Behind the scenes of the new Free Software Directory
This work, as with all our work, is funded by the contributions of individuals like you. I hope that if you appreciate where we're going with the Directory, you'll show that support by becoming a member or making a donation.
There is a lot of free software out there now
The breadth and depth of a catalog of free software can be a powerful statement about just how much free software there is, to do pretty much everything under the sun. We've put up the new version of the site because we believe our previous Directory, due to limits affecting the speed of updates and additions, was underselling the current state of free software.
6500 programs is a lot, until you look at how many we don't have listed. The new version is progress here because it enables a more direct way of submitting new entries — previously you had to email them to us — and empowers a potentially unlimited number of people to help keep existing entries up-to-date and informative. This includes allowing free software maintainers to keep information about their own programs up-to-date, instead of emailing us for changes.
You might wonder why, then, we haven't just made the Directory a completely public wiki, with anyone able to edit. This is something we gave serious thought to, and will still keep in mind as a possibility for the future. Initially, we wanted to have it be publicly editable, but use something like MediaWiki's Flagged Revisions extension, in order to indicate entries that have been vetted. This extension turned out to not be compatible with other extensions more important to our purposes, so we went with a different alternative to accomplish a similar end. We've created one area of the wiki, the submissions area which is editable and viewable by anyone. We are not indexing these submissions in search engines or the site-wide search, because doing that would mean presenting users with potentially nonfree software. We want to hold to our standards, and make sure that we're only listing what we say we list.
So, as it stands now, anyone can view entries submitted, and since the new version uses the CAS-based single-sign-on system we've been rolling out for the FSF web infrastructure, anyone with an FSF web account (which costs nothing and requires only an email address) can make new submissions, as well as comments on other pages. Anyone who wants to go further with helping just needs us to give them a little more access, as part of a brief training about the standards we're working from. Maybe in the future with a larger contributor base of editors, this step won't be necessary.
What free software program would you recommend to...
We believe our new version will make it easier to find programs for specific purposes and meeting specific requirements, which is helpful not only for existing free software users but also for advocates making recommendations to new users.
Each entry is meant to be a gateway to learning more about the program. In a consistent format, it presents users not just with a description of what the program does and how to get it, but also with resources that they can use to get help, learn more, or even contribute to the software.
We can envision some social aspects growing up around the site that would be beneficial toward this purpose — for example, what if users made their own lists of favorite free software, on the wiki? What if some people did the work to tag entries with properties indicating that certain free programs are suitable replacements for popular nonfree programs? What if properties were created for various features to enable useful dynamically-generated comparison tables between different free software programs in the same area? What if blogging software had plugins to automatically link to pages in the Directory when they blog about their favorite free software? What if we had a system for linking paid support providers to these programs? What if it were easy from the Directory pages to make donations to your favorite programs?
I'm pretty excited about the possibilities.
Helping fully free operating system distributions
Unfortunately, just because something is GNU/Linux does not mean it is all free software. GNU/Linux has succeeded to the point where companies sometimes now produce proprietary software that works with GNU/Linux. A goal of the FSF is to make it possible for people to use only free software to do everything they need to do, so this GNU/Linux-compatible proprietary software does not help us. Fortunately, several GNU/Linux distributions have stepped up and committed to distributing and promoting only free software.
I think even people who accept compromises like proprietary firmware in device drivers can agree with us that ideally we would not have to make such choices. The ultimate goal for just about all of us is to have our operating system consist only of software we can fully control.
The Directory will help us continue to encourage the development of these distributions. As alluded to in the press release, we think we'll be able to import packaging data from existing free distributions, which will make the Directory a much more useful resource for developers who want to create new free distributions benefiting from all of the vetting work that's already been done. It will also help show what free software exists that hasn't yet been packaged for distributions.
Academics and researchers
With the growth of free software, there are more and more scholars writing about it. These scholars need data. Currently, much of the data available about the use of free software and the popularity of various licenses comes from corporations who have certain interests. It doesn't always cover what researchers want to know. With the machine-parseable output available via Semantic MediaWiki, researchers should be able to generate useful statistics on any property in the database, including licenses. Additionally, the parseable formats mean that the data can be more easily reused, for other dynamic applications on the web or even for starting other kinds of directories.
Because we're using wiki software, revision history for entries is also available, meaning that researchers will be able to write about changes in the world of free software over time. Who knows, this might even become important in patent discussions, since the Directory will become a more comprehensive source of what innovations have been accomplished and when.
For those interested in digging into the data, you can download a bulk XML dump of the entire directory. We'll continue to provide updated versions of the full set of data, but note that any query you do on the site produces machine-readable results in addition to the view you see in your browser.
We settled on using MediaWiki as the software for the new site, because it is a mature software system with a strong community of both core developers and extension developers. Also, we've had success with MediaWiki in the past, for sites such as LibrePlanet.org. But, MediaWiki alone does not meet our needs for more structured data, so we also settled on using a suite of extensions which include Semantic MediaWiki and Semantic Forms. These extensions provide us with the ability to have structured/semantic data, form-based input, advanced query functionality, along with import and export of bulk data.
For visitors of the Directory, one the most important aspects of Semantic MediaWiki is that it provides us with flexibility and expandability for our classification system. Members of the community will have the ability to help update and continually improve the property system. At the end of the day, this means contributors can make it easier for visitors to browse through the software on the site and make it easier for them to discover free software. Because so many of the site's features come from MediaWiki, rather than code that lives on a server that contributors don't have access to, the bar for making all kinds of changes is low.
MediaWiki with Semantic MediaWiki gave us the right combination between malleable (enabling creativity and evolution), and structured (enabling machine-readable output, reliable searching, and consistent presentation).
The way forward
While doing this work, we found plenty of warts. I'm sure you will too.
Many of the entries are outdated, since we have not done significant updates since last October — that wasn't possible when we were focused on migrating the data without errors or loss. And of course, a reason for this change was an acknowledgment that we couldn't keep pace with updates.
The user interface and the appearance of the site can most definitely be improved. Some of the information presented is a bit confusing ("resource URL"??). The current property names are a little strange in some cases, which makes searching a little strange.
Most importantly, there are thousands of programs that aren't yet listed.
But this is why we chose the technology we did. We know there are many areas for improvement, and always will be. Fundamentally, the Directory is a categorization system for something that is constantly changing. The contents need to be able to change. The categories need to be able to change. The way people search needs to be able to change. What information is presented needs to be able to change. MediaWiki enables us all to make these changes; so we decided to go ahead with launching the new version, as an acknowledged work in progress, even with clear areas for improvement.
The people behind the work
There are many people to thank for helping us build the new site, and I can't thank them all here. But, our sysadmin Peter Olson worked with previous FSF intern Mark Eriksen to do the majority of the work on importing all the data from the old site. Myself and our campaigns manager Joshua Gay did the majority of the MediaWiki work. Our other campaigns manager Matt Lee helped with the layout and appearance of the site.
We owe a big debt to the Wikimedia Foundation and all of the hackers who have worked on MediaWiki over the years. We especially thank Yaron Koren, the developer of Semantic Forms and other extensions, for providing us with valuable feedback and support.
Also, thanks to all of the previous FSF staff Directory Maintainers — Janet Casey, Ted Teah, Deborah Nicholson, and Kelly Hopkins — for their work checking and adding new packages. Together, they are mainly responsible for the 6500 programs currently listed.
And thanks to the free software community, which has written so much free software that we have found ourselves unable to keep track of it all!
Finally, let me thank you in advance, for the work you're about to do helping to build this community resource ;). Go to http://directory.fsf.org/wiki/FSD:Participate to find out how you can get started.