Monday, April 14, 2014

Public Universities Should Use Open Source Software

The Homeless Econometrician: Black Box Software

Choosing to go open source is a big deal.  It means that when asking for help or new improvements you are dealing with a highly active and generous user community which enjoys helping instead of a corporation which needs to justify the expense of providing customer service by extracting fees either directly or indirectly (through workshops, "support costs", etc.).

1. Open source platforms attract a different kind of user.  Frequently university affiliated, open-source users tend to be much more generous with access to their work such that it may be possible to piggy back on the work developed by other universities or agencies which already ask many of the same questions you would like to ask.  This is almost certainly not to be the case when using proprietary software primarily deployed by corporations which will in contrast guard access to their resources and copyright their content.

2. Open source software can have higher quality than proprietary software. Unfortunately the open source movement has gotten a bad reputation as producing inferior products at no cost.  This however is not a fair assessment.  Sure products start out as low cost but remember Firefox.  Up until the development of Google Chrome, Firefox was far and away the best browser you could use (after spending time using Chrome, I still think it is).

3. Publicly funded research should be used to support public goods. There is a philosophical additional reason for using an open source product is that just through use you are promoting the entire purpose of publicly funded research. By using any product indirectly or directly you contribute to justifying its continued existence and making it better.  If you use proprietary software then you are using public funds to support a private agency.  In contrast, using supporting open source software uses public funds to support a public good.

4. Open source software is often better maintained than proprietary software. Open source platforms do not rely upon the sluggish updating process of a corporation’s software development department. Other popular open source deployments have rapidly outpaced their corporate rivals.  R, an open source stastical analysis software, for instance has its base package updated 8 to 16 times more frequently than proprietary rivals.

5. Popular open source software is often far more flexible than proprietary software.  Open source software tends to frequently exceed the bounds of its designers imaginations while proprietary software rarely does.  For example, both R and Wordpress have an order of magnitude more options available than many of their rivals (60 thousand commands compared with 1 to 2 thousand for R and 30 thousand plug-ins compared with approximately 1000 for Blogger).

6. Open source software is transparent. There is no black box when it comes to open source software. By definition all of the code is available for users to access.  This removes the ambiguity of never really able to verify what exactly your software is doing.

7. Open source software if free.  This is often one of the most commonly chosen reasons to use open source software.  However, many research projects have large budgets and can afford to buy proprietary software.  This may be true for you, but can your collaborators afford to buy that same software and what about their collaborators, what about your students?  What will happen to your project if your funding gets short?

Thanks for reading this far.  Some of the open source software that I use are:
For many computation tasks: R
For high powered computation: Julia
For in field data collection: Open Data Kit
For online assessments: The Concerto Platform


  1. This is a rather interesting post doing a great job summarizing the benefits of FLOSS. Thank you.

    One key advantage of FLOSS for me is also the ability to look at the source code. I think for students (and frankly, who isn't a student these days) this a benefit that cannot be overemphasized. By looking at code we can learn from others and gain much more insight (pun intended) than with closed source software. This is especially true for a field like computational statistics and hence R, where the creation of clever algorithms are at the center.

  2. From the management information reporting perspective, I agree. A huge amount of resources are squandered on annual licences for software that only promotes expensive dependencies, especially in the analytics field, e.g. see Stephen Few's observations and comments from those who've worked with Business Objects (SAP):
    That's why I gave up looking to licensed analytics for solutions and had to go looking for my own, adopting R instead. It's within management (especially information and technology) where a lot of the resistance lies. They're not often the kind of users you describe in 1) OS tools would undermine a management monopoly on information. That information asymmetry allows management (esp. IT) to bombard faculty with useless metrics and crude models to judge them against.

  3. I really love reading and following your post as I find them extremely informative and interesting. This post is equally informative as well as interesting . Customer Service

  4. I am really glad to read this post. Full of information.