Sunday, August 19, 2012

Cathedrals, Bazaars, and Data

According to Eric Raymond, with enough eyeballs, all bugs are shallow. This is the secret to the success of open source software. Linus Torvalds' primary innovation wasn't so much creating Linux as the development model surrounding it: get something out there, get people using it, and have them be your testers / contributors. That is, make users co-owners, not just consumers. Raymond compares open source software development to a bazaar, as opposed to a building a cathedral.

I've been using Linux and open source software for more than a decade, but I've always been a bit wary of open source databases. Do you really want a fail early/often approach when dealing with people's data? Oracle has made a killing off of Data Fear. After all, can you promise 99.999% uptime with an unproven system? Open source might be fine for startups, but is it fit for enterprise?

PostgreSQL seems to merit the hype. Even SQL Server DBAs like it. I don't have experience with it on the enterprise, but I'm impressed so far. I believe that it's a tested solution, and I really like its extensibility. Anyone can create any kind of extension, such as an Amazon Web Services interface or K Nearest Neighbor mapping.

As a .Net/SQL Server developer, however, I have some concerns.

First, is free really free? SQL Server Enterprise is $6,874 per core. Oracle Enterprise is $47,500 (though Standard Edition One is only $5,800). That's nothing to sneeze at. I don't know Oracle that well, but with SQL Server, you get a whole suite of technologies--ETL (SSIS), data warehousing (SSAS), and reporting (SSRS) in particular. You get a number of disaster recovery options, like 2012's AlwaysOn.

By default, PostgreSQL works from the command line. You can download PGAdmin, but Navicat costs money. Graphical ETL tools? I don't think so. Reporting? You'll probably have to spring for Crystal reports. Data Warehousing? I hear Pentaho is a pain. It would be hard to keep PostgreSQL free--at least in an enterprise environment.  If you're just doing CRUD for a web UI, you're probably ok--for a while.

The second thing that makes me nervous is the ridiculous number of open source technologies out there. The principle of open source is to 'let a thousand flowers bloom,' but in practice this means playing around with a whole lot of different products and figuring out how they can fit together. It seems like chaos. Microsoft is known for its Three Letter Acronyms (TLAs), but at least everything fits together pretty well.

Finally, because of the complexity of open source solutions, I can't imagine they'd be easy to maintain. Most companies now outsource common DBA and production support tasks. If you're running a PostgreSQL OLTP system with a MongoDB document store and Hadoop web logging, you'll have trouble getting Tata to maintain that for you. Even if you're not interested in outsourcing, it is useful to have standards and best practices across companies.

Of course, it's the innovative start-ups that are using new open source technologies to do new things.  This is one of the main factors that helps them be innovative. They don't want to be standardized. They want to be part of the bazaar of new tech IPOs, rather than the cathedrals of established companies--at least until they have enough users for the cathedrals to buy them.

Anyway, I'm having a lot of fun.  Check out my new favorite book.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...