Sunday, February 12, 2012

The Art of War

While heeding the profit of my counsel, avail yourself also of any helpful circumstances over and beyond the ordinary rules.
-- Sun Tzu
How do you win a war? There is no easy answer. If ever there was an art that could not be reduced to a set of axioms, it's war--and perhaps database programming. Admittedly, I have neither fought in nor commanded in any skirmishes, battles, police actions, or wars, but I'm pretty sure that if there was a surefire way to win, it would have been discovered by now. (Well, there is one rule: never start a land war in Asia.)

The same is true with database programming (never
select *). All forms of programming, and of problem-solving more generally, involve trade-offs. But the trade-offs are the greatest with database programming. As the database is the greatest barrier to performance in any application, its design is the most fraught with philosophical and practical dilemmas. I find it very difficult to interview people for database-related jobs, because there are rarely any right answers. It's also difficult to glean advice from books and blogs, because situations vary so wildly.

For instance, there is a trade-off between reads and writes in a database. If you have a lot of reads, you can create indexes to your heart's content, assuming you have unlimited space, which is never the case. Similarly, there is a trade-off between batch processing and real-time transactions. If your database handles both, and you cannot simply schedule batch jobs for off-hours, you will have to prioritize one or the other. Another trade-off is between reporting and online transactions. Even if you report off of a mirrored snapshot, you must accept the timing delay thus incurred.  This list could be extended indefinitely.

An early database under siege

It can be a frustrating realization that there are no general principals by which lowly database programmers abide. But this can be liberating as well. What would be the point of programming if you too could be replaced by a machine following a set of principals? Programmers are the last stop against the ever-increasing automation of the world. Embrace the art!

More practically, there are some things you can do. Once you've achieved a certain level of proficiency, the best you can do--really--is to catalog trade-offs.  I recently ran into one between database complexity and service traffic. It is usually a good idea to reduce the amount of chatter over a web service, but it's also a good idea to write clear, maintainable SQL code. Another trade-off: you can make all your code dynamic in order to deal with the deployment processes and future change, but then you have to deal with a more complex database design.

Stéphane Faroult outlines five factors that contribute most to performance:
  • The number of rows in the tables involved
  • The existing indexes on these tables
  • Storage peculiarities (like partitioning)
  • The quality of the criteria provided
  • Query diagram à la Tow
  • The size of the result set

Compare to Dan Tow's query diagrams, which show:
  • The join order of tables
  • The selectivity of the filters on those tables
  • The selectivity of the joins between tables

Of course, performance is only one of many database goals and involves trade-offs at a micro and macro level. For example, decisions have to be made about indexes on particular tables, and about whether performance or maintainability is more important. There aren't any universal truths, just experience and deliberate thought.

What trade-offs have you found?

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...