the Garden of Forking Paths: April 2012

Sunday, April 29, 2012

Where's the Logic Go?

Typically, the business logic resides in the middle, or domain, layer

Every application has at least two components: the design of technology platforms, called the application logic, and the processes that need to happen, called the business logic. In theory, the business logic is dependent of the application logic, since a business has rules, workflows, and transactions that have nothing to do with any programming languages or database systems. In practice, however, application logic can put constraints on business 'illogic.'

One of the key design choices in developing any application is deciding where the business logic should go. Database developers think it should go in the database, since keeping the code at the database level is often most performant. The problem with this is that SQL doesn't have many of the basic niceties of any Object-Oriented language. Furthermore, since stored procedures use proprietary SQL, they can prevent the migration of database code to another vendor.

OO developers think business logic should reside in the domain layer, since objects are best at representing the real world. Libraries and IDE's like Visual Studio make it very easy to get an OO application off the ground, and they help with maintainability. For many applications, however, the amount of code necessary to create an MVC model, for example, is not necessary and may even be prohibitively burdensome.

In reality, no application design should be used for all problems. Martin Fowler provides four models that couple domain and database access logic.

Transaction Script / Row Data Gateway - Domain code simply passes requests from the UI to the database. Database access is modeled at the record level.

Table Module / Table Data Gateway - Domain code is organized in objects corresponding to tables in the database. Database access is modeled at the table level.

Domain Model / Active Record - Domain code is organized according to business rules. Database access is modeled by CRUD objects.

Domain Model / Data Mapper - Domain code is organized according to business rules. Database access is modeled by a mapping object layer.

Fowler suggests that your choice of pairings should depend upon the complexity of your business logic. An application used for reporting can simply send requests to a database, but a complex sales order process should probably be mirrored by a domain model and a data mapper. A domain model will have a higher up-front cost, but it may pay off as the complexity of an application increases.

I think this general trade-off makes a lot of sense, and it helps me understand and categorize a number of applications I've seen. But, unless I am mistaken, pretty much any enterprise application is going to require a layer for business logic objects, a layer for data mapping, a layer for data access, a layer for the data itself, and, of course, the presentation layer. If it's possible to reduce the complexity of these layers, do so!

Sunday, April 22, 2012

Bigger Faster Stronger

Scalability is one of those words that everyone uses but few understand. It's a measure of how adding resources (typically hardware) affects performance. You can scale vertically by increasing the power of a server. You can scale horizontally by adding servers. The scalability of a system depends on how performance is defined. Martin Fowler suggests a few categories:

Expect more posts on this one

Response time, or the amount of time it takes to process a request
Responsiveness, or the amount of time it takes to acknowledge a request
Latency, or the amount of time it takes to get a response (this is especially important when there is no data to return)
Throughput, such as transactions / second
Load, or the amount of stress a system is under
Load Sensitivity, or response time / load
Efficiency, or performance / resources
Capacity, as in maximum throughput or load

Systems must be designed to scale, but what scaling means will depend upon the purposes for which a system is built. It might be tempting for database professionals to think about scalability in terms of transactions / second or the number of active accounts. But what really matters is whether or not the system is usable given an increase in transactions or accounts, and this depends upon the use for which the system was created. If we're talking about an e-Commerce system, throughput is probably more important than response time, as long as responsiveness is high. If we're dealing with a manufacturing system, we'll probably be most interested in throughput.

It's important to design systems to be scalable. The Internet has increased adoption rates to unprecedented rates. Consider Instagram, which has 30 million users after 2 years. Draw Something had 36 million users in three weeks. Scalability is a prerequisite for virality.

In the case of N-tier applications which have a Service-Oriented Architecture, it's usually easy to add hardware to the web and application servers. Load balancers and web farms can take care of extra load by distributing it evenly across a number of servers. The real problem is, as always, the database layer.

You can't just add servers to the database layer, because databases must be architected across multiple database servers. Concurrency adds to the difficulty, as database transactions must be ACID (atomic, consistent, isolated, and durable). In other words, you have to manage updates to multiple servers, making sure that an update to Server 2 does not depend on Server 1.

Lighting bolts make it faster

I thought the Cloud might be the solution to database scalability, but Microsoft Azure currently supports databases of only 150 GB in size. In talking with Microsoft consultants, they recommend 'sharding' databases. This means having a master database that directs transactions to the appropriate database server. For instance, all transactions dealing with North American accounts should go to Server 1, South America to Server 2. Sharding adds a layer of abstraction and a layer of complexity, and it requires duplication of database schema, but it's an increasingly popular approach.

Another option is Oracle's RAC system or Microsoft's MatrixDB, which has basically been ported to Azure. I'm skeptical that MatrixDB will make it in to the next edition of SQL Server (2012 has AlwaysOn, which is close, but the mirrors are read-only). In RAC or MatrixDB, databases are replicated across multiple servers and a load balancer directs reads and writes to the server with the least load. Changes are replicated asynchronously between database servers. Still, there are limitations to the size of databases for which this would be feasible.

Relational databases are great up to a certain size (though this is growing, thanks to SSD's and improved caching). It's hard to say exactly what this size is. In the end, scalable databases adhere to principles of normalization and partitioning. After a certain amount of data, RDMS's will be of no use, and NoSQL solutions are the answer to a different problem. Are you ready to scale?

Sunday, April 15, 2012

The Reef and the Market

Philosophy professors like asking students where their ideas come from. "I just think them," students are bound to retort. "Aha!" The professor pounces. "But where does the idea of 'I' come from?" Silence.

Much of philosophy involves trying to explain where various ideas come from. Socrates and Aristotle understood the advance of thought as a process of dialogue which builds upon the ideas of the past. René Descartes argued that the idea of God was at the root of all true ideas. Karl Marx thought many of our ideas, such as religion, are ideology and a product of power relations. W.E.B. DuBois believed that we understand ourselves and others through the lens of race and that these ideas have a contingent history. William James thought our ideas were a product of 'what worked' for us and people like us in the past. Thomas Kuhn argued that scientific theories belong to a history of evolving paradigms.

Stephen Johnson is concerned with good ideas in his book, Where Do Good Ideas Come From? There are three images of innovation that orient his inquiry. Coral reefs, which make up 0.1% of the Earth's surface but have support 25% of all marine species. There is the city, which, as Geoffrey West has shown, increases in innovation in relation to population at a super-linear rate. And there is the web, which has decreased the time required for innovating and adopting new technologies from 20 years to 2.

By looking at innovation from a number of different scales, including at the level of brains, cities, and, ecosystems, Johnson comes up with a framework I summarize in the following way. Good ideas are fostered by:

networks that can change
that have some stability
that favor chance encounters
that embrace error
that support re-use
that support building on other good ideas

More creativity per capita than any suburb

How can businesses foster innovation? Johnson shows that the majority of major inventions in the last two hundred years did not happen in R&D labs at major firms or in the garages of people who later struck it rich. They usually took place in colleges and universities, or organizations like CERN. This was surprising to me, given my experience in the academy, its silos of rival departments, and its distance from the real world. However, universities do allow people from very different backgrounds to work together and to circulate and build upon others' ideas freely. They allow for experiments to go wrong and let people research controversial things.

Many businesses today taut the importance of innovation, but few allow for failure, the open exchange of diverse ideas, change, or time for reflection. A recent survey of CEO's showed that they spend around 50 hours a week working but have little time to reflect, given the constant interruptions of BlackBerries. Google, on the other hand, requires employees to work on their own projects 20% of the time. Twitter built an open API and then built their services on top of that. Apple, while opaque to outsiders, has a very messy development process where everyone at each step of the development chain is involved with a new product at the very beginning.

Twitter or GM?

More personally, Johnson's book caused me to reflect on when I'm most creative. I'm best in my sleep or in writing after having a discussion with someone. I need time to let ideas simmer, but I'm lucky enough to have lots of smart people to discuss ideas with. Johnson concludes:

"Go for a walk; cultivate hunches; write everything down, but keep your folders messy; embrace serendipity; make generative mistakes; take on multiple hobbies; frequent coffeehouses and other liquid networks; follow the links; let others build or your ideas; borrow, recycle, reinvent. Build a tangled bank."

Sunday, April 8, 2012

Do Technologies Have Politics?

I've been exploring the idea that technology isn't just a collection of things which can be used interchangeably for either good or ill. Technologies shape the world and the range of choices we have. For example, in my last post, I suggested that real-time technologies may have effects for the ways we do finance. Our question today is: if technologies can shape financial decisions, can they shape politics as well?

Riot!

It's one thing to say that technologies are used for political ends and another to say that technology is political. The latter is Langdon Winner's claim. If politics is about power and authority, then at least some technologies are political if they embody, define, or exert the power of some people over others.

Technologies are obviously political when they are used as tools of control. It is often feared that new computer technologies will allow governments to spy on people in order to control illegal activities. But forms of control can take much subtler forms. For example, when Robert Moses designed many public works in New York City in the mid 20th century, he designed them so that people could not get to them via public transportation. The overpasses he constructed have a clearance of only 9 feet--far too low for buses and those who would ride them to get to Jones Beach. Similarly, many people don't realize that the grand boulevards of Paris were designed by Baron Haussmann in the mid 19th century as a form of riot control. As a Penn State alum, I can tell you that narrow streets are much more conducive to riots than open fields.

These are examples of technologies being used to empower some and dis-empower others. But are some technologies inherently political? I think so. Marx and Engels called for factory workers to take over control of 'the means of production,' but Engels later argued that the very technology of industrial manufacturing requires a division between workers and elites. How can you run a factory without a boss?

Winner endorses solar power over nuclear energy because it can be decentralized and doesn't require the scientific and bureaucratic elites who make decisions without the knowledge of anyone else, as in the case of the Fukushima Daiichi disaster. For months, the Japanese government and TEPCO lied to people about the seriousness of the accident. Countless polls show that support for nuclear power is at an all-time low across the world, with Germany abandoning future plans for construction. Nuclear fallout is an issue, of course, but I think people are really worried about the forms of power that are inseperable from nuclear energy.

Visit Japan, future skate park capital of the world.

Technologies that shapes public opinion and collective decision-making are also inherently political. Before the invention of the printing press, it would have been impossible to have democracy on anything other than a very small scale. Though it required a set of elites who broadcast information to information consumers, and though, as Walter Lippmann worried, most people would never get the whole truth, the dynamic duo of newspapers and democracy were better than any alternative. Pamphlets made the American Revolution possible.

I, for one, welcome our new search engine overlords.

Today, internet technologies are already reshaping power. The decline of major 1-directional or broadcast media, including newspapers, is heralded by some as the dawn of a new, more democratic age. Like much breathless optimism about technology, such claims should be taken with a grain of salt. Most internet traffic is routed through about five sites, a problem Michael Hindman terms Googlearchy. Most influential political bloggers are white males who went to ivy league schools. But there are many precedents for thinking that new technologies can undermine and recreate power in radical ways. These will be fought by the old guard and embraced by the new.

It comes back again to: what kind of technology do we want?

Sunday, April 1, 2012

Accounting and Thermodynamics

Predator-vision

A few years ago, I rented a very cheap house in a very cold part of the country. I wanted someplace big to play my drums, but I didn't realize what kind of heating bills I would get in the winter. I ended up keeping the house at 40 degrees Fahrenheit, using space heaters, and freezing a few pipes.

Besides earning a story to tell, I also learned how to see rates of flow. I was suddenly able to see the various heat sources and sinks in my house, with vectors of various strengths showing the direction and rates of flow. Unconsciously, I had always thought of heat as being a property of a room or building, but I now saw heating the way physicists see it.

Locke-vision

Such paradigm shifts, which overlay your present view of the world with a broader experience, are not uncommon. I always enjoyed studying geology, since it allows you to see the seemingly-fixed landscape as a fluid process and to see human activity from the perspective of the Earth. For thousands of years, astrology let people interpret ordinary events through the lens of the cosmos.

One of the most natural ways of seeing the world is as a collection of things with properties. This view was best put down on paper by modern philosophers like John Locke. They went back and forth about how subjective 'secondary' qualities like color and taste could be known to be true to the 'primary' essence of a thing, but they never questioned the atomistic model of the universe. This was only natural when the physics of the day characterized the interactions of the universe by analogy to billiard balls.

I've been trying to get my head around some hard accounting problems, and I realized that my problem was thinking of accounts as things with properties. It is correct, in a sense, to describe accounts as having a dollar amount. But is is more correct to think of them as part of a system of interconnected accounts with various directions and rates of flow, much like the heat in my cold house. This is because the value of an account is constantly changing, and because its changes are the direct result of transfers from other accounts. Even the cash in your wallet is not separate from this plumbing. I've begun to see the systems I build and maintain as part of the flow of the entire monetary system.

This flow is becoming particularly interesting with the growth of currency-less transactions like ACH. If you get direct deposit, you use ACH. In the future, there will be no paper or coin currency. We'll simply transfer funds between accounts with smartphones or other devices. There are many fascinating consequences of the death of currency. For instance, if governments do not have to pay the cost of printing money, the cost of transacting will be borne by retailers in the form of transaction fees. Someone will also need to bear the cost of information theft when you lose your phone.

$0.01, spent at all places and times

But I have a really crazy thought. If money becomes infinitely liquid, won't its velocity increase infinitely, thus increasing the money supply infinitely, and raising the cost of everything infinitely? I wonder if the the laws of thermodynamics will continue to hold as currency becomes digitized. With real-time web services and other technologies that take us away from daily batch file ETL common to financial systems, we increase the liquidity of money with consequences that are not yet clear. Instead of rates of flow, we may have currency that is in all accounts at all times, much like the Heart of Gold's Infinite Improbability Drive. But I suppose I shouldn't borrow serious thoughts from Douglas Addams.