the Garden of Forking Paths: Metaprogramming in Ruby

I've started learning some Ruby, one of the most popular object-oriented languages today. In OO programming, data and actions upon data are bundled together in objects that represent real-world things, such as cars, financial transactions, or database connections. Ruby is remarkable because it allows you to change object definitions on the fly due to 1) its open class structure (no methods are private) and 2) the fact that it is interpreted (not compiled beforehand). Since binding occurs at run-time, you can call any method on any object. If that method is not supported, it will call the method_missing method, which itself can be overridden to interesting effect.

Though Ruby does not support multiple inheritance, you can alter classes dynamically by extending them with modules. These modules are called mixins, since you can mix them in whenever you want. This allows for amazing flexibility and an advanced programming technique called metaprogramming, or the programming of programs by programs.

Metaprogramming sounds esoteric, but it is particularly useful when you're designing classes that need to have dynamic metadata. For example, Ruby's Active Record class implements Object Relation Mapping, creating wrappers for database objects. A table, view, or stored procedure can be accessed with standardized methods that can be automatically generated according to the database object to be instantiated once a database connection is established.

I put together some (unfinished) code that shows how this might work. The DBObject class includes the BuildDBObjects class, which extends the BuildIncludes class, which in turn mixes in the appropriate modules as dictated by the constructor. If you construct a table wrapper, only the BuildTable class is included. Each database type could use the same wrapper properties: metadata, data, and name.

module BuildDBObjects
    def self.included(base)
        base.extend BuildIncludes
    end
  
    module BuildIncludes
        def initialize(dbtype, name)
            case dbtype
                when "table"
                    include BuildTable
                    define_table(name)
                when "view"
                    include BuildView
                    define_view(name)
                    #...
            end
        end
    end

    module BuildTable
        def define_table(name)
            # query the database...
            @metadata = %w(SaleID ProductID DateSold)
            @data = %w(1101421 15981923 11/4/2006)
        end
    end

    module BuildView
        #...
    end

    attr_accessor :metadata, :data, :name
end

class DBObject
    include BuildDBObjects
end

tbl = DBObject.new("table","SalesOrders")
puts tbl.name
puts tbl.metadata
puts tbl.data

This is metaprogramming, since the code itself writes the class definition for each instantiation of the DBObject class. The great thing about using metaprogramming to implement object relation mapping is that your classes can change as your database schema changes. This cuts down on the amount of code you might have to write, depending on the way you're accessing data. (Of course, you should create a data access class that uses the ORM in order to decouple the database schema from the application code--otherwise your application code might break with the slightest changes to the database.)

Though this example isn't exactly esoteric, it's probably not something you're going to do every day. It also shows a downside to metaprogramming: you have to write code that is meant to be read by computers. That means it might not be particularly readable by humans. One of the many reasons people like Ruby is that it is very programmer-friendly. It's very easy to read while at the same time cutting down on a lot of "extra" code, like class accessors. Since metaprogramming is often less easy to read and understand, it's often more difficult to maintain.

A few links:
-Paolo Perrotta's book dedicated to metaprogramming Ruby, which contains an extended look at Active Record
-Ruby's core API
-Programming Ruby Pragmatic Programmer's Guide (2001)
-A good mixin tutorial

4 comments:

MecherathAugust 23, 2011 at 9:56 PM
I wandered around some of the Ruby links you posted for a while, and saw that since it uses dynamic typing, an object that inherits methods from multiple classes doesn't have to be re-cast to make use of them all. Apparently that's called Duck-typing, based on this quote:

"If it quacks like a duck and it walks like a duck then it must be of type Duck."
Zach VanderVeenAugust 23, 2011 at 10:49 PM
Yes, it's an interesting way of programming. You can't do any type-checking at compilation (since there is no compilation), but you have a lot of flexibility at run-time. You can overload any class--including base type classes like true and false--so you can do some really wild stuff!

What I like most about Ruby, though, is how nice the syntax is. It really is easy to get going and write very readable code.
MecherathAugust 24, 2011 at 8:28 PM
You mention overloading a class, and before you mentioned overloading the default class, but one of the articles you linked also mentioned that you can't do the sort of overloading that is typical in Java.

As I understand it the Java overloading is defining the same method multiple times, but with different inputs. So you might have TestFunction(int a, int b) and also TestFunction(str c) and it will know which one you mean based on what you pass in.

But your use of it seems to imply overwriting the base functionality. Are there more precise terms for those situations, or is it just the difference between overloading classes and overloading methods?
Zach VanderVeenAugust 24, 2011 at 9:22 PM
That's funny, I actually ask that as an interview question! I meant to say overriding, not overloading. Overriding is redefining a base class method in an inherited class. Overloading is, as you say, providing different parameterizations of the same method.

Sunday, August 21, 2011

Metaprogramming in Ruby

4 comments: