Sunday, July 6, 2008

Metaprogramming keeps it neat: DSL creation techniques in Ruby - Part 1

Hi! Due to the lack of resources on building DSLs in Ruby on the internet, I've written a small guide of how you can use DSL creation techniques in Ruby to cut down on lot of redundant code, make it easier to maintain code and improve readability. Hell you could eaven write your own language, here's your chance to discover how Rake, RoR and other amazing tools are written, and maybe even write one of your own!

This is not a how-to steb-by-step guide on building a full fledged DSL, rather I will show you how you can apply DSL-implementation techniques to cut down on lots of redundant code by examples that get progressively more complex as we address problems or hit a dead-end.

I've taken care to make sure that everything mentioned in this article makes sense even to novices, that's why I haven't slapped complicated code right way. Instead, I've taken a piecemeal approach.

Rules of the game: Detecting when something is not right: When you see a lot of redundant code sitting around, somethings wrong somewhere. Why-TF should your rewrite pieces of code, when you can make the language do that for you! No I'm not talking about doing some functional programming. Relax, you'll find out soon enough.

Redundancy sucks: If you've coded in languages like LISP, redundancy would be something just unacceptable to you. I actually haven't done much of Lisp, but a friend (lisp guy) managed to drill enough sense into me to stop typing that shit over and over again, and figure out alternatives!

Learning how to apply K-I-S-S (keep it simple stupid). While applying KISS, we tend to keep it so simple (KISS again) that the application turns out to be complex as it grows! Not in terms of a single method or module, but the application as a whole: complexity in terms of debugging, maintaining etc. I'd buy 500 lines of moderately complex code any day over 1500 lines of simple stuff!

DSLs to the rescue. So what is a Domain specific language: Its a language that provides you with all the infrastructure, and functionality required to rapidly build applications for a specific domain, without worrying about the details.

RoR - An amazing MVC framework that exploits ever corner of metaprogramming in Ruby and allows you to build simple yet powerful database driven web applications.

Rake- a tool very similar to the unix make. Rake (written completely in Ruby) executes "Rakefiles" allowing you to automate stuff like RoR migrations, file generation etc.

Components of a DSL implementation:
  • a base library providing you with all the functionality required for that domain (like ActiveRecord in RoR if you will)
  • a domain specific instruction set, like:
"has_one", "has_many", "belongs_to" etc in Ruby On Rails
"tasks", "desc", etc in Rake
Now that you know what a DSL is. We're going to learn how its done in Ruby.

Assuming your have a fair amount of Ruby knowledge, I'll walk you through some advanced concepts that we're going to use to spice up and make our app more intelligent so that it does all the hard work for us.

A few Ruby concepts
  • Class vs Instance methods: Yea, I know you know this. But what I want to state here is classes themselves can contain code, independent of any method. In RoR controllers, when you say "has_one", you're actually executing a class method "has_one", that takes the arguments and generates a whole bunch of instance methods for you.
class Products
has_many :vendors
end

The class method :has_many generates the instance method :vendors that returns an Active record collection of all the vendors for that particular product.
  • The define_method class method: define_method comes handy when you need to dynamically create an instance method in a class, all you have to it pass it the method name and block, and your instance method is up and ready.
Example:
class Foo
define_method :bar do
puts "I'm from bar"
end
end
is equivalent to:
class Foo
def bar
puts "I'm from bar"
end
end
Ok you're all set. With a basic knowledge of Ruby and the above concepts, you're ready to know what it takes to build a DSL in Ruby.

A Basic Implementation:
Give me control over my method definitions!
Inorder to cut down on code, the first thing I want to do is to get control over my methods, Every time a method is called, I want some code executed before and after it. Wow, is that even possible you say? Lets apply the concepts we learnt in the last section to make this possible:

Instead of using "def" to defined methods, I'm going to create a keyword "defm" and give it some meaning. For this I'm going to create a class method defm and add some logic to it. Then I'm going to use defm to create my own methods, and control them!
Here we go:
class MyClass
def self.defm(method_name, &code)
define_method method_name do
puts "Starting method #{method_name}"
return_value = code.call
puts "method returned #{return_value}"
end
end
# Now lets use defm we just created:
defm :foo do
puts "I'm inside foo!"
true
end
end
x = MyClass.new
x.foo
Output:
Starting method foo
I'm inside foo!
method returned: true

First I create a class method defm, then I define a method :foo using defm by passing in the method name as a symbol, and a block (body of the method). Executing foo on an instance gives us the desired result.

Note that if the method definition of defm sitting in the Class itself is bothering, don't worry, we can always move it to a module. I'll show you how in the next section. Infact there are a whole bunch of methods coming up and we have to move it to the module, the class just defines methods using our newly created keywords, the implementation has to abstracted.

Wow! I can now call methods created using our new defm keyword (in DSL terms), execute code before they start, and process their return value (boolean true in this case). You see where we're going? all the logic specific to defm-created methods can now be written in self.defm, and is not repeated with every method definition! Remarkable.

Err.. not really.. there's a problem:

What if I have a bar method, and try to call foo from it:
defm :bar  do
puts "I'm inside bar!"
foo
true
end
Trying to call x.bar throws the following error:
NameError: undefined local variable or method `foo' for MyClass:Class

Ruby complains because the block we're passing to defm is executing in the context of the Class in which we created it (so self = MyClass).

Now what?

That's where instance_eval comes in.
instance_eval is required to change the context in which a block is executed. In the above example I'm passing a block to define_method. The block was created with self as the class object Bar. Inorder to execute the block with "self" as the instance of Bar on which it is called, all I need to do is use self.instance_eval(&block) instead of block.call .. the block now executes with self as the instance of Bar, not the object in which we created it.

change:
return_value = code.call
to:
return_value = self.instance_eval(&code)

x.bar now gives:

I'm inside bar!

I'm inside foo!

Wow!
Well done.. You now have built a basic working app by applying some metaprogramming but there are a few problems here:

Problem 1:
If you look in terms of what is stored in memory, every method we define has the logic for
"starting method" and "method returned" repeated. Though its not repeated in the code, it is repeated in every method definition stored in memory.

This is what our "bar" and "foo" actually look like in memory as you might guess:
def bar
# code before method starts
return_value = self.instance_eval(&code)
# code after method returns
end
def foo
# code before method starts
return_value = self.instance_eval(&code)
# code after method returns
end
Ugh! That's redundant again! (though thankfully only in memory this time) -we'll fix this soon enough.

Problem 2:
What we're passing to defm is a ruby block. Blocks or Procs aren't really methods, they're a lot different. Trying to return a value in the middle using the return keyword will break it.

Try this:

> func = lambda { |x|
* return 1
> }
> self.instance_eval(&func)
LocalJumpError: unexpected return
from (irb):97
from (irb):99:in `instance_eval'
from (irb):99

Problem 3:
We also haven't talked about passing arguments to our newly created methods. Even if we did, methods created using "define_method" start to execute even when passed the wrong no. of arguments. Our current implementation isn't smart enough to catch this.

Try this, it will still work:

> class Object
> define_method :x do
* |x|
* puts x
> end
> end
=> #
> x
(irb):57: warning: multiple values for a block parameter (0 for 1)

Ruby only warns you, it doesn't raise an ArgumentError as it normally would with a real method. Blocks unlike real methods in ruby can be passed the wrong no.of arguments, this means our methods aren't as robust as real ones.

In the next part (which is coming soon), we're going to address all these issues. We'll also build a smarter defm implementation in a Rails controller to demonstrate how we used this technique to cut down on tonnes of redundant code.

Hope you enjoyed this so far, stay tuned for some more fun!

1 comments:

Brent said...

Metaprogramming can be difficult to fully understand, but explanations like this really help solidify the concepts.

Looking forward to part 2...