Ruby Symbol#to_proc is a lambadass
Aug 12, 2013
It all started with the addition of the to_proc
method to the Symbol
class, which is admittedly easy to use and looks great. Instead of writing
irb > [Object, Kernel, Class].map {|cls| cls.name }
=> ["Object", "Kernel", "Class"]
it can be written as
irb > [Object, Kernel, Class].map(&:name)
=> ["Object", "Kernel", "Class"]
And what it does is simply call the to_proc
method on the symbol :name
(which returns a Proc
), and then convert the proc to a block with the &
operator (because map
takes a block, not a proc).
A naive implementation of the Symbol#to_proc
could look like this:
class Symbol
def to_proc
Proc.new {|obj, *args| obj.send(self, *args) }
end
end
After fixing cases with arrays and possibly monkey-patched send
method, you might end up with something like this:
class Symbol
def to_proc
Proc.new {|*args| args.shift.__send__(self, *args) }
end
end
It’s a well-understood concept with many descriptions available on the Internet. To understand why the Symbol#to_proc
is a lambadass (and what it means), let’s move on to the different kinds of Ruby Procs.
Different kinds of Ruby Procs
As you may know, there are two kinds of Ruby Procs—procs and lambdas. They differ not only in how they check function arity, but also in their appearance when run in irb:
irb > Proc.new {}
=> #<Proc:0x007f944a090c50@(irb):1>
irb > lambda {}
=> #<Proc:0x007f944a08ac38@(irb):2 (lambda)>
You can see that the (lambda)
suffix is displayed for lambdas only, and both of them have something like context (irb):2
. However, it turns out that there is a third kind of procs, which I call the lambadass. But first, let’s talk about the lambda scope or context.
Scope
Procs and lambdas (both instances of the Proc
class) are closures, just like blocks. The only important thing for us to note is that they are evaluated in the scope where they are defined or created.
It means that any block (or proc or lambda) includes a set of bindings (local variables, instance variables, etc.) captured at the moment when it was defined. Simple example demonstrating this in action:
irb > x = 1
irb > def z(x)
irb > lambda { x }
irb > end
irb > lambda { x }.call
=> 1
irb > z(2).call
=> 2
As any method definition is a new scope gate, the only known binding with the name x
inside the method z
is the method parameter. To visually grab the context of the defined lambda, consider {}
as the constructor (rather than the lambda
keyword).
It also means that the lambda defined inside the method body knows nothing about any bindings defined outside of the method scope:
irb > z = 1
irb > def x
irb > lambda { z }
irb > end
irb > x.call
NameError: undefined local variable or method `z' for main:Object
Even though the lambda was called from the top-level scope, it was defined inside the method where a binding with the name z
did not exist. Once the scope or context is captured, it remains the same inside the block no matter where it’s called from.
Lambadass
Lambadass is a proc or a lambda which looks similar to a standard proc or lambda but behaves differently.
So let’s get back to the Symbol#to_proc
. Usually, it is used instead of a block in methods such as map
.
Since the to_proc
method returns a proc, how can we use it standalone like any other proc? Let’s do just that:
irb > lambda &:name
=> #<Proc:0x007fcfa305dca8>
irb > (lambda &:name).call Class
=> "Class"
And lo and behold, it works just as expected! So it looks like it should be identical to
irb > lambda {|x| x.name }
=> #<Proc:0x007fcfa30465a8@(irb):8 (lambda)>
irb > lambda {|x| x.name }.call Class
=> "Class"
Cool!
But wait a second… Why does the returned Proc
object look different? #<Proc:0x007fcfa305dca8>
instead of #<Proc:0x007f944a090c50@(irb):1>
? It’s still a Proc
and a callable object, but it’s missing something. Looking at the object representation, I would say it’s missing the context. How can we verify this?
Binding
The bindings captured from the scope where a block is defined are all stored in the Binding
object. We can get it by calling the Proc#binding
method:
irb > lambda {}.binding
=> #<Binding:0x007fcfa30363b0>
One thing we can do with the Binding
object is to evaluate any binding captured by a block:
irb > x = 1
irb > eval('x', lambda {}.binding)
=> 1
or
irb > x = 1
irb > lambda {}.binding.eval 'x'
=> 1
It will raise an exception if a binding with such a name is not defined, but every block (or proc) has an associated binding object.
irb > lambda {}.binding
=> #<Binding:0x007fcfa40ab808>
irb > lambda {}.binding.eval 'y'
NameError: undefined local variable or method `y' for main:Object
Meet the Lambadass
Now let’s try to get the binding object of a proc created using the Symbol#to_proc
:
irb > (lambda &:name).binding
ArgumentError: Can't create Binding from C level Proc
Well, there’s something wrong with it. It turns out that the Symbol#to_proc
method is implemented in C in MRI (Matz’s Ruby Interpreter which is written in C). So it doesn’t make sense to get the context of C level Proc
object (would be nice though).
What about other interpreters?
Rubinius
rubinius > x = 1
rubinius > lambda {}.binding.eval 'x'
=> 1
rubinius > (lambda &:name).binding.eval 'x'
NameError: undefined local variable or method `x' on name:Symbol.
We’ve got an exception again. This time it says that a binding with the name x
is not defined. As Rubinius (at least the Symbol#to_proc
) is written in Ruby itself, let’s examine its implementation:
def to_proc
sym = self
Proc.new do |*args, &b|
raise ArgumentError, "no receiver given" if args.empty?
args.shift.__send__(sym, *args, &b)
end
end
The implementation bears a striking resemblance to our initial definition. So what’s the problem then? Let’s take a look at the error message again:
NameError: undefined local variable or method `x' on name:Symbol.
Of course there is no variable x
on the :name
symbol! The key to understanding it is that the lambda {}
is defined just here, where the {}
are, but the lambda &:name
is defined inside the Symbol
class in the to_proc
method, which knows nothing about any bindings defined outside. As for a callable object, its behavior is correct, but its scope is completely different.
To make sense of it, let’s take a look at the Binding
object:
rubinius > lambda {}.binding
=> #<Binding:0x179c @variables=#<Rubinius::VariableScope:0x17a0 module=Object method=#<Rubinius::CompiledCode irb_binding file=(irb)>> @compiled_code=#<Rubinius::CompiledCode __block__ file=(irb)> @proc_environment=#<Rubinius::BlockEnvironment:0x17a4 scope=#<Rubinius::VariableScope:0x17a0 module=Object method=#<Rubinius::CompiledCode irb_binding file=(irb)>> top_scope=#<Rubinius::VariableScope:0x1484 module=Object method=#<Rubinius::CompiledCode irb_binding file=.../rubinius/lib/19/irb/workspace.rb>> module=Object compiled_code=#<Rubinius::CompiledCode __block__ file=(irb)> constant_scope=#<Rubinius::ConstantScope:0x17a8 parent=nil module=Object>> @constant_scope=#<Rubinius::ConstantScope:0x17a8 parent=nil module=Object> @self=main>
rubinius > (lambda &:name).binding
=> #<Binding:0x17e0 @variables=#<Rubinius::VariableScope:0x17e4 module=Symbol method=#<Rubinius::CompiledCode to_proc file=kernel/common/symbol19.rb>> @compiled_code=#<Rubinius::CompiledCode to_proc file=kernel/common/symbol19.rb> @proc_environment=#<Rubinius::BlockEnvironment:0x17e8 scope=#<Rubinius::VariableScope:0x17e4 module=Symbol method=#<Rubinius::CompiledCode to_proc file=kernel/common/symbol19.rb>> top_scope=#<Rubinius::VariableScope:0x17e4 module=Symbol method=#<Rubinius::CompiledCode to_proc file=kernel/common/symbol19.rb>> module=Symbol compiled_code=#<Rubinius::CompiledCode to_proc file=kernel/common/symbol19.rb> constant_scope=#<Rubinius::ConstantScope:0x14cc parent=#<Rubinius::ConstantScope:0x14d0 parent=nil module=Object> module=Symbol>> @constant_scope=#<Rubinius::ConstantScope:0x14cc parent=#<Rubinius::ConstantScope:0x14d0 parent=nil module=Object> module=Symbol> @self=:name>
In the first case, the module is Object
and the compiled code is a block in irb. In the second output, the module is Symbol
and the compiled code is the to_proc
method in the kernel/common/symbol19.rb
file.
Of course, if you wrap the lambda &:name
in another lambda, the scope of the outer lambda will be the Object
because it is not defined in the Symbol
anymore. Anyway, the scope of the inner lambda will remain unchanged:
rubinius > (lambda &:name).binding
=> #<Binding:0x17e0 ... module=Symbol ...
rubinius > lambda { lambda &:name }.binding
=> #<Binding:0x1854 ... module=Object ...
rubinius > lambda { lambda &:name }.call.binding
=> #<Binding:0x1898 ... module=Symbol ...
JRuby
jruby > x = 1
jruby > lambda {}.binding.eval 'x'
=> 1
jruby > (lambda &:name).binding.eval 'x'
=> 1
This is the result that almost everyone I asked would expect. No errors, works identically. But if you recall the previously defined to_proc
method, how scope is defined in Ruby, and Rubinius implementation, this behavior should be considered wrong, even if it seems to be the only one that works without any big surprises.
Epilogue
There is a proc that sometimes can be a lambda—the same object with the different syntax and behavior from just a proc, but consistent across interpreters. However, with the current implementation of the Symbol#to_proc
, we have a third behavior of proc that differs across interpreters. I call it lambadass 🕶.