Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2014 (Bloomington, MN) with: Joel Hill and Matt Vickers and Shawn Grigson and Jonathan Rowny and Jonathan Dowdle and Christian Ready and Oscar Arevalo and Jeff McDowell

Seven Languages In Seven Weeks: Ruby - Day 3

By Ben Nadel on
Tags: Ruby

In the last day of our Ruby overview in Seven Langauges in Seven Weeks, we learned a lot about Classes, Modules, and Mixins. In Ruby, you can have traditional single-class inheritance. However, Ruby also allows for cohesive pieces of functionality (Modules) to be included in and shared by multiple classes. Unlike a CFInclude approach in ColdFusion, these Modules appear to be structured in a way that is geared very specifically for class extension. Oh, and did I mention that Classes are also technically Modules? Yeah, this stuff gets a bit confusing.

I am told that Modules play a big rule in Ruby; however, at this time, I don't understand them enough to immediately grasp the advantage of mixin-based class definitions over inheritance-based class definitions.

HW1: Modify the CSV application to support an each method to return a CsvRow object. Use method_missing on the CsvRow to return the value for the column for a given heading.

  • # Modify the CSV application to support an each method to return
  • # a CsvRow object. Use method_missing on that CsvRow to return the
  • # value for the column for a given heading.
  •  
  •  
  • # The first thing we're going to do is create a CSV file on disk
  • # that we can then read in for testing.
  •  
  • csvContent = <<END_CONTENT_BUFFER.strip().gsub( /^\t*/m, "" )
  •  
  • id,name,age
  • 2,Sarah,35
  • 17,Jill,29
  • 9,Tricia,31
  •  
  • END_CONTENT_BUFFER
  •  
  • # Create a connection to the relative-path file (relative to the
  • # current directory of execution).
  •  
  • csvFile = File.new( "./data.csv", "w" );
  •  
  • # Write the CSV data to the file.
  •  
  • csvFile.puts( csvContent );
  •  
  • # Since we opened this file as "writing", we need to close it -
  • # we can't use it for reading.
  •  
  • csvFile.close();
  •  
  •  
  • # ------------------------------------------------------------ #
  • # ------------------------------------------------------------ #
  • # ------------------------------------------------------------ #
  • # ------------------------------------------------------------ #
  •  
  •  
  • # I parse the given CSV data file.
  •  
  • class CsvParser
  •  
  • def initialize( filePath )
  • # Initialize the private properties.
  •  
  • @filePath = filePath;
  • @headers = [];
  • @rows = [];
  •  
  • # Read in the given file.
  •  
  • readFile( filePath );
  • end
  •  
  •  
  • def readFile( filePath )
  • # Reset the properties.
  •  
  • @filePath = filePath;
  • @headers = [];
  • @rows = [];
  •  
  • # Open a connection to the given file for reading.
  •  
  • csvFile = File.new( @filePath, "r" );
  •  
  • # The first line of the CSV data is assumed to be the header
  • # row. Split it break the row into column names.
  •  
  • @headers = csvFile.gets().chomp().split( /\s*,\s*/ );
  •  
  • # Loop over the rest of the file, one row at a time,
  • # splitting each row into column values.
  •  
  • csvFile.each{ |rowData|
  •  
  • @rows << rowData.chomp().split( /\s*,\s*/ );
  •  
  • };
  •  
  • # Close the file to avoid any unintentional locking.
  • csvFile.close();
  • end
  •  
  •  
  • def each( &block )
  • # Loop over each of the rows, by index. For each row, we
  • # are going to create a new instance of the CsvRow class
  • # and pass it to the given code Block.
  • #
  • # As we do that, we are going to convert the index-based
  • # column values into a key-based hash.
  •  
  • @rows.each_index{ |rowIndex|
  •  
  • block.call(
  • CsvRow.new(
  • rowIndex,
  • rowToHash( @rows[ rowIndex ] )
  • )
  • );
  •  
  • };
  • end
  •  
  •  
  • def rowToHash( row )
  • rowMap = {};
  •  
  • # Loop over the columns and map the index values to key
  • # based values.
  •  
  • @headers.each_index{ |i|
  • columnName = @headers[ i ];
  • columnValue = row[ i ];
  •  
  • rowMap[ columnName ] = columnValue;
  • }
  •  
  • rowMap;
  • end
  •  
  • end
  •  
  •  
  • # ------------------------------------------------------------ #
  • # ------------------------------------------------------------ #
  •  
  •  
  • # I wrap a row of CSV data, providing access to the data points
  • # as named functions.
  •  
  • class CsvRow
  •  
  • def initialize( rowNumber, rowData )
  • @rowNumber = rowNumber;
  • @rowData = rowData;
  • end
  •  
  •  
  • def method_missing( propertyName, *args )
  •  
  • # Check to see if the given property name is a valid name
  • # for the row data. When doing this, we have to convert the
  • # incoming Symbol to a string to test for hash membership.
  •  
  • if (!@rowData.member?( propertyName.to_s() ))
  •  
  • # The user has requested a property that doesn't exist.
  • # Raise an "NoMethodError" exception so that our use of
  • # method_missing doesn't corrupt general work flow.
  •  
  • raise(
  • NoMethodError.new(
  • "undefined method '#{propertyName}' for #{inspect}:#{self.class}"
  • )
  • );
  •  
  • end
  •  
  • # If we have made it this far then the user has requested a
  • # valid column value. Return it (last expression in this
  • # function is what gets returned).
  •  
  • @rowData[ propertyName.to_s() ];
  • end
  •  
  • end
  •  
  •  
  • # ------------------------------------------------------------ #
  • # ------------------------------------------------------------ #
  • # ------------------------------------------------------------ #
  • # ------------------------------------------------------------ #
  •  
  •  
  • # Create a new CSV parser for the given file. This parser assumes
  • # that the fields are delimited by commas and do NOT have any
  • # embedded special characters.
  •  
  • parser = CsvParser.new( "./data.csv" );
  •  
  •  
  • # Loop over each row and output information based on the values.
  • # In each iteration, "row" is an instance of the CsvRow class that
  • # contains the data for the given row.
  •  
  • parser.each{ |row|
  •  
  • puts( "Dang, #{row.name} looks wicked sexy at age #{row.age}!" );
  •  
  • };

In the book, Tate used a Module-based approach to create an empty CSV class that extended another CSV class without inheritance. Since I don't fully understand Modules, however, I chose to just create a single CSV class, skipping any type of extension for the time being.

When I run the above code, writing a CSV file, reading it in, and then iterating over it, I get the following console output:

Dang, Sarah looks wicked sexy at age 35!
Dang, Jill looks wicked sexy at age 29!
Dang, Tricia looks wicked sexy at age 31!

And so, with Day 3 passed, my journey through Ruby is over. It definitely seems like a very cool language with a tremendous amount of functionality. I think the hardest thing for me to get used to is going to be the syntax - there is, at the same time, too much flexibility and yet, perhaps not enough? For example, both of these are valid:

  • puts "hello"
  • puts( "hello" );

Very flexible, right? However, these:

  • each{ ...code block ... }
  • each({ ... code block ... });

... are not both valid. Only the first one is. Why is it that simple values can be passed with optional parenthesis but a lambda function cannot be passed with parenthesis (even though it does become a function argument)? I think the problem is that the latter gets confused with a Hash (ie. struct) and the Ruby interpreter cannot figure out which is which?

In general, I found Ruby's implementation of anonymous, lexically-bound functions to be rather confusing. Specifically, there is an odd separation between code block and procedures (or Proc objects). It seems that code blocks have to be the last argument passed to a function and that they have to be passed after the actual function call. To see what I mean, take a look at this iteration example:

  • def arrayEach( values, &block )
  •  
  • values.each( &block );
  •  
  • end
  •  
  • arrayEach( [ 1, 2, 3 ] ) { |i| puts( i ); } ;

Here, we are defining a method that simply loops over an array using the given code block as the callback. You'll notice that my code block is defined after my call to arrayEach() - look at the parenthesis. In the arrayEach() parameter list, that code block then shows up as a Proc object.

A Proc object is a wrapper for a code block. Code blocks are not objects - they are syntactic sugar. By wrapping a code block in a Proc object, it allows you to pass around the code block as you would pass around any variable. When you see the "&" unary operator, you are dealing with something that converts code blocks to and from Proc objects.

In our case, the "&" unary operator in the parameter list tells Ruby to take the incoming code block and implicitly wrap it in a Proc instance. If you did not include the "&" in the argument list, the Ruby interpreter would have raised the following exception:

in "arrayEach": wrong number of arguments (1 for 2) (ArgumentError)

Since code blocks cannot be assigned to variables, Ruby thinks you are missing a parameter.

Once our "&block" argument has been created, we then need to pass it off to the each() function on the array. However, since the each() function expects a code block, we can't simply pass in our "block" argument. Rather, we need to again use the unary operator (&) to convert the Proc object back into a code block.

When we run this code, we get the following console output:

1
2
3

Taking this kind of understanding, we now know that these two lines of code are equivalent:

  • each{ ...code block ... }
  • each(){ ... code block ... };

Now, what if we wanted to pass in multiple code blocks to a function? Going back to our iterator context, what if we wanted to use one callback for even indices and one callback for odd indices? Well, due to the code block implementation, things get a bit hairy. At this point, I don't believe we can use code blocks; rather, we have to explicitly wrap code blocks with either Proc objects or lambda operators and then pass those around:

  • def arrayEach( values, evenCallback, oddCallback )
  •  
  • values.each_index{ |index|
  •  
  • if (index.even?())
  •  
  • evenCallback.call( values[ index ] );
  •  
  • else
  •  
  • oddCallback.call( values[ index ] );
  •  
  • end
  •  
  • };
  •  
  • end
  •  
  •  
  • arrayEach(
  • [ "Sarah", "Jill", "Katie" ],
  • lambda{ |value|
  • puts( "#{value} is at an even index" );
  • },
  • lambda{ |value|
  • puts( "#{value} is at an odd index" );
  • }
  • );

As you can see, in order to pass in multiple callbacks, we have to manually wrap our code blocks. In this case, I am using the lambda operator. Notice that because we are wrapping the code blocks manually, our arguments no longer need the "&" unary operator. And, rather than passing a callback to the each() method, we are using the call() method to explicitly invoke the given callbacks.

When we run this code, we get the following console output:

Sarah is at an even index
Jill is at an odd index
Katie is at an even index

In this case, I used the lambda operator. However, I could have also used the Proc.new() method to wrap the code blocks in a procedure object. These work mostly the same but appear to differ in the way that explicit "return" statements are handled... this makes my head hurt :)

I am sure that I am making a big deal about something insignificant. I would hazard a guess that once you really get into Ruby programming, these differences become second nature. Coming from a Javascript world where Function expressions and definitions are all treated the same, it is just not readily apparent to me why such differences would add value to a language.

Well that's it for Ruby for the time-being. That's kind of sad - I want to keep digging.

Next language: Io.




Reader Comments

How do you divide chatter and information overload from value. I hear the same type of questions with Application frameworks. Isn't it true that there are good solutions and each has a strength? When are we so overloaded with details that we never master any platform?

Reply to this Comment

Regarding passing closures in parenthesis Ben, Groovy supports this. Consider this code:

def testFunc(int a, Closure c1, String s, Closure c2)
{
println "a = ${a}"
println "s = ${s}"
println "Executing closure c1:"
c1()

println "Executing closure c2:"
c2()
}

testFunc 1, { println "In closure #1-1" }, "Hi #1", { println "In closure #1-2" }
testFunc(1, { println "In closure #2-1" }, "Hi #2", { println "In closure #2-2" })
testFunc(1, { println "In closure #3-1" }, "Hi #2") { println "In closure #3-2" }

Notice all three versions are legal.

Reply to this Comment

Ben,

The advantage of mixins over inheritance is that mixins get around the problem of not having multiple inheritance. Take the classic case of a Teacher class and a Student class. Now, we want a StudentTeacher. We want it to inherit from both, but single inheritance prohibits this. Mixins allow us to "mix in" traits from both Student and Teacher. (BTW, you can do mixins with ColdFusion as well.)

Glad you enjoyed Ruby -- it's a fun language to work with.

Reply to this Comment

@John,

These are good question that I don't have answers to. In the Seven Languages book, the author definitely states that every language has its shortcomings, but that each of them is excellent at solving specific problems. I think this is a good point, in theory; but at the practical level, I don't think we, as programmers, are about to start switching in and out of mastered languages in order to use the best tool for the job. I think the people that can do that effectively are very few and even farther between.

@Adam,

Ah, cool stuff. I've played a bit with Groovy, but only in the context of Barney's CFGroovy projects. It also seems to have a lot of cool features, not the least of which is the seamless integration with Java libraries.

@Hal,

That makes sense. I have also played with this concept in ColdFusion, but only in testing the limits of what the language can actually do. I'm not OO-enough to actually have used the mixin concept in my day-to-day programming.... one day :)

Reply to this Comment

When someone brings up "code commenting" a flash of Ben Nadel comes to my head. It's so awesome of read your posts, and by extension your code. Wish more people blogger/developers out there took your style...

Reply to this Comment

@Anthony,

Thanks my man - it's very nice to hear that my esoteric style is appreciated by some people :D It is, however, unfortunate that my blog post formatting has no idea how to handle color coding for any language other than ColdFusion and HTML. It definitely make for a bit harder reading.

Reply to this Comment

Ben,

There is a lot of confusion out there about passing blocks in Ruby, and what the ampersand does. Yehuda Katz's blog entry cleared it up for me completely, though:
http://yehudakatz.com/2010/02/25/rubys-implementation-does-not-define-its-semantics/

Basically, in Ruby, there is a special argument "slot", called "the block slot," in all methods. The ampersand does nothing more than signify use of the block slot. Ruby has special syntax sugars that work with the block slot, but you can still use the block slot without having to use these syntax sugars if you don't want to.

As you already know, you can wrap up a block of code into a Proc object (an object whose class is Proc) and stick it into a local variable, instance variable, etc. You can also pass it around into methods and return it as the result of methods, just like you can pass any other values (3, "hello", an array or hash, an object, etc) into methods or return any other values out of methods. They are closures, just like you've got in JavaScript, and you can treat them the same way in Ruby as you treat them in JavaScript.

  • callback = lambda do |i|
  • puts "The answer is: #{2 * i * i}!"
  • end
  • callback.call(6)
  •  
  • def run_my_proc(cb, arg)
  • cb.call(arg)
  • end
  • run_my_proc(callback, 6)

[Call Site]

When you call a method, there are two ways of passing an argument into the block slot.

1. Implicit notation. You can use do- or brace-notation to pass a block of code wrapped up in a proc.

  • (1..10).each do |i|
  • puts i
  • end
  • (1..10).each { |i| puts i }

However, this way is simply syntax sugar for the second way.

2. Explicit notation using the ampersand. You can pass any Proc object you have stored away in a variable. Objects passed into the block slot using the explicit ampersand notation must appear last in the list of arguments.

  • callback = lambda { |i| puts i }
  • (1..10).each(&callback)

You can also pass any object which responds to "to_proc" (that is, which has a method named "to_proc"). Ruby will make sure that the method only sees a Proc, rather than what you explicitly pass in

  • callback = Object.new
  • def callback.to_proc
  • lambda { |i| puts i }
  • end
  • (1..10).each(&callback)

In particular, symbols respond to "to_proc".

  • class Person
  • def initialize(name)
  • @name = name
  • end
  • def yell!
  • puts "HELLO THERE, MY NAME IS #{@name.upcase}!"
  • end
  • end
  •  
  • people = [Person.new('Bob'), Person.new('Sally')]
  • people.each(&:yell!)

[Method Definition]

When you define a method, there are two ways of using arguments that have been passed into the method in the block slot.

1. Implicit notation. Using this notation, you can check if a proc was passed into the block slot using "block_given?" and you can call the block using "yield":

  • def loudly(arg1, arg2)
  • unless block_given?
  • puts "loudly doing nothing"
  • else
  • puts "starting loudly"
  • yield(arg2, arg1)
  • puts "ending loudly"
  • end
  • end
  •  
  • loudly(3, 4) do |a, b|
  • puts (a * b)
  • end

2. Explicit notation using the ampersand. Using this notation, you can get the proc passed into the block slot as an argument / local variable. Block-slot arguments gotten using this notation must appear last in the list of formal parameters.

  • def loudly(arg1, arg2, &something_to_do)
  • unless something_to_do #i.e., if it's nil
  • puts "loudly doing nothing"
  • else
  • puts "starting loudly"
  • something_to_do.call(arg2, arg1)
  • puts "ending loudly"
  • end
  • end
  •  
  • loudly(3, 4) do |a, b|
  • puts (a * b)
  • end

Note that you can still use "block_given?" and "yield", even if you use the explicit ampersand notation for naming the block-slot argument.

[Summary]

It's just a special slot in the arguments. You can use this with an implicit notation, for which there is nice syntax sugar. Or you can use this with an explicit notation, and you're back to the regular rules of the language.

Cheers!

Reply to this Comment

@Justice,

Thanks for the link to Yehuda's blog post. I'll be sure to check that out.

Awesome write up, thanks! I'm slowly getting the mental models sorted out. I love the fact that these types of code assets (blocks, Procs, lambda) work like closure, which is a HUGE part of why I love Javascript so much.

When I first read about the "yield" approach to using code blocks, I found it utterly confusing. Now that I see how it works, I could probably use it; however, the thing that I don't like about it is that the method signature does not provide for any insight into the fact that the method is expecting a code block - you'd have to read through the method body to see that. That's why I'd err on using the explicit block slot or a Proc/lambda approach to method passing.

As far as implicit passing goes, I prefer to the {..} approach to the do..end approach; but that's probably because I'm more familiar with bracket-level coding. I don't think I've use and END statement since VB? It's been so long I can't even remember what languages use them.

I can't say that I have any idea what this means:

  • people.each(&:yell!)

You said something about & working with "symbols". The idea of values vs. symbols is not apparent in my mind yet. I read about how one goes to a look-up table and one creates a new values... but I am not sure I actually understands what that means.

Does your line of code call the yell! method reference on each instance of the Person?

Reply to this Comment

Ben,

You can actually copy the code samples in my post above and paste them directly into the Ruby interpreter (irb) to see what they do.

The code:

  • people.each(&:yell!)

is generally equivalent to:

  • people.each do |p|
  • p.yell!
  • end

Keep in mind that in Ruby, when you "call a method" on an object, what that means to Ruby is that you are "sending a message" to the object, and then the object can decide just how it would like to respond to that message. The feature in ColdFusion where you can attach a method to a component object at runtime, and the feature where you can define an onMethodMissing method as a fallback, are based on similar features in Ruby and are based on exactly the same principle: you tell the object what the message is, and the object can respond to that message however it likes. So in Ruby, you will typically want to speak about "sending an object a message :yell!" rather than "calling the :yell! method reference on an object." While you can get method references in Ruby, it's much more common to send objects messages.

  • p = Object.new
  • # specify how p should respond to :yell!
  • def p.yell!
  • puts 'THIS IS A SPECIAL YELL!'
  • end
  • # send :yell! to p
  • p.yell!
  • p.send :yell!
  • [p].each(&:yell!)

A symbol is like a string. A symbol is immutable, while a string is mutable and can be altered at runtime. You can send :to_sym to any string ("hello".to_sym) to retrieve a corresponding symbol and you can send :to_s to any symbol (:hello.to_s) to retrieve a corresponding string. Strings and symbols are both sequences of bytes stored in memory, but in Ruby 1.9 strings also have an encoding associated with them (ASCII, ISO-8859-1, UTF-8, UTF-16, etc). If you have {{a, b = :hello, :hello}}, then a and b refer to the same location in memory.

Cheers!

Reply to this Comment

@Justice,

Ok cool - I figure it was doing as much, I had just not seen the "&:" combination as an operator before. There really is a ton of syntactic sugar in the language, isn't there.

Speaking of "message passing", that is definitely something that I really like as a concept. In fact, I sort of wish that was the way I had originally learned to think about methods. I remember when onMissingMethod() was introduced to ColdFusion, it made no sense to me. It wasn't until Sean Corfield told me to think about it more as "passing messages" than about invoking a method. And I'll tell you, that really made something click in my head.

And, when dealing with Io (the next language in the book), they really deal with message passing in a super flexible way. In the Io language, messages all have delayed execution. So, you can define a message as, say, a method argument, and then not actually execute it until inside the method that you passed it to. I'm sure I haven't fully wrapped my head around it yet, but it seems to be an extremely powerful way to think about objects interacting.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.