« November 2007 | Main | January 2008 »

December 2007

December 31, 2007

Pipelines Using Fibers in Ruby 1.9

Users of the command line are familiar with the idea of building pipelines: a chain of simple commands strung together to the output of one becomes the input of the next. Using pipelines and a basic set of primitives, shell users can accomplish some sophisticated tasks. Here's a basic Unix shell pipeline that reports the ten longest .tip files in the current directory, based on the number of lines in each file:

 wc -l *.tip | grep \.tip | sort -n | tail -10

Let's see how to add something similar to Ruby. By the end of this set of two articles, we'll be able to write things like

puts (even_numbers | tripler | incrementer | multiple_of_five ).resume

and a palindrome finder using blocks:

words            = Pump.new %w{Madam, the civic radar rotator is not level.}
is_palindrome = Filter.new {|word| word == word.reverse}

pipeline = words .| {|word| word.downcase.tr("^a-z", '') } .| is_palindrome

while word = pipeline.resume
  puts word
end

Great code? Nope. But getting there is fun. And, who knows? The techniques might well be useful in your next project.

A Daily Dose of Fiber

Ruby 1.9 adds support for Fibers. At their most basic, let you create simple generators (much as you could do previously with blocks. Here's a trivial example: a fiber that generates successive Fibonacci numbers:

      fib = Fiber.new do
        f1 = f2 = 1
        loop do
          Fiber.yield f1
          f1, f2 = f2, f1 + f2
        end
      end

      10.times { puts fib.resume }

A fiber is somewhat like a thread, except you have control over when it gets scheduled. Initially, a fiber is suspended. When you resume it, it runs the block until the block finishes, or it hits a Fiber.yield. This is similar to a regular block yield: it suspends the fiber and passes control back to the resume. Any value passed to Fiber.yield becomes the value returned by resume.

By default, a fiber can only yield back to the code that resumed it. However, if you require the "fiber" library, Fibers get extended with a transfer method that allows one fiber to transfer control to another. Fibers then become fully fledged coroutines. However, we won't be needing all that power today.

Instead, let's get back to the idea of creating pipelines of functionality in code, much as you can create pipelines in the shell.

As a starting point, let's write two fibers. One's a generator—it creates a list of even numbers. The second is a consumer. All it does it accept values from the generator and print them. We'll make the consumer stop after printing 10 numbers.

    evens = Fiber.new do
      value = 0
      loop do
        Fiber.yield value
        value += 2
      end
    end

    consumer = Fiber.new do
      10.times do
        next_value = evens.resume
        puts next_value
      end
    end

    consumer.resume

Note how we had to use resume to kick off the consumer. Technically, the consumer doesn't have to be a Fiber, but, as we'll see in a minute, making it one gives us some flexibility.

As a next step, notice how we've created some coupling in this code. Our consumer fiber has the name of the evens generator coded into it. Let's wrap both fibers in a method, and pass the name of the generator into the consumer method.

    def evens
      Fiber.new do
        value = 0
        loop do
          Fiber.yield value
          value += 2
        end
      end
    end

    def consumer(source)
      Fiber.new do
        10.times do
          next_value = source.resume
          puts next_value
        end
      end
    end

    consumer(evens).resume

OK. Let's add one more fiber to the weave. We'll create a filter that only passes on numbers that are multiples of three. Again, we'll wrap it in a method.

    def evens
      Fiber.new do
        value = 0
        loop do
          Fiber.yield value
          value += 2
        end
      end
    end

    def multiples_of_three(source)
      Fiber.new do
        loop do
          next_value = source.resume
          Fiber.yield next_value if next_value % 3 == 0
        end
      end
    end

    def consumer(source)
      Fiber.new do
        10.times do
          next_value = source.resume
          puts next_value
        end
      end
    end

    consumer(multiples_of_three(evens)).resume

Running this, we get the output

0
6
12
18
. . .

This is getting cool. We write little chunks of code, and then combine them to get work done. Just like a pipeline. Except...

We can do better. First, the composition looks backwards. Because we're passing methods to methods, we write

    consumer(multiples_of_three(evens))

Instead, we'd like to write

    evens | multiples_of_three | consumer

Also, there's a fair amount of duplication in this code. Each of our little pipeline methods has the same overall structure, and each is coupled to the implementation of fibers. Let's see if we can fix this.

Wrapping Fibers

As is usual when we're refactoring towards a solution, we're about to get really messy. Don't worry, though. It will all wash off, and we'll end up with something a lot neater.

First, let's create a class that represents something that can appear in our pipeline. At it's heart is the process method. This reads something from the input side of the pipe, then "handles" that value. The default handling is to write that value to the output side of the pipeline, passing it on to the next element in the chain.

    class PipelineElement

      attr_accessor :source

      def initialize
        @fiber_delegate = Fiber.new do
          process
        end
      end

      def resume
        @fiber_delegate.resume
      end

      def process
        while value = input
          handle_value(value)
        end
      end

      def handle_value(value)
        output(value)
      end

      def input
        source.resume
      end

      def output(value)
        Fiber.yield(value)
      end
    end

When I first wrote this, I was tempted to make PipelineElement a subclass of Fiber, but that leads to coupling. In the end, the pipeline elements delegate to a separate Fiber object.

The first element of the pipeline doesn't receive any input from prior elements (because there are no prior elements), so we need to override its process method.

    class Evens < PipelineElement
       def process
         value = 0
         loop do
           output(value)
           value += 2
         end
       end
    end

    evens = Evens.new

Just to make things more interesting, we'll create a generic MultiplesOf filter, so we can filter based on any number, and not just 3:

    class MultiplesOf < PipelineElement
      def initialize(factor)
        @factor = factor
        super()
      end
      def handle_value(value)
        output(value) if value % @factor == 0
      end
    end

    multiples_of_three = MultiplesOf.new(3)
    multiples_of_seven = MultiplesOf.new(7)

Then we just knit it all together into a pipeline:

    multiples_of_three.source = evens
    multiples_of_seven.source = multiples_of_three

    10.times do
      puts multiples_of_seven.resume
    end

We get 0, 42, 84, 126, 168, and so on as output. (Any output stream that contains 42 must be correct, so no need for any unit tests here.)

But we're still a little way from our ideal of being able to pipe these puppies together. It's a good thing that Ruby let's us override the "|" operator. Up in class PipelineElement, define a new method:

    def |(other)
      other.source = self
      other
    end        

This allows us to write:

    10.times do
      puts (evens | multiples_of_three | multiples_of_seven).resume
    end

or even:

    pipeline = evens | multiples_of_three | multiples_of_seven

    10.times do
      puts pipeline.resume
    end

Cool, or what?

In The Next Thrilling Installment

The next post will take these basic ideas and tart them up a bit, allowing us to use blocks directly in pipelines. We'll also reveal why our PipelineElement class I just wrote is somewhat more complicated than might seem necessary. In the meantime, here's the full source of the code so far.

    class PipelineElement

      attr_accessor :source

      def initialize
        @fiber_delegate = Fiber.new do
          process
        end
      end

      def |(other)
        other.source = self
        other
      end

      def resume
        @fiber_delegate.resume
      end

      def process
        while value = input
          handle_value(value)
        end
      end

      def handle_value(value)
        output(value)
      end

      def input
        source.resume
      end

      def output(value)
        Fiber.yield(value)
      end
    end

    ##
    # The classes below are the elements in our pipeline
    #
     class Evens < PipelineElement
       def process
         value = 0
         loop do
           output(value)
           value += 2
         end
       end
     end

    class MultiplesOf < PipelineElement
      def initialize(factor)
        @factor = factor
        super()
      end
      def handle_value(value)
        output(value) if value % @factor == 0
      end
    end

    evens = Evens.new
    multiples_of_three = MultiplesOf.new(3)
    multiples_of_seven = MultiplesOf.new(7)

    pipeline = evens | multiples_of_three | multiples_of_seven

    10.times do
      puts pipeline.resume
    end

December 25, 2007

Ruby 1.9—Right for You?

As is becoming a tradition, Matz announced the next major release of Ruby on Christmas day.

Let's start by thanking both him and the entire Ruby core team for the efforts to get us here.

So, should you go and put this new Ruby to work right now? Let's see:

The Upside

This release contains a boatload of new features. Mauricio Fernandez has an impressive list (soon to be updated).

  • Performance: this new Ruby runs on the new YARV virtual machine. For most compute-intensive applications, you'll see significant speed improvements.

  • Support for string encodings and transcoding. Every string in Ruby can now have an associated encoding (ASCII, UTF-8, SJIS, and so on). You can transcode the contents of a string (for example converting ISO-8859-1 to UTF-8).

  • Integrated RubyGems and rake

  • Cool new goodies such as Fibers

and so on and so on.

The Downside

  • This is a development release, not a production release. It has known bugs, and there'll be more to come.

  • It contains several incompatible changes (block parameters are now block-local, String is no longer Enumerable, "cat"[1] now returns "a", rather than 65)

  • It is more rigorous that 1.8 when it comes to detecting invalid code. For example, 1.8 accepts /[^\x00-\xa0]/u, while 1.9 complains of invalid multibyte escapes

Because of this, and based on my experience working on the third edition of the PickAxe, a whole bunch of existing gems and other libraries are broken.

So, Should You Use It?

In production? Probably not yet. It isn't intended for production use, and there will be some rough edges.

For development? Maybe, but take note of some of the issues with gems and other libraries. If you rely on third party code, make sure it has been tested against 1.9.0 before taking the plunge. That goes for Rails users, too.

Now, if you're a library developer and gem maintainer, this is the perfect time to check out a copy of Ruby 1.9 and make sure your code is compatible. Over the coming months, more and more of your users will be basing their applications on 1.9. The future success of your gem requires compatibility.

For experimentation? Absolutely! The new features are wonderful. Not only do they make writing Ruby code even more enjoyable, they also open up whole new avenues to explore. How will fibers (both asymmetric and symmetric) affect they way we code? Let's all find out by playing with them.

My Recommendation

Download 1.9 (either as a tarball/zip file, or directly from the Subversion repository). Build it and install it, but not as your default Ruby. Instead, use the --prefix option to put it somewhere else (I store it under my home directory, so I don't need to be root).

$ autoconf
$ ./configure --prefix=/Users/dave/ruby19
$ make
$ make install

Then, I just add /Users/dave/ruby19/bin to my path, and I'm using my nicely sandboxed version of Ruby 1.9.

$ PATH=/Users/dave/ruby19/bin:$PATH
$ ruby -v 
ruby 1.9.0 (2007-12-26 revision 0) [i686-darwin8.11.1]

If I install gems with that version in my path, they get installed into the sandbox, not globally. If I use the sandboxed version of Ruby when building extension libraries with extconf.rb, those extensions install into the sandbox. But, if I suddenly have to look at a problem in production code that means I have to use Ruby 1.8, I simply fire up another shell with my original PATH, and it's as if Ruby 1.9 doesn't exist.

1.9 is the future of Ruby, and it's a future that will be mainstream very soon. Start playing with it now, so you'll be up to speed when Matz creates his first production release.

December 18, 2007

Bliss

Gsdgis_052

December 13, 2007

A New PickAxe

Ruby3_cover_small

Ruby 1.9 is just around the corner, so it looks like a good time to create a new edition of Programming Ruby. So, I'm pleased to announce that the Third Edition of the PickAxe has just entered beta.

The book's home page is at http://pragprog.com/titles/ruby3.

Although 1.9 is largely compatible with 1.8, there are definite differences. And it's been an interesting ride getting the examples in the book to compile and run with the current 1.9 interpreter. The book pushes the envelope in many different areas, and includes example code designed to illustrate edge cases. When I find these, I'm flagging them in the text and (if they look like bugs) adding them to the tracking system. But, so far, 1.9 is looking like a big win for Ruby.

Nice job, everyone.


Ruby Importer for Spotlight

Ruby_spotlight_3

I'm feeling really stupid for not discovering this sooner—a Spotlight importer for Ruby source files. Now I can search inside source for methods, classes, and the contents of comments. My only concern so far is the volume of stuff being indexed, but time will tell...


December 07, 2007

Advanced Rails Recipes

I just did a gem list --remote, and it appears that the much-awaited Rails 2.0 is out. It has with tons of new features. But how do you use them? How do the emphases on resources and REST, and the inclusion of SimplyHelpful, affect the way you design your interfaces and applications? How do the new foxy fixtures make it easier to write tests? It isn't always obvious.

At the same time, the community has learned a lot of tricks: using presenters to handle multi-model forms; testing with mocks and with BDD; performance and deployment tricks; productivity tips. The list goes on.

Mike Clark polled the community and collected the best of the best tips into his new book, Advanced Rails Recipes. All this programmer goodness is probably why this book has one of the highest pre-order levels we've seen for any title.

With all that interest, there was a lot of pressure to deliver early. But I'm really pleased that Mike decided to wait for Rails 2 to come out before releasing the first beta. It's the most up-to-date Rails book out there. But, more than that, it's the best introduction to using Rails 2 effectively that I know. The current beta has 42 cool recipes, and there are another 30 or so to come.

Enjoy.

December 03, 2007

The Simplest Wish List That Could Possibly Work

So, it's the holiday season. A time for families. A time for reunions. A time for giving.

And we wanted to create a kind of wish list feature for our shiny new online store. It would be a system that would allow our readers tell their non-technical relatives and friends the titles that they'd love to see under the tree.

So I got to planning. We've got all the power we need to create something special. I sketched out the mother of all wish lists, with referral logic, automated suggestions based on other people's wish lists, privacy settings, e-mail and web fulfillment, and so on and so on. It was to be a work of art.

Then, for once, I stopped to think.

So, for this year's wish list, we have no code whatsoever. And the more I think about it, the more I think that this simple solution is just as elegant as the complex, thousands-of-lines-of-code one we could have rolled out. The only downside I see is that our readers will have to find a red crayon from somewhere...

Now in Beta

  • Programming Ruby, 3rd Edition
    Third Edition, Covering Ruby 1.9, now in beta
My Photo

Site Search

  • Google Search

    The web
    PragDave

Pragmatic Stuff

Photos

  • www.flickr.com
    This is a Flickr badge showing public photos from pragdave tagged with pragdave_badge. Make your own badge here.