« Just shipped tons of books (literally) | Main | Test-Driven Rails Studio »

May 12, 2008

Ruby Symbols in 1.9—End of an Era (and a good thing, too)

:rip.to_i

Ruby symbols have always been immediate objects. That means that, inside the interpreter, they were represented as small integers which reference the corresponding symbol text in a lookup table. This made them fast, but it also left code open to denial of service attacks (particular in the context of web applications)—malicious clients could force server code to create arbitrary numbers of symbol table entries, and these were never garbage collected.

Some recent changes in Ruby 1.9 point to the transition away from symbols being immediate objects. In particular, they lose their integer representation, and hence the methods Fixnum.id2name, Fixnum.to_sym, and Symbol.to_i have been removed. I'm expecting to see symbols migrate to the heap as 1.9 continues to evolve.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451c41c69e200e5521f9e358833

Listed below are links to weblogs that reference Ruby Symbols in 1.9—End of an Era (and a good thing, too):

Comments

Hi Dave,

What sort of effect(s) are anticipated with regards to performance? I am assuming that there will be some decrease in performance due to lookup overhead, but is the current thinking that the performance hit will be more than offset by YARV (or some other VM)?

Regards,

Charles McKnight

Charles:

I can't say, because the implementation isn't available to play with. However, it's possible to imagine ways of implementing this that wouldn't be significantly slower than now, as symbols would still be singletons, and therefore tests for equality could just use their object IDs.


Dave

Since symbols always seem to trip up Ruby beginners, I am in favor of seeing the divide between symbols and strings narrow. I don't really see a problem with how Java handles string interning. If it appears as a literal within your program, that string is automatically interned, making it efficient to use strings as constants.

I know that doesn't handle the problem of people symbolizing user submitted data (eck), but it seems to have worked well enough up to this point for Java right?

This seems like a terrible idea. If symbols won't be interned strings, what will even be the point of them any more?

Although I concur that using symbols to intern user supplied data is, er, well, not good, I would hate to take any sort of performance hit over it (yes, performance is a concern for one of the projects I'm working on ).

Also, I wonder how much existing code will break? I guess we will just have to wait for the actual implementation to be available and bench it.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Now in Beta

  • Programming Ruby, 3rd Edition
    Third Edition, Covering Ruby 1.9, now available
My Photo

Pragmatic Stuff

Photos

  • www.flickr.com
    This is a Flickr badge showing public photos from pragdave tagged with pragdave_badge. Make your own badge here.

Site Search

  • Google Search

    The web
    PragDave