Understanding Ruby Symbols

Posted by kev Fri, 19 Aug 2005 22:11:00 GMT

Update 8/25: This post has been translated into korean.

Update 12/12: I found a spanish translation today.

Symbols in ruby are an enigma. We use them, but many don’t really understand them.

So really, what is a symbol?

Simply, a symbol is something that you use to represent names and strings. What this boils down to is a way to efficiently have descriptive names while saving the space one would use to generate a string for each naming instance.

The Case of Dr. Jones

Dr. Jones is a Psycologist. He regularly uses word association tests to diagnose patients and uses ruby to keep track of everything. His first patient, Why, steps up to the plate:

Dr J: Red
Why : Ruby
Dr J: Transportation
Why : Rails
Dr J: Chunky
Why : Bacon

Dr Jones creates a hash to store his data:

why = {"red" => "ruby", "transportation" => "rails", "chunky" => "bacon"}

Dr. Jones’s second patient, Bob, turns in his survey results:

bob = {"red" => "paint", "transportation" => "car", "chunky" => "fat"}

The Problem

After running several hundred word association tests, Dr. Jones begins to realize that he’s running out of memory! On a hunch, Jones runs tests in irb:


> patient1 = { "ruby" => "red" }
> patient2 = { "ruby" => "programming" }
> patient1.each_key {|key| puts key.object_id.to_s}
211006
> patient2.each_key {|key| puts key.object_id.to_s}
203536

Well look at that, each time he creates a hash to store his information, ruby creates a new string object in a different memory location for each key. Fortunately, there’s an alternative.

Symbols to the Rescue

Unlike strings, symbols of the same name are initialized and exist in memory only once during a session of ruby. Symbols are most obviously useful when you’re going to be reusing strings representing something else. Reproducing Dr. Jones’s tests, we are able to see this directly:


> patient1 = { :ruby => "red" }
> patient2 = { :ruby => "programming" }
> patient1.each_key {|key| puts key.object_id.to_s}
3918094
> patient2.each_key {|key| puts key.object_id.to_s}
3918094

Using symbols, we’ve used a single memory address to represent the word ruby in our word association tests. Over time, this can save alot of space.

So I’m no shrink, when else will I want to use symbols?

Symbols are useful whenever you’re going to be reusing a word over and over to represent something else, whether its a key in a hash or the method you’re using in an http query. An example from the latest and greatest web framework Ruby on Rails is its use of symbols in routes and links. Rails defines actions within controllers to do things within the framework before rendering a web page, so a link in Rails may look like:


    link_to("View Article", :controller => "articles", :action => "show", :id => 1)

When an application may have hundreds of links, or atleast hundreds of references to different actions and controllers, it is significantly more efficient to use symbols than strings.

Finally, its important to note that the usefulness of symbols is not restricted to keys in hashes. For example, if one was writing a http client or server they might use get and post several times within their application, and it might be appropriate to use:


    do_this if query == :get
... send_message_to_server(:post,filename)

Any time a string could be used over and over, a symbol may be a good candidate for replacement.

Updates

In #ruby-lang on Freenode (irc.freenode.net) Aria and nome presented helpful additions to this article.

11:58 < Aria> Also, the entirely realistic reasoning for using symbols: If you 
              are going to refer to a method name, use a symbol. Because /by 
              defining the method/, the symbol exists anyway.

12:03 < nome> kevinclark: the intention of symbols are for identification of 
              (user-level, primarily) constructs: a slot in a hash, a method, 
              an option, etc.

Also note Aria’s response to Geoff’s question in the comments:

Geoff -
I'd be interested in knowing exactly how
much memory 1,000 strings ("red") uses over :red.

And remember, outside of Rails, "red" != :red

Aria - 
How much memory? 20 bytes per object, plus storage for the data, 3 bytes,
plus the length storage (4 byes)—so 27,000 bytes or so.

Versus one copy of the entry in the symbol table, which is likely to be just 
a few bytes (I could check, but I know for certain it’s in the tens, not tens 
of thousands of bytes range.)

Jim Weirich notes:

I (generally) use the following rule on string vs symbols:

(1) If the contents (i.e. the sequence of characters) of the object is important, use a String.

(2) If the identity of the object is important, use a Symbol.

Reports of errors and omissions are welcome and should be sent to kevin [dot] clark [at] gmail [dot] com

Posted in ,  | 33 comments | 5 trackbacks

Comments

  1. Avatar Geoff said 8 minutes later:

    Good article. I’d be interested in knowing exactly how much memory 1,000 strings (“red”) uses over :red.

    And remember, outside of Rails, “red” != :red

  2. Avatar Aredridel said 40 minutes later:

    How much memory? 20 bytes per object, plus storage for the data, 3 bytes, plus the length storage (4 byes)—so 27,000 bytes or so.

    Versus one copy of the entry in the symbol table, which is likely to be just a few bytes (I could check, but I know for certain it’s in the tens, not tens of thousands of bytes range.)

  3. Avatar anonymous coward said about 3 hours later:

    there is a typo in your example: “do_this if query = :get” should be “do_this if query == :get”

  4. Avatar Kevin Clark said about 3 hours later:

    AC: Quite right, I’ll fix that.

  5. Avatar Jim Weirich said about 3 hours later:

    I (generally) use the following rule on string vs symbols:

    (1) If the contents (i.e. the sequence of characters) of the object is important, use a String.

    (2) If the identity of the object is important, use a Symbol.

  6. Avatar anon said about 4 hours later:

    another typo: missing ” at the end of rails in why = {“red” => “ruby”, “transportation” => “rails, “chunky” => “bacon”}

    It’s a good article but I feel I don’t grasp the concept of symbols yet. For example in RoR with LoginSystem symbols are also used like this (in application.rb)

    include LoginSystem model :user

    What does it mean? Is it some kind of “reference” to an instance of the class User < ActiveRecord::Base ? How does the magic work to be able to write such things like @request.session[:user] ?

  7. Avatar Kevin Clark said about 6 hours later:

    anon, in your example: model :user

    This code calls the method model and passes the symbol :user as an argument. model in turn sets its arguments as models that the current controller depends on. So the point is, rather than saying

    model “myModel”

    and creating a string each time you want to do this, and then within the dependency methods checking against “myModel” (creating a new string each time we do this) we just reference the single :myModel. We’re still checking to see if the model is myModel and acting accordingly, but saving all the memory we would use with strings.

    Similarly, in @request.session[:user], we’re treating :user like we would a string in a hash, but saving the space that would be assigned in order to create the new object “user” by using the already created :user.

    If this doesn’t cover it, please, follow up.

  8. Avatar trans said about 7 hours later:

    Least anyone think symbols are restricted in there content:

    :"this is a !@#$% symbol too!"

    The difference to point: Like Numerics, Symbols are immutable, but Strings are not.

  9. Avatar Florian Groß said about 8 hours later:

    And to be even more detailed: The content of Symbols is restricted over that of Strings. Symbols can never contain \0 and Ruby tries to keep you from using empty symbols. (Though the latter is still possible.)

  10. Avatar Porges said about 11 hours later:

    I like to think of like this: If you need to print the contents of a string at any time, use a string. Otherwise use a smybol :)

  11. Avatar anon said about 15 hours later:

    Kevin: if I understand you correctly, I was wrong when I asked if it was some kind of “reference? to an instance of a class. It has nothing to do with it. For some reason I thought of symbols of “stuff somewhat like C/C++ pointers” but it’s not, or maybe only pointing against strings. Anyway I think I got it now.

  12. Avatar Gavri Fernandez said 1 day later:

    Geoff: “red” != :red even in Rails. It’s just that Rails uses HashWithIndifferentAccess instead of Hash everywhere.

  13. Avatar Ahmad Alhashemi said 3 days later:

    I used to confuse symbols with variable names. I read everything I could find about them, including in Learning Ruby 2.

    I only “got it” when I understood this fact: symbols are literal values. Just like that!

    They are literal values, just like the number 3 and the string “red”.

    I believe that I’m not the only one confusing symbols with variable names. I read somewhere trying to explain symbols that variable and method names change in what they are referring to in different places, while symbols always mean the same thing.

    This is even more confusing, because it is makes you think that symbols are special kind of variables. But when you know that they are literal values, you will think: of course they will not change their meaning! Just like the literal 3, it always means the same thing!

    That is also why you can say: v = :s

    But you can’t say: :s = v

    This will give you a syntax error.

    I hope this added to your already excellent explanation.

  14. Avatar Kevin Clark said 3 days later:

    Clarifications and additions are more than welcome.

  15. Avatar Metin Amiroff said 3 days later:

    Very helpfull article and great comments. Much appreciated.

    I am a ruby newbie and these symbols were like a black hole for me. Now, everything’s much much clear. Thanks again!

  16. Avatar anon said 69 days later:

    Rails uses HashWithIndifferentAccess like mentioned in a previous comment.. and what that does is converts any key to a string.

    hash[:symbol] turns into hash[“symbol”], both when setting and accessing.

    doesn’t that mean the benifit of using symbols is lost?

  17. Avatar Alex S. said 71 days later:

    as a newbie to Ruby this Blog entry helped al lot to understand Symbols. thanks

  18. Avatar yoav said 87 days later:

    considering what was said here about HashWithIndifferentAccess in ror, is it possible to disable this and “force” rails to use only symbols?

  19. Avatar somebody said 95 days later:

    What I don’t get is why nobody does this in Rails:

    link_to("View Article", :controller => :articles, :action => :show, :id => 1)

    Why give “articles” and “show” as strings? Especially “show” as you’re refering to a method.

  20. Avatar prim8 said 100 days later:

    ‘What I don’t get is why nobody does this in Rails:

    link_to(“View Article”, :controller => :articles, :action => :show, :id => 1)’

    The action symbols do work fine like this, but I have been having problems when I refer to the controller by symbol. It did work at one time, but at some point stopped working.
    class AlbumController < ApplicationController
      layout :photo
    end
    
    This gives: NoMethodError: undefined method `photo’ for #<AlbumController:0×5a16f48> Not sure why it thinks photo is a method, it works fine when I refer to the controller with:
    layout 'photo'
    
  21. Avatar azreal said 105 days later:

    cool article. Now I understant the concept! thank you!

  22. Avatar Michael Bevilacqua-Linn said 108 days later:

    Thanks so much for writing this. Much like you say, I’ve been using symbols without understanding what exactly they are. Now I do.

  23. Avatar Thomas Aylott said 113 days later:

    Wow.. I think I ‘get it’

    Symbols are like constants whose name is their value. Just like the number 5 has both the name 5 and the value 5 so the symbol :howdy is the word howdy. It’s just like “howdy” but Better™

    that’s pretty neat.

  24. Avatar Douglas said 115 days later:

    > link_to(“View Article”, :controller => :articles, :action => :show, :id => 1)

    > Why give “articles? and “show? as > strings? Especially “show? as you’re > refering to a method.

    With the keys (:controller and action and id), the uniqueness of the key is the most important feature. (We happen to use symbols because they are unique and hint at meening.) With the values (“articles”, “show”) the content is the most important thing, we use strings instead.

    Just a convention :)

  25. Avatar twifkak said 122 days later:

    To make it starkly clear, symbols are (I’m told) never garbage collected. So you should be very wary about doing things like :"thing_#{value}".

  26. Avatar howdy said 161 days later:

    If you had this in C:

    #define ID_NAME     0
    #define ID_ADDRESS  1

    That’s basically what you achieve with symbols like :name and :address in ruby, right? Except you don’t have to associate them with numbers yourself, the interpreter does that for you.

  27. Avatar Kevin Clark said 161 days later:

    howdy: Not quite. Ruby does have constants, which is what happens when you #define in C. Symbols are similar, but have no value themselves. They’re really just a label that has meaning to the coder as they aren’t variables.

  28. Avatar hsitz said 188 days later:

    Seems to me like howdy’s example is not so far off, at least in pointing out similarity in use between C constants and Ruby symbols.

    The similarity lies in the fact that Ruby symbols do link up to integers “under the covers”. It’s just that we don’t care what value is linked to the symbol; all we care about is uniqueness, that every use of the symbol link up to same integer.

    The difference is that in howdy’s C constant example, the symbols link up to an integer “under the covers”, but also (as constants) refer to the constant integer value they are assigned within the program.

    To see how Ruby symbols link up to integers in the internal workings of the interpreter you can use the object_id and/or to_int methods:

    e.g., assume f1 = :fred f2 = :fred

    then f1, f2, and fred will all have same internal representation:

    in example I just did on my computer, f1.object_id is 4073742 f2.object_id is 4073742, and :fred.object_id is 4073742

    And the to_int method provides another unique integer representation: f1.to_int is 15913 f2.to_int is 15913 :fred.to_int is 15913

    Perhaps someone can clarify why the to_int method is needed. Is object_id not already guaranteed to be an integer?

    Also, someone said in a previous post that it was impossible to reference the string value of a symbol name from the symbol. This can in fact be done:

    in my above example,

    :fred.to_s is “fred” :f1.to_s is “fred” :f2.to_s is “fred”

    and id2name is synonomous with to_s, so, e.g.,

    :fred.id2name is “fred”

  29. Avatar vhg119 said 197 days later:

    To Ahmad Alhashemi AND Thomas Aylott

    Thanks to your comments, I NOW GET IT!

  30. Avatar Markus Sandy said 218 days later:

    I agree with Howdy. His examples are not the same as constants in C (use “const” for that). Those are “macros” which are mearly symbols that are expanded upon before compilation. Nice post.

  31. Avatar Roman said 220 days later:

    Thanks Thomas Aylott. I now get it too!

  32. Avatar Hardy said 221 days later:

    Wow, this was an excellent find. I finally get it (between the original posting and some of the comments).

    This should be reproduced in the pickaxe : b

  33. Avatar Clint Checketts said 229 days later:

    I’m a bit late to this post, but I think I’m finally understanding symbols.

    I would compare a symbol to a Java 1.5 enumerations on steroids. Like a very versatile constant (if that helps anyone)

Trackbacks

Use the following link to trackback from your own site:
http://glu.ttono.us/articles/trackback/3

  1. From Mando.org
    Symbols and Hashes
    A genius I can only presume is named Kevin Clark has posted a nice little explanation about symbols and hashes in Ruby. If you're using Ruby on Rails and have been having issues with some of the syntax constructs (like...
  2. From Panasonic Youth
    Comparing Ruby Symbols to Java String interning
    Ruby Symbols are kinda like interned Strings in Java, only different ...
  3. From atog
    Ruby Symbols.
    Understanding Ruby Symbols ...
  4. From betweenGo
    Understanding Ruby Symbols
    Excellent post about Ruby symbols, especially important considering Ruby on Rails uses symbols all the time. ...
  5. From jsquintz.com
    Ruby Symbols
    I found a great article talking about symbols in Ruby, this provides a great explaination on an important part of the language that books seem to skim over. Working on a pretty big RoR project with a pretty rudimentary understanding of the languag...

Comments are disabled