Ruby Gotcha of the Day: String Ranges
Posted by kev Wed, 20 Jun 2007 19:05:00 GMT
Spot the pattern?
('1'..'10').to_a
# => ["1", "2", "3", "4", ..... "10"]
('2'..'10').to_a
# => []
('2'..'20').to_a
# => ["2", "3", "4", "5", ..... "20"]
('3'..'20').to_a
# => []
('3'..'30').to_a
# => ["3", "4", "5", "6", ..... "30"]
('4'..'30').to_a
# => []
('4'..'40').to_a
# => ["4", "5", "6", "7", ..... "40"]
(2..10).to_a
# => [2, 3, 4, 5, 6, 7, 8, 9, 10]
('2'.to_i .. '10'.to_i).to_a
# => [2, 3, 4, 5, 6, 7, 8, 9, 10]

Interesting – took me a while to figure it out :)
I wouldn’t call it a bug though – ‘2’ is > ‘10’ i.e. strings don’t sort like numbers
Very good find…
Ok, whether this is a bug or not is certainly up for interpretation, but it’s an interesting gotcha either way.
String#succ, which is used by the the range is quite a strange beast, especially since its rules do not conform well to String#<=>. succ has special treatment of carry with characters and numbers, and doesn’t follow the normal ascii set. For example, ‘9’.succ is ‘10’, ‘z’.succ is ‘aa’, but ’ ?’.succ is ’@’. On the other hand, <=> goes character by character. So yes, it is definitely a gotcha, but stranger things do happen with ranges :
(‘a ’..’b’).include? ‘a ’ => true (‘a ’..’b’).include? ‘b’ => true (‘a ’..’b’).to_a => [‘a ‘] # where is ‘b’?
oops, those small examples got mangled
(‘a ’..’b’).include? ‘a ’ => true
(‘a ’..’b’).include? ‘b’ => true
(‘a ’..’b’).to_a => [‘a ‘] # where is ‘b’?
In the end, one has to be careful using string ranges with Range#to_a, as it can give strange results even with alphabetic characters only :
(‘A’..’b’).include? ‘a’ => true
(‘A’..’b’).to_a.include? ‘a’ => false
(‘A’..’b’).to_a => [‘A’, ‘B’, ‘C’, ... ‘Z’]
‘Z’.succ gives ‘AA’, which is greater than ‘a’, thus #to_a stops at that point