Ruby Constant Lookup Mechanism

Update Note: Starting from Ruby 2.5.0, incorrect top-level constant references will directly throw an exception instead of just giving a warning. However, this article is based on Ruby versions from 2017, discussing the pre-update behavior logic, which is still valuable for understanding the essential mechanisms of Ruby constant lookup.

I’ve been writing Ruby on Rails for over a year now, and as I gradually started diving into the mysterious magic behind Rails, I found that many issues are related to Ruby’s constant lookup mechanism. To be honest, this mechanism is more complex than I initially thought, especially in Rails’ autoloading environment where it’s easy to step into some hidden pitfalls.

This article wants to explore Ruby’s constant definition, storage, and lookup mechanisms, from basic concepts to some edge cases, hoping to help everyone build a fairly complete understanding of Ruby’s constant system. We’ll use practical code examples to deeply understand how the Ruby interpreter works internally.

Constant Definition and Storage

To understand Ruby’s constant lookup process, we first need to understand how Ruby defines and stores constants.

1
2
3
4
5
6
7
8
# module is a constant
module Namespace
# class is also a constant
class Something
# still a constant
SOME_VALUE = true
end
end

Ruby also has the concept of lexical scope, but unlike most languages, the module and class keywords open entirely new scopes. In the code above, the constant SOME_VALUE is defined within the class Something scope, so where is this constant actually stored?

In Ruby MRI’s implementation, each Ruby class corresponds to an RCLASS C structure, and constants defined in this scope are recorded in this structure. We can see what constants are defined in Something using Namespace::Something.constants(false)[1] in irb.

1
2
irb(main):008:0> Namespace::Something.constants
=> [:SOME_VALUE]

Naturally, class Something itself is also a constant, defined in the module Namespace scope.

Thinking further, where is the top-level module Namespace stored? The answer is they’re stored in Object’s constant table. Using Object.constants(false) you can see all the top-level constants you’ve defined.

Scope Nesting and Module.nesting

As mentioned earlier, whenever module or class keywords appear, Ruby opens a new scope, forming multiple nested structures.

In Ruby’s C implementation, there’s a structure called rb_cref_t[2] that represents the current scope:

1
2
3
4
5
6
7
typedef struct rb_cref_struct {
VALUE flags;
const VALUE refinements;
const VALUE klass;
struct rb_cref_struct * const next;
const rb_scope_visibility_t scope_visi;
} rb_cref_t;

Here, the klass attribute represents which class/module the current scope is, and next points to the cref structure representing the upper-level scope. The linked list formed through next pointers clearly represents the hierarchical relationship from the current scope to the top-level scope. In irb, we can view the klass values of this linked list using Module.nesting.

1
2
3
4
5
6
7
8
9
module A
class B
class C
p Module.nesting # => [A::B::C, A::B, A]

p B == ::A::B # => true
end
end
end

When we try to access constant B in the above code, Ruby searches in the order shown by Module.nesting: first checking A::B::C’s constant table (not found), then searching A::B’s constant table (also not found), and finally finding the definition of B in A’s constant table.

1
2
3
4
5
6
7
8
9
module A; end

module A::B
class C
p Module.nesting # => [A::B::C, A::B]

p B == ::A::B # => ???
end
end

Can you guess the result of this code? It will throw an exception: NameError: uninitialized constant A::B::C::B

Inheritance Tree Lookup Mechanism

Searching through the scope chain alone isn’t enough; we also need to access constants defined in superclasses from subclasses.

1
2
3
4
5
6
7
8
9
class Base
CONST = 'constant in base'
end

class Sub < Base
p CONST # => constant in base
end

p Sub::CONST # => constant in base

This is easy to understand - Ruby also searches for constants through the inheritance tree. Ruby searches along the inheritance chain of the current scope class[3]. We can see the inheritance tree structure using Sub.ancestors.

This raises a key question: if both the parent scope and superclass have definitions of the same constant, which one does Ruby choose? Let’s verify:

1
2
3
4
5
6
7
8
9
10
11
12
13
class Base
CONST = 'constant in base'
CONST_1 = 1
end

module Namespace
CONST = 'constant in namespace'
class Sub < Base
p Module.constants
p CONST # => 'constant in namespace'
end
p Sub::CONST # => 'constant in base'
end

The first statement p CONST outputs ‘constant in namespace’, showing that Ruby prioritizes searching in the scope chain before the inheritance tree.

The second statement p Namespace::Sub::CONST looks similar but produces different output. When executing this statement, Ruby first finds constant Sub, then looks for constant CONST under Sub. At this point, Ruby no longer considers the scope chain but directly searches Sub’s inheritance tree. These two ways of searching for CONST have other subtle differences, which will be explained in detail later.

Top-level Constant Lookup Mechanism

Wait, seems like we missed something? Through Module.nesting, we can’t see top-level scope constants, so how are top-level constants queried? Does Ruby have special handling for top-level constants?

The answer to this question is both yes and no.

Actually, Ruby finds these top-level constants through the inheritance tree. As mentioned earlier, Ruby stores top-level constants in Object.

1
2
3
class MyClass
p Math::PI
end

So in the above code, when we access the top-level constant Math in MyClass’s class scope, Ruby finds the top-level constants stored in Object through MyClass’s inheritance tree.

It seems the problem is solved! We don’t need to introduce new rules to handle top-level constant lookup, which is indeed worth celebrating.

But is the problem really completely solved?

If you’re familiar with Ruby’s internal inheritance tree structure, you’ll notice that Object is not at the top of the inheritance tree. More importantly, when we define modules that are often used as namespaces, their inheritance tree might contain only themselves!

1
2
module Namespace; end
p Namespace.ancestors # => [Namespace]

This is indeed problematic! We definitely need to access top-level constants in these module scopes. So, Ruby adds special handling to the inheritance tree search logic: if it’s a module, search once more from Object’s inheritance tree! This indeed solves the problem. As for those classes positioned above Object in the inheritance tree (like Kernel, BasicObject), their special behavior is very likely intentional.

In fact, this is also why BasicObject class scope is often used as a special, clean scope.

1
2
3
class BasicObject
p String # => NameError: uninitialized constant BasicObject::String
end

Constant Lookup Pitfalls

Let’s look at an interesting example:

1
2
3
class Hash
p String == Hash::String
end

Accessing top-level constants (here String) in a class scope (here Hash) meets our previous expectations. But what is Hash::String? Obviously, no such class exists. When we expect this code to throw an exception, we might be surprised by the following result:

1
2
(irb):21: warning: toplevel constant String referenced by Hash::String
true

How is this possible?

The reason is actually simple: these two query processes for the String constant are extremely similar in the inheritance tree search step. Both try to find the definition of constant String in Hash, and when not found, both search up the inheritance tree, ultimately both finding Object.

Does this cause problems?

Yes, especially in Rails autoloading environments. Since Rails autoloading depends on the implementation of const_missing, when a top-level constant with the same name has already been loaded, the program might incorrectly reference your constant to that top-level constant, only giving a warning message that’s easily overlooked in massive logs. The following code example is from the Rails official documentation[4]:

1
2
3
4
5
6
7
8
9
10
11
12
13
# app/models/hotel.rb
class Hotel
end

# app/models/image.rb
class Image
end

# app/models/hotel/image.rb
class Hotel
class Image < Image
end
end
1
2
$ bin/rails r 'Image; p Hotel::Image' 2>/dev/null
Image # NOT Hotel::Image!

Summary

Ruby’s constant lookup mechanism follows the following priority order:

  1. Scope Chain Lookup: Search from the current scope to outer scopes level by level according to Module.nesting order
  2. Inheritance Tree Lookup: Search up the inheritance chain of the current scope class
  3. Special Handling: For modules, additionally search for top-level constants from Object’s inheritance tree

This is what I’ve learned about Ruby’s constant lookup mechanism. I’ve compiled my findings here in hopes it might help others who encounter similar confusion.

That’s it!


  1. There are typically two constants methods. This specifically refers to the instance method defined in Module. When the parameter inherit = false, this method will only list constant names stored in the class’s constant table. ↩︎

  2. Based on Ruby 2.4 implementation; before Ruby 2.3, it was a NODE structure ↩︎

  3. Note that the starting point of the inheritance tree is not self.class but the current scope class ↩︎

  4. Fortunately, Ruby 2.5.0 finally changed the warning to directly throw an exception ↩︎