Making Ruby Segfault

(This article was tested with ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24].)

Welcome to part two of our FFI adventures. We will now learn about how it is possible to have segfaults in Ruby code.

There are (as I learned from a bit of research) a few ways to get Ruby to hit segfaults. The ones I know about are:

  • Hitting segfaults in the FFI
  • Hitting segfaults using the native C interface (we’ll talk about the difference)
  • Hitting interpreter bugs

The last one is hard to straight up show here because the CRuby interpreter is more complicated than really any of the code I’ve worked with so far and I have limited time to dedicate to this side quest. As a result, here’s an example of a bug report so you know it’s at least possible.

When I searched around and asked an LLM for examples of cases, the LLM suggested a few cases in Ruby that might also cause segfaults. It turns out, as is fairly predictable, that the LLM was just wrong.

I’ll go through the cases anyway.

  1. We garbage collect an object and then reference it after.
s = "Hello, John!"
id = s.object_id
s = nil
GC.start # start the garbage collector
puts ObjectSpace._id2ref(id)

This doesn’t segfault, but gives us useful information:

'ObjectSpace._id2ref': "16" is recycled object (RangeError).

  1. We reference some object that doesn’t exist.
bad_id = 430123456789
puts ObjectSpace._id2ref(bad_id)

It turns out this doesn’t actually cause any errors??? On my machine it prints 430123456789. I genuinely have no idea why. This is a question for another time.

It’s safe to say though, that we cannot segfault in these two ways. This gives us the remaining methods of calling into memory unsafe code; in our case C.

We first need to discuss what separates an FFI from a native extension. Essentially you can chalk it up to what you can do in either of them. With the FFI, you’re usually using code that wasn’t written specifically to work with Ruby. The FFI gem gives you the basic types you’d need, but not too much more.

A native extension allows you to have access to the underlying pieces of Ruby. If you want to write C code that creates a Ruby class or accesses pieces of the garbage collector, you’ll need to invest in writing a native extension.

Let’s see how both of these options can go wrong.

FFI #

Take some C.

void unsafe_function(void) {
	int* ptr = 0;
	*ptr = 5;
}

Compile it to a shared library now! This lets us reference it in some other code.

clang -shared -o unsafe.so -fPIC unsafe.c

Now we reference it from some Ruby code and see how it goes.

require "ffi"

module UnsafeLib
	extend FFI::Library
	ffi_lib "unsafe.so" # the name of your shared library
	attach_function :unsafe_function, [], :void
end

UnsafeLib.unsafe_function

This crashes!!! Ruby dumps so much information here:

a.rb:9: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]

-- Crash Report log information --------------------------------------------
   See Crash Report log file in one of the following locations:
     * ~/Library/Logs/DiagnosticReports
     * /Library/Logs/DiagnosticReports
   for more details.
Don't forget to include the above Crash Report log file in bug reports.

-- Control frame information -----------------------------------------------
c:0003 p:---- s:0010 e:000009 CFUNC  :unsafe_function
c:0002 p:0018 s:0006 e:000005 EVAL   a.rb:9 [FINISH]
c:0001 p:0000 s:0003 E:0026e0 DUMMY  [FINISH]

-- Ruby level backtrace information ----------------------------------------
a.rb:9:in '<main>'
a.rb:9:in 'unsafe_function'

-- Threading information ---------------------------------------------------
Total ractor count: 1
Ruby thread count for this ractor: 1

-- Machine register context ------------------------------------------------
  x0: 0x00006000033145e0  x1: 0xecb191d55eff00b2  x2: 0x0000000102ece000
  x3: 0x0000600003316640  x4: 0x0000000000000b61  x5: 0x0000000000000018
  x6: 0x0000600003314100  x7: 0x000000013b80f400 x18: 0x0000000000000000
 x19: 0x0000600000436b50 x20: 0x000000016dd8ab70 x21: 0x00000001020fbf8c
 x22: 0x000000016dd8ab70 x23: 0x0000600000436b98 x24: 0x0000000140028030
 x25: 0x0000000102590788 x26: 0x0000000055550083 x27: 0x0000600003d08570
 x28: 0x0000000000000000  lr: 0x00000001025e0050  fp: 0x000000016dd8a9f0
  sp: 0x000000016dd8a9e0

-- C level backtrace information -------------------------------------------
/opt/homebrew/Cellar/ruby/3.4.1/lib/libruby.3.4.dylib(rb_vm_bugreport+0x344) [0x102c52114]
/opt/homebrew/Cellar/ruby/3.4.1/lib/libruby.3.4.dylib(rb_bug_for_fatal_signal) [0x102aea848]
/opt/homebrew/Cellar/ruby/3.4.1/lib/libruby.3.4.dylib(sig_do_nothing) [0x102bd69e0]
/usr/lib/system/libsystem_platform.dylib(_sigtramp+0x38) [0x192a2ede4]

...

Rest of details are truncated for brevity.

As you can see, we crash in the CFUNC which calls into unsafe_function. Interestingly, we get a really good printout here. I think this is the Ruby process responding to the SIGSEGV signal and giving us good diagnostics. If we were to do the same in C, we get to read a lot less.

int main(void) {
	int* ptr = 0;
	*ptr = 5;
	return 0;
}

Which outputs (using the Bash shell):

Segmentation fault: 11

Native C extensions #

Let’s write a native C extension for Ruby. We’ll invoke the same unsafe behaviour that we do in the FFI examples.

(I use this reference in this section, as it’s written by a very reputable source.)

Aside: at this point I got really lost on a side quest for whether I could find the closed form of a sum of squares without knowing the closed form of a sum of natural numbers. I’m sorry for the delay.

First we write the C code required for it:

#include "ruby.h"

VALUE hello_world(VALUE self) {
    return rb_str_new_cstr("Hello, World!");
}

void Init_extension() {
    rb_define_global_function("hello_world", hello_world, 0);
}

Apparently we need to use something called mmkf to generate a Makefile to build the “bundle”.

require 'mkmf'

create_makefile('extension')

I named this extconf.rb and ran it. It generated a Makefile with all the necessary pieces to compile my native C extension into a .bundle file that Ruby can load. (It’s a .so on Linux or Windows from what I know.)

Now, we test it in irb:

irb(main):001> require './extension'
=> true
irb(main):002> puts hello_world
Hello, World!
=> nil

Great, now we can see if we can make it trigger a segfault.

Modify the C code as such:

#include "ruby.h"

VALUE dangerous_function(VALUE self) {
    int* i = (int *)0xBEEF;
    *i = 5;
    return 0;
}

void Init_extension() {
    rb_define_global_function("dangerous_function", dangerous_function, 0);
}

(I first tried this with deferencing 0, but MacOS caught it and turned it into a SIGTRAP instead - so I turned it to something decidedly unvegan instead.)

Running this test file:

require './extension'

dangerous_function

Gives us the same SIGSEGV as before:

test.rb:3: [BUG] Segmentation fault at 0x000000000000beef
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]

-- Crash Report log information --------------------------------------------
   See Crash Report log file in one of the following locations:
     * ~/Library/Logs/DiagnosticReports
     * /Library/Logs/DiagnosticReports
   for more details.
Don't forget to include the above Crash Report log file in bug reports.

-- Control frame information -----------------------------------------------
c:0003 p:---- s:0010 e:000009 CFUNC  :dangerous_function
c:0002 p:0009 s:0006 e:000005 EVAL   test.rb:3 [FINISH]
c:0001 p:0000 s:0003 E:001fc0 DUMMY  [FINISH]

-- Ruby level backtrace information ----------------------------------------
test.rb:3:in '<main>'
test.rb:3:in 'dangerous_function'

...

Rest of details are truncated for brevity.

Hopefully this clarifies how one can end up getting into segfaults in a memory safe language such as Ruby.