Overview
Ruby C extensions provide a mechanism for writing performance-critical code in C while maintaining integration with Ruby's object model and garbage collector. Extensions compile into shared libraries that Ruby loads dynamically, allowing direct access to Ruby's internal API and memory management systems.
The Ruby C API exposes core functionality through a set of macros and functions defined in ruby.h
. Extensions interact with Ruby objects through the VALUE
type, which represents all Ruby objects as C pointers. The API handles object creation, method definition, class inheritance, and exception handling while maintaining Ruby's dynamic typing system.
Ruby extensions follow a specific initialization pattern. Each extension defines an Init_extension_name
function that Ruby calls during library loading. This function registers classes, modules, and methods with the Ruby interpreter.
# Basic extension structure
require_relative 'my_extension'
class MyClass
include MyExtension
end
result = MyClass.new.fast_operation(large_dataset)
Extensions commonly optimize computationally intensive operations, provide bindings to C libraries, and implement algorithms that benefit from manual memory management. They integrate seamlessly with Ruby's object model, supporting inheritance, mixins, and metaprogramming features.
Basic Usage
Creating a C extension requires writing C source code, defining Ruby bindings, and configuring the build process. Extensions declare their structure through extconf.rb
files that generate appropriate Makefiles for compilation.
The fundamental building block involves defining C functions that accept and return VALUE
types. Ruby objects appear as VALUE
in C code, requiring conversion functions to access underlying data types.
# extconf.rb
require 'mkmf'
create_makefile('my_extension')
// my_extension.c
#include <ruby.h>
static VALUE
rb_string_reverse_bang(VALUE self)
{
char *ptr, *end;
char tmp;
StringValue(self);
rb_str_modify(self);
ptr = RSTRING_PTR(self);
end = ptr + RSTRING_LEN(self) - 1;
while (ptr < end) {
tmp = *ptr;
*ptr = *end;
*end = tmp;
ptr++;
end--;
}
return self;
}
void
Init_my_extension(void)
{
VALUE cString = rb_cString;
rb_define_method(cString, "reverse_bang", rb_string_reverse_bang, 0);
}
Method registration uses rb_define_method
for instance methods, rb_define_singleton_method
for class methods, and rb_define_module_function
for module functions. The final argument specifies the number of required parameters.
// Defining methods with different arities
rb_define_method(klass, "no_args", method_0, 0);
rb_define_method(klass, "one_arg", method_1, 1);
rb_define_method(klass, "two_args", method_2, 2);
rb_define_method(klass, "var_args", method_var, -1);
Building extensions requires running ruby extconf.rb
followed by make
. The resulting shared library loads through require
statements in Ruby code. Extensions must handle Ruby's garbage collection by properly marking and protecting objects from collection during C operations.
// Creating new Ruby objects
VALUE str = rb_str_new_cstr("Hello from C");
VALUE array = rb_ary_new();
VALUE hash = rb_hash_new();
VALUE number = INT2NUM(42);
Advanced Usage
Advanced C extensions implement custom classes, complex data structures, and sophisticated memory management patterns. Extensions can define entirely new Ruby classes with custom allocation and deallocation functions.
// Custom class definition with allocator
typedef struct {
double x, y, z;
int flags;
} Point3D;
static VALUE
point_alloc(VALUE klass)
{
Point3D *point = ALLOC(Point3D);
point->x = point->y = point->z = 0.0;
point->flags = 0;
return Data_Wrap_Struct(klass, NULL, point_free, point);
}
static void
point_free(void *ptr)
{
if (ptr) xfree(ptr);
}
static VALUE
point_initialize(int argc, VALUE *argv, VALUE self)
{
Point3D *point;
VALUE x, y, z;
Data_Get_Struct(self, Point3D, point);
rb_scan_args(argc, argv, "21", &x, &y, &z);
point->x = NUM2DBL(x);
point->y = NUM2DBL(y);
point->z = NIL_P(z) ? 0.0 : NUM2DBL(z);
return self;
}
Extensions implement module mixins and class inheritance by defining modules and extending existing classes. Complex extensions often expose multiple interconnected classes that share data structures and functionality.
// Module definition and inclusion
static VALUE mMathUtils;
static VALUE cVector;
static VALUE cMatrix;
void
Init_advanced_math(void)
{
mMathUtils = rb_define_module("MathUtils");
cVector = rb_define_class_under(mMathUtils, "Vector", rb_cObject);
cMatrix = rb_define_class_under(mMathUtils, "Matrix", rb_cObject);
rb_define_alloc_func(cVector, vector_alloc);
rb_define_method(cVector, "initialize", vector_init, -1);
rb_define_method(cVector, "*", vector_multiply, 1);
rb_define_method(cVector, "magnitude", vector_magnitude, 0);
}
Method chaining and fluent interfaces require careful memory management and object lifecycle handling. Extensions must protect intermediate objects from garbage collection during complex operations.
// Fluent interface implementation
static VALUE
vector_chain_operations(VALUE self)
{
Vector *vec;
VALUE result;
Data_Get_Struct(self, Vector, vec);
// Protect objects during operation chain
result = vector_normalize(self);
result = vector_scale(result, DBL2NUM(2.0));
result = vector_rotate(result, DBL2NUM(M_PI / 4));
return result;
}
Error Handling & Debugging
C extensions must handle both Ruby exceptions and C-level errors appropriately. Ruby provides exception handling mechanisms that integrate with C code through setjmp/longjmp functionality behind the scenes.
// Exception handling patterns
static VALUE
safe_division(VALUE self, VALUE numerator, VALUE denominator)
{
double num, denom, result;
num = NUM2DBL(numerator);
denom = NUM2DBL(denominator);
if (denom == 0.0) {
rb_raise(rb_eZeroDivisionError, "divided by 0");
}
result = num / denom;
if (isnan(result) || isinf(result)) {
rb_raise(rb_eFloatDomainError, "invalid result: %f", result);
}
return DBL2NUM(result);
}
Type checking and conversion errors require defensive programming practices. Extensions should validate input types before accessing underlying C data structures to prevent segmentation faults and memory corruption.
// Robust type checking
static VALUE
process_string_or_array(VALUE obj)
{
if (RB_TYPE_P(obj, T_STRING)) {
// Handle string case
StringValue(obj);
return rb_str_length(obj);
} else if (RB_TYPE_P(obj, T_ARRAY)) {
// Handle array case
return LONG2NUM(RARRAY_LEN(obj));
} else {
rb_raise(rb_eTypeError,
"expected String or Array, got %"PRIsVALUE,
rb_obj_class(obj));
}
}
Memory management debugging requires careful tracking of object allocation and protection. Extensions should use rb_gc_mark
functions for custom data structures containing Ruby objects and implement proper cleanup in deallocator functions.
// Debugging with mark and sweep functions
typedef struct {
VALUE callback;
VALUE data_store;
int ref_count;
} CallbackManager;
static void
callback_manager_mark(void *ptr)
{
CallbackManager *mgr = (CallbackManager *)ptr;
if (mgr) {
rb_gc_mark(mgr->callback);
rb_gc_mark(mgr->data_store);
}
}
static void
callback_manager_free(void *ptr)
{
CallbackManager *mgr = (CallbackManager *)ptr;
if (mgr) {
// Custom cleanup logic
mgr->ref_count = 0;
xfree(mgr);
}
}
Stack unwinding during exceptions requires protecting allocated memory from leaks. Use rb_protect
for exception-safe operations and cleanup resources in ensure blocks equivalent.
Performance & Memory
C extensions achieve significant performance improvements for computationally intensive operations. Benchmarks typically show 5-50x speedups for numeric computations, string processing, and data transformation tasks compared to pure Ruby implementations.
Memory efficiency improves through direct control over allocation patterns and reduced object creation overhead. Extensions can implement custom memory pools, reuse allocated buffers, and minimize garbage collection pressure.
// High-performance string processing
static VALUE
fast_string_transform(VALUE str)
{
char *ptr, *end;
long len;
VALUE result;
StringValue(str);
len = RSTRING_LEN(str);
// Allocate result string once
result = rb_str_new(NULL, len);
ptr = RSTRING_PTR(result);
end = ptr + len;
// Direct memory operations
const char *src = RSTRING_PTR(str);
while (ptr < end) {
*ptr = toupper(*src);
ptr++;
src++;
}
return result;
}
Garbage collection interaction affects extension performance significantly. Objects allocated in C must integrate properly with Ruby's mark-and-sweep collector. Extensions should minimize object allocation in tight loops and reuse existing objects where possible.
// Memory pool implementation
typedef struct {
void **pool;
size_t capacity;
size_t count;
} MemoryPool;
static MemoryPool *global_pool = NULL;
static void*
pool_allocate(size_t size)
{
if (global_pool && global_pool->count > 0) {
return global_pool->pool[--global_pool->count];
}
return xmalloc(size);
}
static void
pool_deallocate(void *ptr)
{
if (global_pool && global_pool->count < global_pool->capacity) {
global_pool->pool[global_pool->count++] = ptr;
} else {
xfree(ptr);
}
}
Cache-friendly data layouts and SIMD operations can further optimize performance. Extensions working with large datasets benefit from arranging data structures to maximize cache locality and minimize memory bandwidth requirements.
Production Patterns
Production C extensions require careful consideration of deployment environments, dependency management, and platform compatibility. Extensions must compile across different operating systems, Ruby versions, and system architectures.
Deployment strategies typically involve pre-compiled gems for common platforms or compilation during gem installation. The extconf.rb
file handles platform detection and compiler configuration automatically.
# Production extconf.rb
require 'mkmf'
# Check for required headers and libraries
have_header('sys/time.h') or abort 'Missing sys/time.h'
have_library('m', 'sin') or abort 'Missing math library'
# Platform-specific optimizations
case RbConfig::CONFIG['host_os']
when /darwin/
$CFLAGS << ' -O3 -march=native'
when /linux/
$CFLAGS << ' -O3 -fPIC'
when /mswin|mingw/
$CFLAGS << ' /O2'
end
create_makefile('production_extension')
Error reporting and logging integration helps monitor extension behavior in production environments. Extensions should provide detailed error information and integrate with Ruby's logging infrastructure.
// Production error handling
static VALUE
production_operation(VALUE self, VALUE data)
{
int status;
VALUE result = Qnil;
// Use rb_protect for exception safety
result = rb_protect(risky_operation, data, &status);
if (status != 0) {
// Log error details
VALUE error_info = rb_errinfo();
rb_funcall(rb_mKernel, rb_intern("warn"), 1,
rb_sprintf("Extension error in %s: %"PRIsVALUE,
__func__, error_info));
rb_jump_tag(status);
}
return result;
}
Performance monitoring hooks allow tracking extension behavior in production systems. Extensions can expose metrics through Ruby interfaces for integration with monitoring systems.
// Performance monitoring integration
static struct {
long call_count;
double total_time;
double max_time;
} operation_stats = {0, 0.0, 0.0};
static VALUE
monitored_operation(VALUE self, VALUE input)
{
struct timespec start, end;
double elapsed;
VALUE result;
clock_gettime(CLOCK_MONOTONIC, &start);
result = expensive_computation(input);
clock_gettime(CLOCK_MONOTONIC, &end);
elapsed = (end.tv_sec - start.tv_sec) +
(end.tv_nsec - start.tv_nsec) / 1e9;
operation_stats.call_count++;
operation_stats.total_time += elapsed;
if (elapsed > operation_stats.max_time) {
operation_stats.max_time = elapsed;
}
return result;
}
Common Pitfalls
C extensions face several common failure modes that can cause crashes, memory leaks, or incorrect behavior. Understanding these pitfalls prevents production issues and debugging difficulties.
Object lifetime management represents the most frequent source of errors. Ruby's garbage collector can free objects that C code still references, leading to segmentation faults when accessing deallocated memory.
// WRONG: Storing Ruby objects in global variables
static VALUE global_callback = Qnil; // Dangerous!
static VALUE
store_callback(VALUE self, VALUE callback)
{
global_callback = callback; // Object may be garbage collected
return Qnil;
}
// CORRECT: Using rb_global_variable or rb_gc_register_address
static VALUE global_callback = Qnil;
void
Init_callback_extension(void)
{
rb_global_variable(&global_callback); // Protect from GC
}
String modification requires understanding Ruby's string mutability model. Modifying strings without proper checks can corrupt string data or cause unexpected behavior when strings are frozen or shared.
// WRONG: Modifying strings without checks
static VALUE
unsafe_string_modify(VALUE str)
{
char *ptr = RSTRING_PTR(str);
ptr[0] = 'X'; // May modify frozen or shared string
return str;
}
// CORRECT: Proper string modification
static VALUE
safe_string_modify(VALUE str)
{
StringValue(str);
rb_str_modify(str); // Ensures string is modifiable
char *ptr = RSTRING_PTR(str);
ptr[0] = 'X';
return str;
}
Type assumptions cause runtime errors when methods receive unexpected object types. Extensions must validate input types explicitly rather than assuming callers provide correct types.
// Handling unexpected nil values
static VALUE
safe_array_access(VALUE array, VALUE index)
{
if (NIL_P(array)) {
return Qnil;
}
if (!RB_TYPE_P(array, T_ARRAY)) {
rb_raise(rb_eTypeError, "expected Array");
}
long idx = NUM2LONG(index);
long len = RARRAY_LEN(array);
if (idx < 0) idx += len;
if (idx < 0 || idx >= len) {
return Qnil;
}
return RARRAY_AREF(array, idx);
}
Reference
Core API Functions
Function | Parameters | Returns | Description |
---|---|---|---|
rb_define_class(name, super) |
const char* , VALUE |
VALUE |
Creates new class under Object |
rb_define_class_under(outer, name, super) |
VALUE , const char* , VALUE |
VALUE |
Creates nested class |
rb_define_module(name) |
const char* |
VALUE |
Creates new module |
rb_define_method(klass, name, func, argc) |
VALUE , const char* , VALUE(*)() , int |
void |
Defines instance method |
rb_define_singleton_method(obj, name, func, argc) |
VALUE , const char* , VALUE(*)() , int |
void |
Defines singleton method |
rb_define_attr(klass, name, read, write) |
VALUE , const char* , int , int |
void |
Defines attribute accessors |
Object Creation Functions
Function | Parameters | Returns | Description |
---|---|---|---|
rb_str_new(ptr, len) |
const char* , long |
VALUE |
Creates string from C buffer |
rb_str_new_cstr(cstr) |
const char* |
VALUE |
Creates string from null-terminated C string |
rb_ary_new() |
None | VALUE |
Creates empty array |
rb_ary_new_capa(capa) |
long |
VALUE |
Creates array with capacity |
rb_hash_new() |
None | VALUE |
Creates empty hash |
INT2NUM(val) |
int |
VALUE |
Converts int to Ruby number |
DBL2NUM(val) |
double |
VALUE |
Converts double to Ruby number |
Type Checking Macros
Macro | Parameters | Returns | Description |
---|---|---|---|
RB_TYPE_P(obj, type) |
VALUE , enum ruby_value_type |
int |
Tests object type |
NIL_P(obj) |
VALUE |
int |
Tests for nil |
FIXNUM_P(obj) |
VALUE |
int |
Tests for Fixnum |
SYMBOL_P(obj) |
VALUE |
int |
Tests for Symbol |
StringValue(obj) |
VALUE |
void |
Ensures object is string |
Check_Type(obj, type) |
VALUE , enum ruby_value_type |
void |
Raises TypeError if wrong type |
Data Access Functions
Function | Parameters | Returns | Description |
---|---|---|---|
NUM2INT(obj) |
VALUE |
int |
Converts Ruby number to int |
NUM2LONG(obj) |
VALUE |
long |
Converts Ruby number to long |
NUM2DBL(obj) |
VALUE |
double |
Converts Ruby number to double |
RSTRING_PTR(str) |
VALUE |
char* |
Gets string data pointer |
RSTRING_LEN(str) |
VALUE |
long |
Gets string length |
RARRAY_PTR(ary) |
VALUE |
VALUE* |
Gets array data pointer |
RARRAY_LEN(ary) |
VALUE |
long |
Gets array length |
Memory Management Functions
Function | Parameters | Returns | Description |
---|---|---|---|
ALLOC(type) |
type |
type* |
Allocates memory for type |
ALLOC_N(type, n) |
type , long |
type* |
Allocates array of n elements |
REALLOC_N(ptr, type, n) |
type* , type , long |
type* |
Reallocates array |
xfree(ptr) |
void* |
void |
Frees allocated memory |
Data_Wrap_Struct(klass, mark, free, ptr) |
VALUE , RUBY_DATA_FUNC , RUBY_DATA_FUNC , void* |
VALUE |
Wraps C struct as Ruby object |
Data_Get_Struct(obj, type, var) |
VALUE , type , type* |
void |
Extracts C struct from Ruby object |
Exception Handling
Function | Parameters | Returns | Description |
---|---|---|---|
rb_raise(exception, fmt, ...) |
VALUE , const char* , ... |
void |
Raises exception with message |
rb_protect(func, arg, state) |
VALUE(*)() , VALUE , int* |
VALUE |
Executes function with exception protection |
rb_rescue(func, arg1, rescue, arg2) |
VALUE(*)() , VALUE , VALUE(*)() , VALUE |
VALUE |
Executes with rescue handler |
rb_ensure(func, arg1, ensure, arg2) |
VALUE(*)() , VALUE , VALUE(*)() , VALUE |
VALUE |
Executes with ensure handler |
Built-in Exception Classes
Constant | Description |
---|---|
rb_eStandardError |
Standard error base class |
rb_eRuntimeError |
Runtime error |
rb_eTypeError |
Type mismatch error |
rb_eArgumentError |
Invalid argument error |
rb_eIndexError |
Array/string index error |
rb_eZeroDivisionError |
Division by zero error |
rb_eNoMemError |
Out of memory error |
rb_eIOError |
Input/output error |
Method Arity Values
Value | Meaning |
---|---|
0 |
No arguments |
1 , 2 , 3 , etc. |
Exact number of required arguments |
-1 |
Variable arguments (use rb_scan_args ) |
-2 |
Variable arguments passed as array |
Build Configuration Options
Option | Purpose |
---|---|
have_header(header) |
Check for system header availability |
have_library(lib, func) |
Check for library and function |
have_func(func, header) |
Check for specific function |
pkg_config(package) |
Use pkg-config for library detection |
$CFLAGS |
Compiler flags |
$LDFLAGS |
Linker flags |
$INCFLAGS |
Include directory flags |