CrackedRuby logo

CrackedRuby

Foreign Function Interface

Complete guide to Ruby's Foreign Function Interface for calling C libraries and managing native code integration.

Performance Optimization C Extensions and FFI
7.5.2

Overview

Foreign Function Interface (FFI) in Ruby provides a mechanism for calling functions in native libraries written in C and other languages that follow C calling conventions. Ruby's FFI implementation offers a high-level abstraction over traditional C extensions, eliminating the need to write C wrapper code while maintaining access to system libraries and native performance.

The ffi gem serves as Ruby's primary FFI implementation, built on top of libffi. It handles type conversion between Ruby objects and C data types, manages memory allocation and deallocation, and provides error handling for native function calls. FFI operates through library bindings that define function signatures, data structures, and callback mechanisms.

require 'ffi'

module MathLib
  extend FFI::Library
  ffi_lib 'libm'
  
  attach_function :sin, [:double], :double
  attach_function :cos, [:double], :double
end

result = MathLib.sin(1.0)
# => 0.8414709848078965

FFI handles the complexity of calling conventions, stack management, and data marshaling automatically. When attach_function declares a binding, FFI generates the necessary glue code to convert Ruby arguments to C types, invoke the native function, and convert the return value back to a Ruby object.

module SystemInfo
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  attach_function :getuid, [], :uint
  attach_function :getpid, [], :int
end

user_id = SystemInfo.getuid
process_id = SystemInfo.getpid
# Native system calls return immediately

The FFI approach provides several advantages over traditional C extensions. Library bindings require no compilation step, making them portable across Ruby implementations. FFI automatically handles garbage collection concerns for most use cases and provides memory safety through managed pointers and automatic cleanup.

Basic Usage

Setting up FFI requires extending the FFI::Library module and declaring library dependencies through ffi_lib. The library specification accepts standard library names, absolute paths, or version-specific names depending on the target platform.

require 'ffi'

module SQLite
  extend FFI::Library
  ffi_lib 'sqlite3'
  
  # Function declarations map C signatures to Ruby calls
  attach_function :sqlite3_open, [:string, :pointer], :int
  attach_function :sqlite3_close, [:pointer], :int
  attach_function :sqlite3_errmsg, [:pointer], :string
end

Function attachment through attach_function requires three components: the function name, parameter types array, and return type. Ruby translates these type specifications into appropriate C calling conventions and performs automatic type conversion during invocation.

Type mapping covers fundamental C types through Ruby symbols. Integers use :int, :uint, :long, :ulong with appropriate bit width handling. Floating point numbers use :float and :double. String parameters use :string for null-terminated C strings, while :pointer handles generic memory addresses.

module FileOps
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  attach_function :open, [:string, :int], :int
  attach_function :read, [:int, :pointer, :size_t], :ssize_t
  attach_function :close, [:int], :int
  
  # Ruby strings automatically convert to C char*
  fd = open("/etc/passwd", 0)
  
  # Pointer allocation for output buffers
  buffer = FFI::MemoryPointer.new(:char, 1024)
  bytes_read = read(fd, buffer, 1024)
  content = buffer.read_string(bytes_read)
  close(fd)
end

Memory management requires explicit allocation for output parameters and buffers. FFI::MemoryPointer creates managed memory regions that automatically clean up when the Ruby object becomes unreachable. The memory pointer provides read and write methods for accessing the underlying data with appropriate type conversion.

Structure definitions enable complex data exchange between Ruby and C code. FFI::Struct subclasses define field layouts that match C struct definitions, providing automatic packing and alignment according to platform-specific rules.

class TimeVal < FFI::Struct
  layout :tv_sec,  :long,
         :tv_usec, :long
end

module TimeOps
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  attach_function :gettimeofday, [TimeVal, :pointer], :int
end

timeval = TimeVal.new
TimeOps.gettimeofday(timeval, nil)
puts "Seconds: #{timeval[:tv_sec]}, Microseconds: #{timeval[:tv_usec]}"

Arrays and buffer handling requires understanding pointer arithmetic and memory layout. FFI provides array access methods for both input and output scenarios, with bounds checking available through size tracking.

module ArrayDemo
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  attach_function :qsort, [:pointer, :size_t, :size_t, :pointer], :void
end

# Integer array sorting through C qsort
numbers = [64, 34, 25, 12, 22, 11, 90]
int_array = FFI::MemoryPointer.new(:int, numbers.size)
int_array.write_array_of_int(numbers)

# Comparison function as callback
compare_proc = FFI::Function.new(:int, [:pointer, :pointer]) do |a, b|
  a.read_int <=> b.read_int
end

ArrayDemo.qsort(int_array, numbers.size, 4, compare_proc)
sorted = int_array.read_array_of_int(numbers.size)
# => [11, 12, 22, 25, 34, 64, 90]

Advanced Usage

Callback functions enable C libraries to invoke Ruby code during execution. FFI callbacks require explicit type signatures and handle the conversion between Ruby blocks or Proc objects and C function pointers. The callback mechanism maintains proper stack management and exception handling across the language boundary.

module EventLoop
  extend FFI::Library
  ffi_lib 'libevent'
  
  # Callback signature: void (*callback)(int fd, short what, void *arg)
  callback :event_callback, [:int, :short, :pointer], :void
  
  attach_function :event_new, [:pointer, :int, :short, :event_callback, :pointer], :pointer
  attach_function :event_add, [:pointer, :pointer], :int
end

# Ruby callback implementation
event_handler = proc do |fd, what, arg|
  puts "Event on fd #{fd}, type #{what}"
  # Ruby code executes in C callback context
end

event = EventLoop.event_new(base, socket_fd, flags, event_handler, nil)
EventLoop.event_add(event, timeout_ptr)

Enumeration handling maps C enum constants to Ruby symbols or integers. FFI provides enum declaration syntax that generates constant mappings and validates values during function calls. Enum definitions support both automatic numbering and explicit value assignment.

module NetworkLib
  extend FFI::Library
  ffi_lib 'network'
  
  enum :socket_type, [
    :stream, 1,     # SOCK_STREAM
    :dgram, 2,      # SOCK_DGRAM
    :raw, 3         # SOCK_RAW
  ]
  
  enum :protocol, [
    :tcp,           # Automatic numbering starts at 0
    :udp,
    :icmp
  ]
  
  attach_function :create_socket, [:socket_type, :protocol], :int
end

# Enum symbols convert to appropriate integer values
socket_fd = NetworkLib.create_socket(:stream, :tcp)

Union types handle C unions through FFI::Union classes that provide overlapping memory layout. Union members share the same memory location, allowing multiple interpretations of the same data. Field access follows the same patterns as structures with appropriate type conversion.

class DataUnion < FFI::Union
  layout :int_value,    :int,
         :float_value,  :float,
         :bytes,        [:char, 4]
end

data = DataUnion.new
data[:int_value] = 0x41200000
puts data[:float_value]  # => 10.0 (IEEE 754 representation)
puts data[:bytes].to_a   # => [0, 0, 32, 65] (little-endian bytes)

Variable arguments and function pointers require advanced pointer manipulation. FFI supports variadic function calls through explicit argument marshaling and provides function pointer types for dynamic library loading and callback registration.

module PrintfDemo
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  # Variadic function requires explicit argument specification
  attach_function :printf, [:string, :varargs], :int
  attach_function :sprintf, [:pointer, :string, :varargs], :int
end

# Printf with multiple arguments
PrintfDemo.printf("Number: %d, String: %s\n", :int, 42, :string, "hello")

# Sprintf with buffer allocation
buffer = FFI::MemoryPointer.new(:char, 256)
length = PrintfDemo.sprintf(buffer, "Value: %f", :double, 3.14159)
result = buffer.read_string(length)

Platform-specific code handling accommodates different operating systems and architectures through conditional library loading and type definitions. FFI provides platform detection utilities and supports multiple library search strategies.

module PlatformLib
  extend FFI::Library
  
  case FFI::Platform::OS
  when 'linux'
    ffi_lib 'liblinux_specific.so'
    typedef :ulong, :size_type
  when 'darwin'
    ffi_lib 'libmac_specific.dylib'
    typedef :ulong, :size_type
  when 'windows'
    ffi_lib 'windows_specific.dll'
    typedef :uint, :size_type
  end
  
  attach_function :get_system_info, [:pointer], :size_type
end

Error Handling & Debugging

FFI error scenarios fall into several categories: library loading failures, function binding errors, type conversion problems, and runtime exceptions during native calls. Each error type requires different debugging approaches and recovery strategies.

Library loading errors occur when ffi_lib cannot locate the specified library. The error messages indicate search paths and naming conventions, but troubleshooting requires understanding platform-specific library resolution mechanisms.

module LibraryLoader
  extend FFI::Library
  
  # Multiple library fallbacks for cross-platform compatibility
  begin
    ffi_lib 'libcustom', 'custom', 'libcustom.so.1'
  rescue LoadError => e
    puts "Library load failed: #{e.message}"
    # Fallback to system library or alternative implementation
    ffi_lib FFI::Library::LIBC
  end
end

Function binding validation occurs at attachment time rather than call time. Incorrect signatures produce immediate errors with specific details about the mismatch. Type validation catches incompatible parameter specifications before any native calls execute.

module ValidationExample
  extend FFI::Library
  ffi_lib 'libmath'
  
  begin
    # Incorrect signature catches errors early
    attach_function :pow, [:string, :int], :double  # Wrong types
  rescue FFI::NotFoundError => e
    puts "Function binding failed: #{e.message}"
    # Correct signature
    attach_function :pow, [:double, :double], :double
  end
end

Runtime type conversion errors happen during function invocation when Ruby values cannot convert to the specified C types. FFI performs validation on each argument and raises TypeError exceptions with details about the conversion failure.

module RuntimeErrors
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  attach_function :abs, [:int], :int
  
  begin
    result = abs("not a number")  # TypeError during conversion
  rescue TypeError => e
    puts "Type conversion failed: #{e.message}"
    # Handle with proper type
    result = abs(-42)  # => 42
  end
end

Memory access violations occur when pointers reference invalid memory regions or when buffer operations exceed allocated boundaries. FFI provides bounds checking for managed memory but cannot prevent all invalid access patterns.

def safe_buffer_operations
  buffer = FFI::MemoryPointer.new(:char, 100)
  
  begin
    # Safe operations within buffer bounds
    buffer.write_string("Hello, World!")
    content = buffer.read_string(13)
    
    # Dangerous operations that may cause segfaults
    # buffer.read_string(1000)  # Reading beyond buffer
    # buffer.write_string("x" * 200)  # Writing beyond buffer
  ensure
    # Explicit cleanup for debugging
    buffer.clear if buffer
  end
end

Debugging native calls requires understanding the call stack transition between Ruby and C code. FFI provides debugging options through environment variables and logging mechanisms that trace function calls and argument conversion.

# Enable FFI debugging through environment
ENV['FFI_DEBUG'] = '1'

module DebugExample
  extend FFI::Library
  ffi_lib 'libdebug'
  
  attach_function :debug_function, [:int, :string], :pointer
  
  # Calls produce detailed logging output
  result = debug_function(42, "test parameter")
end

Performance & Memory

FFI performance characteristics depend on call frequency, argument complexity, and memory allocation patterns. Function calls incur overhead from type conversion and stack management, making FFI unsuitable for tight loops requiring maximum performance. The overhead becomes negligible for I/O-bound operations or infrequent native calls.

Memory allocation patterns significantly impact FFI performance. Frequent allocation and deallocation of FFI::MemoryPointer objects creates garbage collection pressure. Reusing buffers and minimizing memory churn improves performance for repetitive operations.

# Performance comparison: allocation patterns
require 'benchmark'

module PerformanceTest
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  attach_function :memcpy, [:pointer, :pointer, :size_t], :pointer
end

# Inefficient: repeated allocation
def slow_copy(data, iterations)
  iterations.times do
    source = FFI::MemoryPointer.new(:char, data.length)
    dest = FFI::MemoryPointer.new(:char, data.length)
    source.write_string(data)
    PerformanceTest.memcpy(dest, source, data.length)
  end
end

# Efficient: buffer reuse
def fast_copy(data, iterations)
  source = FFI::MemoryPointer.new(:char, data.length)
  dest = FFI::MemoryPointer.new(:char, data.length)
  source.write_string(data)
  
  iterations.times do
    PerformanceTest.memcpy(dest, source, data.length)
  end
end

Benchmark.bm do |x|
  x.report("slow") { slow_copy("test data", 10000) }
  x.report("fast") { fast_copy("test data", 10000) }
end

Type conversion overhead varies significantly between different C types. Simple integer and floating-point conversions require minimal processing, while string and structure conversions involve memory allocation and field-by-field copying. Pointer types provide the fastest conversion since they represent direct memory addresses.

# Type conversion performance characteristics
module ConversionBench
  extend FFI::Library
  ffi_lib FFI::Library::LIBC
  
  # Fast: direct value types
  attach_function :identity_int, :abs, [:int], :int
  attach_function :identity_double, :fabs, [:double], :double
  
  # Medium: string conversion (allocation required)
  attach_function :strlen, [:string], :size_t
  
  # Complex: structure conversion (field-by-field)
  attach_function :stat, [:string, :pointer], :int
end

# Benchmark different type conversions
test_string = "benchmark string"
Benchmark.bm do |x|
  x.report("int")    { 100000.times { ConversionBench.identity_int(42) } }
  x.report("double") { 100000.times { ConversionBench.identity_double(3.14) } }
  x.report("string") { 100000.times { ConversionBench.strlen(test_string) } }
end

Memory management requires understanding FFI's automatic cleanup behavior and manual management options. FFI::MemoryPointer objects automatically release memory when garbage collected, but long-running applications may require explicit memory management to prevent accumulation.

class ManagedBuffer
  def initialize(size)
    @buffer = FFI::MemoryPointer.new(:char, size)
    @size = size
  end
  
  def write_data(data)
    raise ArgumentError, "Data too large" if data.length > @size
    @buffer.write_string(data)
  end
  
  def read_data(length = nil)
    length ||= @size
    @buffer.read_string(length)
  end
  
  def clear
    # Explicit cleanup for immediate memory release
    @buffer.clear if @buffer
    @buffer = nil
  end
end

Large structure arrays require careful memory management to prevent excessive allocation. FFI provides array types and pointer arithmetic for efficient bulk operations without individual object creation.

# Efficient array handling for large datasets
class Point < FFI::Struct
  layout :x, :double,
         :y, :double
end

def process_points(coordinates)
  # Single allocation for entire array
  points_array = FFI::MemoryPointer.new(Point, coordinates.length)
  
  coordinates.each_with_index do |(x, y), i|
    point = Point.new(points_array + i * Point.size)
    point[:x] = x
    point[:y] = y
  end
  
  # Pass array pointer to native function
  # native_process_points(points_array, coordinates.length)
  
  # Read results back
  results = coordinates.length.times.map do |i|
    point = Point.new(points_array + i * Point.size)
    [point[:x], point[:y]]
  end
  
  results
end

Common Pitfalls

Type size mismatches represent the most frequent FFI errors. C type sizes vary between platforms, and assuming fixed sizes leads to incorrect memory layouts and function call failures. Ruby's FFI provides platform-aware type definitions, but developers must account for architecture differences in structure layouts.

# Incorrect: assumes 32-bit integers everywhere
class BadStruct < FFI::Struct
  layout :field1, :int,      # Size varies by platform
         :field2, :long      # Especially problematic
end

# Correct: explicit size specifications
class GoodStruct < FFI::Struct
  layout :field1, :int32,    # Guaranteed 32-bit
         :field2, :int64     # Guaranteed 64-bit
end

# Platform-aware sizing
class FlexibleStruct < FFI::Struct
  layout :field1, :size_t,   # Matches platform pointer size
         :field2, :uintptr_t  # Address-sized integer
end

String encoding problems occur when C functions expect specific character encodings but Ruby strings use different encodings. FFI performs automatic conversion for :string parameters, but the conversion may corrupt data if encodings don't match properly.

module EncodingIssues
  extend FFI::Library
  ffi_lib 'libstring'
  
  attach_function :process_utf8, [:string], :string
  attach_function :process_ascii, [:string], :string
end

# Problematic: mixed encodings
utf8_string = "Héllo Wörld".encode('UTF-8')
ascii_result = EncodingIssues.process_ascii(utf8_string)  # May corrupt

# Solution: explicit encoding management
def safe_string_call(text, target_encoding = 'ASCII')
  converted = text.encode(target_encoding, 
                         :invalid => :replace, 
                         :undef => :replace)
  EncodingIssues.process_ascii(converted)
rescue Encoding::UndefinedConversionError
  # Handle conversion failures appropriately
  nil
end

Pointer lifetime management causes subtle bugs when pointers outlive their backing memory. Ruby's garbage collector may free memory while C code still holds references, leading to use-after-free conditions that produce unpredictable behavior.

def pointer_lifetime_bug
  buffer = FFI::MemoryPointer.new(:char, 100)
  buffer.write_string("important data")
  
  # Return pointer without maintaining buffer reference
  return buffer  # DANGEROUS: buffer may be GC'd
end

def safe_pointer_handling
  buffer = FFI::MemoryPointer.new(:char, 100)
  buffer.write_string("important data")
  
  # Keep buffer alive during pointer usage
  result = yield buffer  # Pass to block for controlled lifetime
  result
ensure
  buffer.clear if buffer
end

# Usage with explicit lifetime management
safe_pointer_handling do |buffer|
  # Use buffer pointer within controlled scope
  SomeLib.process_data(buffer)
end

Callback scope and garbage collection create timing-dependent bugs. Ruby may garbage collect Proc objects used as callbacks if no Ruby references remain, causing C callback invocations to fail with segmentation faults.

module CallbackLifetime
  extend FFI::Library
  ffi_lib 'libcallback'
  
  callback :handler_func, [:int], :void
  attach_function :register_callback, [:handler_func], :void
  attach_function :trigger_callback, [:int], :void
end

# Dangerous: callback may be garbage collected
def register_temporary_callback
  handler = proc { |value| puts "Callback: #{value}" }
  CallbackLifetime.register_callback(handler)
  # handler goes out of scope, may be GC'd
end

# Safe: maintain callback reference
class CallbackManager
  def initialize
    @handlers = []
  end
  
  def register_handler(&block)
    handler = proc { |value| block.call(value) }
    @handlers << handler  # Prevent garbage collection
    CallbackLifetime.register_callback(handler)
    handler
  end
  
  def unregister_handler(handler)
    @handlers.delete(handler)
  end
end

Thread safety violations occur when multiple Ruby threads access shared FFI resources without proper synchronization. C libraries may not be thread-safe, and concurrent access to FFI memory objects can corrupt data or cause crashes.

# Unsafe: concurrent access to shared buffer
@shared_buffer = FFI::MemoryPointer.new(:char, 1024)

# Multiple threads writing simultaneously
threads = 10.times.map do |i|
  Thread.new do
    @shared_buffer.write_string("Thread #{i} data")  # Race condition
  end
end

# Safe: thread-local buffers or synchronization
require 'thread'

class ThreadSafeFFI
  def initialize
    @mutex = Mutex.new
    @buffer = FFI::MemoryPointer.new(:char, 1024)
  end
  
  def safe_write(data)
    @mutex.synchronize do
      @buffer.write_string(data)
      # Atomic operations within critical section
    end
  end
end

Structure padding and alignment issues arise when Ruby structure definitions don't match C structure layouts. Compilers insert padding bytes to align fields on appropriate boundaries, and incorrect padding calculations cause field offset errors.

# Problematic: no padding consideration
class UnalignedStruct < FFI::Struct
  layout :char_field,  :char,    # 1 byte
         :int_field,   :int      # Assumes immediate follow, may be wrong
end

# Solution: explicit padding or packed layout
class AlignedStruct < FFI::Struct
  layout :char_field,  :char,    # 1 byte
         :padding,     [:char, 3], # Explicit padding to align
         :int_field,   :int      # 4-byte aligned
end

# Alternative: let FFI handle alignment automatically
class AutoAlignedStruct < FFI::Struct
  layout :char_field,  :char,
         :int_field,   :int
  
  # FFI handles platform-specific alignment
end

# Verify structure layout matches expectations
puts "Struct size: #{AutoAlignedStruct.size}"
puts "Field offsets: char=#{AutoAlignedStruct.offset_of(:char_field)}, int=#{AutoAlignedStruct.offset_of(:int_field)}"

Reference

Core Classes

Class Purpose Key Methods
FFI::Library Module extension for library bindings ffi_lib, attach_function, callback
FFI::MemoryPointer Managed memory allocation new, read_*, write_*, clear
FFI::Struct C structure definitions layout, [], []=, size, offset_of
FFI::Union C union definitions layout, [], []=, size
FFI::Function Function pointer creation new, call

Type Mappings

FFI Type C Type Ruby Type Size (bits)
:char char Integer 8
:uchar unsigned char Integer 8
:short short Integer 16
:ushort unsigned short Integer 16
:int int Integer Platform-dependent
:uint unsigned int Integer Platform-dependent
:long long Integer Platform-dependent
:ulong unsigned long Integer Platform-dependent
:long_long long long Integer 64
:ulong_long unsigned long long Integer 64
:int8 int8_t Integer 8
:uint8 uint8_t Integer 8
:int16 int16_t Integer 16
:uint16 uint16_t Integer 16
:int32 int32_t Integer 32
:uint32 uint32_t Integer 32
:int64 int64_t Integer 64
:uint64 uint64_t Integer 64
:float float Float 32
:double double Float 64
:pointer void* FFI::Pointer Platform-dependent
:string char* String Platform-dependent
:bool bool Boolean 8
:size_t size_t Integer Platform-dependent
:ssize_t ssize_t Integer Platform-dependent

Library Declaration Methods

Method Parameters Description
ffi_lib(*names) Library names or paths Specify native libraries to load
attach_function(name, args, returns) Function name, parameter types, return type Bind C function to Ruby method
attach_function(ruby_name, c_name, args, returns) Ruby method name, C function name, types Bind with name aliasing
callback(name, args, returns) Callback name, parameter types, return type Define callback function type
typedef(existing_type, new_name) Existing type, new type name Create type alias
enum(name, values) Enum name, value array or hash Define enumeration constants

Memory Operations

Method Parameters Returns Description
FFI::MemoryPointer.new(type, count=1) Type symbol, element count MemoryPointer Allocate typed memory
FFI::MemoryPointer.new(size) Byte count MemoryPointer Allocate raw memory
#read_int None Integer Read 4-byte signed integer
#write_int(value) Integer value Integer Write 4-byte signed integer
#read_string(length=nil) Optional byte count String Read null-terminated or fixed-length string
#write_string(string) String value Integer Write string with null terminator
#read_array_of_int(count) Element count Array Read integer array
#write_array_of_int(array) Integer array Array Write integer array
#clear None nil Release memory immediately

Structure Definition

Method Parameters Returns Description
layout(*field_specs) Field specifications nil Define structure field layout
#[](field_name) Field symbol Value Read structure field value
#[]=(field_name, value) Field symbol, value Value Write structure field value
#size None Integer Structure size in bytes
#offset_of(field) Field symbol Integer Field offset in bytes
#values None Array All field values as array

Platform Constants

Constant Description
FFI::Platform::OS Operating system name ('linux', 'darwin', 'windows')
FFI::Platform::ARCH Architecture name ('x86_64', 'i386', 'arm64')
FFI::Platform::NAME Combined platform identifier
FFI::Library::LIBC Standard C library name for platform
FFI::Library::LIBDL Dynamic loading library name

Error Classes

Exception Cause Recovery Strategy
LoadError Library loading failure Check library paths, fallback libraries
FFI::NotFoundError Function not found in library Verify function name, check library exports
TypeError Type conversion failure Validate argument types, check null values
ArgumentError Invalid argument count or format Check function signature, parameter count
FFI::NullPointerError Null pointer dereference Validate pointer before access

Debugging Environment Variables

Variable Effect
FFI_DEBUG=1 Enable function call tracing
FFI_DEBUG=2 Include argument and return value details
FFI_TYPE_DEBUG=1 Show type conversion details