CrackedRuby logo

CrackedRuby

Memory View API

Technical documentation for Ruby's Memory View API, covering efficient memory buffer access and manipulation without data copying.

Performance Optimization C Extensions and FFI
7.5.3

Overview

Memory View API provides direct access to memory buffers in Ruby without copying underlying data. The API centers around the MemoryView class, which creates views into existing memory regions occupied by objects like strings, arrays, and packed data structures.

Ruby implements memory views through the MemoryView class and associated methods on core classes. The API follows a zero-copy approach, meaning operations create references to existing memory rather than duplicating data. This design reduces memory overhead and improves performance for data processing tasks.

Memory views expose buffer properties including byte length, data type, and memory layout. The API supports both read-only and writable views depending on the source object's mutability. Views maintain automatic cleanup when no longer referenced.

# Basic memory view creation
str = "Hello, World!"
view = str.memory_view
# => #<MemoryView:0x... length=13 readonly=false>

# Array memory view
arr = [1, 2, 3, 4, 5].pack("L*")
view = arr.memory_view
# => #<MemoryView:0x... length=20 readonly=false>

# Read-only view from frozen string
frozen_str = "Immutable data".freeze
readonly_view = frozen_str.memory_view
# => #<MemoryView:0x... length=14 readonly=true>

Memory views provide methods for accessing buffer metadata, reading bytes at specific offsets, and creating subviews. The API integrates with Ruby's garbage collector to prevent premature cleanup of referenced memory.

Basic Usage

Creating memory views requires objects that support the memory view protocol. Strings, packed arrays, and certain numeric arrays provide built-in support. The memory_view method returns a MemoryView instance representing the object's memory region.

# String memory view
text = "Processing data"
view = text.memory_view
puts view.length    # => 15
puts view.readonly? # => false

# Access individual bytes
first_byte = view[0]  # => 80 (ASCII 'P')
last_byte = view[-1]  # => 97 (ASCII 'a')

Memory views support slicing operations to create subviews referencing portions of the original buffer. Subviews share the same underlying memory as their parent view, maintaining zero-copy semantics.

# Create subview of middle portion
data = "ABCDEFGHIJKLMNOP"
view = data.memory_view
middle = view[4, 8]  # => view of "EFGHIJKL"
puts middle.length   # => 8

# Range-based slicing
prefix = view[0...4]  # => view of "ABCD"
suffix = view[12..-1] # => view of "MNOP"

Reading data from memory views uses bracket notation for individual bytes or slice notation for byte ranges. The API returns integer values for single bytes and new MemoryView objects for ranges.

# Read byte sequences
numbers = [0x41, 0x42, 0x43, 0x44].pack("C*")
view = numbers.memory_view

# Single byte access
puts view[0]     # => 65
puts view[1]     # => 66

# Multi-byte read as new view
pair = view[1, 2]
puts pair[0]     # => 66
puts pair[1]     # => 67

Writable memory views support byte modification when the underlying object permits changes. Modified bytes immediately affect the source object since views reference the same memory location.

# Modify bytes through writable view
buffer = "Hello".dup
view = buffer.memory_view

view[0] = 0x4A  # Change 'H' to 'J'
puts buffer     # => "Jello"

# Bulk modification
view[1, 3] = [0x75, 0x6D, 0x70]  # "ump"
puts buffer     # => "Jump"

Advanced Usage

Memory views support advanced patterns including view composition, offset calculations, and integration with packed data structures. Complex scenarios often involve multiple views referencing different portions of shared buffers.

# Composite data structure access
header = [0x4D, 0x5A].pack("C*")        # "MZ" signature
size_data = [1024].pack("L<")           # Little-endian long
payload = "Binary payload data here"
combined = header + size_data + payload

# Create views for each section
full_view = combined.memory_view
header_view = full_view[0, 2]
size_view = full_view[2, 4]
payload_view = full_view[6..-1]

# Interpret size field
size_bytes = size_view.to_str.unpack("L<")[0]
puts "Payload size: #{size_bytes}"

Memory views integrate with Ruby's pack and unpack operations for structured data processing. This combination enables efficient parsing of binary formats without intermediate string creation.

# Network packet parsing
packet_data = "\x01\x02\x00\x14Hello, Network!"
view = packet_data.memory_view

# Extract header fields
version = view[0]           # => 1
type = view[1]              # => 2
length = view[2, 2].to_str.unpack("n")[0]  # => 20

# Extract payload without copying
payload_view = view[4, length - 4]
message = payload_view.to_str
puts "Message: #{message}"  # => "Hello, Network!"

Advanced view manipulation includes creating overlapping views and implementing sliding window operations. These patterns support streaming data processing and buffer management scenarios.

# Sliding window processor
data_stream = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
view = data_stream.memory_view
window_size = 5

# Process overlapping windows
(0..view.length - window_size).each do |offset|
  window = view[offset, window_size]
  content = window.to_str
  puts "Window at #{offset}: #{content}"
end
# Outputs: "ABCDE", "BCDEF", "CDEFG", etc.

Memory views support method chaining and functional programming patterns through careful subview management. Complex transformations maintain efficiency by avoiding unnecessary memory allocation.

# Functional processing pipeline
class DataProcessor
  def initialize(data)
    @view = data.memory_view
  end

  def skip_header(bytes)
    @view = @view[bytes..-1]
    self
  end

  def take_chunk(size)
    chunk = @view[0, size]
    @view = @view[size..-1]
    chunk
  end

  def remaining
    @view
  end
end

# Usage example
processor = DataProcessor.new("HEADER123DATACHUNK1CHUNK2END")
processor.skip_header(9)
chunk1 = processor.take_chunk(6).to_str  # => "CHUNK1"
chunk2 = processor.take_chunk(6).to_str  # => "CHUNK2"
remainder = processor.remaining.to_str   # => "END"

Performance & Memory

Memory views deliver significant performance benefits compared to string slicing and array copying operations. The zero-copy design eliminates redundant memory allocation for read-only access patterns and large data processing scenarios.

Performance characteristics vary based on view operations and underlying object types. Reading single bytes through memory views carries minimal overhead compared to string indexing. Range operations create new view objects but avoid copying the referenced data.

# Performance comparison example
large_data = "X" * 1_000_000
puts "Original size: #{large_data.bytesize} bytes"

# Traditional substring (copies memory)
start_time = Time.now
substring = large_data[500_000, 100_000]
copy_time = Time.now - start_time
puts "Substring copy time: #{copy_time} seconds"
puts "Memory copied: #{substring.bytesize} bytes"

# Memory view (no copying)
start_time = Time.now
view = large_data.memory_view
subview = view[500_000, 100_000]
view_time = Time.now - start_time
puts "View creation time: #{view_time} seconds"
puts "Memory referenced: #{subview.length} bytes"

Memory usage patterns depend on view lifetime and reference management. Views maintain references to source objects, preventing garbage collection until all views are released. Long-lived views can inadvertently retain large objects in memory.

# Memory management consideration
class BufferProcessor
  def process_large_file(filename)
    content = File.read(filename)  # Large file in memory
    view = content.memory_view
    
    # Extract small piece for later use
    header = view[0, 64].to_str.dup  # Copy needed data
    view = nil  # Release view reference
    content = nil  # Allow GC of large content
    
    header  # Return only needed data
  end
end

Bulk operations through memory views demonstrate substantial performance improvements over equivalent string manipulation. The API minimizes method call overhead and eliminates intermediate object creation.

# Efficient byte counting
data = File.read("large_binary_file.dat")
view = data.memory_view

# Count specific byte value
count = 0
(0...view.length).each do |i|
  count += 1 if view[i] == 0xFF
end
puts "Found #{count} occurrences of 0xFF"

# Alternative using to_str for bulk processing
chunks = []
(0...view.length).step(8192) do |offset|
  chunk_size = [8192, view.length - offset].min
  chunk_view = view[offset, chunk_size]
  chunks << chunk_view.to_str
end
all_data = chunks.join

Memory views excel in scenarios requiring repeated access to data segments without modification. Cache-friendly access patterns emerge naturally from the view-based approach, particularly beneficial for iterative algorithms.

Common Pitfalls

Memory views create subtle dependencies between view objects and their source data. Modifying source objects while active views exist can produce unexpected behavior or corrupt view contents. This relationship requires careful lifetime management.

# Dangerous: source modification with active views
source = "Original data"
view = source.memory_view
view_copy = view[0, 4]  # References same memory

# Source modification affects all views
source[0] = "M"
puts view_copy.to_str  # => "Mrig" (unexpected change)

# Safer: copy needed data immediately
source = "Original data"
view = source.memory_view
safe_copy = view[0, 4].to_str.dup  # Independent copy
source[0] = "M"
puts safe_copy  # => "Orig" (unchanged)

View slicing operations can create confusing offset calculations, particularly with nested subviews. Each subview maintains offsets relative to its immediate parent, not the original source object.

# Offset confusion with nested views
data = "0123456789ABCDEF"
view = data.memory_view
middle = view[4, 8]      # "456789AB"
sub = middle[2, 4]       # "6789"

# sub[0] refers to position 2 in middle
# which is position 6 in original data
puts sub[0]              # => 54 (ASCII '6')
puts data[6].ord         # => 54 (same byte)

# Track absolute positions manually
absolute_offset = 4 + 2  # original offset + subview offset
puts data[absolute_offset].ord  # => 54

Read-only view violations raise exceptions when attempting modifications. The readonly status depends on source object mutability, not view creation options. Frozen strings always produce readonly views regardless of access patterns.

# Read-only violations
frozen = "Immutable".freeze
view = frozen.memory_view

begin
  view[0] = 0x58  # Attempt to modify
rescue FrozenError => e
  puts "Cannot modify: #{e.message}"
end

# Mutable source with readonly view behavior
mutable = "Changeable"
view = mutable.memory_view

# View reflects source mutability
puts view.readonly?  # => false
mutable.freeze
puts view.readonly?  # => true (now readonly)

Boundary checking requires explicit validation since memory views permit invalid offset access. Out-of-bounds operations may raise exceptions or return unexpected values depending on the specific access pattern.

# Boundary validation
small_buffer = "ABC"
view = small_buffer.memory_view

# Safe access patterns
if view.length > 10
  large_chunk = view[5, 6]
else
  puts "Buffer too small for requested operation"
end

# Defensive slicing
def safe_slice(view, start, length)
  return nil if start < 0 || start >= view.length
  actual_length = [length, view.length - start].min
  return nil if actual_length <= 0
  view[start, actual_length]
end

chunk = safe_slice(view, 1, 5)  # Returns view[1, 2] safely

Type assumptions about view contents can lead to incorrect data interpretation. Memory views expose raw bytes without type information, requiring explicit conversion based on expected data format.

# Type interpretation mistakes
numeric_data = [1.5, 2.7, 3.14159].pack("d*")  # Double precision
view = numeric_data.memory_view

# Wrong: treating as integers
wrong_value = view[0, 4].to_str.unpack("L")[0]
puts "Wrong interpretation: #{wrong_value}"

# Correct: respecting original type
correct_value = view[0, 8].to_str.unpack("d")[0]
puts "Correct value: #{correct_value}"  # => 1.5

Reference

MemoryView Class Methods

Method Parameters Returns Description
#length None Integer Total bytes in view
#readonly? None Boolean Whether view permits modification
#[] index (Integer) Integer Byte value at index
#[] start, length (Integer, Integer) MemoryView Subview of specified range
#[] range (Range) MemoryView Subview matching range
#[]= index, value (Integer, Integer) Integer Set byte value at index
#[]= start, length, values (Integer, Integer, Array) Array Set multiple bytes
#to_str None String Copy view contents to new string
#each_byte Block or none Enumerator or self Iterate over byte values
#byteslice start, length (Integer, Integer) String Copy specified bytes to string

Core Class Extensions

Class Method Returns Description
String #memory_view MemoryView Create view of string bytes
Array #memory_view MemoryView Create view when packed array

Exception Types

Exception Trigger Condition Common Causes
FrozenError Modifying readonly view Frozen source objects, immutable buffers
IndexError Invalid offset access Negative indices, out-of-bounds access
ArgumentError Invalid parameters Wrong argument types, invalid ranges
TypeError Unsupported object type Objects without memory view support

Memory View Properties

Property Type Description
Length Integer Byte count in view
Readonly Boolean Modification permission
Offset Integer Starting position in source buffer
Source Object Referenced source object

Supported Data Types

Ruby Type Memory View Support Notes
String Full Both mutable and frozen strings
Array (packed) Full After pack operations only
Numeric arrays Limited Implementation dependent
IO buffers Planned Future Ruby versions
Custom objects Extension Via memory view protocol

Performance Characteristics

Operation Time Complexity Memory Usage
View creation O(1) Constant overhead
Byte access O(1) No additional memory
Subview creation O(1) Reference only
Range copying O(n) Linear with range size
Modification O(1) In-place change

Common Usage Patterns

Pattern Use Case Implementation
Header parsing Extract fixed-size headers view[0, header_size]
Chunked processing Process data in segments view[offset, chunk_size]
Buffer scanning Search for byte patterns view.each_byte.with_index
Format validation Check magic numbers view[0, 4].to_str
Stream processing Handle continuous data Sliding window with subviews