CrackedRuby - Memory Mapping

Overview

Memory mapping establishes a correspondence between a file or device and a region of a process's virtual address space. This mechanism allows programs to access files as if they were part of memory, using pointer operations instead of explicit read and write system calls. The operating system handles the actual data transfer between disk and memory automatically through its virtual memory subsystem.

The kernel manages memory-mapped regions through its page cache mechanism. When a program accesses a memory-mapped address, the OS checks whether that page resides in physical memory. If not, a page fault occurs, and the kernel loads the required page from disk. Subsequent accesses to the same page occur at memory speed without additional system calls. Modified pages in writable mappings get written back to the underlying file either when explicitly requested or during normal page cache operations.

Memory mapping originated in early virtual memory systems as a way to implement demand paging for executable code. Modern operating systems extended this concept to arbitrary files, creating a unified mechanism for both code loading and data access. This abstraction eliminates the boundary between file I/O and memory access, allowing the same virtual memory infrastructure to handle both.

# Traditional file reading requires explicit I/O calls
File.open('data.bin', 'rb') do |f|
  chunk = f.read(4096)
  process_data(chunk)
end

# Memory mapping treats the file as a memory region
# (conceptual example - actual implementation varies)
mmap = MemoryMap.new('data.bin', :read)
data = mmap[0, 4096]  # Access like memory
process_data(data)

The virtual memory system manages mapped regions identically to other memory allocations. Each mapping occupies address space but consumes physical memory only for pages actually accessed. The OS can discard clean pages at any time and reload them from disk when needed again, making memory mapping particularly efficient for large files where only portions get accessed.

Key Principles

Memory mapping operates on the principle that file contents can be accessed through memory addresses rather than file offsets. The operating system maintains a mapping table that associates virtual memory addresses with file offsets. When code dereferences a pointer into the mapped region, the memory management unit translates that virtual address into a physical address, triggering page loads as necessary.

The fundamental unit of memory mapping is the page, matching the system's virtual memory page size. On most systems, pages are 4KB, though larger page sizes exist for specific use cases. The OS maps files in page-sized chunks, meaning even small files consume at least one page of address space. File offsets used in mapping operations must align to page boundaries, though the mapping can expose any byte range to the application.

Memory mappings fall into two categories based on whether changes propagate to the underlying file. Shared mappings make modifications visible to other processes mapping the same file and write changes back to disk. Private mappings use copy-on-write semantics where modifications create private copies of pages, leaving the original file unchanged. The kernel creates new page frames for modified pages in private mappings, consuming additional memory.

# Conceptual representation of mapping types
class MemoryMapping
  # Shared mapping - changes affect file and other processes
  def self.shared(file, offset, length)
    mapping = allocate_address_range(length)
    link_to_file(file, offset, mapping, :shared)
    mapping
  end
  
  # Private mapping - changes create copy-on-write pages
  def self.private(file, offset, length)
    mapping = allocate_address_range(length)
    link_to_file(file, offset, mapping, :private)
    mapping
  end
end

Anonymous mappings create memory regions not backed by any file. These mappings provide private memory that starts zero-initialized, useful for dynamic memory allocation. The kernel allocates physical pages on demand as the program accesses the anonymous region. Many allocators use anonymous mappings for large allocations, bypassing the traditional heap.

The protection attributes of a mapping control access permissions. Mappings can be readable, writable, executable, or combinations thereof. The kernel enforces these permissions through the memory management unit, generating segmentation faults for violations. Protection attributes interact with the shared/private designation - shared mappings typically require the underlying file to have compatible permissions.

Synchronization determines when modifications to a shared mapping become visible to other processes and when they get written to disk. Asynchronous synchronization allows the kernel to flush changes at its discretion, optimizing for system-wide performance. Synchronous synchronization forces immediate writeback, ensuring durability but sacrificing performance. Most applications rely on asynchronous synchronization, explicitly requesting flushes only at consistency points.

# Synchronization control (conceptual)
class SharedMapping
  def sync_async
    # Kernel flushes pages when convenient
    mark_for_eventual_writeback
  end
  
  def sync_sync
    # Immediate flush to disk
    flush_all_modified_pages
    wait_for_disk_completion
  end
  
  def sync_range(offset, length)
    # Flush only specific pages
    flush_pages_in_range(offset, length)
  end
end

Ruby Implementation

Ruby provides memory mapping capabilities primarily through external gems, as the core language lacks built-in memory mapping APIs. The mmap2 gem offers the most direct interface to system memory mapping facilities, wrapping the underlying POSIX mmap system call with Ruby objects. This gem exposes memory-mapped regions as string-like objects that support random access and modification.

require 'mmap2'

# Map an entire file for reading
mmap = Mmap.new('large_file.dat', 'r')
puts mmap.size  # File size in bytes

# Access data at specific offsets
header = mmap[0, 512]
record = mmap[512, 128]

# Close the mapping
mmap.unmap

The mmap2 gem supports various mapping modes mirroring the underlying system capabilities. Read-only mode maps the file without modification capability. Read-write mode allows changes that propagate to the file. Copy-on-write mode creates a private mapping where modifications don't affect the original file. The gem handles page alignment requirements automatically, simplifying usage compared to raw system calls.

# Different mapping modes
read_map = Mmap.new('input.dat', 'r')       # Read-only
write_map = Mmap.new('output.dat', 'w')     # Read-write shared
private_map = Mmap.new('template.dat', 'c')  # Copy-on-write private

# Modify a shared mapping
write_map[0] = 'X'  # Changes visible in file

# Modify a private mapping  
private_map[0] = 'Y'  # Creates private copy of page

write_map.unmap
private_map.unmap

For more advanced control, Ruby's Fiddle and FFI libraries enable direct system call invocation. This approach provides access to all mmap flags and options but requires more code and platform-specific knowledge. Applications needing precise control over protection attributes, fixed addresses, or unusual mapping modes use this technique.

require 'fiddle'
require 'fiddle/import'

module MemMap
  extend Fiddle::Importer
  dlload Fiddle.dlopen(nil)
  
  # Constants for mmap
  PROT_READ = 1
  PROT_WRITE = 2
  MAP_SHARED = 1
  MAP_PRIVATE = 2
  
  extern 'void* mmap(void*, size_t, int, int, int, off_t)'
  extern 'int munmap(void*, size_t)'
  extern 'int msync(void*, size_t, int)'
end

# Map a file using raw system calls
fd = File.open('data.bin', 'r').fileno
size = File.size('data.bin')
addr = MemMap.mmap(nil, size, 
                    MemMap::PROT_READ, 
                    MemMap::MAP_PRIVATE, 
                    fd, 0)
# Access mapped memory through addr pointer

Ruby's IO class supports memory-like access through positioning and buffering, though this doesn't use actual memory mapping. The IO#pread method reads from a specific offset without changing the file position, providing random access similar to memory mapping but still using system calls. This approach offers portability where memory mapping isn't available.

# IO-based random access (not true memory mapping)
File.open('data.bin', 'rb') do |io|
  # Read from multiple positions
  header = io.pread(512, 0)
  footer = io.pread(512, io.size - 512)
  
  # No need to seek between reads
  middle = io.pread(1024, io.size / 2)
end

The tempfile library combined with memory mapping creates temporary shared memory regions for inter-process communication. A temporary file gets created, mapped into memory, and then unlinked from the filesystem. The mapping persists while processes hold references, providing shared memory that automatically cleans up when all processes exit.

require 'tempfile'
require 'mmap2'

# Create temporary shared memory
tmpfile = Tempfile.new('shared')
tmpfile.write("\0" * 4096)  # Allocate space
tmpfile.flush

# Map into memory
shared = Mmap.new(tmpfile.path, 'w')

# Fork and share data
pid = fork do
  shared[0, 5] = "child"
  shared.msync  # Ensure visibility
end

Process.wait(pid)
puts shared[0, 5]  # Reads "child"

shared.unmap
tmpfile.close
tmpfile.unlink

For working with structured binary data in memory-mapped files, the bindata gem provides serialization and deserialization. Combine bindata definitions with memory-mapped regions to parse complex binary formats efficiently, reading only the required portions of large files.

require 'bindata'
require 'mmap2'

# Define binary structure
class Record < BinData::Record
  endian :little
  uint32 :id
  uint32 :timestamp
  string :payload, length: 256
end

# Map file and read specific record
mmap = Mmap.new('records.bin', 'r')
record_size = 264  # 4 + 4 + 256 bytes
record_data = mmap[record_size * 100, record_size]
record = Record.read(record_data)

puts record.id
puts record.timestamp

Implementation Approaches

File-backed memory mapping creates a bidirectional connection between a file and memory. The application opens a file descriptor, then requests the kernel to map a range of file offsets into virtual address space. The kernel creates page table entries pointing to the file's pages in the page cache. Reads and writes to mapped addresses operate on these cached pages, with the kernel handling synchronization to disk. This approach works best for files accessed non-sequentially where traditional I/O would require many seek operations.

Anonymous memory mapping allocates memory without file backing. The kernel initializes mapped pages to zero and allocates physical memory on demand as the application touches pages. These mappings serve as private memory regions for the process, equivalent to heap allocations but managed by the virtual memory system rather than a heap allocator. Large allocations often use anonymous mappings because they can be more efficiently managed at the page level.

# File-backed mapping pattern
class FileMapper
  def initialize(path, size)
    @file = File.open(path, 'r+')
    @file.truncate(size) if @file.size < size
    @mmap = Mmap.new(@file.path, 'w')
  end
  
  def read_record(index, record_size)
    offset = index * record_size
    @mmap[offset, record_size]
  end
  
  def write_record(index, record_size, data)
    offset = index * record_size
    @mmap[offset, record_size] = data
    @mmap.msync  # Flush to disk
  end
  
  def close
    @mmap.unmap
    @file.close
  end
end

Shared memory through memory-mapped files enables inter-process communication. Multiple processes map the same file into their address spaces, creating a shared memory region visible to all. Changes made by any process appear to others after synchronization. The kernel maintains a single set of physical pages for the file, reducing memory usage compared to each process having private copies. This pattern suits situations where processes need to exchange large amounts of data efficiently.

# Shared memory communication pattern
class SharedMemoryQueue
  def initialize(path, size)
    File.write(path, "\0" * size) unless File.exist?(path)
    @mmap = Mmap.new(path, 'w')
    @size = size
  end
  
  # Producer writes data
  def enqueue(data)
    header_size = 8
    data_len = [data.bytesize].pack('Q')
    
    # Write length then data
    @mmap[0, 8] = data_len
    @mmap[8, data.bytesize] = data
    @mmap.msync
  end
  
  # Consumer reads data
  def dequeue
    header_size = 8
    data_len = @mmap[0, 8].unpack1('Q')
    return nil if data_len == 0
    
    data = @mmap[8, data_len]
    
    # Clear the queue
    @mmap[0, 8] = [0].pack('Q')
    @mmap.msync
    
    data
  end
end

Memory-mapped I/O accesses hardware device registers by mapping physical device memory into the process address space. This technique bypasses traditional I/O port operations, treating device memory as ordinary memory. Reads and writes to the mapped region directly access device registers. Device driver development frequently uses memory-mapped I/O for performance-critical hardware interaction, though application-level code rarely needs this approach.

The streaming read pattern combines sequential access with memory mapping for processing large files. Map a window of the file into memory, process that window, unmap it, then map the next window. This approach limits memory consumption while maintaining efficient access patterns. The kernel's read-ahead mechanisms optimize sequential mapping operations, prefetching pages before they're accessed.

# Windowed mapping for large file processing
class WindowedFileProcessor
  WINDOW_SIZE = 64 * 1024 * 1024  # 64 MB windows
  
  def process_file(path)
    file_size = File.size(path)
    offset = 0
    
    while offset < file_size
      window_size = [WINDOW_SIZE, file_size - offset].min
      
      # Map current window
      # Note: Most Ruby mmap gems map entire files
      # This is conceptual - would need lower-level control
      process_window(path, offset, window_size)
      
      offset += window_size
    end
  end
  
  private
  
  def process_window(path, offset, size)
    # Process this portion of the file
    # Implementation would use file positioning
    # or a gem supporting partial mapping
  end
end

The persistent data structure pattern uses memory-mapped files as durable storage for in-memory data structures. Hash tables, trees, or other structures get laid out in a file format that supports direct pointer access. The application maps the file and accesses structures through memory addresses. Modifications persist automatically as the kernel flushes dirty pages. This approach creates data structures that survive process restarts without serialization overhead.

Performance Considerations

Memory mapping eliminates data copying between kernel and user space. Traditional I/O requires the kernel to copy data from the page cache into application buffers, then the application processes that copy. Memory-mapped access operates directly on pages in the page cache, removing one copy operation. This reduction matters most for large data transfers where copy overhead dominates execution time.

Page fault handling adds latency to first access of each page. When code accesses an unmapped page, the processor triggers a page fault exception. The kernel must locate the corresponding page in the file, allocate physical memory, load the page contents, and update page tables. This process takes microseconds but can accumulate for workloads touching many pages. Sequential access patterns minimize this cost because the kernel's read-ahead prefetches pages before they're accessed.

# Comparison: traditional I/O vs memory mapping
require 'benchmark'
require 'mmap2'

file_size = 100 * 1024 * 1024  # 100 MB
File.open('testfile.dat', 'wb') { |f| f.write('x' * file_size) }

Benchmark.bm(20) do |x|
  # Traditional sequential read
  x.report('Traditional I/O:') do
    sum = 0
    File.open('testfile.dat', 'rb') do |f|
      while chunk = f.read(4096)
        sum += chunk.bytes.sum
      end
    end
  end
  
  # Memory-mapped sequential read
  x.report('Memory mapped:') do
    mmap = Mmap.new('testfile.dat', 'r')
    sum = 0
    (0...file_size).step(4096) do |offset|
      chunk_size = [4096, file_size - offset].min
      sum += mmap[offset, chunk_size].bytes.sum
    end
    mmap.unmap
  end
end

Memory consumption patterns differ between traditional I/O and memory mapping. Traditional I/O uses explicit buffers whose size the application controls. Memory mapping commits virtual address space equal to the mapping size but consumes physical memory only for accessed pages. For sparse access patterns where small portions of a large file get read, memory mapping uses less physical memory. For dense sequential access, memory mapping may use more memory because the kernel keeps recently accessed pages cached.

The TLB (Translation Lookaside Buffer) limits memory mapping efficiency for scattered access patterns. The TLB caches virtual-to-physical address translations for recently used pages. When access spans many pages in a scattered pattern, TLB misses increase, requiring expensive page table walks. Sequential access patterns keep the TLB hot, while random access across a large file causes frequent TLB misses. Traditional I/O using large buffers can outperform memory mapping for truly random access because it maintains locality within buffers.

Write patterns affect performance differently for memory mapping versus traditional I/O. Memory-mapped writes mark pages dirty in place without immediate disk I/O. The kernel flushes dirty pages asynchronously, potentially batching many pages into efficient write operations. Traditional write system calls either copy data into kernel buffers (buffered I/O) or wait for disk completion (unbuffered I/O). Memory mapping provides the benefits of buffered writes without explicit buffer management, but applications lose control over write timing.

# Write performance comparison
require 'benchmark'
require 'mmap2'

file_size = 10 * 1024 * 1024  # 10 MB
data = 'x' * 4096

Benchmark.bm(20) do |x|
  # Traditional buffered writes
  x.report('Buffered writes:') do
    File.open('write_test1.dat', 'wb') do |f|
      (file_size / 4096).times do
        f.write(data)
      end
    end
  end
  
  # Memory-mapped writes
  x.report('Mapped writes:') do
    File.open('write_test2.dat', 'wb') { |f| f.write("\0" * file_size) }
    mmap = Mmap.new('write_test2.dat', 'w')
    (file_size / 4096).times do |i|
      mmap[i * 4096, 4096] = data
    end
    mmap.msync  # Force flush for fair comparison
    mmap.unmap
  end
end

Cache coherency overhead affects shared mappings accessed by multiple processes. When one process modifies a shared page, the kernel must ensure other processes see the change. On single-processor systems, this happens automatically through the shared page cache. On multiprocessor systems, cache coherency protocols invalidate cached copies of modified cache lines across processor caches. This overhead increases with the number of processors and the frequency of modifications to shared pages.

File size changes interact poorly with memory mapping. If a file grows beyond its mapped size, the original mapping doesn't reflect the extension. Applications must unmap and remap to access new content. If a file shrinks below its mapped size, accessing now-invalid addresses causes segmentation faults. Traditional I/O handles size changes gracefully because each operation checks current file size. Applications using memory mapping for dynamic files must carefully manage mapping lifetime relative to file size changes.

Common Pitfalls

Forgetting to synchronize shared mappings leads to data loss. Modifications to memory-mapped pages remain in memory until the kernel decides to flush them. If the process terminates before flushing, changes may not persist. Calling msync forces dirty pages to disk, but applications must remember to call it at appropriate points. Unlike traditional I/O where write calls immediately copy data to kernel buffers, memory mapping delays durability.

require 'mmap2'

# Dangerous: changes may not persist
mmap = Mmap.new('important.dat', 'w')
mmap[0, 10] = "critical!"
mmap.unmap  # Might lose data if process crashes

# Safe: explicit synchronization
mmap = Mmap.new('important.dat', 'w')
mmap[0, 10] = "critical!"
mmap.msync  # Force flush to disk
mmap.unmap

Accessing memory beyond the mapping bounds causes segmentation faults rather than returning errors. Traditional file I/O returns error codes for reads or writes past end-of-file. Memory mapping crashes the process. Applications must carefully track mapping sizes and validate offsets before access. This behavior makes debugging harder because crashes occur at memory access sites rather than mapping creation sites.

# Bounds checking necessary for safety
class SafeMapping
  def initialize(path, mode)
    @mmap = Mmap.new(path, mode)
    @size = @mmap.size
  end
  
  def read(offset, length)
    raise ArgumentError, "Read beyond mapping" if offset + length > @size
    @mmap[offset, length]
  end
  
  def write(offset, data)
    raise ArgumentError, "Write beyond mapping" if offset + data.bytesize > @size
    @mmap[offset, data.bytesize] = data
  end
end

Platform differences in mapping behavior create portability issues. Windows and POSIX systems handle certain mapping operations differently. File mappings on Windows require files to remain open while mapped, whereas POSIX allows unmapping the file descriptor after mapping. Maximum mapping sizes vary by platform and process architecture. Code that works on 64-bit Linux may fail on 32-bit Windows due to address space limitations.

Private mappings consume more memory than expected when modified extensively. The copy-on-write mechanism creates private copies of pages as they're written. An application that maps a large file privately and modifies it extensively ends up with both the original pages and private copies in memory. This memory doubling surprises developers expecting memory mapping to reduce memory usage. Shared mappings avoid this issue but require careful synchronization.

# Private mapping memory consumption
mmap = Mmap.new('large.dat', 'c')  # Private copy-on-write

# Modifying the entire mapping creates private copies
# Memory usage: original file size + modified page count
(0...mmap.size).step(4096) do |offset|
  mmap[offset] = 'X'  # Triggers copy-on-write for each page
end
# Memory usage now ~2x file size

Mapping alignment requirements cause confusion. The offset parameter to mmap must align to page boundaries. An application attempting to map starting at an arbitrary byte offset receives an error. The length parameter need not align, but the kernel rounds up to page boundaries internally. This means mapping a 1-byte file still consumes a full page of address space.

Signal handling during page faults creates race conditions. When a page fault loads data from disk, the process blocks waiting for I/O completion. If a signal arrives during this time, the signal handler executes before the memory access completes. Code in signal handlers that accesses the same mapped region can cause deadlocks or corruption. Applications must either avoid mapped memory access in signal handlers or use appropriate synchronization.

Mixing memory mapping with traditional I/O on the same file produces inconsistent views. The kernel caches file contents in the page cache for mapped access. Separate read or write system calls may use different cache mechanisms, leading to stale data. Changes through memory mapping might not appear in traditional I/O reads immediately, and vice versa. Applications should choose one access method per file or carefully synchronize between methods.

# Dangerous: mixed access modes
file = File.open('data.dat', 'r+')
mmap = Mmap.new('data.dat', 'w')

mmap[0, 5] = "hello"
mmap.msync

file.seek(0)
# May read stale data depending on platform
puts file.read(5)

mmap.unmap
file.close

Reference

Memory Mapping System Calls

Function	Purpose	Key Parameters
mmap	Create a new mapping	addr, length, prot, flags, fd, offset
munmap	Remove a mapping	addr, length
msync	Synchronize mapping to disk	addr, length, flags
mprotect	Change protection on mapping	addr, length, prot
madvise	Give usage hints to kernel	addr, length, advice
mremap	Resize existing mapping	old_addr, old_size, new_size, flags

Protection Flags

Flag	Value	Description
PROT_NONE	0	Page cannot be accessed
PROT_READ	1	Page can be read
PROT_WRITE	2	Page can be written
PROT_EXEC	4	Page can be executed

Mapping Flags

Flag	Purpose	Effect
MAP_SHARED	Shared mapping	Changes visible to other processes
MAP_PRIVATE	Private mapping	Copy-on-write for modifications
MAP_ANONYMOUS	Anonymous mapping	Not backed by file
MAP_FIXED	Fixed address	Map at exact address or fail
MAP_POPULATE	Prefault pages	Load all pages immediately
MAP_LOCKED	Lock pages	Prevent swapping

Synchronization Flags

Flag	Behavior	Use Case
MS_ASYNC	Asynchronous flush	Schedule writeback, return immediately
MS_SYNC	Synchronous flush	Wait for writeback completion
MS_INVALIDATE	Invalidate caches	Force reload from disk

Ruby mmap2 Gem Methods

Method	Purpose	Example
Mmap.new	Create mapping	Mmap.new(path, mode)
Mmap[]	Read from mapping	mmap[offset, length]
Mmap[]=	Write to mapping	mmap[offset, length] = data
Mmap#size	Get mapping size	mmap.size
Mmap#msync	Synchronize to disk	mmap.msync
Mmap#munmap	Unmap region	mmap.munmap
Mmap#mprotect	Change protection	mmap.mprotect(mode)

Mode Strings

Mode	Access	Sharing	Equivalent Flags
r	Read-only	Shared	PROT_READ, MAP_SHARED
w	Read-write	Shared	PROT_READ + PROT_WRITE, MAP_SHARED
c	Copy-on-write	Private	PROT_READ + PROT_WRITE, MAP_PRIVATE

Common Usage Patterns

Pattern	Code Structure	When to Use
Read-only access	mmap = Mmap.new(path, 'r')	Processing large files
Random access	data = mmap[offset, size]	Non-sequential reads
Shared memory	mmap = Mmap.new(path, 'w') + fork	Inter-process communication
Persistent data	mmap[offset] = data + msync	Durable structures
Large allocations	MAP_ANONYMOUS mapping	Memory-intensive operations

Performance Characteristics

Operation	Memory Mapping	Traditional I/O	Winner
Sequential read	Fast (read-ahead)	Fast (buffered)	Tie
Random read	Very fast (page cache)	Slow (seeks)	Memory mapping
Sequential write	Fast (async writeback)	Fast (buffered)	Tie
Random write	Fast (in-place)	Slow (seeks)	Memory mapping
First access latency	High (page faults)	Low (immediate)	Traditional I/O
Memory overhead	Variable (accessed pages)	Fixed (buffers)	Depends on pattern

Size Limitations

Platform	Max Mapping Size	Address Space
32-bit systems	~2-3 GB	Limited by virtual address space
64-bit systems	Terabytes	Limited by physical memory + swap
Windows	Per mapping limits apply	Handle limits per process
Linux	No practical limit	Per-process limit configurable

Troubleshooting Checklist

Symptom	Likely Cause	Solution
Segmentation fault	Access beyond mapping	Add bounds checking
EINVAL error	Unaligned offset	Round offset to page boundary
ENOMEM error	Address space exhausted	Reduce mapping size or use windowing
Data loss	Missing msync	Add explicit synchronization
Performance degradation	TLB thrashing	Reduce number of accessed pages
Memory bloat	Private mapping modifications	Use shared mapping or traditional I/O

Memory Mapping