CrackedRuby CrackedRuby

Process Creation and Termination

Overview

Process creation and termination define the fundamental lifecycle operations for executable programs in operating systems. A process represents an instance of a running program with its own memory space, system resources, and execution context. The operating system provides mechanisms to spawn new processes, establish relationships between parent and child processes, control execution flow, and reclaim resources when processes complete or terminate abnormally.

Operating systems support multiple process creation models. Unix-like systems traditionally use the fork-exec model, where an existing process clones itself with fork() and then replaces its memory image with a new program using exec(). Windows uses a different approach with CreateProcess(), which creates a new process with a specified executable in a single operation. Modern systems also provide higher-level abstractions like posix_spawn() that combine creation and execution.

Process termination occurs through several paths: normal exit when a program completes successfully, error exit when an error condition forces termination, or external termination through signals or system calls. The operating system performs cleanup operations including closing file descriptors, releasing memory, removing process table entries, and notifying the parent process. Improper termination handling creates zombie processes that consume system resources or orphan processes that lose their parent relationship.

# Process creation in Ruby
pid = fork do
  # Child process code
  puts "Child process: #{Process.pid}"
  exit 0
end

# Parent process continues
puts "Parent process: #{Process.pid}, child: #{pid}"
Process.wait(pid)

Process management affects system performance, resource utilization, and application architecture. Multi-process applications distribute workload across CPU cores, isolate failures between components, and implement privilege separation for security. Understanding process creation overhead, memory-on-write optimization, and proper cleanup prevents resource leaks and enables scalable system design.

Key Principles

Process Identity and Hierarchy

Each process receives a unique process identifier (PID) from the operating system. The PID serves as the primary reference for process management operations. Process 0 or 1 (init/systemd on Unix) serves as the root of the process tree. Every process except the init process has a parent process identified by PPID (parent process ID). This hierarchy determines resource limits, signal propagation, and cleanup responsibilities.

Address Space Isolation

Operating systems allocate separate virtual address spaces for each process. Memory isolation prevents processes from accessing each other's data directly, providing security and stability. A process crash does not corrupt memory in other processes. Inter-process communication requires explicit mechanisms like pipes, sockets, shared memory, or message queues.

Copy-on-Write Optimization

Modern operating systems optimize fork operations using copy-on-write (COW) semantics. When a process forks, the child initially shares the parent's physical memory pages. The kernel marks these pages read-only and copies them only when either process attempts to modify the data. This optimization reduces memory usage and speeds up process creation, making fork operations that immediately exec a new program highly efficient.

Process States and Transitions

Processes transition through several states during their lifecycle:

  • Running: Executing instructions on a CPU
  • Ready: Prepared to run but waiting for CPU allocation
  • Waiting/Blocked: Suspended while waiting for I/O or events
  • Zombie: Terminated but awaiting parent to collect exit status
  • Stopped: Suspended by signals (SIGSTOP, SIGTSTP)

The operating system scheduler manages transitions between running and ready states. System calls and I/O operations move processes to waiting states. Proper state management prevents deadlocks and ensures processes make progress.

Exit Status and Return Codes

Processes communicate termination results through exit status codes. Convention reserves 0 for success and non-zero values for errors. The exit status propagates to the parent process through the wait system call. Shell scripts and automation tools use exit codes to determine command success and chain operations. Exit codes occupy 8 bits, providing values from 0-255.

Signal-Based Control

Signals provide asynchronous notifications to processes. Common signals include SIGTERM (request termination), SIGKILL (force termination), SIGCHLD (child process state change), and SIGSTOP (suspend execution). Processes can install custom signal handlers for most signals, enabling graceful shutdown, resource cleanup, and state persistence. SIGKILL and SIGSTOP cannot be caught or ignored, giving the system ultimate control.

Resource Inheritance and Limits

Child processes inherit various attributes from their parent: environment variables, file descriptors, resource limits (ulimit), working directory, and process group membership. File descriptors remain open across fork unless marked with close-on-exec flags. Resource limits restrict CPU time, memory usage, file sizes, and open file counts. Understanding inheritance patterns prevents security vulnerabilities and resource leaks.

Process Groups and Sessions

Operating systems organize processes into groups for collective management. A process group contains related processes that receive signals together. Sessions contain process groups and associate with controlling terminals. Job control in shells relies on process groups to manage foreground and background tasks. Daemon processes typically create new sessions to detach from controlling terminals.

Implementation Approaches

Fork-Exec Model

The fork-exec pattern separates process creation from program loading. Fork creates an exact copy of the current process, including memory contents, open file descriptors, and signal handlers. The child process then calls exec to replace its memory image with a new program. This two-step approach provides flexibility for setup between fork and exec:

parent_process()
  |
  fork() --> child_process(copy of parent)
  |            |
  |            setup (redirect I/O, close files)
  |            |
  |            exec(new_program)
  |            |
  |          new_program runs
  |
parent continues

Between fork and exec, the child process can modify file descriptors, change working directory, set environment variables, or drop privileges. The parent process can capture the child PID immediately after fork, enabling process tracking before the new program starts. This model dominates Unix-like systems due to its simplicity and power.

Direct Spawn Model

Direct spawn operations create a new process running a specified program in one atomic operation. Windows CreateProcess and POSIX posix_spawn combine creation and execution. This approach avoids the overhead of copying the parent's address space when the child immediately replaces it with exec:

parent_process()
  |
  spawn(program, args, environment)
  |         |
  |         new_process running program
  |
parent continues

Spawn operations accept parameters specifying the executable path, command-line arguments, environment variables, file descriptor mappings, and working directory. The calling process receives the new process PID. Spawn proves more efficient than fork-exec when the child does not need intermediate setup, and some systems implement fork-exec as spawn internally.

Thread-Based Concurrency

Instead of separate processes, applications can use threads within a single process. Threads share the same address space, file descriptors, and process ID while executing independently. Thread creation incurs lower overhead than process creation because it avoids address space duplication. Threads communicate through shared memory without explicit IPC mechanisms:

process
  |
  |-- thread_1
  |-- thread_2
  |-- thread_3
  |
  (all sharing memory)

Thread-based concurrency suits applications requiring frequent communication and shared state. Process-based concurrency provides better isolation, fault tolerance, and security boundaries. Hybrid approaches use multiple processes with multiple threads per process.

Process Pool Pattern

Process pools pre-create worker processes that accept tasks from a queue. This pattern amortizes process creation overhead across many operations. A master process spawns worker processes during initialization. Workers repeatedly fetch tasks, execute them, and wait for new tasks:

master_process
  |
  |-- worker_1 --> task_queue
  |-- worker_2 --> task_queue
  |-- worker_3 --> task_queue

Process pools maintain stable resource usage, limit concurrency, and handle worker failures through replacement. Web servers (Unicorn, Passenger) and task queues (Sidekiq, Resque) commonly implement process pool patterns. The pool size balances parallelism against memory overhead.

Daemon Process Creation

Daemon processes run in the background without a controlling terminal. Creating a daemon requires specific steps to detach from the parent environment:

  1. Fork to create a child process
  2. Parent exits, leaving child running
  3. Child creates a new session with setsid()
  4. Fork again to prevent acquiring a controlling terminal
  5. Change working directory to root (/)
  6. Close inherited file descriptors
  7. Redirect stdin, stdout, stderr to /dev/null

These steps ensure the daemon runs independently of its launching environment and survives terminal disconnection. Daemon processes typically write to log files or system logging facilities instead of standard output.

Ruby Implementation

Process Module

Ruby provides the Process module for process management operations. The module offers both low-level system call wrappers and higher-level abstractions. Core methods include fork, spawn, exec, wait, and kill:

# Check current process ID
puts Process.pid        # => 12345
puts Process.ppid       # => 12344 (parent PID)

# Get process group and session IDs
puts Process.getpgrp    # => 12345
puts Process.getsid     # => 12345

# Set process priority
Process.setpriority(Process::PRIO_PROCESS, 0, 10)

The Process module integrates with Ruby's exception handling and object model. Methods raise appropriate exceptions for system errors, making error handling idiomatic. Return values follow Ruby conventions rather than raw C semantics.

Fork Method

Process.fork creates a child process. Called without a block, fork returns twice: once in the parent (returning child PID) and once in the child (returning nil). Called with a block, the child executes the block and exits automatically:

# Without block - manual control
pid = Process.fork
if pid.nil?
  # Child process
  puts "Child: #{Process.pid}"
  sleep 2
  exit 0
else
  # Parent process
  puts "Parent: #{Process.pid}, child: #{pid}"
  Process.wait(pid)
end

# With block - automatic exit
pid = Process.fork do
  puts "Child executes block"
  puts "PID: #{Process.pid}"
  # Exits automatically when block completes
end
Process.wait(pid)

Fork raises Errno::EAGAIN when the system reaches process limits and Errno::ENOMEM when insufficient memory exists for process creation. The block form simplifies common patterns and prevents forgetting to call exit in the child.

Spawn Method

Process.spawn creates a new process running a specified command. Spawn combines fork and exec but provides extensive options for process setup. The method accepts command strings, argument arrays, and option hashes:

# Simple command execution
pid = Process.spawn("ls -la")
Process.wait(pid)

# Command with arguments array (safer for special characters)
pid = Process.spawn("ls", "-la", "/tmp")
Process.wait(pid)

# With environment variables
pid = Process.spawn(
  {"LOG_LEVEL" => "debug"},
  "ruby script.rb"
)
Process.wait(pid)

# Redirect I/O streams
pid = Process.spawn(
  "ruby process_data.rb",
  in: "/dev/null",
  out: "output.log",
  err: "error.log"
)
Process.wait(pid)

# Comprehensive options
pid = Process.spawn(
  {"PATH" => "/usr/local/bin:/usr/bin"},
  "processor",
  "--input", "data.csv",
  "--output", "results.json",
  chdir: "/var/data",
  umask: 0o027,
  pgroup: true,
  rlimit_nproc: [100, 100]
)
Process.wait(pid)

Spawn options control working directory (chdir), file creation mask (umask), process group (pgroup), resource limits (rlimit_*), and close-on-exec flags. The method returns the child PID immediately. Spawn raises Errno::ENOENT when the executable does not exist and Errno::EACCES when lacking execute permissions.

Exec Method

Process.exec replaces the current process with a new program. Exec never returns on success because the calling process no longer exists. The method accepts the same argument formats as spawn:

# This code after exec never runs
puts "Before exec"
Process.exec("ruby", "-e", "puts 'New process'")
puts "After exec - never prints"

# Common pattern: fork then exec
pid = Process.fork do
  # Child process replaces itself
  Process.exec("date")
end
Process.wait(pid)

Exec proves useful when the current process has no further work. Web servers fork to handle requests and exec CGI scripts. Shell implementations fork and exec for external commands. Exec raises exceptions only when the operation fails, since successful exec terminates the Ruby process.

Wait Operations

Process.wait and related methods allow parent processes to collect child exit status and prevent zombie processes. Several wait variants handle different scenarios:

# Wait for any child process
pid = Process.fork { sleep 1; exit 5 }
waited_pid = Process.wait
status = $?
puts "Process #{waited_pid} exited with status #{status.exitstatus}"

# Wait for specific child
pid = Process.fork { sleep 1 }
Process.wait(pid)

# Non-blocking wait (WNOHANG)
pid = Process.fork { sleep 5 }
result = Process.wait(pid, Process::WNOHANG)
if result.nil?
  puts "Child still running"
else
  puts "Child completed"
end

# Wait for all children
pids = 3.times.map { Process.fork { sleep rand; exit rand(10) } }
results = Process.waitall
results.each do |pid, status|
  puts "Process #{pid}: exit=#{status.exitstatus}, signal=#{status.termsig}"
end

Process::Status objects provide detailed termination information. Methods include exitstatus (return code), success? (zero exit status), termsig (terminating signal number), and coredump? (whether core was dumped). The wait operations remove zombie processes from the system process table.

Signal Handling

Ruby provides Signal.trap for registering signal handlers. Handlers execute asynchronously when the process receives signals. Common patterns include graceful shutdown and child process management:

# Graceful shutdown on SIGTERM
@shutdown_requested = false
Signal.trap("TERM") do
  @shutdown_requested = true
end

# Main loop checks flag
until @shutdown_requested
  # Process work
  sleep 1
end
puts "Shutting down gracefully"

# Handle child process termination
child_count = 0
Signal.trap("CHLD") do
  # Reap terminated children
  loop do
    pid = Process.wait(-1, Process::WNOHANG)
    break if pid.nil?
    child_count -= 1
    puts "Child #{pid} terminated"
  end
end

# Spawn workers
5.times do
  Process.fork { sleep rand(1..5); exit }
  child_count += 1
end

# Wait for all workers
sleep 0.1 until child_count.zero?

Signal handlers run in a restricted context. Handlers should set flags or write to pipes rather than performing complex operations. The CHLD signal requires non-blocking wait to avoid blocking on still-running children.

Process Termination

Ruby processes exit through multiple mechanisms. Normal completion returns control to the parent. Explicit exit calls terminate immediately with a specified status code:

# Normal completion (implicit exit 0)
puts "Work complete"
# Ruby exits with status 0

# Explicit exit with status
if error_condition
  exit 1  # Shell sees failure
end

# Exit without running at_exit handlers
at_exit { puts "Cleanup" }
exit!   # at_exit block does not run

# Abort with error message
abort "Fatal error occurred"  # Prints to stderr, exits 1

The at_exit method registers cleanup handlers that run during normal exit. Handlers execute in reverse registration order. Exit! bypasses at_exit handlers and finalizers, terminating immediately. Abort prints an error message to stderr and exits with status 1.

Killing Processes

Process.kill sends signals to processes. The method requires a signal name or number and target PID:

# Send SIGTERM (request termination)
pid = Process.spawn("long_running_task")
sleep 5
Process.kill("TERM", pid)

# Force kill if SIGTERM fails
Process.kill("TERM", pid)
sleep 2
begin
  Process.kill(0, pid)  # Check if process exists
  Process.kill("KILL", pid)  # Force termination
rescue Errno::ESRCH
  puts "Process already terminated"
end

# Signal multiple processes
pids = 3.times.map { Process.spawn("worker") }
Process.kill("TERM", *pids)
pids.each { |pid| Process.wait(pid) }

Signal 0 tests whether a process exists without affecting it. The method raises Errno::ESRCH when the target process does not exist and Errno::EPERM when lacking permission to signal the process.

Practical Examples

Parallel Data Processing

Applications processing large datasets benefit from parallel execution across multiple processes. Each worker handles a portion of the data independently:

class ParallelProcessor
  def initialize(worker_count: 4)
    @worker_count = worker_count
    @workers = {}
  end

  def process_files(file_list)
    # Divide files among workers
    chunks = file_list.each_slice(
      (file_list.size / @worker_count.to_f).ceil
    ).to_a

    # Spawn worker processes
    chunks.each_with_index do |chunk, index|
      pid = Process.fork do
        process_chunk(chunk, index)
      end
      @workers[pid] = chunk
    end

    # Wait for all workers and collect results
    results = {}
    @workers.each do |pid, chunk|
      Process.wait(pid)
      status = $?
      results[pid] = {
        chunk: chunk,
        success: status.success?,
        exit_code: status.exitstatus
      }
    end

    results
  end

  private

  def process_chunk(files, worker_id)
    puts "Worker #{worker_id} (#{Process.pid}): processing #{files.size} files"
    
    files.each do |file|
      begin
        data = File.read(file)
        # Process data
        processed = data.upcase  # Example transformation
        File.write(file + ".processed", processed)
      rescue => e
        $stderr.puts "Worker #{worker_id} error: #{e.message}"
        exit 1
      end
    end

    exit 0
  end
end

# Usage
files = Dir.glob("data/*.txt")
processor = ParallelProcessor.new(worker_count: 4)
results = processor.process_files(files)

results.each do |pid, info|
  status = info[:success] ? "succeeded" : "failed (#{info[:exit_code]})"
  puts "Worker #{pid}: #{info[:chunk].size} files #{status}"
end

This pattern distributes work without shared state, maximizing parallelism. Each worker operates independently with its own memory space. The parent coordinates execution and collects results through exit codes.

Background Job Processing

Long-running tasks benefit from background execution. The parent process spawns workers for CPU-intensive operations while remaining responsive:

class BackgroundJobRunner
  def initialize
    @jobs = {}
  end

  def submit_job(name, &block)
    read_pipe, write_pipe = IO.pipe

    pid = Process.fork do
      read_pipe.close
      begin
        result = block.call
        # Send result back to parent
        Marshal.dump({success: true, result: result}, write_pipe)
      rescue => e
        Marshal.dump({success: false, error: e.message}, write_pipe)
      ensure
        write_pipe.close
        exit 0
      end
    end

    write_pipe.close
    @jobs[pid] = {
      name: name,
      pipe: read_pipe,
      started_at: Time.now
    }

    pid
  end

  def check_job(pid)
    return nil unless @jobs.key?(pid)

    result = Process.wait(pid, Process::WNOHANG)
    return :running if result.nil?

    # Job completed, read result
    job = @jobs.delete(pid)
    data = Marshal.load(job[:pipe])
    job[:pipe].close

    {
      name: job[:name],
      duration: Time.now - job[:started_at],
      status: $?,
      data: data
    }
  end

  def wait_for_job(pid)
    return nil unless @jobs.key?(pid)

    Process.wait(pid)
    job = @jobs.delete(pid)
    data = Marshal.load(job[:pipe])
    job[:pipe].close

    {
      name: job[:name],
      duration: Time.now - job[:started_at],
      status: $?,
      data: data
    }
  end
end

# Usage
runner = BackgroundJobRunner.new

# Submit jobs
pid1 = runner.submit_job("calculate_pi") do
  # Expensive calculation
  (1..1000000).sum { |k| (-1)**(k+1) / (2.0*k - 1) } * 4
end

pid2 = runner.submit_job("process_image") do
  sleep 2  # Simulate image processing
  "processed_image.jpg"
end

# Check job status periodically
loop do
  result1 = runner.check_job(pid1)
  result2 = runner.check_job(pid2)

  puts "Job 1: #{result1 || 'running'}"
  puts "Job 2: #{result2 || 'running'}"

  break if result1 && result2
  sleep 0.5
end

This pattern uses pipes for inter-process communication. The child serializes results through Marshal and sends them to the parent. The parent polls for completion or blocks until jobs finish.

Daemon Process Creation

Services running continuously in the background require proper daemon initialization:

class DaemonProcess
  def self.daemonize(name:, pid_file:, log_file:)
    # Already a daemon?
    return if Process.ppid == 1

    # First fork
    pid = Process.fork
    exit if pid  # Parent exits

    # Create new session
    Process.setsid

    # Second fork to prevent controlling terminal acquisition
    pid = Process.fork
    exit if pid  # First child exits

    # Write PID file
    File.write(pid_file, Process.pid)

    # Change to root directory
    Dir.chdir("/")

    # Set file creation mask
    File.umask(0o027)

    # Close inherited file descriptors
    ObjectSpace.each_object(IO) do |io|
      io.close unless [STDIN, STDOUT, STDERR].include?(io)
    end

    # Redirect standard streams
    STDIN.reopen("/dev/null", "r")
    STDOUT.reopen(log_file, "a")
    STDERR.reopen(log_file, "a")
    STDOUT.sync = true
    STDERR.sync = true

    # Setup signal handlers
    setup_signal_handlers(pid_file)

    puts "#{name} daemon started: PID #{Process.pid}"
  end

  def self.setup_signal_handlers(pid_file)
    @shutdown = false

    Signal.trap("TERM") do
      puts "SIGTERM received, shutting down"
      @shutdown = true
    end

    Signal.trap("INT") do
      puts "SIGINT received, shutting down"
      @shutdown = true
    end

    at_exit do
      File.delete(pid_file) if File.exist?(pid_file)
    end
  end

  def self.running?
    @shutdown == false
  end
end

# Usage
DaemonProcess.daemonize(
  name: "worker_daemon",
  pid_file: "/var/run/worker.pid",
  log_file: "/var/log/worker.log"
)

# Main daemon loop
counter = 0
while DaemonProcess.running?
  puts "Daemon iteration #{counter += 1}: #{Time.now}"
  sleep 10
end

puts "Daemon shutdown complete"

The daemon disconnects from the controlling terminal, redirects output to log files, and responds to termination signals. The PID file enables monitoring tools to track the daemon process.

Supervisor Process Pattern

Supervisor processes monitor and restart child workers that crash or become unresponsive:

class ProcessSupervisor
  def initialize(worker_count:, &worker_block)
    @worker_count = worker_count
    @worker_block = worker_block
    @workers = {}
    @shutdown = false
    setup_signals
  end

  def run
    # Start initial workers
    @worker_count.times { spawn_worker }

    # Supervisor loop
    until @shutdown
      begin
        pid = Process.wait(-1, Process::WNOHANG)
        
        if pid
          # Worker terminated
          worker = @workers.delete(pid)
          status = $?
          
          if status.success?
            puts "Worker #{pid} exited normally"
          else
            puts "Worker #{pid} crashed: exit=#{status.exitstatus}"
          end

          # Respawn if not shutting down
          spawn_worker unless @shutdown
        end

        sleep 0.5
      rescue Errno::ECHILD
        # No children to wait for
        break if @shutdown
        sleep 1
      end
    end

    # Shutdown: terminate all workers
    shutdown_workers
  end

  private

  def spawn_worker
    pid = Process.fork do
      @worker_block.call
    end
    @workers[pid] = { started_at: Time.now }
    puts "Spawned worker #{pid}"
  end

  def setup_signals
    Signal.trap("TERM") { @shutdown = true }
    Signal.trap("INT") { @shutdown = true }
    Signal.trap("CHLD") { }  # Handled in wait loop
  end

  def shutdown_workers
    puts "Shutting down #{@workers.size} workers"
    
    # Send TERM to all workers
    @workers.keys.each do |pid|
      Process.kill("TERM", pid) rescue nil
    end

    # Wait up to 10 seconds for graceful shutdown
    deadline = Time.now + 10
    while Time.now < deadline && @workers.any?
      pid = Process.wait(-1, Process::WNOHANG)
      @workers.delete(pid) if pid
      sleep 0.1
    end

    # Force kill remaining workers
    @workers.keys.each do |pid|
      puts "Force killing worker #{pid}"
      Process.kill("KILL", pid) rescue nil
    end
  end
end

# Usage
supervisor = ProcessSupervisor.new(worker_count: 3) do
  # Worker code
  puts "Worker #{Process.pid} started"
  
  # Simulate work with occasional crashes
  10.times do |i|
    puts "Worker #{Process.pid}: iteration #{i}"
    sleep 2
    exit 1 if rand < 0.2  # 20% crash rate
  end
  
  puts "Worker #{Process.pid} completed"
  exit 0
end

supervisor.run

The supervisor maintains a stable worker pool, automatically replacing failed workers. This pattern provides fault tolerance for long-running services.

Error Handling & Edge Cases

Zombie Process Prevention

Zombie processes occur when parents fail to call wait on terminated children. The child's process table entry remains until the parent collects its exit status:

# Creates zombies - parent never waits
10.times do
  Process.fork { exit }
end
sleep 10  # Zombies accumulate
# Check with: ps aux | grep defunct

# Correct approach - wait for children
children = 10.times.map do
  Process.fork { exit }
end
children.each { |pid| Process.wait(pid) }

# Alternative - async signal handler
Signal.trap("CHLD") do
  loop do
    pid = Process.wait(-1, Process::WNOHANG)
    break unless pid
    puts "Reaped child #{pid}"
  end
end

10.times { Process.fork { exit } }
sleep 1  # Handler prevents zombies

Long-running parent processes must reap children continuously. Signal handlers enable automatic cleanup without blocking the main thread. Process.detach provides another solution by creating a thread that waits for the specified child.

Orphan Process Handling

Orphan processes lose their parent when the parent exits before the child. The init process (PID 1) adopts orphans and reaps them when they terminate:

# Parent exits immediately, child becomes orphan
Process.fork do
  # Child process
  puts "Child PPID: #{Process.ppid}"  # Parent PID
  sleep 5
  puts "Child PPID after wait: #{Process.ppid}"  # Will be 1 (init)
  exit
end
# Parent exits, child continues as orphan

Intentional orphaning creates daemon processes. Accidental orphaning occurs from parent crashes or improper termination. Orphaned processes inherit init's resource limits and environment.

Fork Failures

Fork operations fail when system resource limits prevent process creation:

def safe_fork(&block)
  retries = 0
  max_retries = 3

  begin
    pid = Process.fork(&block)
    return pid
  rescue Errno::EAGAIN => e
    # System process limit reached or temporary failure
    retries += 1
    if retries <= max_retries
      sleep_time = 2 ** retries
      $stderr.puts "Fork failed, retrying in #{sleep_time}s: #{e.message}"
      sleep sleep_time
      retry
    else
      raise
    end
  rescue Errno::ENOMEM => e
    # Insufficient memory
    $stderr.puts "Fork failed - insufficient memory: #{e.message}"
    raise
  end
end

# Usage with error handling
begin
  pid = safe_fork do
    # Child work
    exit 0
  end
  Process.wait(pid)
rescue => e
  $stderr.puts "Failed to spawn child: #{e.message}"
end

Applications should implement retry logic with exponential backoff for transient failures. Resource limits (ulimit -u) control maximum process counts per user.

Signal Delivery Race Conditions

Signals can arrive at inconvenient times, causing race conditions. The CHLD signal may arrive before the parent calls wait:

# Race condition - signal handler and main code both wait
@children = []
Signal.trap("CHLD") do
  # May conflict with explicit wait calls
  pid = Process.wait(-1, Process::WNOHANG)
  @children.delete(pid) if pid
end

5.times do
  pid = Process.fork { sleep rand }
  @children << pid
end

# Both handler and this loop may wait
@children.each { |pid| Process.wait(pid) }  # May raise ECHILD

# Solution - use only handler or only explicit waits
@children = []
@children_lock = Mutex.new

Signal.trap("CHLD") do
  loop do
    pid = Process.wait(-1, Process::WNOHANG)
    break unless pid
    @children_lock.synchronize { @children.delete(pid) }
  end
end

5.times do
  pid = Process.fork { sleep rand }
  @children_lock.synchronize { @children << pid }
end

# Wait via flag check instead of blocking wait
sleep 0.1 until @children_lock.synchronize { @children.empty? }

Consistent wait strategies prevent race conditions. Either use signal handlers exclusively or disable the CHLD signal and use explicit waits.

File Descriptor Leaks

Child processes inherit open file descriptors from parents. Failing to close unneeded descriptors wastes resources:

# Parent opens files
input_file = File.open("large_input.dat", "r")
output_file = File.open("results.dat", "w")

# Bad - child inherits file descriptors
pid = Process.fork do
  # Child unnecessarily holds parent's files open
  long_running_computation()
  exit
end

# Good - close inherited descriptors in child
pid = Process.fork do
  input_file.close
  output_file.close
  long_running_computation()
  exit
end

# Better - use spawn with close-on-exec
pid = Process.spawn(
  "computation_program",
  close_others: true  # Close all non-standard file descriptors
)

Process.wait(pid)
input_file.close
output_file.close

The close_others spawn option automatically closes file descriptors beyond stdin, stdout, and stderr. Setting FD_CLOEXEC on descriptors closes them on exec operations.

Waitpid Errors

Wait operations raise exceptions when child processes do not exist or are already reaped:

def safe_wait(pid, timeout: nil)
  deadline = Time.now + timeout if timeout

  loop do
    begin
      result = Process.wait(pid, Process::WNOHANG)
      return result if result

      if timeout && Time.now >= deadline
        return :timeout
      end

      sleep 0.1
    rescue Errno::ECHILD
      # Child does not exist or already reaped
      return :not_found
    rescue Errno::EINTR
      # Interrupted by signal, retry
      retry
    end
  end
end

# Usage
pid = Process.fork { sleep 2; exit 0 }

result = safe_wait(pid, timeout: 5)
case result
when :timeout
  puts "Child did not complete within timeout"
  Process.kill("KILL", pid) rescue nil
when :not_found
  puts "Child already terminated"
else
  puts "Child #{result} completed successfully"
end

Robust wait implementations handle all error conditions and support timeouts for unresponsive processes.

Common Pitfalls

Forgetting to Wait for Children

Parent processes that spawn children without waiting create zombie processes:

# Bad - creates zombies
def process_batch(items)
  items.each do |item|
    Process.fork { process_item(item) }
  end
  # Parent exits or continues without waiting
end

# Good - wait for all children
def process_batch(items)
  pids = items.map { |item| Process.fork { process_item(item) } }
  pids.each { |pid| Process.wait(pid) }
end

Always ensure wait operations match fork operations. Process.detach creates a cleanup thread when manual wait calls are inconvenient.

Shared File Position After Fork

Parent and child processes share file position for open files. Both processes reading or writing cause position conflicts:

file = File.open("data.txt", "r")
content = file.read(100)  # Parent reads 100 bytes

pid = Process.fork do
  # Child starts at position 100
  more_content = file.read(100)  # Reads bytes 100-200
  puts more_content
  exit
end

# Parent continues reading
even_more = file.read(100)  # Race condition with child
Process.wait(pid)

Close and reopen files in the child, or use spawn with explicit file redirection to avoid position conflicts.

Signal Handlers with Fork

Signal handlers inherit from parent to child but may reference parent-specific state:

@server_socket = TCPServer.new(8080)

Signal.trap("TERM") do
  @server_socket.close  # Parent closes socket
  exit
end

pid = Process.fork do
  # Child inherits TERM handler
  # Handler references parent's @server_socket variable
  loop { sleep 1 }
end

# Sending TERM to child causes confusion
Process.kill("TERM", pid)

Reset or reconfigure signal handlers in child processes to use child-specific resources and state.

Memory Bloat Before Fork

Large parent memory footprints waste resources through copy-on-write:

# Bad - allocates large data before fork
huge_data = Array.new(1_000_000) { rand }

10.times do
  Process.fork do
    # Child has copy-on-write access to huge_data
    # Modifying huge_data copies memory
    small_computation()
    exit
  end
end

# Good - allocate only what's needed before fork
10.times do
  Process.fork do
    # Allocate needed data in child
    needed_data = load_relevant_data()
    computation(needed_data)
    exit
  end
end

Fork immediately after startup before loading large datasets when children do not need parent data.

Ignoring Exit Status

Child exit codes communicate errors, but parents often ignore them:

# Bad - ignores child failures
pid = Process.spawn("important_operation")
Process.wait(pid)
# Continues regardless of success or failure

# Good - checks exit status
pid = Process.spawn("important_operation")
Process.wait(pid)
unless $?.success?
  $stderr.puts "Operation failed with exit code #{$?.exitstatus}"
  exit 1
end

Check Process::Status after wait operations. Chain commands conditionally based on previous command success.

Deadlock in Fork with Mutexes

Threads holding mutexes when fork occurs cause deadlock in the child:

mutex = Mutex.new
data = []

# Thread acquires mutex
Thread.new do
  mutex.synchronize do
    sleep 5  # Holds mutex
  end
end

sleep 1

# Fork while thread holds mutex
pid = Process.fork do
  # Child only has main thread
  # Mutex is locked but owning thread doesn't exist
  mutex.synchronize do
    # Deadlock - mutex never unlocks
    data << "value"
  end
  exit
end

Process.wait(pid)  # Child hangs forever

Avoid fork in multi-threaded programs or use spawn instead. Reset mutexes after fork if unavoidable.

Resource Limit Inheritance

Children inherit parent resource limits, potentially causing unexpected failures:

# Parent sets restrictive limits
Process.setrlimit(:NOFILE, 10)  # Maximum 10 open files

pid = Process.fork do
  # Child inherits limit
  files = 15.times.map { |i| File.open("file#{i}", "w") }  # Fails
  exit
end

Process.wait(pid)
puts "Child exit: #{$?.exitstatus}"  # Non-zero

Explicitly set appropriate limits in child processes when requirements differ from the parent.

Reference

Process Creation Methods

Method Description Return Value Use Case
Process.fork Creates child process (clone of parent) Parent: child PID, Child: nil Need exact parent copy or setup before exec
Process.spawn Creates process running specified command Child PID Standard command execution with options
Process.exec Replaces current process with new program Does not return on success Current process has no further work
Process.detach Creates thread to wait for child Thread object Fire-and-forget child processes

Wait Operations

Method Blocking Parameters Description
Process.wait Yes pid=nil, flags=0 Waits for any child or specific PID
Process.wait2 Yes pid=nil, flags=0 Returns [pid, status] tuple
Process.waitpid Yes pid, flags=0 Alias for Process.wait
Process.waitall Yes None Waits for all children, returns array

Wait Flags

Flag Value Description
Process::WNOHANG 1 Return immediately if no child exited
Process::WUNTRACED 2 Report stopped children (SIGSTOP)
Process::WCONTINUED 8 Report continued children (SIGCONT)

Common Signals

Signal Number Default Action Can Catch Description
SIGTERM 15 Terminate Yes Polite termination request
SIGKILL 9 Terminate No Forced termination
SIGINT 2 Terminate Yes Interrupt (Ctrl+C)
SIGCHLD 17 Ignore Yes Child process state change
SIGSTOP 19 Stop No Forced process suspension
SIGCONT 18 Continue Yes Resume stopped process
SIGHUP 1 Terminate Yes Terminal disconnect
SIGUSR1 10 Terminate Yes User-defined signal
SIGUSR2 12 Terminate Yes User-defined signal

Process::Status Methods

Method Return Type Description
exitstatus Integer or nil Exit code if exited normally
success? Boolean or nil True if exit code was 0
termsig Integer or nil Signal number that terminated process
stopsig Integer or nil Signal number that stopped process
signaled? Boolean True if terminated by signal
stopped? Boolean True if stopped by signal
exited? Boolean True if exited normally
coredump? Boolean True if core dump generated
pid Integer Process ID

Spawn Options

Option Type Description Example
chdir String Change working directory before exec chdir: "/var/data"
umask Integer Set file creation mask umask: 0o022
pgroup Boolean/Integer Process group control pgroup: true
rlimit_cpu [soft, hard] CPU time limit in seconds rlimit_cpu: [60, 60]
rlimit_nproc [soft, hard] Maximum process count rlimit_nproc: [100, 100]
in String/IO/Integer Redirect stdin in: "/dev/null"
out String/IO/Integer Redirect stdout out: "output.log"
err String/IO/Integer Redirect stderr err: "error.log"
close_others Boolean Close non-standard file descriptors close_others: true
unsetenv_others Boolean Clear environment except specified unsetenv_others: true

Exit Methods

Method Description Runs at_exit Raises Exceptions
exit Normal exit with status code Yes No
exit! Immediate exit, no cleanup No No
abort Print message to stderr, exit 1 Yes No
Kernel.exit Same as exit Yes No
raise SystemExit Exit via exception Yes Yes (if uncaught)

Resource Limit Types

Type Description Units
RLIMIT_CPU Maximum CPU time Seconds
RLIMIT_FSIZE Maximum file size Bytes
RLIMIT_DATA Maximum data segment Bytes
RLIMIT_STACK Maximum stack size Bytes
RLIMIT_CORE Maximum core file size Bytes
RLIMIT_RSS Maximum resident set size Bytes
RLIMIT_NPROC Maximum process count Count
RLIMIT_NOFILE Maximum open files Count
RLIMIT_MEMLOCK Maximum locked memory Bytes

Process States

State Description Can Transition To
Running Executing on CPU Ready, Waiting
Ready Waiting for CPU Running
Waiting Blocked on I/O or event Ready, Zombie
Stopped Suspended by signal Running, Zombie
Zombie Terminated, awaiting reap None (removed after wait)

File Descriptor Handling

Approach Code Example Description
Manual close file.close Close specific descriptor in child
close-on-exec flag fd.fcntl(Fcntl::F_SETFD, Fcntl::FD_CLOEXEC) Auto-close on exec
Spawn close_others spawn(..., close_others: true) Close all non-standard descriptors
Redirect to /dev/null spawn(..., in: "/dev/null") Explicitly redirect stream