CrackedRuby logo

CrackedRuby

spawn and exec

Ruby's spawn and exec methods for creating and replacing processes to execute external commands and programs.

Concurrency and Parallelism Process-based Parallelism
6.5.2

Overview

Ruby provides two primary methods for process creation and replacement: spawn and exec. The spawn method creates a child process to execute a command and returns immediately with the process ID, while exec replaces the current Ruby process entirely with the new command. Both methods belong to the Process module and the Kernel module, making them available as global functions.

The spawn method creates a new process without waiting for completion, making it ideal for running background tasks or multiple concurrent processes. The parent Ruby process continues execution while the child process runs independently. The method returns the process ID (PID) of the created process, which can be used with Process.wait or similar methods to manage the child process lifecycle.

# Basic spawn usage
pid = spawn("echo 'Hello from child process'")
Process.wait(pid)
# => Child process outputs: Hello from child process

The exec method terminates the current Ruby process and replaces it with the specified command. No Ruby code executes after a successful exec call within the same process. This method is commonly used in fork-exec patterns where a process forks and then immediately execs a new program.

# exec replaces current process - no code after this runs
exec("echo 'Current process replaced'")
puts "This line never executes"

Both methods accept commands as strings or arrays, with arrays providing better argument separation and security by avoiding shell interpretation. The methods support extensive options for environment variables, file descriptor redirection, process groups, and working directory changes.

# Array form avoids shell interpretation
pid = spawn(["echo", "Safe argument handling"])
Process.wait(pid)

Basic Usage

The spawn method accepts commands in multiple formats. String arguments pass through shell interpretation, while array arguments bypass the shell for direct program execution. The first array element specifies the program to execute, with subsequent elements as arguments.

# String form - shell interprets command
pid = spawn("ls -la /tmp")
Process.wait(pid)

# Array form - direct execution
pid = spawn("ls", "-la", "/tmp")  
Process.wait(pid)

# Array with program name override
pid = spawn(["ruby", "my_script"], "/path/to/script.rb")
Process.wait(pid)

Environment variable modification occurs through a hash as the first argument. Variables can be set, modified, or removed from the child process environment without affecting the parent process.

# Set environment variables
pid = spawn({"DEBUG" => "true", "PATH" => "/custom/bin"}, "my_program")
Process.wait(pid)

# Remove environment variable 
pid = spawn({"SENSITIVE_VAR" => nil}, "secure_program")
Process.wait(pid)

# Modify existing variable
current_path = ENV["PATH"]
pid = spawn({"PATH" => "#{current_path}:/additional/bin"}, "program")
Process.wait(pid)

File descriptor redirection controls input, output, and error streams of the spawned process. Redirection options use symbols or file objects to specify destinations.

# Redirect output to file
pid = spawn("date", :out => "/tmp/current_date.txt")
Process.wait(pid)

# Redirect stderr to stdout
pid = spawn("some_command", :err => :out)
Process.wait(pid)

# Redirect input from file
pid = spawn("sort", :in => "/tmp/unsorted_data.txt", 
                   :out => "/tmp/sorted_data.txt")
Process.wait(pid)

The exec method uses identical command and environment syntax but replaces the current process entirely. Fork-exec patterns combine Process.fork with exec to create child processes that run different programs.

# Fork-exec pattern
pid = fork do
  # Child process code
  exec("echo", "Child process message")
end

# Parent process continues here
puts "Parent process continues"
Process.wait(pid)

Working directory changes apply to the spawned process without modifying the parent process directory. This option works with both spawn and exec.

# Change working directory for child process
pid = spawn("pwd", :chdir => "/tmp")
Process.wait(pid)
# => Outputs: /tmp

puts Dir.pwd
# => Still in original directory

Error Handling & Debugging

Process execution failures manifest in several ways depending on the failure type and timing. The spawn method raises exceptions for immediate errors like missing executables or permission problems, but returns normally if the process starts successfully even if it later fails.

begin
  # Raises Errno::ENOENT if command not found
  pid = spawn("nonexistent_command")
  Process.wait(pid)
rescue Errno::ENOENT => e
  puts "Command not found: #{e.message}"
rescue Errno::EACCES => e
  puts "Permission denied: #{e.message}"
end

Child process exit status requires explicit checking through Process.wait variants. The $? global variable contains the exit status of the last waited-for child process.

pid = spawn("false")  # Command that exits with status 1
Process.wait(pid)

if $?.success?
  puts "Command succeeded"
else
  puts "Command failed with exit code: #{$?.exitstatus}"
  puts "Terminated by signal: #{$?.termsig}" if $?.signaled?
end

Timeout handling prevents hanging on processes that run longer than expected. Ruby lacks built-in timeout support for process waiting, requiring manual implementation with non-blocking waits or signal handling.

require 'timeout'

def spawn_with_timeout(command, timeout_seconds)
  pid = spawn(command)
  
  begin
    Timeout.timeout(timeout_seconds) do
      Process.wait(pid)
    end
  rescue Timeout::Error
    # Kill the process and wait for cleanup
    Process.kill("TERM", pid) rescue nil
    sleep 0.1
    Process.kill("KILL", pid) rescue nil
    Process.wait(pid) rescue nil
    raise "Process timed out after #{timeout_seconds} seconds"
  end
end

# Usage with timeout
begin
  spawn_with_timeout("sleep 10", 5)
rescue => e
  puts "Error: #{e.message}"
end

File descriptor redirection errors occur when target files cannot be created, opened, or accessed. These errors raise exceptions during the spawn call.

begin
  # Fails if directory doesn't exist or no write permission
  pid = spawn("echo test", :out => "/nonexistent/dir/output.txt")
  Process.wait(pid)
rescue Errno::ENOENT => e
  puts "Output directory not found: #{e.message}"
rescue Errno::EACCES => e  
  puts "Cannot write to output file: #{e.message}"
end

Debugging process execution issues requires examining environment variables, file descriptors, and process state. Adding debug output and logging helps identify problems in complex process hierarchies.

def debug_spawn(command, options = {})
  puts "Spawning: #{command}"
  puts "Options: #{options.inspect}"
  puts "Current dir: #{Dir.pwd}"
  puts "Environment vars: #{ENV.select { |k,v| k.start_with?('DEBUG') }}"
  
  pid = spawn(command, options)
  puts "Spawned PID: #{pid}"
  
  Process.wait(pid)
  puts "Exit status: #{$?.exitstatus}"
  puts "Success: #{$?.success?}"
  
  pid
end

debug_spawn("echo 'Debug test'", :out => "/tmp/debug.log")

Performance & Memory

Process creation overhead varies significantly between operating systems and depends on factors like executable size, shared libraries, and system load. The spawn method typically performs better than alternatives like system or backticks because it avoids unnecessary shell interpretation when using array arguments.

require 'benchmark'

command = "echo 'performance test'"

# Benchmark different execution methods
Benchmark.bm(10) do |x|
  x.report("spawn") do
    1000.times do
      pid = spawn(command)
      Process.wait(pid)
    end
  end
  
  x.report("system") do  
    1000.times { system(command) }
  end
  
  x.report("backticks") do
    1000.times { `#{command}` }
  end
end

Memory usage patterns differ between spawn and exec. The spawn method creates a complete copy of the parent process memory space initially, though copy-on-write optimization reduces actual memory duplication until the child process modifies memory pages. The exec method replaces process memory entirely, potentially reducing overall memory usage.

# Monitor memory usage during process creation
def measure_memory_usage
  # Get current process memory (RSS) in KB
  rss = `ps -o rss= -p #{Process.pid}`.to_i
  rss
end

initial_memory = measure_memory_usage
puts "Initial memory: #{initial_memory} KB"

# Create multiple child processes to observe memory impact  
pids = []
10.times do |i|
  pid = spawn("sleep 5")
  pids << pid
  current_memory = measure_memory_usage
  puts "After spawn #{i+1}: #{current_memory} KB (delta: #{current_memory - initial_memory})"
end

pids.each { |pid| Process.kill("TERM", pid) rescue nil }
pids.each { |pid| Process.wait(pid) rescue nil }

Large output handling requires careful file descriptor management to avoid memory accumulation in the parent process. Redirecting output to files or pipes prevents Ruby from buffering large amounts of data.

# Inefficient - large output accumulates in memory
output = `find / -name "*.txt" 2>/dev/null`

# Efficient - output goes directly to file
pid = spawn("find / -name '*.txt'", 
            :out => "/tmp/find_results.txt",
            :err => "/dev/null")
Process.wait(pid)

# Process results from file without loading everything into memory
File.foreach("/tmp/find_results.txt") do |line|
  # Process each line individually
  puts line if line.include?("important")
end

Concurrent process management requires careful resource tracking to prevent process leaks and resource exhaustion. The system limits the number of processes per user, making cleanup essential in long-running applications.

class ProcessPool
  def initialize(max_processes = 10)
    @max_processes = max_processes
    @active_processes = {}
  end
  
  def spawn_managed(command, options = {})
    cleanup_finished_processes
    
    if @active_processes.size >= @max_processes
      wait_for_available_slot
    end
    
    pid = spawn(command, options)
    @active_processes[pid] = Time.now
    pid
  end
  
  private
  
  def cleanup_finished_processes
    @active_processes.keys.each do |pid|
      if Process.waitpid(pid, Process::WNOHANG)
        @active_processes.delete(pid)
      end
    rescue Errno::ECHILD
      # Process already reaped
      @active_processes.delete(pid)
    end
  end
  
  def wait_for_available_slot
    oldest_pid = @active_processes.min_by { |pid, time| time }.first
    Process.wait(oldest_pid)
    @active_processes.delete(oldest_pid)
  end
end

# Usage
pool = ProcessPool.new(5)
10.times do |i|
  pool.spawn_managed("echo 'Task #{i}'")
end

Production Patterns

Web application deployments frequently use spawn for background job processing, file generation, and external service integration. Proper error handling and monitoring becomes essential in production environments where process failures must be detected and handled gracefully.

class BackgroundJobRunner
  def self.run_job(job_command, job_id, timeout: 300)
    log_file = "/var/log/jobs/#{job_id}.log"
    error_file = "/var/log/jobs/#{job_id}.error"
    
    # Spawn job with proper logging and timeout
    pid = spawn(job_command,
                :out => log_file,
                :err => error_file,
                :pgroup => true)  # Create new process group
    
    # Monitor job with timeout
    start_time = Time.now
    loop do
      break if Process.waitpid(pid, Process::WNOHANG)
      
      if Time.now - start_time > timeout
        # Terminate entire process group
        Process.kill("-TERM", pid) rescue nil
        sleep 2
        Process.kill("-KILL", pid) rescue nil
        raise "Job #{job_id} timed out after #{timeout} seconds"
      end
      
      sleep 1
    end
    
    # Check results and log status
    success = $?.success?
    log_job_completion(job_id, success, $?.exitstatus)
    success
  rescue => e
    log_job_error(job_id, e)
    false
  end
  
  private
  
  def self.log_job_completion(job_id, success, exit_code)
    status = success ? "SUCCESS" : "FAILED"
    Rails.logger.info "Job #{job_id} completed: #{status} (exit code: #{exit_code})"
  end
  
  def self.log_job_error(job_id, error)
    Rails.logger.error "Job #{job_id} error: #{error.message}"
  end
end

Service integration patterns handle external command execution with retry logic, circuit breakers, and fallback strategies. Production systems must handle temporary service unavailability and partial failures gracefully.

class ExternalServiceClient
  def initialize(max_retries: 3, timeout: 30)
    @max_retries = max_retries
    @timeout = timeout
    @circuit_breaker = CircuitBreaker.new
  end
  
  def call_external_service(service_command, input_data)
    @circuit_breaker.call do
      attempt = 0
      begin
        attempt += 1
        
        # Write input to temporary file
        input_file = Tempfile.new('service_input')
        input_file.write(input_data)
        input_file.close
        
        output_file = Tempfile.new('service_output')
        output_file.close
        
        # Execute service with timeout
        pid = spawn(service_command, 
                    :in => input_file.path,
                    :out => output_file.path,
                    :err => "/dev/null")
        
        wait_with_timeout(pid, @timeout)
        
        unless $?.success?
          raise "Service returned exit code #{$?.exitstatus}"
        end
        
        # Return service output
        File.read(output_file.path)
        
      rescue => e
        if attempt < @max_retries && retryable_error?(e)
          sleep(2 ** attempt)  # Exponential backoff
          retry
        end
        raise e
      ensure
        input_file&.unlink
        output_file&.unlink
      end
    end
  end
  
  private
  
  def wait_with_timeout(pid, timeout)
    start_time = Time.now
    while Time.now - start_time < timeout
      return if Process.waitpid(pid, Process::WNOHANG)
      sleep 0.1
    end
    
    Process.kill("TERM", pid) rescue nil
    sleep 1
    Process.kill("KILL", pid) rescue nil
    Process.wait(pid) rescue nil
    raise "Service call timed out"
  end
  
  def retryable_error?(error)
    # Retry on timeout and temporary failures
    error.message.include?("timeout") || 
    error.message.include?("Connection refused")
  end
end

Container and deployment environments require careful handling of signal propagation and process lifecycle management. Applications running in containers must respond appropriately to termination signals and clean up child processes.

# Signal handling for graceful shutdown
class ApplicationServer
  def initialize
    @child_processes = []
    @shutdown = false
    setup_signal_handlers
  end
  
  def start
    puts "Starting application server..."
    
    # Start background workers
    3.times do |i|
      pid = spawn("ruby worker.rb #{i}")
      @child_processes << pid
    end
    
    # Main application loop
    until @shutdown
      handle_requests
      cleanup_finished_children
      sleep 1
    end
    
    shutdown_gracefully
  end
  
  private
  
  def setup_signal_handlers
    trap("TERM") do
      puts "Received TERM signal, shutting down gracefully..."
      @shutdown = true
    end
    
    trap("INT") do
      puts "Received INT signal, shutting down gracefully..."
      @shutdown = true
    end
  end
  
  def cleanup_finished_children
    @child_processes.reject! do |pid|
      if Process.waitpid(pid, Process::WNOHANG)
        puts "Child process #{pid} finished"
        true
      else
        false
      end
    rescue Errno::ECHILD
      true  # Process already reaped
    end
  end
  
  def shutdown_gracefully
    puts "Terminating child processes..."
    
    @child_processes.each do |pid|
      Process.kill("TERM", pid) rescue nil
    end
    
    # Wait for graceful shutdown
    sleep 5
    
    # Force kill remaining processes
    @child_processes.each do |pid|
      Process.kill("KILL", pid) rescue nil
      Process.wait(pid) rescue nil
    end
    
    puts "Shutdown complete"
  end
  
  def handle_requests
    # Application request handling logic
  end
end

Common Pitfalls

Process zombie creation occurs when parent processes fail to wait for child process completion. Zombie processes accumulate system resources and can eventually exhaust the process table, causing system-wide problems.

# Problematic code - creates zombies
def bad_background_processing
  10.times do |i|
    spawn("long_running_task #{i}")
    # Missing Process.wait - creates zombie processes
  end
end

# Solution - proper child process cleanup
def good_background_processing  
  pids = []
  10.times do |i|
    pid = spawn("long_running_task #{i}")
    pids << pid
  end
  
  # Wait for all child processes
  pids.each { |pid| Process.wait(pid) }
end

# Alternative - non-blocking cleanup
def background_with_periodic_cleanup
  @child_pids ||= []
  
  # Start new background task
  pid = spawn("background_task")
  @child_pids << pid
  
  # Clean up finished processes
  @child_pids.reject! do |pid|
    Process.waitpid(pid, Process::WNOHANG)
  rescue Errno::ECHILD
    true  # Process already cleaned up
  end
end

Shell injection vulnerabilities arise when user input passes directly into string-form commands without proper sanitization. Array-form commands avoid shell interpretation entirely, preventing most injection attacks.

# Vulnerable to shell injection
def unsafe_file_search(user_input)
  pid = spawn("find /data -name '#{user_input}'")
  Process.wait(pid)
end

# User input: "'; rm -rf /; echo '"
# Results in: find /data -name ''; rm -rf /; echo ''

# Safe approach using array form
def safe_file_search(user_input)
  # Array form prevents shell interpretation
  pid = spawn("find", "/data", "-name", user_input)
  Process.wait(pid)
end

# Additional safety with input validation
def validated_file_search(user_input)
  # Validate input format
  unless user_input.match?(/\A[a-zA-Z0-9._-]+\z/)
    raise ArgumentError, "Invalid filename pattern"
  end
  
  pid = spawn("find", "/data", "-name", user_input)
  Process.wait(pid)
end

File descriptor leakage occurs when processes open files or sockets without proper cleanup, especially in long-running applications that spawn many processes. Ruby's garbage collector does not automatically close file descriptors in child processes.

# Problematic - file descriptors leak to child processes
def file_descriptor_leak_example
  file = File.open("/tmp/parent_data.txt", "w")
  
  # Child process inherits open file descriptor
  pid = spawn("some_command")
  Process.wait(pid)
  
  # File remains open in child, preventing cleanup
  file.close
end

# Solution - close file descriptors explicitly
def proper_fd_management
  file = File.open("/tmp/parent_data.txt", "w")
  
  begin
    # Use close_others option to close inherited descriptors
    pid = spawn("some_command", :close_others => true)
    Process.wait(pid)
  ensure
    file.close
  end
end

# Advanced - selective file descriptor control
def selective_fd_control
  input_file = File.open("/tmp/input.txt", "r")
  log_file = File.open("/tmp/process.log", "w")
  
  # Pass specific files, close others
  pid = spawn("data_processor",
              :in => input_file,
              :out => log_file,
              :close_others => true)
  
  Process.wait(pid)
  
ensure
  input_file&.close
  log_file&.close
end

Environment variable pollution affects child processes when parent process environment contains sensitive or conflicting values. Child processes inherit the complete parent environment unless explicitly managed.

# Environment pollution example
ENV["SECRET_KEY"] = "sensitive_value"
ENV["DEBUG"] = "true"

# Child inherits all environment variables
pid = spawn("external_command")
Process.wait(pid)

# Solution - clean environment for child processes
def clean_environment_spawn(command)
  # Define minimal safe environment
  clean_env = {
    "PATH" => "/usr/local/bin:/usr/bin:/bin",
    "HOME" => ENV["HOME"],
    "USER" => ENV["USER"]
  }
  
  pid = spawn(clean_env, command)
  Process.wait(pid)
end

# Selective environment inheritance
def selective_environment_spawn(command, allowed_vars = [])
  safe_env = {}
  
  # Copy only allowed variables
  allowed_vars.each do |var|
    safe_env[var] = ENV[var] if ENV[var]
  end
  
  # Remove sensitive variables explicitly
  sensitive_vars = ["SECRET_KEY", "DATABASE_PASSWORD", "API_TOKEN"]
  sensitive_vars.each { |var| safe_env[var] = nil }
  
  pid = spawn(safe_env, command)
  Process.wait(pid)
end

Signal handling complications arise in multi-process applications where signals intended for parent processes accidentally affect child processes or vice versa. Process groups help isolate signal handling between related processes.

# Problematic - signals affect child processes unexpectedly
def signal_propagation_problem
  pid = spawn("long_running_process")
  
  # SIGTERM to parent also affects child
  trap("TERM") { exit }
  
  Process.wait(pid)
end

# Solution - isolate child processes in separate process groups
def isolated_process_groups
  # Create child in new process group
  pid = spawn("long_running_process", :pgroup => true)
  
  trap("TERM") do
    # Terminate specific process group
    Process.kill("-TERM", pid) rescue nil
    sleep 2
    Process.kill("-KILL", pid) rescue nil
    exit
  end
  
  Process.wait(pid)
end

# Advanced signal coordination
def coordinated_shutdown
  child_pids = []
  
  # Start multiple workers in same process group
  3.times do
    pid = spawn("worker_process", :pgroup => true)
    child_pids << pid
  end
  
  trap("TERM") do
    puts "Coordinating shutdown..."
    
    # Send shutdown signal to all children
    child_pids.each do |pid|
      Process.kill("TERM", pid) rescue nil
    end
    
    # Wait for graceful shutdown
    start_time = Time.now
    child_pids.each do |pid|
      begin
        timeout = 10 - (Time.now - start_time)
        if timeout > 0
          Timeout.timeout(timeout) { Process.wait(pid) }
        else
          Process.kill("KILL", pid) rescue nil
          Process.wait(pid) rescue nil
        end
      rescue Timeout::Error
        Process.kill("KILL", pid) rescue nil
        Process.wait(pid) rescue nil
      end
    end
    
    exit
  end
  
  # Wait for all children
  child_pids.each { |pid| Process.wait(pid) }
end

Reference

Core Methods

Method Parameters Returns Description
spawn(command, **opts) command (String/Array), options (Hash) Integer Creates child process, returns PID
spawn(env, command, **opts) env (Hash), command (String/Array), options (Hash) Integer Creates child process with environment
exec(command, **opts) command (String/Array), options (Hash) NoReturn Replaces current process
exec(env, command, **opts) env (Hash), command (String/Array), options (Hash) NoReturn Replaces process with environment

Command Format Options

Format Example Shell Usage Security
String "ls -la" Shell interprets Vulnerable to injection
Array ["ls", "-la"] Direct execution Safe from injection
Array with name [["ruby", "script"], "file.rb"] Override argv[0] Safe from injection

Environment Hash Options

Key Type Value Effect
String String Sets environment variable
String nil Removes environment variable
String Empty string Sets variable to empty

File Descriptor Redirection

Option Destination Description
:in String/IO Redirect standard input
:out String/IO Redirect standard output
:err String/IO Redirect standard error
:err :out Redirect stderr to stdout
Integer String/IO Redirect specific file descriptor

Process Options

Option Type Description
:chdir String Change working directory
:umask Integer Set file creation mask
:pgroup Boolean/Integer Process group control
:rlimit_* Array Resource limits
:unsetenv_others Boolean Clear inherited environment
:close_others Boolean Close inherited file descriptors

Process Wait Methods

Method Behavior Returns
Process.wait(pid) Block until process exits PID
Process.waitpid(pid, flags) Wait with flags PID or nil
Process.wait2(pid) Block until exit [PID, Process::Status]
Process.waitall Wait for all children Array of [PID, status]

Wait Flags

Flag Description
Process::WNOHANG Return immediately if no child available
Process::WUNTRACED Return if child stopped by signal
Process::WCONTINUED Return if stopped child resumed

Process Status Methods

Method Returns Description
$?.success? Boolean True if exit status zero
$?.exitstatus Integer/nil Exit status code
$?.pid Integer Process ID
$?.signaled? Boolean True if terminated by signal
$?.termsig Integer/nil Signal number that terminated process
$?.stopped? Boolean True if process stopped
$?.stopsig Integer/nil Signal number that stopped process

Common Errno Exceptions

Exception Cause
Errno::ENOENT Command or file not found
Errno::EACCES Permission denied
Errno::E2BIG Argument list too long
Errno::ECHILD No child processes
Errno::EMFILE Too many open files
Errno::ENOMEM Out of memory

Resource Limit Keys

Key Resource
:RLIMIT_CPU CPU time in seconds
:RLIMIT_FSIZE Maximum file size
:RLIMIT_DATA Data segment size
:RLIMIT_STACK Stack size
:RLIMIT_CORE Core file size
:RLIMIT_NPROC Number of processes
:RLIMIT_NOFILE Number of open files
:RLIMIT_MEMLOCK Locked memory