CrackedRuby - Shellwords

Overview

Ruby's Shellwords module provides methods for manipulating strings intended for shell execution. The module handles the complex task of properly escaping shell metacharacters, splitting shell command strings, and joining arguments in a shell-safe manner. Shellwords addresses the fundamental security concern of shell injection attacks that occur when untrusted input is passed directly to shell commands.

The module contains three primary methods: split (also aliased as shellsplit), escape (also aliased as shellescape), and join (also aliased as shelljoin). These methods handle the bidirectional transformation between arrays of arguments and shell command strings while preserving the exact semantics that shells expect.

Ruby implements Shellwords parsing according to POSIX shell standards, recognizing single quotes, double quotes, backslash escaping, and whitespace handling rules. The module correctly handles nested quoting scenarios, escaped characters within quotes, and empty arguments that would otherwise be lost during shell parsing.

require 'shellwords'

# Splitting a shell command string into components
command = "ls -la 'My Documents' /home/user"
args = Shellwords.split(command)
# => ["ls", "-la", "My Documents", "/home/user"]

# Escaping dangerous characters for shell safety
filename = "file; rm -rf /"
safe_filename = Shellwords.escape(filename)
# => "file\\;\\ rm\\ -rf\\ /"

# Joining arguments into a shell command string
parts = ["grep", "pattern with spaces", "file.txt"]
command = Shellwords.join(parts)
# => "grep pattern\\ with\\ spaces file.txt"

Basic Usage

The Shellwords.split method parses shell command strings into arrays of individual arguments. The method recognizes quoted strings, escaped characters, and properly handles whitespace separation. Single-quoted strings preserve all characters literally, while double-quoted strings allow variable expansion and escape sequences when passed to actual shells.

require 'shellwords'

# Basic splitting with mixed quoting
command = %q{echo "hello world" 'single quotes' unquoted}
args = Shellwords.split(command)
# => ["echo", "hello world", "single quotes", "unquoted"]

# Handling escaped characters
command = "cp file\\ with\\ spaces.txt /tmp/"
args = Shellwords.split(command)
# => ["cp", "file with spaces.txt", "/tmp/"]

# Empty arguments are preserved
command = "command '' \"\" arg"
args = Shellwords.split(command)
# => ["command", "", "", "arg"]

The Shellwords.escape method converts individual strings into shell-safe versions by escaping special characters. The method handles all POSIX shell metacharacters including spaces, semicolons, pipes, redirections, and glob patterns. The escaping strategy uses backslashes for most characters, ensuring compatibility across different shell implementations.

# Escaping filenames with special characters
filename = "My Document (final).txt"
escaped = Shellwords.escape(filename)
# => "My\\ Document\\ \\(final\\).txt"

# Handling command injection attempts
malicious_input = "file.txt; cat /etc/passwd"
safe_input = Shellwords.escape(malicious_input)
# => "file.txt\\;\\ cat\\ /etc/passwd"

# Building safe system commands
user_input = "data & analysis"
command = "grep #{Shellwords.escape(user_input)} logfile.txt"
# => "grep data\\ \\&\\ analysis logfile.txt"

The Shellwords.join method combines arrays of arguments into shell command strings. The method applies appropriate escaping to each argument and joins them with spaces. Arguments that require escaping are handled automatically, while simple strings without special characters remain unmodified.

# Joining simple arguments
args = ["ls", "-l", "/home"]
command = Shellwords.join(args)
# => "ls -l /home"

# Joining arguments with special characters
args = ["find", "/home", "-name", "*.log file"]
command = Shellwords.join(args)
# => "find /home -name *.log\\ file"

# Building complex commands programmatically
base_cmd = ["rsync", "-avz"]
source_files = ["file 1.txt", "file 2.txt"]
destination = "user@server:/backup/"
full_args = base_cmd + source_files + [destination]
command = Shellwords.join(full_args)
# => "rsync -avz file\\ 1.txt file\\ 2.txt user@server:/backup/"

Error Handling & Debugging

The Shellwords.split method raises ArgumentError when encountering malformed shell command strings. The most common error occurs with unclosed quotes, where the method cannot determine the intended argument boundaries. The error message indicates the specific parsing failure, typically "Unmatched quote" or similar diagnostic information.

require 'shellwords'

# Unclosed single quote triggers ArgumentError
begin
  Shellwords.split("echo 'unclosed quote")
rescue ArgumentError => e
  puts e.message
  # => "Unmatched quote"
end

# Unclosed double quote also triggers ArgumentError
begin
  Shellwords.split('command "partial string')
rescue ArgumentError => e
  puts e.message
  # => "Unmatched quote"
end

Shell injection vulnerabilities represent the primary security concern when working with shell commands. Shellwords prevents these attacks through proper escaping, but developers must apply escaping consistently to all untrusted input. The module does not automatically detect or prevent all injection scenarios - it provides tools that must be used correctly.

# Vulnerable: Direct string interpolation
user_file = "important.txt; rm -rf /"
dangerous_command = "cat #{user_file}"
# => "cat important.txt; rm -rf /"
# This would execute: cat important.txt, then rm -rf /

# Secure: Using Shellwords.escape
user_file = "important.txt; rm -rf /"
safe_command = "cat #{Shellwords.escape(user_file)}"
# => "cat important.txt\\;\\ rm\\ -rf\\ /"
# Shell treats entire string as single filename

# Building secure commands with arrays
def build_grep_command(pattern, files)
  args = ["grep", pattern] + files
  Shellwords.join(args)
end

pattern = "search term; dangerous_command"
files = ["log1.txt", "log2.txt"]
command = build_grep_command(pattern, files)
# => "grep search\\ term\\;\\ dangerous_command log1.txt log2.txt"

Encoding issues can cause unexpected behavior when shell commands contain non-ASCII characters. Shellwords handles UTF-8 encoded strings correctly, but problems arise when mixing encodings or when the receiving shell has different encoding expectations. The module preserves byte-for-byte accuracy of input strings while applying escaping rules.

# UTF-8 strings with special characters
filename = "café & résumé.pdf"
escaped = Shellwords.escape(filename)
# => "café\\ \\&\\ résumé.pdf"

# Binary data in arguments requires careful handling
binary_pattern = "\xFF\xFE pattern"
args = ["grep", binary_pattern, "data.bin"]
command = Shellwords.join(args)
# Binary bytes are escaped as needed for shell safety

Debugging shell command construction requires understanding both the Ruby string representation and the final shell interpretation. The Shellwords.split method can verify that escaping worked correctly by parsing the generated command back into components.

# Round-trip testing for command construction
original_args = ["grep", "complex pattern", "file name.txt"]
command_string = Shellwords.join(original_args)
parsed_args = Shellwords.split(command_string)

puts "Original: #{original_args.inspect}"
puts "Command: #{command_string}"
puts "Parsed: #{parsed_args.inspect}"
puts "Round-trip successful: #{original_args == parsed_args}"

# Output:
# Original: ["grep", "complex pattern", "file name.txt"]
# Command: grep complex\\ pattern file\\ name.txt
# Parsed: ["grep", "complex pattern", "file name.txt"]
# Round-trip successful: true

Production Patterns

Web applications frequently use Shellwords when executing system commands based on user input. The module integrates with Ruby's system command execution methods like system, backticks, and Open3. Rails applications commonly use Shellwords in background job processing, file upload handling, and administrative tasks.

require 'shellwords'
require 'open3'

class DocumentProcessor
  def convert_document(input_path, output_path, format)
    # Escape all user-provided paths and options
    safe_input = Shellwords.escape(input_path)
    safe_output = Shellwords.escape(output_path)
    safe_format = Shellwords.escape(format)

    command = "pandoc -f markdown -t #{safe_format} #{safe_input} -o #{safe_output}"

    stdout, stderr, status = Open3.capture3(command)

    unless status.success?
      raise "Document conversion failed: #{stderr}"
    end

    stdout
  end

  def batch_convert(files, output_dir, format)
    commands = files.map do |file|
      input_path = Shellwords.escape(file[:path])
      output_name = File.basename(file[:path], ".*") + ".#{format}"
      output_path = Shellwords.escape(File.join(output_dir, output_name))

      "pandoc -f markdown -t #{Shellwords.escape(format)} #{input_path} -o #{output_path}"
    end

    # Execute commands in parallel using xargs
    command_list = commands.join("\n")
    parallel_cmd = "echo #{Shellwords.escape(command_list)} | xargs -I {} -P 4 sh -c '{}'"

    system(parallel_cmd)
  end
end

Container orchestration and deployment scripts rely on Shellwords for generating Docker commands, Kubernetes configurations, and CI/CD pipeline commands. The module handles complex argument passing between different shell environments and container contexts.

class ContainerDeployer
  def build_docker_command(image_name, build_args, dockerfile_path)
    base_cmd = ["docker", "build"]

    # Add build arguments
    build_args.each do |key, value|
      base_cmd << "--build-arg"
      base_cmd << "#{key}=#{value}"
    end

    # Add dockerfile path and image tag
    base_cmd << "-f" << dockerfile_path
    base_cmd << "-t" << image_name
    base_cmd << "."

    Shellwords.join(base_cmd)
  end

  def deploy_with_secrets(image, environment, secrets)
    env_vars = environment.map { |k, v| "-e #{k}=#{v}" }
    secret_mounts = secrets.map { |name, path|
      "--mount type=secret,id=#{name},target=#{path}"
    }

    args = ["docker", "run", "-d"] + env_vars + secret_mounts + [image]
    command = Shellwords.join(args)

    # Log command for debugging (without sensitive values)
    safe_args = ["docker", "run", "-d"] +
                environment.keys.map { |k| "-e #{k}=[REDACTED]" } +
                secrets.keys.map { |name| "--mount type=secret,id=#{name}" } +
                [image]

    logger.info("Executing: #{Shellwords.join(safe_args)}")
    system(command)
  end
end

System administration tools and monitoring scripts use Shellwords for constructing commands that process log files, manage services, and generate reports. The module handles complex regular expressions, file patterns, and command pipelines safely.

require 'shellwords'

class SystemMonitor
  def analyze_logs(log_pattern, search_terms, time_range)
    # Build find command for log files
    find_args = ["find", "/var/log", "-name", log_pattern, "-type", "f"]

    if time_range[:newer_than]
      find_args += ["-newermt", time_range[:newer_than]]
    end

    find_command = Shellwords.join(find_args)

    # Build grep command for search terms
    grep_args = ["grep", "-E"]
    pattern = search_terms.map { |term| Regexp.escape(term) }.join("|")
    grep_args << pattern

    # Combine commands with xargs
    full_command = "#{find_command} | xargs #{Shellwords.join(grep_args)}"

    `#{full_command}`.split("\n")
  end

  def cleanup_old_files(base_path, pattern, days_old)
    # Multiple safety checks for destructive operations
    raise ArgumentError, "base_path must be absolute" unless base_path.start_with?("/")
    raise ArgumentError, "days_old must be positive" unless days_old > 0

    find_args = [
      "find", base_path,
      "-name", pattern,
      "-type", "f",
      "-mtime", "+#{days_old}",
      "-delete"
    ]

    command = Shellwords.join(find_args)

    # Log before executing destructive operations
    logger.warn("Executing cleanup: #{command}")
    system(command)
  end

  def generate_system_report(output_file)
    report_commands = [
      ["df", "-h"],
      ["free", "-h"],
      ["uptime"],
      ["ps", "aux", "--sort=-%cpu", "|", "head", "-20"]
    ]

    File.open(output_file, "w") do |f|
      report_commands.each do |cmd_parts|
        f.puts "=== #{cmd_parts.join(' ')} ==="

        if cmd_parts.include?("|")
          # Handle shell pipes specially
          f.puts `#{cmd_parts.join(' ')}`
        else
          command = Shellwords.join(cmd_parts)
          f.puts `#{command}`
        end

        f.puts
      end
    end
  end
end

Reference

Core Methods

Method	Parameters	Returns	Description
`Shellwords.split(line)`	`line` (String)	`Array<String>`	Splits a shell command line into an array of arguments
`Shellwords.escape(str)`	`str` (String)	`String`	Escapes a string so it can be safely used in a shell command
`Shellwords.join(array)`	`array` (Array)	`String`	Joins an array of strings into a shell command line

Method Aliases

Alias	Original Method	Usage
`shellsplit`	`Shellwords.split`	`Shellwords.shellsplit(line)`
`shellescape`	`Shellwords.escape`	`Shellwords.shellescape(str)`
`shelljoin`	`Shellwords.join`	`Shellwords.shelljoin(array)`

String Extension Methods

When requiring 'shellwords', Ruby adds convenience methods to the String and Array classes:

Method	Class	Equivalent	Description
`#shellsplit`	String	`Shellwords.split(self)`	Instance method for string splitting
`#shellescape`	String	`Shellwords.escape(self)`	Instance method for string escaping
`#shelljoin`	Array	`Shellwords.join(self)`	Instance method for array joining

Escape Character Handling

Character	Shell Meaning	Escaped Form	Context
(space)	Argument separator	`\`	Always escaped outside quotes
`;`	Command separator	`\;`	Always escaped
`&`	Background execution	`\&`	Always escaped
`\|`	Pipe operator	`\|`	Always escaped
`<`	Input redirection	`\<`	Always escaped
`>`	Output redirection	`\>`	Always escaped
`(` `)`	Subshell grouping	`$` `$`	Always escaped
`{` `}`	Brace expansion	`\{` `\}`	Always escaped
`[` `]`	Glob patterns	`\[` `\]`	Always escaped
`*`	Glob wildcard	`\*`	Always escaped
`?`	Single char glob	`\?`	Always escaped
`$`	Variable expansion	`\$`	Escaped outside single quotes
`\`	Escape character	`\\`	Always escaped
`'`	Single quote	`\'`	Cannot appear inside single quotes
`"`	Double quote	`\"`	Escaped inside double quotes

Quoting Rules

Quote Type	Behavior	Variable Expansion	Escape Sequences	Use Case
Single quotes `'...'`	Literal interpretation	No	No	Preserve exact content
Double quotes `"..."`	Allow expansions	Yes	Yes	Allow shell processing
No quotes	Word splitting	Yes	Yes	Simple arguments
Backslash `\`	Escape next character	Depends on context	Yes	Individual character escaping

Error Conditions

Error Type	Trigger Condition	Error Message	Recovery Strategy
`ArgumentError`	Unclosed single quote	"Unmatched quote"	Add closing quote or escape
`ArgumentError`	Unclosed double quote	"Unmatched quote"	Add closing quote or escape
`ArgumentError`	Invalid escape sequence	Varies	Check backslash usage
`Encoding::CompatibilityError`	Mixed encodings	Varies	Ensure consistent encoding

Integration Examples

# System command execution
command = Shellwords.join(["ls", "-la", "My Documents"])
output = `#{command}`

# Open3 integration
cmd_array = ["grep", "pattern", "file with spaces.txt"]
stdout, stderr, status = Open3.capture3(*cmd_array)

# Direct system() call with escaping
filename = "user input.txt"
system("wc -l #{Shellwords.escape(filename)}")

Security Considerations

Vulnerability	Mitigation	Example
Command injection	Always escape user input	`Shellwords.escape(user_input)`
Argument injection	Use array forms when possible	`system(*cmd_array)` vs `system(cmd_string)`
Path traversal	Validate paths before escaping	Check for `../` patterns
Code execution	Never use `eval` with shell commands	Use predefined command templates