Overview
Ruby's CGI module provides classes and methods for creating Common Gateway Interface scripts that handle HTTP requests and responses. The module includes the core CGI
class for parsing form data and managing HTTP interactions, along with specialized classes for HTML generation, cookie handling, and session management.
The CGI class serves as the primary interface, automatically parsing query strings, form data, and multipart uploads from HTTP requests. It provides methods for accessing form parameters, managing HTTP headers, and generating proper responses. Ruby's implementation handles character encoding, URL decoding, and multipart parsing transparently.
require 'cgi'
# Basic CGI initialization
cgi = CGI.new
name = cgi['username'] # Access form parameter
puts "Content-Type: text/html\n\n"
puts "<h1>Hello #{CGI.escapeHTML(name)}</h1>"
The module includes several utility methods for web development tasks. CGI.escape
and CGI.unescape
handle URL encoding and decoding, while CGI.escapeHTML
and CGI.unescapeHTML
manage HTML entity encoding for security and display purposes.
# URL and HTML escaping
url_safe = CGI.escape("hello world & more")
# => "hello+world+%26+more"
html_safe = CGI.escapeHTML("<script>alert('xss')</script>")
# => "<script>alert('xss')</script>"
Ruby's CGI implementation supports cookies through the CGI::Cookie
class, which handles cookie creation, parsing, and attribute management. The module also provides session management capabilities and HTML generation utilities for dynamic content creation.
Basic Usage
Creating a CGI script begins with requiring the module and instantiating a CGI object. The constructor automatically parses incoming request data, making form parameters accessible through hash-like syntax or method calls.
#!/usr/bin/env ruby
require 'cgi'
cgi = CGI.new
user_input = cgi['message']
email = cgi.params['email'].first # params returns arrays
# Generate response headers
puts cgi.header('type' => 'text/html', 'charset' => 'utf-8')
puts "<html><body><h1>Message: #{CGI.escapeHTML(user_input)}</h1></body></html>"
Form parameter access supports both single values and arrays for fields that may contain multiple values. The params
method returns a hash where values are always arrays, while bracket notation returns the first value or nil
.
cgi = CGI.new
# Single value access
username = cgi['username'] # String or nil
categories = cgi['category'] # First value only
# Array access for multiple values
all_categories = cgi.params['category'] # Array of all values
selected_options = cgi.params['options'] || []
File uploads require multipart form encoding and are handled through CGI::TempFile
objects. These provide access to the uploaded file content, original filename, and content type information.
cgi = CGI.new
uploaded_file = cgi['attachment']
if uploaded_file.respond_to?(:read)
filename = uploaded_file.original_filename
content_type = uploaded_file.content_type
file_size = uploaded_file.size
# Process file content
File.open("/uploads/#{filename}", 'wb') do |f|
f.write(uploaded_file.read)
end
end
Cookie management involves creating CGI::Cookie
objects with specified names, values, and attributes. Cookies can be set in response headers and accessed from incoming requests through the cookies hash.
# Creating and setting cookies
session_cookie = CGI::Cookie.new(
'name' => 'session_id',
'value' => 'abc123',
'expires' => Time.now + 86400, # 24 hours
'path' => '/',
'secure' => true
)
puts cgi.header('cookie' => session_cookie)
# Reading incoming cookies
existing_session = cgi.cookies['session_id'].first if cgi.cookies['session_id']
Error Handling & Debugging
CGI applications must handle various input validation scenarios and encoding issues. Malformed requests, missing parameters, and invalid file uploads can cause runtime errors that require graceful handling.
require 'cgi'
begin
cgi = CGI.new
# Validate required parameters
username = cgi['username']
raise ArgumentError, "Username required" if username.nil? || username.empty?
# Validate parameter format
email = cgi['email']
unless email =~ /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i
raise ArgumentError, "Invalid email format"
end
rescue ArgumentError => e
puts cgi.header('status' => '400 Bad Request')
puts "<h1>Error: #{CGI.escapeHTML(e.message)}</h1>"
rescue StandardError => e
puts cgi.header('status' => '500 Internal Server Error')
puts "<h1>Server Error</h1>"
# Log actual error for debugging
File.open('/var/log/cgi_errors.log', 'a') do |log|
log.puts "#{Time.now}: #{e.class} - #{e.message}"
log.puts e.backtrace.join("\n")
end
end
File upload validation requires checking multiple attributes to ensure security and prevent resource exhaustion. Size limits, content type validation, and filename sanitization prevent common attack vectors.
def validate_upload(uploaded_file)
return nil unless uploaded_file.respond_to?(:read)
# Check file size
if uploaded_file.size > 5_000_000 # 5MB limit
raise ArgumentError, "File too large"
end
# Validate content type
allowed_types = ['image/jpeg', 'image/png', 'image/gif']
unless allowed_types.include?(uploaded_file.content_type)
raise ArgumentError, "Invalid file type"
end
# Sanitize filename
original_name = uploaded_file.original_filename
sanitized_name = File.basename(original_name).gsub(/[^\w.-]/, '_')
if sanitized_name.empty?
raise ArgumentError, "Invalid filename"
end
{ content: uploaded_file.read, filename: sanitized_name }
end
# Usage with error handling
begin
file_data = validate_upload(cgi['upload'])
# Process validated file
rescue ArgumentError => e
puts cgi.header('status' => '422 Unprocessable Entity')
puts "Upload error: #{CGI.escapeHTML(e.message)}"
end
Debugging CGI scripts requires logging capabilities since standard output is reserved for HTTP responses. Environment variable inspection and request debugging help identify issues in production environments.
def debug_request(cgi)
debug_info = {
'REQUEST_METHOD' => ENV['REQUEST_METHOD'],
'CONTENT_TYPE' => ENV['CONTENT_TYPE'],
'CONTENT_LENGTH' => ENV['CONTENT_LENGTH'],
'QUERY_STRING' => ENV['QUERY_STRING'],
'HTTP_USER_AGENT' => ENV['HTTP_USER_AGENT'],
'REMOTE_ADDR' => ENV['REMOTE_ADDR']
}
File.open('/var/log/cgi_debug.log', 'a') do |log|
log.puts "=== Request Debug #{Time.now} ==="
debug_info.each { |k, v| log.puts "#{k}: #{v}" }
log.puts "Parameters: #{cgi.params.inspect}"
log.puts "Cookies: #{cgi.cookies.keys}"
end
end
Production Patterns
Production CGI deployments require proper security headers, session management, and integration with web server configurations. Apache and Nginx configurations must properly execute CGI scripts and handle file permissions.
require 'cgi'
require 'cgi/session'
require 'digest/sha2'
class SecureCGIApplication
def initialize
@cgi = CGI.new
@session = CGI::Session.new(@cgi,
'database_manager' => CGI::Session::FileStore,
'session_path' => '/tmp/sessions',
'session_expires' => Time.now + 1800, # 30 minutes
'prefix' => 'webapp_'
)
end
def authenticate_user
username = @cgi['username']
password = @cgi['password']
return false unless username && password
# Hash password for comparison (use proper password hashing in production)
password_hash = Digest::SHA256.hexdigest(password + 'salt_value')
# Verify against database/file
authenticated = verify_credentials(username, password_hash)
if authenticated
@session['user_id'] = username
@session['last_activity'] = Time.now.to_i
true
else
false
end
end
def require_authentication
user_id = @session['user_id']
last_activity = @session['last_activity']
unless user_id && last_activity &&
(Time.now.to_i - last_activity.to_i) < 1800
redirect_to_login
return false
end
@session['last_activity'] = Time.now.to_i
true
end
def generate_response
headers = {
'type' => 'text/html',
'charset' => 'utf-8',
'X-Frame-Options' => 'DENY',
'X-Content-Type-Options' => 'nosniff',
'X-XSS-Protection' => '1; mode=block',
'Strict-Transport-Security' => 'max-age=31536000; includeSubDomains'
}
puts @cgi.header(headers)
end
private
def verify_credentials(username, password_hash)
# Implementation depends on storage mechanism
# This is a simplified example
users = load_user_database
users[username] == password_hash
end
def redirect_to_login
puts @cgi.header('status' => '302 Found', 'location' => '/login.html')
exit
end
end
Load balancing and scaling considerations affect session storage and state management. File-based sessions work for single-server deployments, but distributed applications require shared session storage.
# Database-backed session management for scalability
require 'pg' # or preferred database adapter
class DatabaseSessionStore < CGI::Session::NullStore
def initialize(session, options = {})
@db_config = options['database_config']
@table_name = options['table_name'] || 'cgi_sessions'
super
end
def restore
conn = PG.connect(@db_config)
result = conn.exec_params(
"SELECT data FROM #{@table_name} WHERE session_id = $1 AND expires_at > NOW()",
[@session_id]
)
if result.ntuples > 0
Marshal.load(result[0]['data'])
else
{}
end
ensure
conn&.close
end
def update
data = Marshal.dump(@h)
expires_at = Time.now + 1800 # 30 minutes
conn = PG.connect(@db_config)
conn.exec_params(
"INSERT INTO #{@table_name} (session_id, data, expires_at) VALUES ($1, $2, $3)
ON CONFLICT (session_id) DO UPDATE SET data = $2, expires_at = $3",
[@session_id, data, expires_at]
)
ensure
conn&.close
end
def delete
conn = PG.connect(@db_config)
conn.exec_params("DELETE FROM #{@table_name} WHERE session_id = $1", [@session_id])
ensure
conn&.close
end
end
Monitoring production CGI applications involves logging performance metrics, error rates, and security events. Integration with system monitoring tools provides visibility into application health and performance characteristics.
class CGIMonitoring
def self.log_request(cgi, start_time, status = '200')
duration = Time.now - start_time
log_entry = {
timestamp: Time.now.iso8601,
remote_addr: ENV['REMOTE_ADDR'],
method: ENV['REQUEST_METHOD'],
path: ENV['SCRIPT_NAME'],
query: ENV['QUERY_STRING'],
status: status,
duration_ms: (duration * 1000).round(2),
content_length: ENV['CONTENT_LENGTH']&.to_i || 0,
user_agent: ENV['HTTP_USER_AGENT']
}
File.open('/var/log/cgi_access.json', 'a') do |f|
f.puts log_entry.to_json
end
end
def self.log_security_event(event_type, details)
event = {
timestamp: Time.now.iso8601,
type: event_type,
remote_addr: ENV['REMOTE_ADDR'],
details: details,
user_agent: ENV['HTTP_USER_AGENT']
}
File.open('/var/log/cgi_security.json', 'a') do |f|
f.puts event.to_json
end
end
end
Common Pitfalls
Cross-site scripting vulnerabilities occur when user input is displayed without proper escaping. The CGI.escapeHTML
method must be used consistently for all user-generated content displayed in HTML responses.
# VULNERABLE - Direct output of user input
username = cgi['username']
puts "<h1>Welcome #{username}!</h1>" # XSS vulnerability
# SECURE - Proper HTML escaping
username = cgi['username']
puts "<h1>Welcome #{CGI.escapeHTML(username)}!</h1>"
# VULNERABLE - Building HTML with concatenation
message = cgi['message']
html = "<div class='message'>" + message + "</div>"
# SECURE - Escape all dynamic content
message = cgi['message']
html = "<div class='message'>#{CGI.escapeHTML(message)}</div>"
Parameter pollution attacks exploit how CGI handles multiple parameters with the same name. Applications expecting single values may receive arrays, leading to unexpected behavior or security issues.
# Vulnerable to parameter pollution
user_id = cgi['user_id']
# If URL contains ?user_id=123&user_id=456, user_id gets the first value (123)
# But cgi.params['user_id'] contains ['123', '456']
# Secure parameter handling
def get_single_param(cgi, name, pattern = nil)
values = cgi.params[name]
return nil if values.nil? || values.empty?
if values.size > 1
raise ArgumentError, "Multiple values for parameter #{name}"
end
value = values.first
if pattern && !(value =~ pattern)
raise ArgumentError, "Invalid format for parameter #{name}"
end
value
end
# Usage with validation
user_id = get_single_param(cgi, 'user_id', /\A\d+\z/)
File upload security requires careful handling of filenames, content types, and storage locations. Attackers can manipulate these attributes to overwrite system files or execute malicious code.
# VULNERABLE - Using original filename directly
uploaded_file = cgi['upload']
File.open("/uploads/#{uploaded_file.original_filename}", 'wb') do |f|
f.write(uploaded_file.read)
end
# SECURE - Filename sanitization and validation
def secure_upload_handling(uploaded_file, allowed_extensions = ['.jpg', '.png', '.pdf'])
return nil unless uploaded_file.respond_to?(:read)
# Generate safe filename
original_name = uploaded_file.original_filename
extension = File.extname(original_name).downcase
unless allowed_extensions.include?(extension)
raise ArgumentError, "File type not allowed"
end
# Create unique, safe filename
timestamp = Time.now.strftime('%Y%m%d_%H%M%S')
random_id = rand(10000).to_s.rjust(4, '0')
safe_filename = "upload_#{timestamp}_#{random_id}#{extension}"
# Store in secure directory with proper permissions
upload_dir = '/var/uploads'
file_path = File.join(upload_dir, safe_filename)
File.open(file_path, 'wb', 0644) do |f|
f.write(uploaded_file.read)
end
safe_filename
end
Session fixation vulnerabilities occur when session IDs are not regenerated after authentication. CGI::Session requires explicit session ID regeneration to prevent session hijacking attacks.
# VULNERABLE - Keeping same session ID after login
def login_user(cgi, username, password)
if authenticate(username, password)
session = CGI::Session.new(cgi)
session['user_id'] = username # Session ID remains same
return true
end
false
end
# SECURE - Regenerate session after authentication
def secure_login(cgi, username, password)
if authenticate(username, password)
# Destroy existing session
old_session = CGI::Session.new(cgi)
old_session.delete
# Create new session with different ID
new_session = CGI::Session.new(cgi, 'new_session' => true)
new_session['user_id'] = username
new_session['authenticated_at'] = Time.now.to_i
return new_session
end
nil
end
Character encoding issues arise when processing form data with different encodings. Ruby's CGI module requires explicit encoding handling to prevent data corruption and security vulnerabilities.
# Handle encoding issues properly
def process_multilingual_input(cgi)
input_text = cgi['message']
return '' if input_text.nil?
# Force UTF-8 encoding and handle invalid bytes
if input_text.encoding != Encoding::UTF_8
begin
input_text = input_text.encode('UTF-8', invalid: :replace, undef: :replace)
rescue Encoding::InvalidByteSequenceError, Encoding::UndefinedConversionError
input_text = input_text.force_encoding('UTF-8')
input_text = input_text.scrub('?') # Replace invalid bytes
end
end
# Validate that string is valid UTF-8
unless input_text.valid_encoding?
input_text = input_text.scrub('?')
end
input_text
end
Performance & Memory
Large form data and file uploads can consume significant memory and processing time. Ruby's CGI implementation loads all form data into memory, requiring careful resource management for applications handling substantial uploads.
# Memory-efficient file upload handling
def stream_large_upload(cgi, max_size = 100_000_000) # 100MB limit
uploaded_file = cgi['large_file']
return nil unless uploaded_file.respond_to?(:read)
if uploaded_file.size > max_size
raise ArgumentError, "File exceeds maximum size"
end
# Stream file in chunks to reduce memory usage
output_path = "/tmp/upload_#{Time.now.to_i}_#{rand(1000)}"
total_written = 0
File.open(output_path, 'wb') do |output|
while chunk = uploaded_file.read(8192) # 8KB chunks
output.write(chunk)
total_written += chunk.bytesize
# Safety check during streaming
if total_written > max_size
File.unlink(output_path)
raise ArgumentError, "File size exceeded during upload"
end
end
end
{ path: output_path, size: total_written }
ensure
uploaded_file.close if uploaded_file.respond_to?(:close)
end
Parameter parsing performance degrades with large numbers of form fields or deeply nested parameter structures. Optimization strategies include parameter count limits and selective parsing.
# Performance monitoring for parameter parsing
class OptimizedCGI
def initialize(max_params = 1000)
@max_params = max_params
@start_time = Time.now
@cgi = parse_with_limits
end
def [](key)
@cgi[key]
end
def params
@cgi.params
end
private
def parse_with_limits
# Check content length before parsing
content_length = ENV['CONTENT_LENGTH'].to_i
if content_length > 50_000_000 # 50MB limit
raise ArgumentError, "Request too large"
end
cgi = CGI.new
# Count total parameters
param_count = cgi.params.values.sum(&:size)
if param_count > @max_params
raise ArgumentError, "Too many parameters (#{param_count} > #{@max_params})"
end
parse_time = Time.now - @start_time
if parse_time > 5 # 5 second parsing limit
File.open('/var/log/slow_parsing.log', 'a') do |f|
f.puts "Slow parsing: #{parse_time}s, params: #{param_count}, size: #{content_length}"
end
end
cgi
end
end
Memory usage optimization requires careful management of large strings and temporary files. Ruby's garbage collection can be tuned for CGI applications with specific memory patterns.
# Memory-conscious CGI response generation
class MemoryOptimizedResponse
def initialize(cgi)
@cgi = cgi
@output_buffer = StringIO.new
end
def generate_large_response(data_array)
# Stream response instead of building large strings
puts @cgi.header('type' => 'text/csv', 'disposition' => 'attachment')
# Process data in batches to control memory usage
data_array.each_slice(1000) do |batch|
csv_chunk = batch.map { |row| row.join(',') }.join("\n")
print csv_chunk
print "\n" unless batch == data_array.last(1000)
# Force garbage collection periodically
GC.start if batch.size == 1000 && rand < 0.1
end
end
def memory_stats
{
process_memory: `ps -o rss= -p #{Process.pid}`.strip.to_i * 1024,
ruby_memory: GC.stat[:heap_live_slots] * 40, # Approximate bytes per slot
gc_count: GC.count
}
end
end
# Usage with memory monitoring
response_handler = MemoryOptimizedResponse.new(cgi)
start_memory = response_handler.memory_stats[:process_memory]
response_handler.generate_large_response(large_dataset)
end_memory = response_handler.memory_stats[:process_memory]
memory_used = end_memory - start_memory
File.open('/var/log/memory_usage.log', 'a') do |f|
f.puts "Memory used: #{memory_used / 1024}KB"
end
Database connection pooling and resource management become critical for high-traffic CGI applications. Connection reuse and proper cleanup prevent resource exhaustion.
# Resource-efficient database integration
class PooledDatabaseCGI
@connection_pool = []
@pool_mutex = Mutex.new
def self.get_connection(config)
@pool_mutex.synchronize do
connection = @connection_pool.pop
return connection if connection && connection.status == PG::CONNECTION_OK
PG.connect(config)
end
end
def self.return_connection(connection)
@pool_mutex.synchronize do
@connection_pool.push(connection) if connection && connection.status == PG::CONNECTION_OK
# Limit pool size
@connection_pool.pop.close if @connection_pool.size > 10
end
end
def process_request_with_db(cgi)
conn = self.class.get_connection(database_config)
# Process request with database
user_id = cgi['user_id']
result = conn.exec_params('SELECT name FROM users WHERE id = $1', [user_id])
puts cgi.header
puts "<h1>Hello #{CGI.escapeHTML(result[0]['name'])}</h1>"
ensure
self.class.return_connection(conn) if conn
end
private
def database_config
{
host: ENV['DB_HOST'] || 'localhost',
port: ENV['DB_PORT'] || 5432,
dbname: ENV['DB_NAME'],
user: ENV['DB_USER'],
password: ENV['DB_PASSWORD'],
connect_timeout: 5,
statement_timeout: 30000
}
end
end
Reference
CGI Class Methods
Method | Parameters | Returns | Description |
---|---|---|---|
CGI.new(type = "query") |
type (String) |
CGI |
Creates new CGI instance with specified input type |
CGI.escape(string) |
string (String) |
String |
URL-encodes string for use in URLs |
CGI.unescape(string) |
string (String) |
String |
URL-decodes previously encoded string |
CGI.escapeHTML(string) |
string (String) |
String |
HTML-encodes string to prevent XSS attacks |
CGI.unescapeHTML(string) |
string (String) |
String |
HTML-decodes previously encoded string |
CGI.escape_element(string, *elements) |
string (String), elements (Array) |
String |
Escapes specified HTML elements only |
CGI.unescape_element(string, *elements) |
string (String), elements (Array) |
String |
Unescapes specified HTML elements only |
CGI.parse(query) |
query (String) |
Hash |
Parses query string into parameter hash |
CGI Instance Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#[](key) |
key (String) |
String or nil |
Returns first value for parameter key |
#params |
None | Hash |
Returns hash of all parameters as arrays |
#keys |
None | Array |
Returns array of all parameter keys |
#has_key?(key) |
key (String) |
Boolean |
Checks if parameter exists |
#header(options = {}) |
options (Hash) |
String |
Generates HTTP response headers |
#cookies |
None | Hash |
Returns hash of cookies from request |
#accept_charset |
None | String |
Returns accepted character sets |
#auth_type |
None | String |
Returns authentication type |
#content_length |
None | Integer |
Returns content length of request |
#content_type |
None | String |
Returns content type of request |
#gateway_interface |
None | String |
Returns CGI version information |
#path_info |
None | String |
Returns additional path information |
#path_translated |
None | String |
Returns translated path information |
#query_string |
None | String |
Returns query string from URL |
#remote_addr |
None | String |
Returns client IP address |
#remote_host |
None | String |
Returns client hostname |
#remote_ident |
None | String |
Returns client identity information |
#remote_user |
None | String |
Returns authenticated username |
#request_method |
None | String |
Returns HTTP request method |
#script_name |
None | String |
Returns script path |
#server_name |
None | String |
Returns server hostname |
#server_port |
None | Integer |
Returns server port number |
#server_protocol |
None | String |
Returns HTTP protocol version |
#server_software |
None | String |
Returns web server software |
#user_agent |
None | String |
Returns client user agent string |
CGI::Cookie Class Methods
Method | Parameters | Returns | Description |
---|---|---|---|
CGI::Cookie.new(options) |
options (Hash) |
CGI::Cookie |
Creates new cookie with specified attributes |
CGI::Cookie.parse(raw_cookie) |
raw_cookie (String) |
Hash |
Parses cookie string into cookie objects |
CGI::Cookie Instance Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#name |
None | String |
Returns cookie name |
#value |
None | Array |
Returns cookie values as array |
#domain |
None | String |
Returns cookie domain |
#path |
None | String |
Returns cookie path |
#expires |
None | Time |
Returns cookie expiration time |
#secure |
None | Boolean |
Returns secure flag status |
#httponly |
None | Boolean |
Returns HTTP-only flag status |
#to_s |
None | String |
Returns formatted cookie string |
Header Options
Option | Type | Description |
---|---|---|
type |
String |
Content-Type header value (default: "text/html") |
charset |
String |
Character encoding (default: system encoding) |
status |
String |
HTTP status code and message |
server |
String |
Server header value |
connection |
String |
Connection header value |
length |
Integer |
Content-Length header value |
language |
String |
Content-Language header value |
expires |
Time |
Expires header value |
cookie |
CGI::Cookie or Array |
Cookie objects to set |
location |
String |
Location header for redirects |
cache-control |
String |
Cache-Control header value |
pragma |
String |
Pragma header value |
Common Environment Variables
Variable | Description |
---|---|
REQUEST_METHOD |
HTTP method (GET, POST, PUT, DELETE) |
CONTENT_TYPE |
MIME type of request body |
CONTENT_LENGTH |
Size of request body in bytes |
QUERY_STRING |
URL parameters after ? character |
HTTP_USER_AGENT |
Client browser identification |
HTTP_ACCEPT |
MIME types accepted by client |
HTTP_COOKIE |
Cookies sent by client |
REMOTE_ADDR |
Client IP address |
REMOTE_HOST |
Client hostname |
SERVER_NAME |
Server hostname |
SERVER_PORT |
Server port number |
SCRIPT_NAME |
CGI script path |
PATH_INFO |
Additional path information |
CGI Input Types
Type | Description | Usage |
---|---|---|
"query" |
Parse query string and form data | Default for most requests |
"html" |
Enable HTML generation methods | For scripts generating HTML |
"xml" |
Enable XML generation methods | For scripts generating XML |
"multipart" |
Handle multipart form data | For file upload forms |