Overview
Input validation in Ruby encompasses the techniques and patterns used to verify, sanitize, and transform user-provided data before processing. Ruby provides multiple approaches to validation through built-in methods, regular expressions, type checking mechanisms, and custom validation frameworks.
Ruby's validation capabilities span from basic type checking using methods like Integer()
and Float()
to complex pattern matching with regular expressions. The language includes string validation methods such as String#match?
, numeric range checking, and length validation through String#size
and Array#length
. Ruby's duck typing system requires explicit validation when type safety matters.
# Basic type validation
def validate_age(input)
age = Integer(input)
raise ArgumentError, "Age must be positive" unless age > 0
age
rescue ArgumentError
raise ArgumentError, "Invalid age format"
end
# Pattern-based validation
EMAIL_REGEX = /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i
def validate_email(email)
return false unless email.is_a?(String)
email.match?(EMAIL_REGEX)
end
Ruby applications typically implement validation at multiple layers: input sanitization, format validation, business rule validation, and output encoding. The standard library provides URI
for URL validation, Date
and Time
for temporal validation, and JSON
for structured data validation.
Validation in Ruby serves three primary purposes: preventing application errors from invalid data, enforcing business rules, and protecting against security vulnerabilities like SQL injection and cross-site scripting. Ruby's flexible nature requires explicit validation since the language performs minimal automatic type coercion compared to statically typed languages.
Basic Usage
Ruby input validation begins with type checking and format verification. The Integer()
, Float()
, and String()
methods provide strict type conversion that raises exceptions for invalid inputs, making them effective validators.
def validate_numeric_input(value)
# Integer() raises ArgumentError for non-numeric strings
number = Integer(value)
# Additional range validation
raise ArgumentError, "Number out of range" unless (1..100).cover?(number)
number
end
# Usage examples
validate_numeric_input("42") # => 42
validate_numeric_input("abc") # => ArgumentError: invalid value for Integer()
validate_numeric_input("150") # => ArgumentError: Number out of range
String validation commonly uses length checks and pattern matching. Ruby's String#size
method returns character count, while regular expressions provide pattern-based validation for formats like emails, phone numbers, and identifiers.
class InputValidator
USERNAME_PATTERN = /\A[a-zA-Z0-9_]{3,20}\z/
def self.validate_username(username)
return false unless username.is_a?(String)
return false if username.empty?
username.match?(USERNAME_PATTERN)
end
def self.validate_password(password)
return false unless password.is_a?(String)
return false if password.length < 8
return false unless password.match?(/[A-Z]/) # uppercase
return false unless password.match?(/[a-z]/) # lowercase
return false unless password.match?(/\d/) # digit
true
end
end
Array and hash validation requires checking structure and content. Ruby provides methods like Array#all?
and Hash#key?
for validating collections and their elements.
def validate_user_data(data)
# Check if data is a hash
raise ArgumentError, "Data must be a hash" unless data.is_a?(Hash)
# Required fields validation
required_fields = %w[name email age]
missing_fields = required_fields - data.keys.map(&:to_s)
raise ArgumentError, "Missing fields: #{missing_fields.join(', ')}" unless missing_fields.empty?
# Individual field validation
raise ArgumentError, "Invalid name" unless data[:name].is_a?(String) && !data[:name].empty?
raise ArgumentError, "Invalid email" unless validate_email(data[:email])
raise ArgumentError, "Invalid age" unless data[:age].is_a?(Integer) && data[:age] > 0
true
end
# Array element validation
def validate_tags(tags)
return false unless tags.is_a?(Array)
return false if tags.empty?
tags.all? { |tag| tag.is_a?(String) && tag.match?(/\A[a-z0-9-]{2,30}\z/) }
end
Date and time validation uses Ruby's Date
and Time
classes with parsing methods that raise exceptions for invalid formats. The Date.strptime
method allows custom format specification.
def validate_date(date_string, format = '%Y-%m-%d')
Date.strptime(date_string, format)
true
rescue Date::Error, ArgumentError
false
end
def validate_date_range(start_date, end_date)
start_parsed = Date.parse(start_date)
end_parsed = Date.parse(end_date)
raise ArgumentError, "End date must be after start date" if end_parsed <= start_parsed
{ start: start_parsed, end: end_parsed }
rescue Date::Error, ArgumentError => e
raise ArgumentError, "Invalid date format: #{e.message}"
end
Error Handling & Debugging
Input validation error handling in Ruby requires distinguishing between different failure types and providing meaningful error messages. Ruby's exception hierarchy allows catching specific validation errors while letting system errors propagate.
class ValidationError < StandardError
attr_reader :field, :value, :constraint
def initialize(field, value, constraint, message = nil)
@field = field
@value = value
@constraint = constraint
super(message || "Validation failed for #{field}: #{constraint}")
end
end
class DataValidator
def self.validate_user_registration(params)
errors = {}
# Email validation with specific error types
begin
validate_email_format(params[:email])
rescue ValidationError => e
errors[:email] = e.message
end
# Password validation with multiple constraints
begin
validate_password_strength(params[:password])
rescue ValidationError => e
errors[:password] = e.message
end
# Age validation with range checking
begin
validate_age_range(params[:age])
rescue ValidationError => e
errors[:age] = e.message
end
raise ValidationError, "Multiple validation errors: #{errors}" unless errors.empty?
true
end
private
def self.validate_email_format(email)
raise ValidationError.new(:email, email, "required") if email.nil? || email.empty?
raise ValidationError.new(:email, email, "invalid format") unless email.match?(EMAIL_REGEX)
raise ValidationError.new(:email, email, "too long") if email.length > 254
end
def self.validate_password_strength(password)
raise ValidationError.new(:password, "[hidden]", "required") if password.nil? || password.empty?
raise ValidationError.new(:password, "[hidden]", "too short") if password.length < 8
raise ValidationError.new(:password, "[hidden]", "missing uppercase") unless password.match?(/[A-Z]/)
raise ValidationError.new(:password, "[hidden]", "missing lowercase") unless password.match?(/[a-z]/)
raise ValidationError.new(:password, "[hidden]", "missing digit") unless password.match?(/\d/)
end
end
Debugging validation failures requires logging both the input values and the validation rules that failed. Ruby's logging capabilities help track validation patterns and identify problematic inputs.
require 'logger'
class DebugValidator
def initialize(logger = Logger.new(STDOUT))
@logger = logger
end
def validate_with_debugging(value, validators)
@logger.info("Starting validation for value: #{value.inspect}")
validators.each_with_index do |validator, index|
begin
result = validator.call(value)
@logger.info("Validator #{index + 1} passed: #{validator}")
return result if result
rescue => e
@logger.error("Validator #{index + 1} failed: #{e.message}")
@logger.debug("Validator details: #{validator}")
@logger.debug("Stack trace: #{e.backtrace.first(3)}")
raise
end
end
false
end
end
# Usage with debugging
validator = DebugValidator.new
validators = [
->(v) { raise "Too short" if v.length < 3; true },
->(v) { raise "Invalid chars" unless v.match?(/\A[a-z]+\z/); true },
->(v) { raise "Reserved word" if %w[admin root].include?(v); true }
]
begin
validator.validate_with_debugging("admin", validators)
rescue => e
puts "Validation failed: #{e.message}"
end
Complex validation scenarios benefit from validation result objects that capture both success state and detailed error information. This approach separates validation logic from error handling.
class ValidationResult
attr_reader :valid, :errors, :warnings
def initialize(valid = true)
@valid = valid
@errors = []
@warnings = []
end
def add_error(field, message)
@errors << { field: field, message: message }
@valid = false
end
def add_warning(field, message)
@warnings << { field: field, message: message }
end
def valid?
@valid && @errors.empty?
end
def error_messages
@errors.map { |e| "#{e[:field]}: #{e[:message]}" }
end
end
class ComprehensiveValidator
def validate_user_profile(profile)
result = ValidationResult.new
# Required field validation
%w[name email].each do |field|
if profile[field.to_sym].nil? || profile[field.to_sym].empty?
result.add_error(field, "is required")
end
end
# Email format validation
if profile[:email] && !profile[:email].match?(EMAIL_REGEX)
result.add_error("email", "invalid format")
end
# Age validation with warnings
if profile[:age]
if profile[:age] < 13
result.add_error("age", "must be at least 13")
elsif profile[:age] > 120
result.add_warning("age", "unusually high age")
end
end
result
end
end
Performance & Memory
Input validation performance in Ruby depends heavily on the validation methods chosen and the size of data being processed. Regular expressions, string operations, and type conversions have different performance characteristics that affect large-scale applications.
Regular expression performance varies significantly based on pattern complexity and input string length. Simple patterns like \A[a-z]+\z
perform faster than complex lookahead patterns. Compiling regular expressions outside of validation methods improves performance for repeated validations.
# Performance comparison of validation approaches
require 'benchmark'
class PerformanceValidator
# Pre-compiled regex for better performance
EMAIL_REGEX = /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i.freeze
PHONE_REGEX = /\A\+?[\d\s\-\(\)]{10,15}\z/.freeze
def self.benchmark_email_validation(emails)
Benchmark.bm(15) do |x|
# Regex validation
x.report("Regex match") do
emails.each { |email| email.match?(EMAIL_REGEX) }
end
# String method validation
x.report("String methods") do
emails.each do |email|
email.include?("@") &&
email.count("@") == 1 &&
email.length > 5 &&
email.length < 255
end
end
# Type conversion approach
x.report("URI parsing") do
emails.each do |email|
begin
require 'uri'
URI::MailTo.build(email)
true
rescue
false
end
end
end
end
end
end
# Memory-efficient validation for large datasets
class BatchValidator
def validate_large_dataset(data_stream)
valid_count = 0
error_count = 0
data_stream.each_slice(1000) do |batch|
batch.each do |record|
if validate_record(record)
valid_count += 1
else
error_count += 1
end
end
# Process batch and free memory
GC.start if (valid_count + error_count) % 10000 == 0
end
{ valid: valid_count, errors: error_count }
end
private
def validate_record(record)
# Efficient validation that doesn't create large intermediate objects
return false unless record.is_a?(Hash)
return false unless record[:id].to_s.match?(/\A\d+\z/)
return false unless record[:name].to_s.length.between?(1, 100)
true
end
end
String validation performance benefits from choosing appropriate methods. String#match?
performs faster than String#match
when only boolean results are needed. Length validation using String#bytesize
differs from String#length
for multibyte characters.
class OptimizedStringValidator
# Fast string validation methods
def self.fast_length_check(string, min, max)
# bytesize is faster than length for ASCII-only validation
size = string.bytesize
size >= min && size <= max
end
def self.validate_ascii_only(string)
# ASCII validation is faster than full unicode checks
string.ascii_only? && string.valid_encoding?
end
def self.validate_numeric_string(string)
# Faster than regex for simple numeric validation
return false if string.empty?
string.each_byte do |byte|
return false unless byte >= 48 && byte <= 57 # '0' to '9'
end
true
end
# Memory-efficient validation for large strings
def self.validate_large_text(text, max_words = 1000)
word_count = 0
text.scan(/\S+/) do |word|
word_count += 1
return false if word_count > max_words
end
true
end
end
# Performance testing framework
class ValidationBenchmark
def self.compare_validation_methods(dataset)
puts "Dataset size: #{dataset.size} records"
Benchmark.bm(20) do |x|
x.report("Regex validation") do
dataset.each { |item| item.match?(/\A[a-z0-9_]{3,20}\z/) }
end
x.report("Length + char check") do
dataset.each do |item|
item.length.between?(3, 20) &&
item.chars.all? { |c| c.match?(/[a-z0-9_]/) }
end
end
x.report("Byte-level check") do
dataset.each do |item|
next false unless item.length.between?(3, 20)
item.each_byte.all? do |byte|
(byte >= 97 && byte <= 122) || # a-z
(byte >= 48 && byte <= 57) || # 0-9
byte == 95 # _
end
end
end
end
end
end
Memory management becomes critical when validating large datasets. Ruby's garbage collector can be triggered strategically, and validation methods should avoid creating unnecessary intermediate objects.
class MemoryEfficientValidator
def validate_csv_stream(file_path, chunk_size = 5000)
require 'csv'
valid_records = 0
invalid_records = []
memory_usage = []
CSV.foreach(file_path, headers: true).each_slice(chunk_size).with_index do |chunk, index|
# Record memory usage before processing
memory_before = GC.stat[:heap_live_slots]
chunk_results = chunk.map { |row| validate_csv_row(row) }
valid_records += chunk_results.count(true)
# Collect only invalid record references, not full data
chunk.each_with_index do |row, row_index|
unless chunk_results[row_index]
invalid_records << { chunk: index, row: row_index, id: row['id'] }
end
end
# Force garbage collection and measure memory
GC.start
memory_after = GC.stat[:heap_live_slots]
memory_usage << { chunk: index, before: memory_before, after: memory_after }
# Log progress for large files
puts "Processed chunk #{index + 1}, valid: #{valid_records}, memory delta: #{memory_after - memory_before}" if index % 10 == 0
end
{
valid_count: valid_records,
invalid_count: invalid_records.size,
invalid_records: invalid_records,
memory_profile: memory_usage
}
end
private
def validate_csv_row(row)
# Efficient validation without creating intermediate objects
return false unless row['id']&.match?(/\A\d+\z/)
return false unless row['email']&.include?('@')
return false unless row['name']&.length&.between?(1, 100)
true
end
end
Testing Strategies
Testing input validation requires comprehensive coverage of boundary conditions, edge cases, and error scenarios. Ruby's testing frameworks provide tools for parameterized testing, exception assertions, and property-based testing approaches.
RSpec and Minitest both support validation testing patterns. Testing strategies should cover valid inputs, invalid inputs, boundary values, and error handling behavior. Property-based testing helps discover edge cases that manual test cases might miss.
# RSpec validation testing patterns
RSpec.describe InputValidator do
describe '#validate_email' do
# Valid email test cases
valid_emails = [
'user@example.com',
'test.email+tag@example.co.uk',
'user123@sub.domain.com'
]
valid_emails.each do |email|
it "accepts valid email: #{email}" do
expect(InputValidator.validate_email(email)).to be true
end
end
# Invalid email test cases with specific error messages
invalid_email_cases = [
{ email: nil, reason: 'nil input' },
{ email: '', reason: 'empty string' },
{ email: 'invalid', reason: 'missing @ symbol' },
{ email: '@example.com', reason: 'missing local part' },
{ email: 'user@', reason: 'missing domain' },
{ email: 'user@.com', reason: 'invalid domain' },
{ email: 'user name@example.com', reason: 'contains space' }
]
invalid_email_cases.each do |test_case|
it "rejects #{test_case[:reason]}: #{test_case[:email].inspect}" do
expect(InputValidator.validate_email(test_case[:email])).to be false
end
end
end
describe '#validate_password' do
# Boundary testing for password length
context 'password length validation' do
it 'rejects passwords shorter than 8 characters' do
expect(InputValidator.validate_password('1234567')).to be false
end
it 'accepts passwords exactly 8 characters' do
expect(InputValidator.validate_password('Abcdef12')).to be true
end
it 'accepts very long passwords' do
long_password = 'A' + 'a' * 100 + '1'
expect(InputValidator.validate_password(long_password)).to be true
end
end
# Character requirement testing
context 'character requirements' do
let(:base_password) { 'password' }
it 'requires uppercase letter' do
expect(InputValidator.validate_password('lowercase123')).to be false
expect(InputValidator.validate_password('Uppercase123')).to be true
end
it 'requires lowercase letter' do
expect(InputValidator.validate_password('UPPERCASE123')).to be false
expect(InputValidator.validate_password('Uppercase123')).to be true
end
it 'requires digit' do
expect(InputValidator.validate_password('NoDigits')).to be false
expect(InputValidator.validate_password('WithDigit1')).to be true
end
end
end
end
# Minitest validation testing with helper methods
class ValidationTest < Minitest::Test
def setup
@validator = DataValidator.new
end
def test_age_validation_boundaries
# Test boundary values
assert_raises(ValidationError) { @validator.validate_age(0) }
assert @validator.validate_age(1)
assert @validator.validate_age(120)
assert_raises(ValidationError) { @validator.validate_age(121) }
end
def test_age_validation_types
# Test different input types
assert @validator.validate_age(25)
assert @validator.validate_age("25") # String conversion
assert_raises(ValidationError) { @validator.validate_age("abc") }
assert_raises(ValidationError) { @validator.validate_age(nil) }
assert_raises(ValidationError) { @validator.validate_age(25.5) }
end
# Property-based testing helper
def test_username_validation_properties
# Generate test data with known properties
valid_usernames = generate_valid_usernames(100)
invalid_usernames = generate_invalid_usernames(100)
valid_usernames.each do |username|
assert @validator.validate_username(username),
"Should accept valid username: #{username}"
end
invalid_usernames.each do |username|
refute @validator.validate_username(username),
"Should reject invalid username: #{username}"
end
end
private
def generate_valid_usernames(count)
count.times.map do
length = rand(3..20)
chars = [*'a'..'z', *'A'..'Z', *'0'..'9', '_']
Array.new(length) { chars.sample }.join
end
end
def generate_invalid_usernames(count)
invalid = []
# Too short
invalid += Array.new(count / 4) { [*'a'..'z'].sample(rand(0..2)).join }
# Too long
invalid += Array.new(count / 4) { [*'a'..'z'].sample(rand(21..50)).join }
# Invalid characters
invalid += Array.new(count / 4) { "valid#{['!', '@', '#', '$'].sample}name" }
# Empty and nil
invalid += ['', nil] * (count / 8)
invalid.first(count)
end
end
Mock and stub testing helps isolate validation logic from external dependencies. Testing validation methods that depend on database lookups, API calls, or file system access requires careful mocking.
# Testing validation with external dependencies
class UserValidator
def initialize(user_repository = UserRepository.new)
@user_repository = user_repository
end
def validate_unique_email(email)
return false unless validate_email_format(email)
!@user_repository.exists_with_email?(email)
end
def validate_username_availability(username)
return false unless validate_username_format(username)
!@user_repository.exists_with_username?(username)
end
private
def validate_email_format(email)
email&.match?(EMAIL_REGEX)
end
def validate_username_format(username)
username&.match?(/\A[a-zA-Z0-9_]{3,20}\z/)
end
end
# RSpec tests with mocking
RSpec.describe UserValidator do
let(:mock_repository) { double('UserRepository') }
let(:validator) { UserValidator.new(mock_repository) }
describe '#validate_unique_email' do
context 'with valid email format' do
let(:email) { 'test@example.com' }
it 'returns true when email is not taken' do
allow(mock_repository).to receive(:exists_with_email?).with(email).and_return(false)
expect(validator.validate_unique_email(email)).to be true
end
it 'returns false when email is already taken' do
allow(mock_repository).to receive(:exists_with_email?).with(email).and_return(true)
expect(validator.validate_unique_email(email)).to be false
end
end
it 'returns false for invalid email format without checking repository' do
expect(mock_repository).not_to receive(:exists_with_email?)
expect(validator.validate_unique_email('invalid-email')).to be false
end
end
describe '#validate_username_availability' do
it 'calls repository only after format validation passes' do
expect(mock_repository).not_to receive(:exists_with_username?)
validator.validate_username_availability('ab') # Too short
end
it 'checks repository when format is valid' do
username = 'validuser'
expect(mock_repository).to receive(:exists_with_username?).with(username).and_return(false)
expect(validator.validate_username_availability(username)).to be true
end
end
end
Integration testing validates the complete validation pipeline including error handling, logging, and side effects. These tests ensure validation works correctly within the broader application context.
# Integration test for validation pipeline
class ValidationIntegrationTest < Minitest::Test
def setup
@temp_db = create_test_database
@validator = UserValidator.new(UserRepository.new(@temp_db))
@logger = Logger.new(StringIO.new)
end
def teardown
cleanup_test_database(@temp_db)
end
def test_complete_user_validation_flow
user_data = {
email: 'newuser@example.com',
username: 'newuser123',
password: 'SecurePass1',
age: 25
}
# Should pass all validation
result = @validator.validate_complete_user(user_data)
assert result.valid?
assert_empty result.errors
# Create user to test uniqueness validation
create_test_user(user_data)
# Should fail uniqueness validation
result = @validator.validate_complete_user(user_data)
refute result.valid?
assert_includes result.error_messages, "email: already taken"
end
def test_validation_error_logging
invalid_data = { email: 'invalid', username: 'ab', password: 'weak', age: -5 }
# Capture log output
log_output = StringIO.new
validator_with_logging = UserValidator.new(UserRepository.new(@temp_db), Logger.new(log_output))
result = validator_with_logging.validate_complete_user(invalid_data)
refute result.valid?
log_content = log_output.string
assert_includes log_content, "Validation failed for email"
assert_includes log_content, "Validation failed for username"
assert_includes log_content, "Validation failed for password"
assert_includes log_content, "Validation failed for age"
end
private
def create_test_database
# Setup in-memory SQLite database for testing
require 'sqlite3'
db = SQLite3::Database.new(':memory:')
db.execute('CREATE TABLE users (id INTEGER PRIMARY KEY, email TEXT, username TEXT)')
db
end
def cleanup_test_database(db)
db.close
end
def create_test_user(user_data)
@temp_db.execute('INSERT INTO users (email, username) VALUES (?, ?)',
[user_data[:email], user_data[:username]])
end
end
Common Pitfalls
Input validation in Ruby contains several subtle gotchas that can lead to security vulnerabilities or application errors. Understanding these pitfalls prevents common mistakes in validation logic.
Regular expression anchoring represents a critical security concern. Using ^
and $
anchors instead of \A
and \z
allows multiline bypass attacks where malicious input includes newline characters.
# DANGEROUS: Vulnerable to newline bypass
BAD_EMAIL_REGEX = /^[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+$/i
def vulnerable_email_validation(email)
email.match?(BAD_EMAIL_REGEX)
end
# This passes validation but contains malicious content
malicious_input = "valid@example.com\n<script>alert('xss')</script>"
puts vulnerable_email_validation(malicious_input) # => true
# SAFE: Proper anchoring
SAFE_EMAIL_REGEX = /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i
def secure_email_validation(email)
email.match?(SAFE_EMAIL_REGEX)
end
puts secure_email_validation(malicious_input) # => false
Type coercion pitfalls occur when Ruby's flexible type system allows unexpected conversions. The to_i
and to_s
methods perform silent conversion, while Integer()
and String()
raise exceptions for invalid input.
# Dangerous: Silent conversion loses validation
def bad_age_validation(age_input)
age = age_input.to_i # "abc".to_i => 0, "25abc".to_i => 25
age > 0 && age < 120
end
puts bad_age_validation("abc") # => false (but should raise error)
puts bad_age_validation("25abc") # => true (accepts invalid input!)
puts bad_age_validation("0025") # => true (accepts leading zeros)
# Better: Explicit validation with strict conversion
def proper_age_validation(age_input)
# First validate string format
return false unless age_input.is_a?(String)
return false unless age_input.match?(/\A\d+\z/) # Only digits
# Then convert and validate range
age = Integer(age_input) # Raises exception for invalid format
age > 0 && age < 120
rescue ArgumentError
false
end
# Best: Comprehensive validation with detailed errors
class AgeValidator
def self.validate(input)
errors = []
errors << "must be a string" unless input.is_a?(String)
return errors unless errors.empty?
errors << "cannot be empty" if input.empty?
errors << "must contain only digits" unless input.match?(/\A\d+\z/)
errors << "cannot have leading zeros" if input.match?(/\A0\d+\z/)
return errors unless errors.empty?
age = Integer(input)
errors << "must be positive" unless age > 0
errors << "must be realistic (< 120)" unless age < 120
errors
end
end
String encoding issues create validation bypasses when byte-level and character-level operations produce different results. Ruby's encoding handling requires explicit consideration in validation routines.
# Encoding pitfall demonstration
def length_validation_pitfall(input)
# These can give different results for multibyte strings
puts "String: #{input.inspect}"
puts "Length (chars): #{input.length}"
puts "Bytesize: #{input.bytesize}"
puts "Valid encoding: #{input.valid_encoding?}"
end
# Examples showing the differences
length_validation_pitfall("café") # 4 chars, 5 bytes (UTF-8)
length_validation_pitfall("🚀") # 1 char, 4 bytes
length_validation_pitfall("\xff\xfe") # Invalid UTF-8 sequence
# Proper encoding-aware validation
class SafeStringValidator
def self.validate_text_input(input, max_chars: 100, max_bytes: 1000)
errors = []
# Check encoding validity first
unless input.valid_encoding?
errors << "contains invalid byte sequences"
return errors
end
# Character count validation (for display)
if input.length > max_chars
errors << "too many characters (#{input.length}/#{max_chars})"
end
# Byte count validation (for storage)
if input.bytesize > max_bytes
errors << "too many bytes (#{input.bytesize}/#{max_bytes})"
end
# Check for dangerous characters
if input.match?(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/)
errors << "contains control characters"
end
errors
end
end
Regex performance pitfalls arise from catastrophic backtracking in complex patterns. Certain regex constructs can cause exponential time complexity with malicious input.
# DANGEROUS: Catastrophic backtracking vulnerability
VULNERABLE_REGEX = /^(a+)+b$/
def vulnerable_validation(input)
# This can take exponential time for inputs like "aaaaaaaaaaaaaaaaaac"
start_time = Time.now
result = input.match?(VULNERABLE_REGEX)
end_time = Time.now
puts "Validation took #{end_time - start_time} seconds"
result
end
# Safe: More efficient regex design
SAFE_REGEX = /\Aa+b\z/
def safe_validation(input)
input.match?(SAFE_REGEX)
end
# Testing with potentially problematic input
problematic_input = "a" * 20 + "c" # No 'b' at end
puts "Testing vulnerable regex..."
vulnerable_validation(problematic_input)
puts "Testing safe regex..."
start_time = Time.now
result = safe_validation(problematic_input)
end_time = Time.now
puts "Safe validation took #{end_time - start_time} seconds: #{result}"
State-dependent validation errors occur when validation depends on external state that can change between validation and use. This creates race conditions and time-of-check-time-of-use vulnerabilities.
# Problematic: Validation depends on mutable external state
class ProblematicValidator
def initialize
@blacklisted_domains = load_blacklist_from_file # Mutable external state
end
def validate_email_domain(email)
domain = email.split('@').last
!@blacklisted_domains.include?(domain) # State can change!
end
private
def load_blacklist_from_file
# External file can be modified between validation calls
File.readlines('blacklist.txt').map(&:strip)
rescue
[]
end
end
# Better: Immutable validation with explicit state management
class RobustValidator
def initialize(blacklist_source = nil)
@blacklist_snapshot = create_blacklist_snapshot(blacklist_source)
@blacklist_loaded_at = Time.now
end
def validate_email_domain(email, refresh_blacklist: false)
refresh_blacklist! if refresh_blacklist || blacklist_expired?
domain = extract_domain(email)
return false unless domain
!@blacklist_snapshot.include?(domain)
end
def blacklist_age
Time.now - @blacklist_loaded_at
end
private
def create_blacklist_snapshot(source)
# Create immutable snapshot of validation rules
case source
when Array then source.dup.freeze
when String then File.readlines(source).map(&:strip).freeze
when nil then [].freeze
else raise ArgumentError, "Invalid blacklist source"
end
rescue => e
warn "Failed to load blacklist: #{e.message}"
[].freeze
end
def blacklist_expired?
blacklist_age > 3600 # 1 hour expiry
end
def refresh_blacklist!
@blacklist_snapshot = create_blacklist_snapshot('blacklist.txt')
@blacklist_loaded_at = Time.now
end
def extract_domain(email)
return nil unless email.is_a?(String) && email.include?('@')
parts = email.split('@')
return nil unless parts.length == 2
domain = parts.last.downcase
return nil if domain.empty?
domain
end
end
Reference
Core Validation Methods
Method | Parameters | Returns | Description |
---|---|---|---|
Integer(value) |
value (Object) |
Integer |
Converts value to integer, raises ArgumentError for invalid input |
Float(value) |
value (Object) |
Float |
Converts value to float, raises ArgumentError for invalid input |
String(value) |
value (Object) |
String |
Converts value to string using to_s method |
String#match?(pattern) |
pattern (Regexp/String) |
Boolean |
Returns true if string matches pattern, false otherwise |
String#empty? |
None | Boolean |
Returns true if string length is zero |
String#length |
None | Integer |
Returns character count in string |
String#bytesize |
None | Integer |
Returns byte count in string |
String#valid_encoding? |
None | Boolean |
Returns true if string contains valid byte sequence |
String#ascii_only? |
None | Boolean |
Returns true if string contains only ASCII characters |
Regular Expression Anchors
Anchor | Behavior | Security Impact |
---|---|---|
^ |
Start of line (allows multiline bypass) | Dangerous - vulnerable to injection |
$ |
End of line (allows multiline bypass) | Dangerous - vulnerable to injection |
\A |
Start of string (absolute) | Safe - prevents multiline bypass |
\z |
End of string (absolute) | Safe - prevents multiline bypass |
\Z |
End of string or before final newline | Potentially unsafe |
Validation Error Classes
Exception | Inheritance | Common Use Cases |
---|---|---|
ArgumentError |
StandardError |
Invalid method arguments, type mismatches |
TypeError |
StandardError |
Unexpected object types |
RangeError |
StandardError |
Values outside acceptable ranges |
EncodingError |
StandardError |
String encoding problems |
Date::Error |
ArgumentError |
Invalid date/time formats |
JSON::ParserError |
StandardError |
Malformed JSON data |
URI::InvalidURIError |
StandardError |
Invalid URI formats |
Type Checking Methods
Method | Behavior | Error Handling |
---|---|---|
obj.is_a?(Class) |
Checks exact class or inheritance | Returns boolean |
obj.kind_of?(Class) |
Alias for is_a? | Returns boolean |
obj.instance_of?(Class) |
Checks exact class only | Returns boolean |
obj.respond_to?(method) |
Checks method availability | Returns boolean |
obj.class |
Returns object's class | Never fails |
String Validation Patterns
Pattern | Regular Expression | Use Case |
---|---|---|
/\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i |
Email address format | |
Username | /\A[a-zA-Z0-9_]{3,20}\z/ |
Alphanumeric usernames |
Phone | /\A\+?[\d\s\-\(\)]{10,15}\z/ |
Phone number format |
URL | /\Ahttps?:\/\/[\S]+\z/ |
Basic URL validation |
UUID | /\A[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\z/i |
UUID format |
IP Address | /\A(?:[0-9]{1,3}\.){3}[0-9]{1,3}\z/ |
IPv4 address |
Hex Color | /\A#?[0-9a-f]{6}\z/i |
Hexadecimal color |
Numeric Validation Ranges
Data Type | Minimum Value | Maximum Value | Ruby Method |
---|---|---|---|
Integer | -2**63 |
2**63 - 1 |
Integer() |
Float | -Float::INFINITY |
Float::INFINITY |
Float() |
Age | 0 |
150 |
Custom validation |
Percentage | 0.0 |
100.0 |
Range#cover? |
Port Number | 1 |
65535 |
Integer() + range |
HTTP Status | 100 |
599 |
Custom validation |
Performance Characteristics
Operation | Complexity | Performance Notes |
---|---|---|
String length check | O(1) | Constant time for UTF-8 strings |
Regex matching | O(n) to O(2^n) | Depends on pattern complexity |
Integer conversion | O(n) | Linear with string length |
Hash key lookup | O(1) average | For validation rule caching |
Array inclusion | O(n) | Linear search through elements |
Set inclusion | O(1) average | Use for large validation lists |
Common Validation Patterns
# Email validation with comprehensive checks
def validate_email(email)
return false unless email.is_a?(String)
return false if email.length > 254 # RFC limit
return false unless email.match?(/\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i)
local, domain = email.split('@')
return false if local.length > 64 # RFC limit
return false if domain.length > 253 # RFC limit
true
end
# Numeric range validation
def validate_in_range(value, min, max)
num = Float(value)
(min..max).cover?(num)
rescue ArgumentError
false
end
# Collection validation
def validate_array_of_strings(array, max_length: 100)
return false unless array.is_a?(Array)
return false if array.length > max_length
array.all? { |item| item.is_a?(String) && !item.empty? }
end