Overview
Ruby provides several built-in methods for converting string case through the String class. These methods handle ASCII characters directly and delegate to Unicode algorithms for international characters. The core methods include upcase
, downcase
, capitalize
, and swapcase
, each returning a new string with transformed character casing.
text = "Hello World"
text.upcase # => "HELLO WORLD"
text.downcase # => "hello world"
text.capitalize # => "Hello world"
Ruby's case conversion operates on the string's encoding, applying transformation rules based on Unicode standards for non-ASCII characters. The methods preserve the original string's encoding and handle multibyte characters according to their Unicode properties.
text = "café"
text.upcase # => "CAFÉ"
text.downcase # => "café"
# Works with various encodings
text.encode("ISO-8859-1").upcase # => "CAFÉ" (in ISO-8859-1)
String case conversion methods are non-destructive by default, returning new String objects rather than modifying the original. Destructive variants with !
suffixes modify the string in place when possible.
original = "Mixed Case"
converted = original.upcase # original unchanged
original.upcase! # modifies original
Basic Usage
The upcase
method converts all lowercase characters to uppercase equivalents. ASCII characters a-z transform to A-Z, while non-ASCII characters follow Unicode case mapping rules.
"hello".upcase # => "HELLO"
"Hello World".upcase # => "HELLO WORLD"
"naïve résumé".upcase # => "NAÏVE RÉSUMÉ"
The downcase
method performs the inverse operation, converting uppercase characters to lowercase. The method handles complex Unicode transformations including characters that expand during conversion.
"HELLO".downcase # => "hello"
"RÉSUMÉ".downcase # => "résumé"
"İSTANBUL".downcase # => "i̇stanbul" (Turkish dotted I)
The capitalize
method converts the first character to uppercase and all remaining characters to lowercase. This differs from title case, which capitalizes each word.
"hello world".capitalize # => "Hello world"
"HELLO WORLD".capitalize # => "Hello world"
"mary o'connor".capitalize # => "Mary o'connor"
The swapcase
method inverts the case of each character, converting uppercase to lowercase and lowercase to uppercase.
"Hello World".swapcase # => "hELLO wORLD"
"ABC123def".swapcase # => "abc123DEF"
Each method includes a destructive variant that modifies the original string when the string is mutable. These methods return nil
if no changes occur or the string is frozen.
str = "hello"
result = str.upcase! # str becomes "HELLO", returns "HELLO"
frozen_str = "hello".freeze
frozen_str.upcase! # raises FrozenError
Advanced Usage
Case conversion methods accept optional locale parameters for language-specific transformations. Turkish and Lithuanian have special rules that differ from standard Unicode mappings.
# Turkish I conversion
"İstanbul".downcase(:turkish) # => "istanbul" (dotless i)
"istanbul".upcase(:turkish) # => "İSTANBUL" (dotted I)
# Lithuanian retains dots over i when followed by accents
"Į́".downcase(:lithuanian) # => "į́" (preserves dot)
Method chaining enables complex transformations by combining multiple case operations with other string methods.
" MIXED case STRING "
.strip
.downcase
.capitalize # => "Mixed case string"
# Transform and validate
input = "EMAIL@DOMAIN.COM"
normalized = input.downcase.strip
valid = normalized.match?(/\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/)
Custom case conversion patterns combine built-in methods with string manipulation for specialized formatting requirements.
# Snake case to title case conversion
def snake_to_title(str)
str.split('_').map(&:capitalize).join(' ')
end
snake_to_title("first_name_field") # => "First Name Field"
# Camel case to sentence case
def camel_to_sentence(str)
str.gsub(/([A-Z])/, ' \1').strip.capitalize
end
camel_to_sentence("XMLHttpRequest") # => "Xml http request"
Regular expressions with case conversion enable selective transformations based on patterns or position within the string.
# Capitalize after punctuation
text = "hello. world! how are you?"
text.gsub(/(?<=\.\s)[a-z]/) { |match| match.upcase }
# => "hello. World! How are you?"
# Convert acronyms to title case
text = "HTTP API and XML parser"
text.gsub(/\b[A-Z]{2,}\b/) { |acronym| acronym.capitalize }
# => "Http Api and Xml parser"
Enumerable methods combine with case conversion for batch string processing operations.
headers = ["FIRST_NAME", "LAST_NAME", "EMAIL_ADDRESS"]
formatted = headers.map { |h| h.split('_').map(&:capitalize).join(' ') }
# => ["First Name", "Last Name", "Email Address"]
# Case-insensitive grouping
words = ["Apple", "BANANA", "apple", "Banana"]
grouped = words.group_by(&:downcase)
# => {"apple"=>["Apple", "apple"], "banana"=>["BANANA", "Banana"]}
Common Pitfalls
Unicode normalization affects case conversion results when strings contain combining characters or multiple representations of the same visual character.
# Different Unicode representations
str1 = "café" # é as single character (U+00E9)
str2 = "cafe\u0301" # e + combining acute accent
str1.length # => 4
str2.length # => 5
str1.upcase # => "CAFÉ"
str2.upcase # => "CAFÉ"
# Normalize before comparison
str1.unicode_normalize == str2.unicode_normalize # => true
Encoding mismatches cause unexpected results when strings contain non-ASCII characters in incompatible encodings.
# UTF-8 string with accented characters
utf8_str = "résumé".encode("UTF-8")
# Convert to Latin-1, loses accent information in some cases
latin1_str = utf8_str.encode("ISO-8859-1")
latin1_str.upcase # Works correctly: "RÉSUMÉ"
# But forced encoding without conversion breaks
broken = utf8_str.force_encoding("ASCII-8BIT")
broken.upcase # May produce unexpected results
Locale-dependent transformations require explicit locale specification to avoid system-dependent behavior in certain environments.
# System locale affects some conversions
turkish_text = "İstanbul"
# Default behavior (system dependent)
turkish_text.downcase # May vary by system locale
# Explicit locale ensures consistent behavior
turkish_text.downcase(:turkish) # Always produces "istanbul"
Case conversion with special characters encounters edge cases where Unicode defines complex mapping rules.
# German sharp s (ß) conversion
"Straße".upcase # => "STRASSE" (ß becomes SS)
"STRASSE".downcase # => "strasse" (cannot reverse)
# One-to-many character mappings
"ffl".upcase # => "FFL" (ligature expands)
Frozen string literals prevent in-place modifications, causing destructive methods to raise exceptions rather than silently failing.
# frozen_string_literal: true
str = "hello"
str.frozen? # => true (literal is frozen)
str.upcase # => "HELLO" (returns new string)
str.upcase! # => FrozenError
Character boundaries in multibyte encodings require careful handling when manipulating strings byte-by-byte.
utf8_string = "café"
# Incorrect: splitting at byte boundary
utf8_string.byteslice(0, 3) # => "caf" (cuts off é)
# Correct: using character-aware methods
utf8_string[0, 3] # => "caf"
utf8_string.chars.take(3).join # => "caf"
Performance & Memory
Case conversion performance varies significantly between ASCII-only strings and those containing multibyte Unicode characters.
require 'benchmark'
ascii_string = "hello world" * 1000
unicode_string = "héllo wørld" * 1000
Benchmark.bm do |bm|
bm.report("ASCII upcase") { 1000.times { ascii_string.upcase } }
bm.report("Unicode upcase") { 1000.times { unicode_string.upcase } }
end
# ASCII upcase: 0.012000 0.000000 0.012000 ( 0.012345)
# Unicode upcase: 0.089000 0.001000 0.090000 ( 0.091234)
Memory allocation increases with destructive operations on frozen strings, which must create new objects despite the !
suffix suggesting in-place modification.
# Frozen strings allocate new objects
frozen_str = "hello".freeze
result = frozen_str.upcase! # Creates new string object
# Mutable strings modify in place when possible
mutable_str = +"hello" # Creates mutable copy
mutable_str.upcase! # Modifies existing object
Large string processing benefits from streaming approaches that process data in chunks rather than loading entire strings into memory.
# Memory-efficient processing of large files
def process_large_file(filename)
File.open(filename, 'r') do |file|
file.each_line do |line|
processed = line.strip.downcase
# Process line immediately rather than accumulating
yield processed
end
end
end
# Batch processing with controlled memory usage
def process_in_batches(strings, batch_size = 1000)
strings.each_slice(batch_size) do |batch|
results = batch.map(&:upcase)
# Process batch results immediately
yield results
GC.start if rand < 0.1 # Periodic garbage collection
end
end
String pooling reduces memory usage when processing many strings with repeated case conversion patterns.
class StringCaseConverter
def initialize
@cache = {}
end
def upcase_cached(str)
@cache[str] ||= str.upcase
end
def clear_cache
@cache.clear
end
end
converter = StringCaseConverter.new
# Repeated conversions use cached results
1000.times { converter.upcase_cached("same string") } # Only converts once
Production Patterns
Web applications commonly normalize user input through case conversion to ensure consistent data storage and comparison operations.
class UserRegistration
def normalize_email(email)
email.to_s.strip.downcase
end
def format_name(name)
name.to_s.strip.split.map(&:capitalize).join(' ')
end
def normalize_username(username)
username.to_s.strip.downcase.gsub(/[^a-z0-9_]/, '')
end
end
# Usage in controller
def create_user
registration = UserRegistration.new
params = {
email: registration.normalize_email(params[:email]),
name: registration.format_name(params[:name]),
username: registration.normalize_username(params[:username])
}
User.create(params)
end
Database queries with case conversion enable case-insensitive searches while preserving original data formatting.
class Product < ActiveRecord::Base
scope :search_by_name, ->(query) {
where("LOWER(name) LIKE ?", "%#{query.to_s.downcase}%")
}
def self.find_by_sku_ignore_case(sku)
where("UPPER(sku) = ?", sku.to_s.upcase).first
end
end
# Usage
products = Product.search_by_name("iPhone") # Finds "iPhone", "IPHONE", etc.
product = Product.find_by_sku_ignore_case("abc123") # Case-insensitive SKU lookup
API serialization standardizes output format through consistent case conversion patterns.
class ApiSerializer
def self.serialize_keys(hash)
case Rails.application.config.api_key_format
when :snake_case
hash.transform_keys { |key| key.to_s.underscore }
when :camel_case
hash.transform_keys { |key| key.to_s.camelize(:lower) }
when :kebab_case
hash.transform_keys { |key| key.to_s.dasherize }
else
hash
end
end
def self.serialize_values(hash)
hash.transform_values do |value|
case value
when String
value.strip
when Hash
serialize_keys(serialize_values(value))
else
value
end
end
end
end
Logging systems apply case conversion for consistent log parsing and filtering.
class ApplicationLogger
def self.normalize_level(level)
level.to_s.upcase.to_sym
end
def self.log(level, message, **metadata)
normalized_level = normalize_level(level)
log_entry = {
level: normalized_level,
message: message.to_s,
timestamp: Time.current.iso8601,
metadata: metadata.transform_keys { |k| k.to_s.downcase.to_sym }
}
Rails.logger.send(normalized_level.downcase, log_entry.to_json)
end
end
# Usage
ApplicationLogger.log(:info, "User created", USER_ID: 123, EMAIL: "user@example.com")
# Logs with consistent casing: {:level=>:INFO, :metadata=>{:user_id=>123, :email=>"user@example.com"}}
Background job processing standardizes parameter handling through case conversion middleware.
class ParameterNormalizationJob
include Sidekiq::Job
def perform(*args)
normalized_args = args.map { |arg| normalize_parameter(arg) }
process_with_normalized_parameters(normalized_args)
end
private
def normalize_parameter(param)
case param
when Hash
param.transform_keys { |k| k.to_s.underscore.to_sym }
.transform_values { |v| normalize_parameter(v) }
when String
param.strip
else
param
end
end
def process_with_normalized_parameters(args)
# Process with consistently formatted parameters
end
end
Reference
Core Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#upcase |
None | String |
Returns string with lowercase characters converted to uppercase |
#upcase! |
None | String or nil |
Converts lowercase characters to uppercase in place |
#downcase |
None | String |
Returns string with uppercase characters converted to lowercase |
#downcase! |
None | String or nil |
Converts uppercase characters to lowercase in place |
#capitalize |
None | String |
Returns string with first character uppercase, rest lowercase |
#capitalize! |
None | String or nil |
Capitalizes first character, lowercases rest in place |
#swapcase |
None | String |
Returns string with case of each character inverted |
#swapcase! |
None | String or nil |
Inverts case of each character in place |
Locale-Aware Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#upcase(:locale) |
:turkic , :lithuanian |
String |
Converts to uppercase using locale-specific rules |
#downcase(:locale) |
:turkic , :lithuanian |
String |
Converts to lowercase using locale-specific rules |
Behavior Rules
Condition | Non-destructive Methods | Destructive Methods |
---|---|---|
String is mutable | Returns new string object | Modifies original, returns self |
String is frozen | Returns new string object | Raises FrozenError |
No changes needed | Returns new identical string | Returns nil |
Empty string | Returns empty string | Returns "" or nil |
Unicode Considerations
Character Type | Behavior | Example |
---|---|---|
ASCII a-z, A-Z | Direct mapping | a ↔ A |
Latin accented | Unicode case mapping | é ↔ É |
Turkish I/i | Locale-dependent | İ → i (Turkish), I (default) |
German ß | Expands on upcase | ß → SS |
Ligatures | May expand | ffl → FFL |
Combining chars | Preserves combinations | e + ◌́ → E + ◌́ |
Error Conditions
Error | Cause | Solution |
---|---|---|
FrozenError |
Destructive method on frozen string | Use non-destructive variant |
Encoding::CompatibilityError |
Incompatible encoding operations | Ensure compatible encodings |
ArgumentError |
Invalid locale parameter | Use supported locale symbols |
Performance Characteristics
Operation | ASCII Performance | Unicode Performance | Memory Impact |
---|---|---|---|
#upcase |
O(n) fast | O(n) slower | New string allocated |
#upcase! |
O(n) fast | O(n) slower | In-place when mutable |
Large strings | Linear scaling | Linear scaling | Memory proportional to size |
Repeated operations | No optimization | No optimization | Consider caching results |