Overview
OpenURI extends Ruby's Kernel#open
method and URI.open
to handle HTTP, HTTPS, and FTP URIs transparently. The library converts these URIs into readable IO objects, making remote resource access as simple as file operations. OpenURI wraps the underlying Net::HTTP
and Net::FTP
libraries, providing a unified interface for different protocol types.
The library adds several key capabilities to URI handling. When opening HTTP or HTTPS URIs, OpenURI returns a StringIO
object containing the response body, along with metadata accessible through additional methods. The response object includes headers, status information, and the base URI for handling redirects. For FTP URIs, OpenURI provides direct access to remote files through the same interface.
require 'open-uri'
# HTTP request returning StringIO with response body
response = URI.open('https://api.github.com/users/octocat')
puts response.read
# => {"login":"octocat","id":1,"node_id":"MDQ6VXNlcjE=", ...}
# Access response metadata
puts response.status
# => ["200", "OK"]
puts response.content_type
# => "application/json; charset=utf-8"
OpenURI handles redirects automatically, following up to 5 redirections by default. The library maintains the final URI after redirects through the base_uri
method, which becomes important when processing relative links in HTML or handling API responses that redirect to different endpoints.
# Handling redirected responses
response = URI.open('https://github.com/ruby/ruby')
puts response.base_uri
# => #<URI::HTTPS https://github.com/ruby/ruby>
# Original URI vs final URI after redirect
original_uri = URI('https://git.io/ruby')
response = original_uri.open
puts "Original: #{original_uri}"
puts "Final: #{response.base_uri}"
The library integrates seamlessly with existing Ruby IO operations. Response objects respond to standard IO methods like read
, readline
, each_line
, and rewind
, making them compatible with any code expecting IO input. This design allows treating remote resources identically to local files in many contexts.
# Processing response line by line
URI.open('https://raw.githubusercontent.com/ruby/ruby/master/README.md') do |response|
response.each_line.with_index do |line, index|
puts "Line #{index + 1}: #{line.chomp}"
break if index >= 5 # First 6 lines only
end
end
Basic Usage
OpenURI supports two primary access patterns: direct URI opening and block-based resource management. The URI.open
method accepts a URI string or URI object and returns an IO-like object containing the response. The library handles protocol detection automatically based on the URI scheme.
require 'open-uri'
# Direct access pattern
response = URI.open('https://httpbin.org/get')
content = response.read
headers = response.meta
response.close
# Block pattern with automatic cleanup
URI.open('https://httpbin.org/json') do |response|
data = response.read
puts "Content-Type: #{response.content_type}"
puts "Status: #{response.status.join(' ')}"
# Response automatically closed at block end
end
Request customization occurs through options passed to the open
method. OpenURI supports HTTP headers, authentication, redirect limits, and timeout configurations. Headers pass as a hash, with string keys matching HTTP header names. User-Agent strings, accept headers, and custom application headers integrate through this mechanism.
# Custom headers and options
options = {
'User-Agent' => 'Ruby OpenURI Client/1.0',
'Accept' => 'application/json',
'Authorization' => 'Bearer token123',
read_timeout: 30,
redirect: false
}
response = URI.open('https://api.example.com/data', options)
Authentication mechanisms vary by requirement. HTTP Basic Authentication encodes credentials directly in the URI or through the http_basic_authentication
option. The URI-embedded approach works for simple cases, while the option approach provides better security by avoiding credential exposure in logs.
# URI-embedded authentication
response = URI.open('https://user:password@secure.example.com/api/data')
# Option-based authentication (preferred)
auth_options = {
http_basic_authentication: ['username', 'password']
}
response = URI.open('https://secure.example.com/api/data', auth_options)
OpenURI handles SSL/TLS connections transparently for HTTPS URIs. Certificate validation occurs automatically, but applications can customize SSL behavior through additional options. Certificate verification, SSL version selection, and custom certificate stores integrate through these configuration options.
# SSL configuration options
ssl_options = {
ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER,
ssl_ca_cert: '/path/to/ca-bundle.crt',
ssl_cert: OpenSSL::X509::Certificate.new(cert_content),
ssl_key: OpenSSL::PKey::RSA.new(key_content)
}
response = URI.open('https://secure-api.example.com/data', ssl_options)
Response metadata access provides detailed information about the HTTP transaction. The meta
method returns a hash containing all response headers, while specific methods like content_type
, content_length
, and last_modified
offer convenient access to common header values. Status information includes both numeric code and reason phrase.
URI.open('https://httpbin.org/response-headers?X-Custom=value') do |response|
# All headers as hash
headers = response.meta
puts headers['content-type']
# Convenience methods
puts "Type: #{response.content_type}"
puts "Length: #{response.content_length}"
puts "Modified: #{response.last_modified}"
puts "Status: #{response.status[0]} #{response.status[1]}"
# Custom headers
puts "Custom: #{headers['x-custom']}"
end
Error Handling & Debugging
OpenURI raises specific exception types for different failure conditions. Network timeouts, HTTP error responses, SSL certificate problems, and redirect loops each produce distinct exception classes. Understanding these exception patterns enables precise error handling and appropriate recovery strategies.
The primary exception hierarchy includes OpenURI::HTTPRedirect
for redirect-related issues, OpenURI::HTTPError
for HTTP status errors, and various Net::
exceptions for network-level problems. Timeout errors manifest as Net::ReadTimeout
or Net::OpenTimeout
, while SSL issues raise OpenSSL::SSL::SSLError
subclasses.
require 'open-uri'
def fetch_with_error_handling(uri)
URI.open(uri, read_timeout: 10)
rescue Net::ReadTimeout => e
puts "Request timed out after 10 seconds: #{e.message}"
nil
rescue Net::OpenTimeout => e
puts "Connection timeout: #{e.message}"
nil
rescue OpenURI::HTTPError => e
puts "HTTP error #{e.io.status[0]}: #{e.io.status[1]}"
puts "Response body: #{e.io.read}" if e.io.respond_to?(:read)
nil
rescue OpenSSL::SSL::SSLError => e
puts "SSL certificate error: #{e.message}"
nil
rescue SocketError => e
puts "Network error (DNS/connection): #{e.message}"
nil
end
# Usage with comprehensive error handling
response = fetch_with_error_handling('https://nonexistent-domain.invalid/api')
HTTP error responses require special handling because OpenURI raises exceptions for 4xx and 5xx status codes by default. The exception object contains the response data through the io
method, allowing access to error response bodies and headers. This pattern enables processing API error messages and implementing retry logic based on specific error codes.
def handle_api_errors(uri)
URI.open(uri)
rescue OpenURI::HTTPError => e
status_code = e.io.status[0].to_i
error_body = e.io.read
case status_code
when 400
puts "Bad request: #{error_body}"
# Parse error details for debugging
begin
error_details = JSON.parse(error_body)
puts "Validation errors: #{error_details['errors']}"
rescue JSON::ParserError
puts "Non-JSON error response"
end
when 401
puts "Authentication required"
# Trigger credential refresh
when 403
puts "Access denied: #{error_body}"
when 404
puts "Resource not found"
when 429
puts "Rate limited"
# Implement backoff strategy
when 500..599
puts "Server error #{status_code}: #{error_body}"
# Log for monitoring systems
else
puts "Unexpected HTTP error #{status_code}"
end
end
Redirect handling becomes complex when dealing with infinite redirect loops or redirect limits. OpenURI follows redirects automatically but raises OpenURI::HTTPRedirect
when limits are exceeded. Custom redirect handling requires disabling automatic redirects and implementing manual redirect logic.
def handle_redirects_manually(uri, max_redirects = 5)
current_uri = uri
redirect_count = 0
loop do
begin
response = URI.open(current_uri, redirect: false)
return response # Success, no redirect
rescue OpenURI::HTTPRedirect => e
redirect_count += 1
if redirect_count > max_redirects
raise "Too many redirects (#{redirect_count}): #{current_uri}"
end
# Extract redirect location
location = e.io.meta['location']
if location.nil?
raise "Redirect without Location header"
end
# Resolve relative redirects
current_uri = URI.join(current_uri.to_s, location).to_s
puts "Redirect #{redirect_count}: #{current_uri}"
end
end
end
Debugging network issues requires examining request and response details. OpenURI doesn't provide built-in logging, but you can implement request tracing by wrapping URI.open calls with debugging output. This approach helps identify network problems, header issues, and response processing errors.
def debug_request(uri, options = {})
puts "=== REQUEST DEBUG ==="
puts "URI: #{uri}"
puts "Options: #{options.inspect}"
start_time = Time.now
begin
response = URI.open(uri, options)
end_time = Time.now
puts "=== RESPONSE DEBUG ==="
puts "Status: #{response.status.join(' ')}"
puts "Headers: #{response.meta.to_h}"
puts "Content-Type: #{response.content_type}"
puts "Content-Length: #{response.content_length}"
puts "Base URI: #{response.base_uri}"
puts "Request duration: #{((end_time - start_time) * 1000).round(2)}ms"
response
rescue => e
end_time = Time.now
puts "=== ERROR DEBUG ==="
puts "Exception: #{e.class}"
puts "Message: #{e.message}"
puts "Request duration: #{((end_time - start_time) * 1000).round(2)}ms"
raise
end
end
# Usage for debugging problematic requests
response = debug_request('https://httpbin.org/delay/2', read_timeout: 5)
Production Patterns
Production OpenURI usage requires robust error handling, connection management, and monitoring integration. Applications should implement retry logic with exponential backoff for transient network failures, maintain connection pools where possible, and provide comprehensive logging for debugging production issues.
Connection reuse becomes important when making multiple requests to the same host. While OpenURI doesn't expose connection pooling directly, you can implement session management using Net::HTTP directly for high-volume scenarios, falling back to OpenURI for simple cases.
require 'open-uri'
require 'net/http'
require 'logger'
class ProductionHttpClient
def initialize(logger: Logger.new($stdout))
@logger = logger
@retry_attempts = 3
@base_timeout = 30
end
def fetch_with_retries(uri, options = {})
attempts = 0
begin
attempts += 1
@logger.info("Fetching #{uri} (attempt #{attempts})")
start_time = Time.now
response = URI.open(uri, default_options.merge(options))
duration = Time.now - start_time
@logger.info("Successfully fetched #{uri} in #{duration.round(3)}s")
response
rescue Net::ReadTimeout, Net::OpenTimeout => e
if attempts < @retry_attempts
backoff_time = @base_timeout * (2 ** (attempts - 1))
@logger.warn("Timeout on attempt #{attempts}, retrying in #{backoff_time}s: #{e.message}")
sleep(backoff_time)
retry
else
@logger.error("Failed to fetch #{uri} after #{attempts} attempts: #{e.message}")
raise
end
rescue OpenURI::HTTPError => e
status_code = e.io.status[0].to_i
# Retry on 5xx errors, but not 4xx
if (500..599).include?(status_code) && attempts < @retry_attempts
backoff_time = @base_timeout * (2 ** (attempts - 1))
@logger.warn("HTTP #{status_code} on attempt #{attempts}, retrying in #{backoff_time}s")
sleep(backoff_time)
retry
else
@logger.error("HTTP error #{status_code} for #{uri}: #{e.io.read}")
raise
end
end
end
private
def default_options
{
'User-Agent' => 'MyApp/1.0 (Production)',
read_timeout: @base_timeout,
open_timeout: 15,
ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER
}
end
end
# Usage in production code
client = ProductionHttpClient.new
response = client.fetch_with_retries('https://api.external-service.com/data')
Monitoring integration requires tracking request metrics, error rates, and response times. Production applications should emit metrics to monitoring systems and implement health checks that verify external service availability. Circuit breaker patterns help prevent cascade failures when external services become unavailable.
class MonitoredHttpClient
def initialize(metrics_client, circuit_breaker)
@metrics = metrics_client
@circuit_breaker = circuit_breaker
end
def fetch_with_monitoring(uri, options = {})
return nil if @circuit_breaker.open?
start_time = Time.now
begin
response = URI.open(uri, options)
duration = Time.now - start_time
# Record success metrics
@metrics.increment('http_requests_total', tags: ['status:success'])
@metrics.histogram('http_request_duration', duration * 1000)
@circuit_breaker.record_success
response
rescue => e
duration = Time.now - start_time
# Record error metrics
error_type = e.class.name.downcase
@metrics.increment('http_requests_total', tags: ["status:error", "error:#{error_type}"])
@metrics.histogram('http_request_duration', duration * 1000)
@circuit_breaker.record_failure
raise
end
end
def health_check(endpoints)
results = {}
endpoints.each do |name, uri|
begin
response = URI.open(uri, read_timeout: 5, open_timeout: 5)
results[name] = {
status: 'healthy',
response_code: response.status[0],
response_time: Time.now - start_time
}
rescue => e
results[name] = {
status: 'unhealthy',
error: e.message
}
end
end
results
end
end
Caching strategies reduce external API calls and improve application performance. Implement HTTP caching by respecting cache-control headers and ETags from responses. For applications with high request volumes, consider implementing response caching with appropriate invalidation strategies.
require 'digest'
class CachingHttpClient
def initialize(cache_store, default_ttl: 300)
@cache = cache_store
@default_ttl = default_ttl
end
def fetch_with_cache(uri, options = {})
cache_key = generate_cache_key(uri, options)
# Try to get from cache first
cached_response = @cache.read(cache_key)
if cached_response && !cached_response[:expired]
# Use conditional request if we have ETag
if cached_response[:etag]
options['If-None-Match'] = cached_response[:etag]
end
end
begin
response = URI.open(uri, options)
# Handle 304 Not Modified
if response.status[0] == '304'
return cached_response[:body]
end
# Cache new response
cache_data = {
body: response.read,
etag: response.meta['etag'],
expires_at: Time.now + determine_ttl(response),
expired: false
}
@cache.write(cache_key, cache_data)
cache_data[:body]
rescue OpenURI::HTTPError => e
# Return cached version on server errors if available
if cached_response && (500..599).include?(e.io.status[0].to_i)
cached_response[:body]
else
raise
end
end
end
private
def generate_cache_key(uri, options)
content = "#{uri}#{options.to_s}"
Digest::SHA256.hexdigest(content)
end
def determine_ttl(response)
cache_control = response.meta['cache-control']
if cache_control && match = cache_control.match(/max-age=(\d+)/)
match[1].to_i
else
@default_ttl
end
end
end
Reference
Core Methods
Method | Parameters | Returns | Description |
---|---|---|---|
URI.open(uri, **options, &block) |
uri (String/URI), options (Hash) |
StringIO or File |
Opens URI and returns IO object with response |
Kernel#open(uri, **options, &block) |
uri (String), options (Hash) |
StringIO or File |
Alias for URI.open when uri starts with protocol |
Response Object Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#read(length=nil) |
length (Integer, optional) |
String |
Reads response body content |
#readline |
None | String |
Reads single line from response |
#each_line(&block) |
Block | Enumerator |
Iterates over response lines |
#rewind |
None | Integer |
Resets read position to beginning |
#close |
None | nil |
Closes the response object |
Response Metadata Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#meta |
None | Hash |
All response headers as hash |
#content_type |
None | String |
Content-Type header value |
#content_length |
None | Integer |
Content-Length header as integer |
#last_modified |
None | Time |
Last-Modified header as Time object |
#status |
None | Array |
HTTP status code and reason phrase |
#base_uri |
None | URI |
Final URI after redirects |
Request Options
Option | Type | Default | Description |
---|---|---|---|
'User-Agent' |
String | Ruby/VERSION | User-Agent header string |
'Accept' |
String | */* |
Accept header for content types |
'Referer' |
String | None | Referer header value |
read_timeout |
Integer | 60 | Read timeout in seconds |
open_timeout |
Integer | 60 | Connection timeout in seconds |
redirect |
Boolean | true | Follow redirects automatically |
ssl_verify_mode |
Integer | VERIFY_PEER |
SSL certificate verification mode |
Authentication Options
Option | Type | Description |
---|---|---|
http_basic_authentication |
Array | [username, password] for HTTP Basic Auth |
ssl_cert |
OpenSSL::X509::Certificate | Client certificate for SSL |
ssl_key |
OpenSSL::PKey | Private key for client certificate |
ssl_ca_cert |
String | Path to CA certificate file |
Exception Classes
Exception | Parent Class | Description |
---|---|---|
OpenURI::HTTPError |
StandardError |
HTTP 4xx/5xx response codes |
OpenURI::HTTPRedirect |
HTTPError |
Redirect limit exceeded |
Net::ReadTimeout |
Net::TimeoutError |
Read operation timeout |
Net::OpenTimeout |
Net::TimeoutError |
Connection establishment timeout |
OpenSSL::SSL::SSLError |
OpenSSL::OpenSSLError |
SSL/TLS connection errors |
SocketError |
StandardError |
Network connection errors |
SSL Options
Option | Type | Default | Description |
---|---|---|---|
ssl_verify_mode |
Integer | VERIFY_PEER |
Certificate verification level |
ssl_verify_depth |
Integer | 5 | Certificate chain depth limit |
ssl_version |
Symbol | Auto | Specific SSL/TLS version |
ssl_ciphers |
String | System default | Allowed cipher suites |
ssl_ca_file |
String | System default | CA certificate bundle file |
HTTP Status Code Patterns
Status Range | OpenURI Behavior | Exception Class |
---|---|---|
1xx (Informational) | Transparent handling | None |
2xx (Success) | Normal response | None |
3xx (Redirect) | Follow automatically | HTTPRedirect if limit exceeded |
4xx (Client Error) | Raise exception | HTTPError |
5xx (Server Error) | Raise exception | HTTPError |
Protocol Support
Protocol | URI Scheme | Implementation | Features |
---|---|---|---|
HTTP | http:// |
Net::HTTP | Full HTTP/1.1 support |
HTTPS | https:// |
Net::HTTP + OpenSSL | SSL/TLS encryption |
FTP | ftp:// |
Net::FTP | File transfer protocol |