Overview
Reliable Data Transfer (RDT) addresses the fundamental challenge that network communication channels are inherently unreliable. Bits can be corrupted during transmission, packets can be lost entirely, and data can arrive out of order or duplicated. RDT protocols implement mechanisms to detect and recover from these failures, creating the abstraction of a reliable channel over an unreliable medium.
The concept originated from early networking research when engineers recognized that application developers needed guarantees about data delivery without implementing complex error handling in every application. Rather than forcing each application to handle packet loss, corruption, and reordering, the transport layer provides reliability as a service. This separation of concerns became a cornerstone of the Internet protocol stack.
RDT sits at the transport layer of the network stack, positioned between the application layer above and the network layer below. The network layer (IP) provides best-effort delivery with no guarantees, while the transport layer (TCP) builds reliability on top of this unreliable foundation. Applications like web browsers, email clients, and database connections depend on RDT to function correctly without implementing their own reliability mechanisms.
Consider a simple example: sending the string "HELLO" across a network. Without reliability mechanisms, the receiving application might receive "HXLLO" (bit corruption), "HEL" (packet loss), "LLOHE" (reordering), or "HELLOHELLO" (duplication). RDT protocols prevent these issues through checksums, acknowledgments, sequence numbers, and retransmission.
The design space for RDT protocols involves fundamental trade-offs between simplicity, performance, and the types of failures handled. Early protocols like stop-and-wait provide correctness but sacrifice throughput. More sophisticated protocols like selective repeat achieve higher performance through pipelining but require additional complexity in state management and buffer handling.
Key Principles
Reliable data transfer builds on four foundational mechanisms that work together to detect and recover from transmission errors. Each mechanism addresses a specific failure mode in the communication channel.
Checksums detect bit corruption during transmission. The sender computes a mathematical hash of the packet contents and includes it in the packet header. The receiver recomputes the checksum and compares it to the transmitted value. Any mismatch indicates corruption occurred. Common checksum algorithms include Internet checksum (ones-complement sum), CRC (cyclic redundancy check), and MD5 or SHA for stronger error detection. The checksum field itself must be protected or excluded from the computation to avoid circular dependencies.
Acknowledgments inform the sender that data arrived successfully. After receiving and validating a packet, the receiver sends an acknowledgment (ACK) back to the sender. The ACK indicates which packet was received correctly. Negative acknowledgments (NAK) explicitly signal packet corruption or other errors, though many modern protocols use duplicate ACKs instead of explicit NAKs. The acknowledgment mechanism creates a feedback loop that allows the sender to know the receiver's state.
Sequence numbers enable the receiver to detect duplicates, reordering, and gaps in the data stream. Each packet carries a sequence number that identifies its position in the stream. The receiver uses sequence numbers to reassemble data in the correct order, discard duplicates, and identify missing packets. Sequence numbers typically wrap around after reaching a maximum value, so protocols must handle wraparound correctly. The sequence number space size determines how many outstanding unacknowledged packets can exist simultaneously.
Timeouts and retransmission handle packet loss when neither data nor acknowledgments arrive. The sender starts a timer when transmitting a packet. If no acknowledgment arrives before the timer expires, the sender assumes the packet was lost and retransmits it. Timeout values must balance responsiveness against spurious retransmissions caused by delayed packets or acknowledgments. Adaptive timeout algorithms adjust based on measured round-trip times.
These mechanisms combine in protocols that can be modeled as finite state machines. The sender FSM has states for waiting to send, waiting for acknowledgment, and handling timeouts. The receiver FSM has states for waiting for packets and processing received data. State transitions occur based on events like packet arrival, timeout expiration, or ACK reception.
The protocol must handle the interaction between these mechanisms. For example, sequence numbers allow the receiver to distinguish between original packets and retransmissions after a timeout. Checksums ensure that acknowledgments themselves arrive uncorrupted. The combination creates a system where data eventually arrives correctly despite any finite number of transmission errors, though timeouts may occur.
RDT protocols differ in how they handle pipelining—sending multiple packets before receiving acknowledgments. Stop-and-wait protocols send one packet and wait for its ACK before sending the next, achieving low utilization on high-latency links. Pipelined protocols like Go-Back-N and Selective Repeat send multiple packets concurrently, requiring additional buffering and more complex sequence number management but achieving much higher throughput.
Ruby Implementation
Ruby provides reliable data transfer through the TCPSocket and TCPServer classes, which implement the TCP protocol's reliability mechanisms. The socket library handles checksums, acknowledgments, sequence numbers, and retransmissions transparently, presenting a stream abstraction to the application.
require 'socket'
# Server implementing reliable message reception
server = TCPServer.new(9000)
puts "Server listening on port 9000"
client = server.accept
puts "Client connected from #{client.peeraddr[3]}:#{client.peeraddr[1]}"
# TCP handles reliability - data arrives in order without corruption
while (message = client.gets)
puts "Received: #{message}"
client.puts "ACK: #{message.chomp}"
end
client.close
server.close
The TCP socket handles reliability through several configurable parameters. The TCP_NODELAY option disables Nagle's algorithm, which buffers small packets to reduce overhead. Without this option, TCP delays sending small packets to coalesce them, trading latency for efficiency:
require 'socket'
socket = TCPSocket.new('example.com', 80)
# Disable Nagle's algorithm for low-latency applications
socket.setsockopt(Socket::IPPROTO_TCP, Socket::TCP_NODELAY, 1)
# Configure socket buffer sizes for throughput
socket.setsockopt(Socket::SOL_SOCKET, Socket::SO_SNDBUF, 65536)
socket.setsockopt(Socket::SOL_SOCKET, Socket::SO_RCVBUF, 65536)
# Set keepalive to detect dead connections
socket.setsockopt(Socket::SOL_SOCKET, Socket::SO_KEEPALIVE, 1)
socket.write("GET / HTTP/1.1\r\n\r\n")
response = socket.read
puts response
socket.close
For applications requiring custom reliability mechanisms over UDP, Ruby's UDPSocket class provides unreliable datagram delivery. Building reliability on top of UDP requires implementing checksums, acknowledgments, sequence numbers, and timeouts:
require 'socket'
require 'digest'
# Custom reliable protocol over UDP
class ReliableUDPSocket
TIMEOUT = 1.0
MAX_RETRIES = 5
def initialize(port)
@socket = UDPSocket.new
@socket.bind('0.0.0.0', port)
@seq_num = 0
end
def send_reliable(data, host, port)
retries = 0
loop do
# Create packet with sequence number and checksum
packet = create_packet(data)
@socket.send(packet, 0, host, port)
# Wait for acknowledgment with timeout
begin
ready = IO.select([@socket], nil, nil, TIMEOUT)
if ready
ack_data, _ = @socket.recvfrom(1024)
ack_seq = ack_data.unpack1('N')
if ack_seq == @seq_num
@seq_num += 1
return true
end
end
rescue IO::TimeoutError
# Timeout occurred, fall through to retry
end
retries += 1
raise "Max retries exceeded" if retries >= MAX_RETRIES
puts "Timeout, retransmitting packet #{@seq_num}"
end
end
private
def create_packet(data)
checksum = Digest::MD5.digest(data)
[@seq_num, checksum, data].pack('NA16A*')
end
end
# Usage
sender = ReliableUDPSocket.new(9001)
sender.send_reliable("Important message", 'localhost', 9002)
The receiver side validates checksums and sends acknowledgments:
class ReliableUDPReceiver
def initialize(port)
@socket = UDPSocket.new
@socket.bind('0.0.0.0', port)
@expected_seq = 0
end
def receive_reliable
loop do
data, addr = @socket.recvfrom(65536)
# Parse packet
seq_num, checksum, payload = data.unpack('NA16A*')
# Verify checksum
computed = Digest::MD5.digest(payload)
if computed == checksum && seq_num == @expected_seq
# Send acknowledgment
ack = [@expected_seq].pack('N')
@socket.send(ack, 0, addr[3], addr[1])
@expected_seq += 1
return payload
else
# Corruption or duplicate - send previous ACK
prev_ack = [@expected_seq - 1].pack('N')
@socket.send(prev_ack, 0, addr[3], addr[1])
end
end
end
end
# Usage
receiver = ReliableUDPReceiver.new(9002)
message = receiver.receive_reliable
puts "Reliably received: #{message}"
Ruby's Async gem provides higher-level abstractions for reliable communication with built-in timeout handling and connection management:
require 'async'
require 'async/io'
require 'async/io/stream'
Async do
endpoint = Async::IO::Endpoint.tcp('localhost', 9000)
endpoint.bind do |server|
server.listen(128)
server.accept do |client|
stream = Async::IO::Stream.new(client)
# Reliable read with timeout
begin
Async::Task.current.with_timeout(5.0) do
while (line = stream.read_until("\n"))
puts "Received: #{line}"
stream.write("ACK\n")
stream.flush
end
end
rescue Async::TimeoutError
puts "Client connection timed out"
ensure
client.close
end
end
end
end
Implementation Approaches
Reliable data transfer protocols fall into three main categories based on how they handle packet pipelining and acknowledgments. Each approach makes different trade-offs between simplicity, buffer requirements, and throughput.
Stop-and-Wait Protocol represents the simplest reliable transfer mechanism. The sender transmits one packet and waits for its acknowledgment before sending the next packet. This approach minimizes sender and receiver buffering requirements but severely limits throughput on high-latency links.
The stop-and-wait sender operates as a two-state finite state machine. In the "waiting for call from above" state, it accepts data from the application, creates a packet with sequence number and checksum, sends it, starts a timer, and transitions to "waiting for ACK." In the "waiting for ACK" state, it processes events: receiving a valid ACK transitions back to the first state; timer expiration triggers retransmission and timer restart; receiving a corrupt ACK or NAK also triggers retransmission.
Performance analysis reveals the protocol's limitations. Consider a 1 Gbps link with 30ms round-trip time. Stop-and-wait sends a 1KB packet, waits 30ms for ACK, then sends the next packet. The sender transmits for 8 microseconds per packet (1KB at 1Gbps) but waits 30ms between transmissions. Link utilization equals (0.008ms / 30.008ms) or approximately 0.027%, achieving only 267 Kbps effective throughput despite the 1 Gbps link capacity. The protocol's throughput is inversely proportional to the round-trip time, making it unsuitable for high-bandwidth-delay product networks.
Go-Back-N Protocol improves throughput by allowing the sender to transmit multiple packets before receiving acknowledgments. The sender maintains a window of N packets that can be outstanding simultaneously. Sequence numbers use modulo-2^k arithmetic where k is chosen such that 2^k > N. When an error occurs, the receiver discards all subsequent packets and the sender retransmits from the error point onward.
The sender maintains three variables: base (oldest unacknowledged packet), nextseqnum (next sequence number to use), and N (window size). The window represents packets in the range [base, base+N-1]. When the application provides data, if nextseqnum < base+N, the sender creates and transmits the packet and advances nextseqnum. If nextseqnum >= base+N, the window is full and the sender refuses the data or buffers it.
When an ACK arrives for packet n, the sender sets base to n+1 and slides the window forward. This cumulative acknowledgment scheme means ACK n acknowledges all packets up through n. If base equals nextseqnum after processing the ACK, no packets are outstanding and the timer stops. Otherwise, the timer restarts for the oldest unacknowledged packet.
Timeout events trigger retransmission of all packets from base to nextseqnum-1. This "go back N" behavior gives the protocol its name. If packet 5 is lost among packets 1-10, the receiver discards packets 6-10 even if they arrived correctly. When timeout occurs, the sender retransmits packets 5-10.
The receiver maintains a single variable: expectedseqnum. When a packet arrives, if its sequence number equals expectedseqnum, the receiver delivers data to the application, sends ACK expectedseqnum, and increments expectedseqnum. For any other sequence number (out of order or duplicate), the receiver discards the packet and sends ACK for the last correctly received in-order packet. This design keeps the receiver simple with minimal buffering but causes retransmission of correctly received packets after errors.
Selective Repeat Protocol eliminates unnecessary retransmissions by acknowledging packets individually and buffering out-of-order packets at the receiver. The sender maintains a timer for each outstanding packet and retransmits only packets that timeout or receive NAKs. The receiver acknowledges each correctly received packet regardless of order and delivers data to the application only when the next in-sequence packet arrives.
The sender window spans [send_base, send_base+N-1] where N is the window size. Each packet has an independent timer. When data arrives from the application, if nextseqnum is in the window, the sender transmits the packet and starts its timer. When ACK n arrives, the sender marks packet n as received and moves send_base forward if n equals send_base. Timeout for packet n triggers retransmission of only that packet.
The receiver window spans [rcv_base, rcv_base+N-1]. When packet n arrives, if n falls in [rcv_base, rcv_base+N-1], the receiver sends ACK n. If n equals rcv_base, it delivers packets starting from n up through the highest consecutive received packet, then advances rcv_base. If n falls in [rcv_base-N, rcv_base-1], the packet is a duplicate from before the window, so the receiver sends ACK n. Packets outside these ranges are discarded.
Window size selection critically affects correctness. For selective repeat to work correctly, the window size must satisfy N <= sequence_space_size / 2. If sequence numbers use modulo 4 (sequence space = 4), then N <= 2. Larger windows cause ambiguity: the receiver cannot distinguish between packet 0 from the current window and packet 0 from the next window cycle. This constraint limits the protocol's performance on high-bandwidth-delay product networks.
Performance comparison shows selective repeat achieves the best throughput by minimizing retransmissions, but requires buffer space at both sender and receiver for N packets. Go-back-N achieves good throughput with minimal receiver buffering but retransmits correctly received packets after errors. Stop-and-wait requires minimal buffering but achieves poor throughput. Protocol selection depends on the application's requirements for throughput, memory usage, and implementation complexity.
Practical Examples
Building a reliable file transfer application demonstrates how RDT principles translate into working systems. This example implements chunked transfer with acknowledgments and retransmission over TCP:
require 'socket'
require 'digest'
class ReliableFileTransfer
CHUNK_SIZE = 8192
MAX_RETRIES = 3
def self.send_file(filename, host, port)
socket = TCPSocket.new(host, port)
file_size = File.size(filename)
# Send file metadata
socket.puts "#{File.basename(filename)}:#{file_size}"
File.open(filename, 'rb') do |file|
chunk_num = 0
while (chunk = file.read(CHUNK_SIZE))
retries = 0
begin
# Send chunk with sequence number and checksum
checksum = Digest::SHA256.hexdigest(chunk)
socket.puts "CHUNK:#{chunk_num}:#{checksum}"
socket.write(chunk)
socket.flush
# Wait for acknowledgment
ack = socket.gets.chomp
unless ack == "ACK:#{chunk_num}"
raise "Invalid ACK received"
end
chunk_num += 1
print "\rSent #{chunk_num * CHUNK_SIZE} / #{file_size} bytes"
rescue => e
retries += 1
if retries < MAX_RETRIES
puts "\nError sending chunk #{chunk_num}, retrying..."
file.seek(chunk_num * CHUNK_SIZE)
retry
else
raise "Failed to send chunk #{chunk_num} after #{MAX_RETRIES} attempts"
end
end
end
end
socket.puts "DONE"
socket.close
puts "\nFile transfer complete"
end
def self.receive_file(port)
server = TCPServer.new(port)
client = server.accept
# Receive file metadata
metadata = client.gets.chomp
filename, file_size = metadata.split(':')
file_size = file_size.to_i
puts "Receiving file: #{filename} (#{file_size} bytes)"
File.open("received_#{filename}", 'wb') do |file|
expected_chunk = 0
loop do
header = client.gets
break if header.nil? || header.chomp == "DONE"
# Parse chunk header
_, chunk_num, expected_checksum = header.chomp.split(':')
chunk_num = chunk_num.to_i
# Read chunk data
chunk = client.read(CHUNK_SIZE)
# Verify checksum
actual_checksum = Digest::SHA256.hexdigest(chunk)
if actual_checksum == expected_checksum && chunk_num == expected_chunk
file.write(chunk)
client.puts "ACK:#{chunk_num}"
expected_chunk += 1
print "\rReceived #{expected_chunk * CHUNK_SIZE} / #{file_size} bytes"
else
# Corruption or out-of-order chunk
client.puts "NAK:#{chunk_num}"
puts "\nCorrupted chunk #{chunk_num}, requesting retransmission"
end
end
end
client.close
server.close
puts "\nFile received successfully"
end
end
# Usage
# Server: ReliableFileTransfer.receive_file(9000)
# Client: ReliableFileTransfer.send_file('large_file.dat', 'localhost', 9000)
A distributed task queue demonstrates reliable message delivery with at-least-once semantics. Tasks are acknowledged only after successful processing, ensuring no task is lost even if workers crash:
require 'socket'
require 'json'
require 'securerandom'
class ReliableTaskQueue
def initialize(port)
@port = port
@pending_tasks = {}
@task_queue = Queue.new
end
def enqueue_task(task_data)
task_id = SecureRandom.uuid
task = {
id: task_id,
data: task_data,
created_at: Time.now,
attempts: 0
}
@pending_tasks[task_id] = task
@task_queue.push(task)
task_id
end
def start_server
server = TCPServer.new(@port)
puts "Task queue listening on port #{@port}"
Thread.new { monitor_timeouts }
loop do
client = server.accept
Thread.new { handle_worker(client) }
end
end
private
def handle_worker(socket)
worker_id = socket.gets.chomp
puts "Worker #{worker_id} connected"
loop do
# Send task to worker
task = @task_queue.pop
task[:attempts] += 1
task[:assigned_at] = Time.now
socket.puts task.to_json
socket.flush
# Wait for acknowledgment
begin
result = socket.gets(chomp: true)
if result && result.start_with?('ACK')
task_id = result.split(':')[1]
@pending_tasks.delete(task_id)
puts "Task #{task_id} completed by #{worker_id}"
elsif result && result.start_with?('FAIL')
task_id = result.split(':')[1]
requeue_task(task_id)
puts "Task #{task_id} failed, requeuing"
end
rescue
# Worker disconnected, requeue unacknowledged task
requeue_task(task[:id])
break
end
end
socket.close
end
def monitor_timeouts
loop do
sleep 5
@pending_tasks.each do |task_id, task|
if task[:assigned_at] && Time.now - task[:assigned_at] > 30
puts "Task #{task_id} timed out, requeuing"
requeue_task(task_id)
end
end
end
end
def requeue_task(task_id)
task = @pending_tasks[task_id]
return unless task
if task[:attempts] < 3
task[:assigned_at] = nil
@task_queue.push(task)
else
puts "Task #{task_id} exceeded max attempts, moving to dead letter queue"
@pending_tasks.delete(task_id)
end
end
end
# Worker implementation
class TaskWorker
def initialize(server_host, server_port, worker_id)
@server_host = server_host
@server_port = server_port
@worker_id = worker_id
end
def start
socket = TCPSocket.new(@server_host, @server_port)
socket.puts @worker_id
loop do
task_json = socket.gets
break unless task_json
task = JSON.parse(task_json)
begin
# Process task
result = process_task(task['data'])
socket.puts "ACK:#{task['id']}"
puts "Completed task #{task['id']}"
rescue => e
socket.puts "FAIL:#{task['id']}"
puts "Failed task #{task['id']}: #{e.message}"
end
socket.flush
end
socket.close
end
private
def process_task(data)
# Simulate task processing
sleep rand(1..3)
data
end
end
Implementing a chat application with message delivery guarantees shows how RDT ensures no messages are lost or duplicated:
require 'socket'
require 'json'
class ReliableChat
def initialize(port)
@port = port
@clients = {}
@message_log = []
@sequence_number = 0
end
def start_server
server = TCPServer.new(@port)
puts "Chat server listening on port #{@port}"
loop do
socket = server.accept
Thread.new { handle_client(socket) }
end
end
private
def handle_client(socket)
username = socket.gets.chomp
@clients[username] = {
socket: socket,
last_ack: -1
}
puts "#{username} joined the chat"
broadcast("#{username} joined", username)
# Send message history
send_history(socket, username)
# Handle incoming messages
loop do
message = socket.gets
break unless message
data = JSON.parse(message)
if data['type'] == 'message'
handle_message(username, data['content'])
elsif data['type'] == 'ack'
handle_ack(username, data['seq'])
end
end
@clients.delete(username)
broadcast("#{username} left", username)
socket.close
rescue
@clients.delete(username)
socket.close
end
def handle_message(sender, content)
message = {
seq: @sequence_number,
sender: sender,
content: content,
timestamp: Time.now
}
@message_log << message
@sequence_number += 1
# Broadcast to all clients with retry logic
@clients.each do |username, client_data|
send_with_retry(client_data[:socket], message)
end
end
def send_with_retry(socket, message, max_retries = 3)
retries = 0
begin
socket.puts message.to_json
socket.flush
rescue
retries += 1
retry if retries < max_retries
end
end
def handle_ack(username, seq)
@clients[username][:last_ack] = seq
end
def send_history(socket, username)
last_ack = @clients[username][:last_ack]
@message_log.each do |message|
if message[:seq] > last_ack
send_with_retry(socket, message)
end
end
end
def broadcast(message, exclude_user = nil)
@clients.each do |username, client_data|
next if username == exclude_user
send_with_retry(client_data[:socket], {
seq: @sequence_number,
sender: 'system',
content: message,
timestamp: Time.now
})
end
@sequence_number += 1
end
end
Common Patterns
Acknowledgment Schemes determine how the receiver confirms packet receipt. Positive acknowledgment with retransmission (PAR) uses ACK messages to confirm successful delivery. The sender retransmits if no ACK arrives within the timeout period. This pattern appears in almost all reliable protocols and forms the foundation of reliability.
Cumulative acknowledgments optimize network efficiency by acknowledging multiple packets with a single ACK. ACK n confirms receipt of all packets up through n. This reduces acknowledgment traffic but provides less granular feedback. If packets 1-5 are sent and packet 3 is lost, the receiver continues sending ACK 2 until packet 3 arrives, then sends ACK 5 to acknowledge packets 3-5 simultaneously. TCP uses cumulative acknowledgments extensively.
Selective acknowledgments (SACK) provide explicit feedback about which packets arrived. The receiver includes a bitmap or list of received packet ranges in the ACK. This allows the sender to retransmit only missing packets rather than going back N. TCP implements SACK as an option that must be negotiated during connection establishment:
# Simulating SACK behavior
class SelectiveAckReceiver
def initialize
@received_packets = {}
@next_expected = 0
end
def receive_packet(seq_num, data)
@received_packets[seq_num] = data
# Build SACK ranges
sack_ranges = []
sorted_seqs = @received_packets.keys.sort
range_start = sorted_seqs[0]
range_end = sorted_seqs[0]
sorted_seqs.each do |seq|
if seq == range_end + 1
range_end = seq
else
sack_ranges << [range_start, range_end]
range_start = seq
range_end = seq
end
end
sack_ranges << [range_start, range_end]
{
ack: @next_expected - 1,
sack_ranges: sack_ranges
}
end
def deliver_data
data = []
while @received_packets.key?(@next_expected)
data << @received_packets.delete(@next_expected)
@next_expected += 1
end
data
end
end
Timeout Management requires balancing responsiveness against spurious retransmissions. Fixed timeouts work poorly because network delays vary significantly. Adaptive timeout algorithms measure round-trip time (RTT) and adjust timeout values dynamically.
The exponential weighted moving average (EWMA) smooths RTT measurements to handle variation. EstimatedRTT = (1 - α) * EstimatedRTT + α * SampleRTT, where α typically equals 0.125. This gives more weight to recent measurements while smoothing out fluctuations. The timeout value includes a safety margin based on RTT deviation: TimeoutInterval = EstimatedRTT + 4 * DevRTT, where DevRTT tracks RTT variance.
Karn's algorithm addresses a critical problem in RTT measurement: when a packet is retransmitted, the sender cannot tell whether the ACK corresponds to the original transmission or the retransmission. The algorithm excludes retransmitted packets from RTT calculations, preventing skewed estimates. Additionally, exponential backoff doubles the timeout after each retransmission, reducing network congestion:
class AdaptiveTimeout
def initialize
@estimated_rtt = 1.0
@dev_rtt = 0.0
@timeout_interval = 1.0
@alpha = 0.125
@beta = 0.25
end
def update_rtt(sample_rtt)
# Update estimated RTT
@estimated_rtt = (1 - @alpha) * @estimated_rtt + @alpha * sample_rtt
# Update RTT deviation
@dev_rtt = (1 - @beta) * @dev_rtt + @beta * (@sample_rtt - @estimated_rtt).abs
# Calculate timeout interval
@timeout_interval = @estimated_rtt + 4 * @dev_rtt
end
def timeout_value
@timeout_interval
end
def exponential_backoff(attempt)
@timeout_interval * (2 ** attempt)
end
end
Flow Control prevents fast senders from overwhelming slow receivers. The receiver advertises a receive window indicating available buffer space. The sender limits outstanding data to the advertised window size. TCP implements flow control through the rwnd field in the header.
The sliding window protocol implements flow control naturally. The window size limits how many packets can be in flight simultaneously. As the receiver processes data and frees buffer space, it sends ACKs with updated window advertisements. If the receive window reaches zero, the sender stops transmitting until the window reopens:
class FlowControlSender
def initialize(initial_window)
@send_buffer = []
@receive_window = initial_window
@next_seq = 0
@base = 0
end
def send_data(data)
while @send_buffer.size >= @receive_window
sleep 0.01 # Wait for window to open
end
packet = { seq: @next_seq, data: data }
@send_buffer << packet
@next_seq += 1
transmit(packet)
end
def handle_ack(ack_seq, window_update)
# Remove acknowledged packets
@send_buffer.reject! { |p| p[:seq] <= ack_seq }
@base = ack_seq + 1
# Update receive window
@receive_window = window_update
end
def available_window
@receive_window - @send_buffer.size
end
end
Fast Retransmit reduces latency by triggering retransmission before timeout expiration. When the sender receives three duplicate ACKs (four identical ACKs total), it immediately retransmits the suspected lost packet. This pattern exploits the observation that duplicate ACKs signal either packet loss or reordering. Three duplicate ACKs strongly suggest loss rather than reordering.
Fast retransmit significantly improves performance by avoiding timeout delays. When packet 10 is lost among packets 1-20, packets 11-20 trigger duplicate ACKs for packet 9. After receiving three duplicate ACKs, the sender retransmits packet 10 without waiting for timeout. This reduces recovery time from seconds to milliseconds on typical networks.
Error Handling & Edge Cases
Packet Corruption occurs when bit errors flip packet contents during transmission. Checksums detect corruption but cannot correct it. The receiver must discard corrupted packets and rely on retransmission for recovery. Checksum algorithms trade detection strength against computation cost. Simple checksums like Internet checksum require minimal CPU but detect only certain error patterns. CRC detects all single-bit and double-bit errors plus all error bursts up to the CRC width. Cryptographic hashes like MD5 or SHA provide strong detection but consume more CPU:
require 'zlib'
class CorruptionDetector
def self.create_packet(data)
# Use CRC32 for corruption detection
checksum = Zlib.crc32(data)
[checksum, data].pack('NA*')
end
def self.verify_packet(packet)
checksum, data = packet.unpack('NA*')
computed = Zlib.crc32(data)
if computed == checksum
{ valid: true, data: data }
else
{ valid: false, error: 'Checksum mismatch' }
end
end
def self.simulate_corruption(packet, corruption_rate = 0.01)
bytes = packet.bytes
bytes.map! do |byte|
if rand < corruption_rate
byte ^ (1 << rand(8)) # Flip random bit
else
byte
end
end
bytes.pack('C*')
end
end
# Example usage showing corruption detection
data = "Important message"
packet = CorruptionDetector.create_packet(data)
# Simulate transmission with corruption
corrupted = CorruptionDetector.simulate_corruption(packet, 0.1)
result = CorruptionDetector.verify_packet(corrupted)
puts result[:valid] ? "Valid packet" : "Corrupted: #{result[:error]}"
Duplicate Packets arise when the sender retransmits a packet that actually arrived but whose ACK was lost or delayed. Sequence numbers enable the receiver to detect and discard duplicates. The receiver maintains the highest in-order sequence number received and rejects packets with sequence numbers at or below this threshold. However, wraparound in sequence numbers creates ambiguity if the window size is too large.
Consider sequence numbers using modulo 4 (sequence space = 4) with window size 3. The sender transmits packets 0, 1, 2 and receives ACK 2. It slides the window and transmits packet 3. If packet 0's ACK was delayed (not lost), the receiver might receive packet 0 from the next window cycle before receiving packet 3. The receiver cannot distinguish this "new" packet 0 from a duplicate of the old packet 0. The constraint N <= sequence_space / 2 prevents this ambiguity:
class SequenceNumberManager
def initialize(sequence_bits)
@sequence_space = 2 ** sequence_bits
@max_window = @sequence_space / 2
@next_seq = 0
@expected_seq = 0
end
def get_next_sequence
seq = @next_seq
@next_seq = (@next_seq + 1) % @sequence_space
seq
end
def is_duplicate?(seq)
# Check if sequence number is behind expected
distance = (@expected_seq - seq) % @sequence_space
distance > 0 && distance <= @max_window
end
def is_in_window?(seq)
distance = (seq - @expected_seq) % @sequence_space
distance < @max_window
end
def update_expected(seq)
@expected_seq = (seq + 1) % @sequence_space
end
end
Packet Reordering happens when packets take different network paths with varying delays. A packet sent earlier may arrive later than subsequent packets. The receiver must buffer out-of-order packets and deliver data to the application in sequence. Go-back-N handles reordering by discarding out-of-order packets, while selective repeat buffers them.
Buffer management becomes critical with reordering. The receiver allocates buffer space for the receive window but must handle cases where many packets arrive out of order. If the buffer fills, the receiver must either drop packets or implement back pressure:
class ReorderingBuffer
def initialize(window_size)
@window_size = window_size
@buffer = {}
@next_expected = 0
end
def receive_packet(seq, data)
# Check if packet is in acceptable range
offset = seq - @next_expected
if offset < 0
return { status: :duplicate, delivered: [] }
elsif offset >= @window_size
return { status: :window_exceeded, delivered: [] }
end
# Buffer the packet
@buffer[seq] = data
# Deliver consecutive packets starting from next_expected
delivered = []
while @buffer.key?(@next_expected)
delivered << @buffer.delete(@next_expected)
@next_expected += 1
end
{
status: :accepted,
delivered: delivered,
buffer_size: @buffer.size
}
end
def get_buffer_status
if @buffer.empty?
{ empty: true }
else
gaps = find_gaps
{
empty: false,
buffered_count: @buffer.size,
gaps: gaps
}
end
end
private
def find_gaps
return [] if @buffer.empty?
sorted_seqs = @buffer.keys.sort
gaps = []
expected = @next_expected
sorted_seqs.each do |seq|
gaps << [expected, seq - 1] if seq > expected
expected = seq + 1
end
gaps
end
end
Connection Failures require detecting dead connections and cleaning up state. TCP implements keepalive probes that periodically send packets to verify the peer is still responsive. Applications should set appropriate socket timeouts and handle disconnection gracefully:
class ConnectionMonitor
KEEPALIVE_INTERVAL = 5
KEEPALIVE_TIMEOUT = 15
def initialize(socket)
@socket = socket
@last_activity = Time.now
@keepalive_thread = nil
end
def start_monitoring
@keepalive_thread = Thread.new do
loop do
sleep KEEPALIVE_INTERVAL
if Time.now - @last_activity > KEEPALIVE_TIMEOUT
puts "Connection timeout detected"
@socket.close
break
end
begin
# Send keepalive probe
@socket.write("\0")
@socket.flush
rescue
puts "Connection lost"
break
end
end
end
end
def record_activity
@last_activity = Time.now
end
def stop_monitoring
@keepalive_thread&.kill
end
end
Sequence Number Wraparound creates ambiguity when sequence numbers cycle back to zero. Modern protocols use 32-bit sequence numbers (TCP) providing 4 billion unique values. At 10 Gbps, sequence numbers wrap around in approximately 3 seconds, so the protocol must ensure that packets from different cycles cannot coexist in the network. The maximum segment lifetime (MSL) bounds how long packets survive, typically 2 minutes. As long as the wraparound period exceeds 2*MSL, no ambiguity occurs.
Premature Timeouts fire before packets actually arrive, causing unnecessary retransmissions. This wastes bandwidth and can create duplicate packets. Adaptive timeout algorithms reduce premature timeouts by tracking network behavior, but extreme delay variation makes perfect timeout selection impossible. Retransmission ambiguity arises: when both the original packet and its retransmission arrive, the receiver sees duplicates and must discard one based on sequence numbers.
Reference
Protocol Comparison
| Protocol | Window Size | Sender Buffer | Receiver Buffer | Retransmission | Throughput | Complexity |
|---|---|---|---|---|---|---|
| Stop-and-Wait | 1 | 1 packet | 1 packet | Single packet | Low | Minimal |
| Go-Back-N | N packets | N packets | 1 packet | From error point | Medium | Moderate |
| Selective Repeat | N packets | N packets | N packets | Only lost packets | High | High |
Reliability Mechanisms
| Mechanism | Purpose | Implementation | Overhead |
|---|---|---|---|
| Checksums | Detect corruption | CRC, MD5, SHA | Per packet computation |
| Sequence Numbers | Detect duplicates/reordering | Counter with wraparound | Header space |
| Acknowledgments | Confirm receipt | ACK/SACK messages | Reverse traffic |
| Timeouts | Detect loss | Adaptive timer | State management |
| Flow Control | Prevent overflow | Window advertisement | Coordination delay |
TCP Socket Options
| Option | Level | Purpose | Typical Values |
|---|---|---|---|
| TCP_NODELAY | IPPROTO_TCP | Disable Nagle algorithm | 0 (off), 1 (on) |
| SO_KEEPALIVE | SOL_SOCKET | Enable keepalive probes | 0 (off), 1 (on) |
| SO_RCVBUF | SOL_SOCKET | Receive buffer size | 32768-262144 bytes |
| SO_SNDBUF | SOL_SOCKET | Send buffer size | 32768-262144 bytes |
| SO_REUSEADDR | SOL_SOCKET | Reuse address quickly | 0 (off), 1 (on) |
| TCP_KEEPIDLE | IPPROTO_TCP | Keepalive idle time | 7200 seconds |
| TCP_KEEPINTVL | IPPROTO_TCP | Keepalive interval | 75 seconds |
| TCP_KEEPCNT | IPPROTO_TCP | Keepalive probe count | 9 probes |
Timeout Calculations
| Parameter | Formula | Purpose |
|---|---|---|
| EstimatedRTT | (1-α)EstimatedRTT + αSampleRTT | Smooth RTT average |
| DevRTT | (1-β)DevRTT + βabs(SampleRTT-EstimatedRTT) | RTT variation |
| TimeoutInterval | EstimatedRTT + 4*DevRTT | Timeout value |
| Exponential Backoff | TimeoutInterval * 2^attempt | Retry timeout |
Window Size Constraints
| Protocol | Minimum Sequence Space | Formula |
|---|---|---|
| Stop-and-Wait | 2 | seq_space >= 2 |
| Go-Back-N | N + 1 | seq_space >= N + 1 |
| Selective Repeat | 2N | seq_space >= 2N |
Performance Metrics
| Metric | Formula | Factors |
|---|---|---|
| Link Utilization | (L/R) / (RTT + L/R) | Packet size, data rate, RTT |
| Effective Throughput | WindowSize * PacketSize / RTT | Window size, RTT |
| Goodput | (TotalDataDelivered - Retransmissions) / Time | Loss rate, efficiency |
| Bandwidth-Delay Product | Bandwidth * RTT | Link capacity, latency |
Error Detection Strength
| Algorithm | Bits | Detection Capability |
|---|---|---|
| Internet Checksum | 16 | Basic error detection |
| CRC-32 | 32 | All single/double bit errors, bursts up to 32 bits |
| MD5 | 128 | Cryptographically strong |
| SHA-256 | 256 | Maximum strength |
Ruby Class Quick Reference
| Class | Purpose | Key Methods |
|---|---|---|
| TCPSocket | TCP client connection | new, read, write, close |
| TCPServer | TCP server listener | new, accept, close |
| UDPSocket | UDP datagram socket | new, send, recvfrom, bind |
| Socket | Low-level socket operations | setsockopt, getsockopt |
| IO | select for multiplexing | select with timeout |
| Timeout | Timeout operations | timeout block |