Challenge

Problem

Log analysis is fundamental for monitoring application health, debugging issues, and understanding system behavior. This drill teaches you to parse structured log files, extract key metrics, identify patterns and anomalies, and generate comprehensive reports. You'll learn log parsing, statistical analysis, time-series aggregation, and multi-dimensional reporting—essential skills for building monitoring tools, debugging production issues, and creating operational dashboards.

Difficulty: Intermediate

Instructions

  1. Read server logs from access.log
  2. Parse log format: [timestamp] level request_path status response_time
  3. Extract and analyze:
    • Request counts by HTTP status code (200, 404, 500, etc.)
    • Average response time overall and by status code
    • Requests per hour (group by hour)
    • Top 5 slowest requests (path and response time)
    • Error rate percentage (4xx and 5xx)
  4. Print comprehensive report:
    'SERVER LOG ANALYSIS'
    '=' * 50
    'Log file: access.log'
    'Total requests: X'
    'Time range: [earliest] to [latest]'
    '-' * 50
    'STATUS CODE DISTRIBUTION'
    'XXX: Y requests (Z.ZZ%)'
    '-' * 50
    'RESPONSE TIME ANALYSIS'
    'Average: X.XXms'
    'By Status Code:'
    ' XXX: Y.YYms'
    '-' * 50
    'REQUESTS PER HOUR'
    'HH:00 - X requests'
    '-' * 50
    'TOP 5 SLOWEST REQUESTS'
    '/path - XXXms'
    '-' * 50
    'ERROR ANALYSIS'
    'Error rate: X.XX%'
    '4xx errors: X requests'
    '5xx errors: Y requests'

Files

Editable
Read-only

Hints

Hint 1

Time.parse(string) converts timestamp string to Time object

Hint 2

Use regex to extract log components: /\[(.*?)\] \w+ (\S+) (\d+) (\d+)/

Hint 3

Hash.new(0) creates hash with default value 0 for counters

Hint 4

Hash.new { |h, k| h[k] = [] } creates hash with array defaults

Hint 5

time.strftime('%H:00') formats as '10:00'

Hint 6

Sort descending: array.sort_by { |item| -item[:key] }

Hint 7

Calculate percentage: (count.to_f / total * 100)

Hint 8

Check status ranges: status >= 400 && status < 500

Provided Files (Read-only)

1. Mixed status codes and response times

Input:
analyze_logs('access.log')
Expected Output:
SERVER LOG ANALYSIS
==================================================
Log file: access.log
Total requests: 6
Time range: 2024-01-15T10:00:15Z to 2024-01-15T11:00:05Z
--------------------------------------------------
STATUS CODE DISTRIBUTION
200: 4 requests (66.67%)
404: 1 requests (16.67%)
500: 1 requests (16.67%)
--------------------------------------------------
RESPONSE TIME ANALYSIS
Average: 79.33ms
By Status Code:
  200: 65.25ms
  404: 12.00ms
  500: 203.00ms
--------------------------------------------------
REQUESTS PER HOUR
10:00 - 5 requests
11:00 - 1 requests
--------------------------------------------------
TOP 5 SLOWEST REQUESTS
/api/posts - 203ms
/api/users - 156ms
/api/users - 45ms
/api/posts - 32ms
/api/posts - 28ms
--------------------------------------------------
ERROR ANALYSIS
Error rate: 33.33%
4xx errors: 1 requests
5xx errors: 1 requests

2. All successful requests

Input:
analyze_logs('access.log')
Expected Output:
SERVER LOG ANALYSIS
==================================================
Log file: access.log
Total requests: 3
Time range: 2024-02-01T09:00:00Z to 2024-02-01T09:00:10Z
--------------------------------------------------
STATUS CODE DISTRIBUTION
200: 3 requests (100.00%)
--------------------------------------------------
RESPONSE TIME ANALYSIS
Average: 17.67ms
By Status Code:
  200: 17.67ms
--------------------------------------------------
REQUESTS PER HOUR
09:00 - 3 requests
--------------------------------------------------
TOP 5 SLOWEST REQUESTS
/home - 20ms
/contact - 18ms
/about - 15ms
--------------------------------------------------
ERROR ANALYSIS
Error rate: 0.00%
4xx errors: 0 requests
5xx errors: 0 requests

3. High error rate

Input:
analyze_logs('access.log')
Expected Output:
SERVER LOG ANALYSIS
==================================================
Log file: access.log
Total requests: 4
Time range: 2024-03-01T14:00:00Z to 2024-03-01T14:00:15Z
--------------------------------------------------
STATUS CODE DISTRIBUTION
200: 1 requests (25.00%)
404: 1 requests (25.00%)
500: 2 requests (50.00%)
--------------------------------------------------
RESPONSE TIME ANALYSIS
Average: 71.25ms
By Status Code:
  200: 25.00ms
  404: 10.00ms
  500: 125.00ms
--------------------------------------------------
REQUESTS PER HOUR
14:00 - 4 requests
--------------------------------------------------
TOP 5 SLOWEST REQUESTS
/api - 150ms
/api - 100ms
/home - 25ms
/page - 10ms
--------------------------------------------------
ERROR ANALYSIS
Error rate: 75.00%
4xx errors: 1 requests
5xx errors: 2 requests
+ 2 hidden test cases