Overview
Code metrics provide objective, numerical measurements of software code characteristics. These measurements transform subjective assessments of code quality into quantifiable data that teams can track, analyze, and improve over time. Metrics range from simple counts like lines of code to complex calculations like cyclomatic complexity and maintainability indices.
The practice of measuring code emerged from software engineering research in the 1970s when organizations needed systematic approaches to evaluate software quality. Early metrics focused on size and complexity measurements. Modern code metrics encompass structural properties, test coverage, duplication, documentation completeness, and technical debt indicators.
Code metrics serve multiple purposes in software development. They identify problem areas requiring refactoring, track quality trends across releases, enforce coding standards during code review, inform architectural decisions, and provide data for project planning and estimation. Metrics become particularly valuable when measured consistently over time, revealing patterns and trends that spot-checks cannot detect.
Different metrics measure different aspects of code quality. Complexity metrics assess how difficult code is to understand and maintain. Coverage metrics measure test thoroughness. Duplication metrics identify redundant code. Coupling and cohesion metrics evaluate architectural quality. No single metric provides complete insight—teams use combinations of metrics to build comprehensive quality pictures.
# Simple metric: counting method lines
def calculate_metrics(file_path)
lines = File.readlines(file_path)
code_lines = lines.reject { |line| line.strip.empty? || line.strip.start_with?('#') }
{
total_lines: lines.count,
code_lines: code_lines.count,
comment_lines: lines.count - code_lines.count - blank_line_count(lines)
}
end
Code metrics exist at multiple granularity levels. Method-level metrics measure individual function complexity and size. Class-level metrics assess object design quality through coupling and cohesion. Module-level metrics evaluate component organization. System-level metrics aggregate data across entire codebases. Each level provides different insights appropriate for different decisions.
Key Principles
Quantification transforms subjective assessment into objective measurement. Rather than describing code as "complex" or "maintainable," metrics assign numerical values that enable comparison and tracking. This quantification supports data-driven decision making and removes ambiguity from quality discussions.
Different metrics measure orthogonal quality dimensions. Cyclomatic complexity measures branching logic. Lines of code measures size. Code coverage measures test thoroughness. Coupling measures dependencies. Each metric captures a distinct aspect of code quality. High scores in one dimension do not guarantee quality in others—a small function can have high complexity; well-tested code can have poor design.
Thresholds distinguish acceptable from problematic code. Most metrics become actionable through threshold values that trigger attention or enforcement. Cyclomatic complexity above 10 suggests refactoring opportunities. Test coverage below 80% indicates insufficient testing. Duplication above 5% signals maintainability risks. These thresholds vary by team, domain, and context but provide clear quality gates.
Trends matter more than absolute values. A single metric snapshot provides limited insight. Tracking metrics over time reveals whether quality improves, degrades, or remains stable. Increasing complexity over sprints signals accumulating technical debt. Declining coverage indicates growing test gaps. Stable metrics suggest consistent quality practices.
Context determines metric interpretation. A cyclomatic complexity of 15 might be acceptable in complex domain logic but problematic in utility functions. 60% test coverage might suffice for proof-of-concept code but be inadequate for critical payment processing. Teams must interpret metrics within the context of risk, criticality, and development phase.
Metrics guide investigation, not judgment. High complexity or low coverage identifies code requiring attention but does not automatically indicate bad code. Complex domain logic sometimes requires complex implementations. Legacy code might lack tests while functioning reliably. Metrics highlight where to investigate deeper, not what to condemn immediately.
Aggregation obscures important details. System-wide average complexity of 8 might hide individual functions with complexity of 40. Overall test coverage of 85% might mask critical modules with 20% coverage. Metrics must be examined at appropriate granularity to reveal actionable insights. Averages provide overview; distributions reveal problems.
Measurement changes behavior. Once teams track metrics, developers optimize for those metrics. This can improve quality when metrics align with quality goals. However, poorly chosen metrics drive dysfunctional behavior—maximizing test coverage through trivial tests, reducing complexity through inappropriate decomposition, or gaming measurements without improving actual quality.
Ruby Implementation
Ruby's dynamic nature and metaprogramming capabilities enable sophisticated code analysis tools. Multiple gems provide metric calculation, analysis, and reporting. These tools parse Ruby source code, build abstract syntax trees, and compute various measurements.
SimpleCov measures test coverage by tracking which lines execute during test runs. It integrates with test frameworks through Ruby's Coverage module, producing detailed reports showing covered and uncovered code.
# spec/spec_helper.rb
require 'simplecov'
SimpleCov.start do
add_filter '/spec/'
add_filter '/vendor/'
add_group 'Models', 'app/models'
add_group 'Controllers', 'app/controllers'
add_group 'Services', 'app/services'
minimum_coverage 80
refuse_coverage_drop
end
SimpleCov generates coverage data after test execution, calculating line coverage percentages for each file and the overall project. The minimum_coverage setting enforces quality gates, failing builds when coverage drops below thresholds.
# Custom coverage formatter
class MetricsFormatter
def format(result)
result.files.each do |file|
coverage_percent = file.covered_percent.round(2)
missed_lines = file.missed_lines.count
puts "#{file.filename}: #{coverage_percent}% (#{missed_lines} missed lines)"
end
end
end
SimpleCov.formatter = MetricsFormatter
Flog calculates complexity scores based on the ABC metric (Assignments, Branches, Calls). It assigns points to different language constructs, producing an aggregate complexity score per method.
# Using Flog programmatically
require 'flog'
flogger = Flog.new
flogger.flog('app/models/order.rb')
flogger.calculate_total_scores
flogger.each_by_score do |class_method, score|
puts "#{class_method}: #{score.round(2)}" if score > 10
end
Flog's scoring considers assignment operations, branches, method calls, and other complexity factors. Higher scores indicate more complex code requiring closer review or refactoring.
RuboCop enforces style guidelines and detects potential problems through static analysis. While primarily a linter, it calculates metrics like method length, class length, and cyclomatic complexity.
# .rubocop.yml
Metrics/MethodLength:
Max: 15
CountComments: false
Metrics/ClassLength:
Max: 100
Metrics/CyclomaticComplexity:
Max: 8
Metrics/PerceivedComplexity:
Max: 10
RuboCop's metric cops enforce length and complexity limits, failing checks when code exceeds configured thresholds. This integration into continuous integration pipelines prevents problematic code from merging.
# Running RuboCop programmatically
require 'rubocop'
config = RuboCop::ConfigStore.new
team = RuboCop::Cop::Team.new(RuboCop::Cop::Registry.global, config)
file_paths = Dir.glob('app/**/*.rb')
results = team.inspect_files(file_paths)
results.files.each do |file|
file.offenses.select { |o| o.cop_name.start_with?('Metrics/') }.each do |offense|
puts "#{file.path}:#{offense.line} - #{offense.cop_name}: #{offense.message}"
end
end
Reek detects code smells—patterns indicating potential design problems. It analyzes code structure for duplication, long parameter lists, feature envy, and other smell categories.
# Using Reek programmatically
require 'reek'
examiner = Reek::Examiner.new('app/services/payment_processor.rb')
examiner.smells.each do |smell|
puts "#{smell.smell_type}: #{smell.message}"
puts " Lines: #{smell.lines.join(', ')}"
puts " Context: #{smell.context}"
end
Flay detects duplicate code by analyzing structural similarity. Unlike simple text comparison, Flay understands Ruby syntax and identifies functionally similar code even when variable names or formatting differs.
# Detecting duplication with Flay
require 'flay'
flay = Flay.new(fuzzy: false, liberal: false)
flay.process(*Dir.glob('app/**/*.rb'))
flay.analyze
flay.report.each do |structural_hash, nodes|
similarity = flay.masses[structural_hash]
next if similarity < 50 # Skip low-similarity duplications
puts "Duplication score: #{similarity}"
nodes.each do |node|
puts " #{node.file}:#{node.line}"
end
end
MetricFu aggregates multiple metrics tools into unified reports. It runs Flog, Flay, Reek, RuboCop, and other analyzers, combining results into comprehensive dashboards.
# metric_fu configuration
MetricFu::Configuration.run do |config|
config.configure_metrics do |metrics|
metrics.enabled = [:flog, :flay, :reek, :roodi]
end
config.configure_metric(:flog) do |flog|
flog.continue_on_failure = true
flog.dirs_to_flog = ['app', 'lib']
end
config.configure_metric(:flay) do |flay|
flay.minimum_score = 50
flay.dirs_to_flay = ['app', 'lib']
end
end
Tools & Ecosystem
The Ruby ecosystem provides extensive tooling for code metrics across different quality dimensions. Tools range from focused single-metric analyzers to comprehensive quality platforms.
Coverage Analysis Tools track test execution to measure how thoroughly tests exercise code. SimpleCov dominates Ruby coverage analysis with widespread adoption and framework integration. Deep-Cover provides more detailed coverage analysis including branch coverage and execution counts per line. These tools integrate into test suites transparently, requiring minimal configuration.
Complexity Analyzers calculate various complexity metrics. Flog computes ABC complexity scores. RuboCop's metric cops measure cyclomatic and perceived complexity. Saikuro generates complexity reports with HTML visualization. Each tool uses slightly different algorithms and thresholds, so teams often use multiple tools to cross-validate complexity assessments.
Code Smell Detectors identify design problems and anti-patterns. Reek remains the primary Ruby smell detector, analyzing code for feature envy, long parameter lists, duplicate code, and other smells. RuboCop's lint cops detect some smells alongside style violations. Flay specializes in duplication detection through structural analysis.
Static Analysis Platforms combine multiple metrics into unified reports. MetricFu aggregates coverage, complexity, duplication, and smell detection. Rubycritic combines Flog, Reek, and Churn to generate code quality grades with trend analysis. Code Climate provides commercial hosted analysis integrating numerous metrics with GitHub workflows.
Churn Analysis Tools measure code change frequency. Churn detects files frequently modified, indicating instability or hotspots. MetricFu includes churn analysis. Git-based scripts calculate churn from repository history. High churn combined with high complexity identifies the riskiest code requiring attention.
Maintainability Calculators generate composite scores representing overall code health. Rubycritic calculates letter grades (A-F) based on complexity, smells, and churn. These scores simplify communication with non-technical stakeholders but obscure underlying metric details.
Continuous Integration Integration embeds metrics into development workflows. RuboCop runs in pre-commit hooks and CI pipelines, blocking merges that violate standards. SimpleCov fails builds when coverage drops. Code Climate automatically comments on pull requests with metric changes. Pronto provides incremental analysis, reviewing only changed files.
IDE Integration surfaces metrics during development. RubyMine shows complexity indicators inline with code. VS Code extensions display coverage and smell information. This real-time feedback helps developers address issues before committing code.
Custom Metric Tools address domain-specific needs. Teams build custom analyzers using Ruby's parser library to measure application-specific metrics like security pattern compliance or architectural rule adherence.
# Custom metric collector
require 'parser/current'
class ApiEndpointMetrics
def initialize(file_path)
@file_path = file_path
@endpoints = []
end
def analyze
code = File.read(@file_path)
ast = Parser::CurrentRuby.parse(code)
process_node(ast)
{
total_endpoints: @endpoints.count,
endpoints_with_auth: @endpoints.count { |e| e[:has_auth] },
endpoints_with_validation: @endpoints.count { |e| e[:has_validation] },
average_complexity: @endpoints.sum { |e| e[:complexity] } / @endpoints.count.to_f
}
end
private
def process_node(node)
return unless node.is_a?(Parser::AST::Node)
if endpoint_definition?(node)
@endpoints << analyze_endpoint(node)
end
node.children.each { |child| process_node(child) }
end
end
Practical Examples
Establishing Coverage Baselines requires measuring current coverage before enforcing thresholds. Teams inherit legacy codebases with unknown coverage and must establish realistic starting points.
# Coverage audit script
require 'simplecov'
require 'json'
SimpleCov.start do
add_filter '/spec/'
track_files 'app/**/*.rb'
end
# Run full test suite
require_relative '../spec/spec_helper'
RSpec.configure do |config|
config.after(:suite) do
result = SimpleCov.result
coverage_by_directory = result.files.group_by { |f| f.filename.split('/')[1] }
report = coverage_by_directory.transform_values do |files|
{
average_coverage: files.sum(&:covered_percent) / files.count,
total_files: files.count,
fully_covered: files.count { |f| f.covered_percent == 100 },
poorly_covered: files.count { |f| f.covered_percent < 50 }
}
end
File.write('coverage_baseline.json', JSON.pretty_generate(report))
end
end
This audit generates baseline data showing which components have acceptable coverage and which require improvement. Teams use this data to set incremental improvement goals rather than arbitrary universal thresholds.
Tracking Complexity Growth identifies when refactoring becomes necessary. Regular complexity measurement reveals gradual degradation before it becomes crisis.
# Complexity trend tracker
require 'flog'
require 'yaml'
class ComplexityTracker
THRESHOLD = 20
def self.track(directory, output_file)
flogger = Flog.new(continue: true)
flogger.flog(*Dir.glob("#{directory}/**/*.rb"))
results = {}
flogger.totals.each do |class_method, score|
file_path = flogger.method_locations[class_method]
results[class_method] = {
score: score.round(2),
file: file_path,
timestamp: Time.now.iso8601,
exceeds_threshold: score > THRESHOLD
}
end
# Append to historical log
history = File.exist?(output_file) ? YAML.load_file(output_file) : []
history << { date: Date.today.to_s, metrics: results }
File.write(output_file, YAML.dump(history))
# Report current violations
violations = results.select { |_, data| data[:exceeds_threshold] }
if violations.any?
puts "#{violations.count} methods exceed complexity threshold:"
violations.each do |method, data|
puts " #{method}: #{data[:score]} (#{data[:file]})"
end
end
end
end
ComplexityTracker.track('app', 'complexity_history.yml')
Running this script regularly (daily in CI) builds historical complexity data. Teams spot upward trends before complexity becomes unmanageable. The script also enforces thresholds, failing builds when complexity exceeds limits.
Identifying High-Risk Code combines multiple metrics to find code requiring immediate attention. Code that is complex, frequently changed, and poorly tested represents maximum risk.
# Risk assessment combining metrics
require 'flog'
require 'git'
require 'simplecov'
class RiskAnalyzer
def initialize(repo_path, coverage_data)
@repo = Git.open(repo_path)
@coverage = coverage_data
end
def analyze_file(file_path)
# Calculate complexity
flogger = Flog.new
flogger.flog(file_path)
complexity = flogger.total_score
# Calculate churn (commits in last 90 days)
commits = @repo.log(90).path(file_path).count
# Get coverage
coverage = @coverage[file_path] || 0
# Risk score formula
risk_score = (complexity * 0.3) + (commits * 2) + ((100 - coverage) * 0.5)
{
file: file_path,
complexity: complexity.round(2),
churn: commits,
coverage: coverage.round(2),
risk_score: risk_score.round(2)
}
end
def highest_risk_files(count = 10)
files = Dir.glob('app/**/*.rb')
results = files.map { |f| analyze_file(f) }
results.sort_by { |r| -r[:risk_score] }.take(count)
end
end
This analysis identifies the most problematic code for prioritized refactoring. High-risk files receive additional review scrutiny and test coverage improvements.
Enforcing Metric Standards in Code Review automates quality gates during pull request workflows. Automated checks prevent problematic code from merging while providing feedback to authors.
# CI script for PR metric checks
require 'rubocop'
require 'simplecov'
require 'json'
class PRMetricChecker
def initialize(changed_files)
@changed_files = changed_files
@violations = []
end
def check
check_rubocop_metrics
check_coverage_changes
if @violations.any?
puts "Metric violations detected:"
@violations.each do |violation|
puts " [#{violation[:severity]}] #{violation[:message]}"
end
exit 1
else
puts "All metric checks passed"
end
end
private
def check_rubocop_metrics
config = RuboCop::ConfigStore.new
team = RuboCop::Cop::Team.new(RuboCop::Cop::Registry.global, config)
results = team.inspect_files(@changed_files)
results.files.each do |file|
metric_offenses = file.offenses.select { |o| o.cop_name.start_with?('Metrics/') }
metric_offenses.each do |offense|
@violations << {
severity: offense.severity.name,
message: "#{file.path}:#{offense.line} - #{offense.message}"
}
end
end
end
def check_coverage_changes
current_coverage = SimpleCov.result.covered_percent
baseline_coverage = JSON.parse(File.read('.coverage_baseline.json'))['total']
if current_coverage < baseline_coverage - 1
@violations << {
severity: :error,
message: "Coverage decreased from #{baseline_coverage}% to #{current_coverage}%"
}
end
end
end
changed_files = `git diff --name-only origin/main`.split("\n").select { |f| f.end_with?('.rb') }
PRMetricChecker.new(changed_files).check
This script runs in CI, checking only modified files. It fails builds that introduce metric violations or decrease coverage, maintaining quality standards without blocking all development.
Common Pitfalls
Optimizing metrics instead of quality occurs when developers game measurements without improving actual code quality. Splitting complex methods into many small methods might reduce complexity metrics while making code harder to follow. Adding trivial tests increases coverage without improving test quality. Metrics measure proxies for quality, not quality itself—improving the proxy does not guarantee improving the underlying quality.
Applying universal thresholds without context treats all code identically regardless of risk or criticality. Payment processing code requires higher test coverage than logging utilities. Complex domain logic justifies higher complexity than simple CRUD operations. One-size-fits-all thresholds either block legitimate code or allow problematic code through.
Ignoring metric combinations examines metrics in isolation rather than holistically. A file with 95% test coverage might have poor tests that only execute code without asserting behavior. Low complexity might indicate over-decomposition or shallow functionality. High complexity combined with low test coverage and high churn indicates critical risk; any single metric alone provides incomplete information.
Measuring without acting collects metric data but never uses it for improvement. Teams generate reports, post dashboards, track trends, but never address identified problems. Metrics consume effort without providing value when teams lack processes to act on findings. Effective measurement includes defined responses to metric violations.
Chasing perfect scores pursues 100% coverage or zero complexity at the expense of pragmatism. Perfect coverage requires testing trivial code, vendored libraries, and generated files—effort better spent on meaningful tests. Eliminating all complexity might necessitate awkward abstractions that obscure logic. Diminishing returns set in well before perfection.
Misinterpreting statistical metrics treats aggregates as representative when distribution matters more. Average complexity of 8 seems acceptable but might hide methods with complexity of 40. Median, maximum, and percentile distributions reveal problems that averages obscure. Box plots and histograms provide better insight than single summary statistics.
Comparing incomparable metrics evaluates projects or teams using raw metric values without accounting for context. A legacy Rails application reasonably has different metrics than a greenfield microservice. Comparing test coverage percentages between a stateless API and a stateful workflow engine ignores fundamental differences in testability. Metrics enable comparison only within similar contexts.
Trusting metric accuracy blindly assumes tools measure correctly without verification. Coverage tools miss branches in complex conditionals. Complexity calculators use different algorithms producing different scores. Duplication detectors generate false positives from boilerplate code. Manual inspection of flagged code separates legitimate concerns from tool limitations.
# Example of metric gaming through method extraction
# Original: High complexity but readable
def process_order(order)
if order.amount > 1000 && order.customer.premium? && order.items.all?(&:in_stock?)
apply_premium_discount(order)
priority_queue(order)
elsif order.amount > 1000 && order.customer.premium?
apply_premium_discount(order)
elsif order.items.all?(&:in_stock?)
standard_queue(order)
end
end
# Refactored: Lower complexity but less readable
def process_order(order)
handle_premium_in_stock(order) if premium_in_stock?(order)
handle_premium_no_stock(order) if premium_no_stock?(order)
handle_standard_in_stock(order) if standard_in_stock?(order)
end
def premium_in_stock?(order)
order.amount > 1000 && order.customer.premium? && order.items.all?(&:in_stock?)
end
def premium_no_stock?(order)
order.amount > 1000 && order.customer.premium? && !order.items.all?(&:in_stock?)
end
def standard_in_stock?(order)
order.amount <= 1000 && order.items.all?(&:in_stock?)
end
This refactoring reduces cyclomatic complexity through method extraction but fragments logic across many methods. The complexity still exists but is now distributed and harder to understand. Metrics improved while maintainability decreased.
Ignoring temporal aspects measures code at single points without tracking changes over time. A high-complexity legacy module might be stable and well-understood, requiring no attention. A recently added module with moderate complexity but rapid growth deserves scrutiny. Change velocity and trend direction matter as much as current values.
Focusing on code metrics exclusively neglects broader quality indicators. Code metrics measure implementation quality but ignore design quality, documentation completeness, operational stability, or user satisfaction. Teams need balanced quality frameworks covering code structure, system architecture, operational metrics, and business outcomes.
Implementation Approaches
Incremental Adoption introduces metrics gradually rather than enforcing comprehensive measurement immediately. Teams start with one or two high-value metrics, establish processes around them, then expand. Initial focus on test coverage provides immediate feedback and visible improvement. Once coverage tracking becomes routine, adding complexity or duplication measurement builds on existing practices.
Incremental adoption allows teams to learn metric interpretation before committing to enforcement. Early metrics inform threshold selection for later metrics. Teams discover which metrics provide value and which create noise in their specific context. Gradual rollout prevents metric fatigue from overwhelming simultaneous changes.
Threshold Ratcheting progressively tightens quality standards as code improves. Rather than enforcing ideal thresholds immediately, teams set achievable initial thresholds based on current state, then gradually increase requirements. A codebase with 40% test coverage starts with a 45% minimum threshold. After reaching 45%, the threshold increases to 50%. This approach provides steady improvement without blocking development.
Ratcheting works in CI pipelines through configuration updates after achieving current thresholds. Teams track threshold progression over time, celebrating incremental improvements. The approach accommodates legacy code—old code remains unchanged while new code meets higher standards. Eventually, refactoring brings legacy code up to current standards.
Metric-Driven Code Review incorporates metrics into pull request evaluation. Automated tools comment on pull requests with metric data for changed files. Reviewers use this data to guide review focus—high-complexity changes receive detailed scrutiny. Metric changes appear alongside code changes, making quality impact visible during review.
This approach surfaces metrics when developers can most easily address issues. Catching complexity increases during review costs less than fixing them later. Reviewers might request refactoring or additional tests based on metric data. The process becomes educational, helping developers internalize quality standards.
Continuous Monitoring tracks metrics in production systems, not just during development. Monitoring dashboards display current metrics and historical trends. Teams review metrics during planning to allocate refactoring time. Sudden metric changes trigger investigation—spike in complexity might indicate rushed feature implementation requiring cleanup.
Monitoring systems alert on threshold violations or significant changes. Teams establish ownership of metric improvement—specific developers or teams own reducing duplication in particular modules. Regular metric reviews become part of retrospectives, ensuring quality remains a continuous focus.
Pre-commit Quality Gates enforce metrics before code enters version control. Git hooks run metric checks locally, rejecting commits that violate standards. This provides immediate feedback with minimal disruption. Developers fix issues in their working environment rather than after committing and pushing.
Local enforcement requires careful threshold selection—overly strict checks frustrate developers and encourage bypassing hooks. Focusing on egregious violations (complexity over 25, new files under 50% coverage) balances quality improvement with developer experience. Teams often make pre-commit checks advisory (warning) while CI checks remain mandatory (blocking).
Differential Measurement analyzes only changed code rather than entire codebases. This approach works well for legacy systems where fixing all existing problems proves impractical. New and modified code must meet current standards while legacy code remains unchanged. Over time, as changes touch legacy code, quality gradually improves.
Differential tools compare metrics before and after changes, reporting only impacts of current work. A pull request increasing overall complexity by 50 points gets flagged even if individual methods stay under thresholds. This prevents gradual quality erosion from many small degradations.
Metric Budgets allocate acceptable complexity or technical debt across a project. Each module has a complexity budget based on its purpose and history. Simple utility modules get low budgets. Complex domain logic receives higher budgets. Teams track budget consumption and refactor before exceeding limits.
Budgets enable tradeoffs—accepting higher complexity in one area while maintaining lower complexity elsewhere. They acknowledge that perfect uniformity is unrealistic. Teams periodically review and adjust budgets based on changing requirements and lessons learned.
Reference
Common Code Metrics
| Metric | Measures | Typical Threshold | Tools |
|---|---|---|---|
| Lines of Code (LOC) | Code size | Class: 100 lines, Method: 15 lines | RuboCop, SonarQube |
| Cyclomatic Complexity | Number of independent paths | 10 per method | RuboCop, Flog |
| ABC Complexity | Assignments + Branches + Calls | 20 per method | Flog |
| Test Coverage | Percentage of code executed by tests | 80% minimum | SimpleCov, Deep-Cover |
| Code Duplication | Repeated code structures | Under 5% | Flay, RuboCop |
| Method Length | Number of lines in methods | 10-15 lines | RuboCop |
| Class Length | Number of lines in classes | 100-200 lines | RuboCop |
| Churn | Commit frequency per file | Context-dependent | MetricFu, Git |
| Coupling | Number of dependencies between classes | Low coupling preferred | Reek |
| Cohesion | Relatedness of class responsibilities | High cohesion preferred | Reek |
RuboCop Metric Cops
| Cop | Description | Default Threshold |
|---|---|---|
| Metrics/AbcSize | ABC complexity metric | 17 |
| Metrics/BlockLength | Length of blocks | 25 lines |
| Metrics/ClassLength | Length of classes | 100 lines |
| Metrics/CyclomaticComplexity | Cyclomatic complexity | 7 |
| Metrics/MethodLength | Length of methods | 10 lines |
| Metrics/ModuleLength | Length of modules | 100 lines |
| Metrics/ParameterLists | Number of parameters | 5 parameters |
| Metrics/PerceivedComplexity | Perceived complexity | 8 |
Flog Complexity Scoring
| Construct | Points | Example |
|---|---|---|
| Assignment | 1.0 | x = value |
| Branch | 1.0 | if, unless, while |
| Call | 1.0 | Method invocation |
| Condition | 1.0 | Conditional expression |
| Yield | 1.0 | Block yield |
| Assignment Branch | 2.0 | x = if condition |
SimpleCov Configuration Options
| Option | Purpose | Example |
|---|---|---|
| minimum_coverage | Fail build below threshold | 80 |
| refuse_coverage_drop | Fail if coverage decreases | true/false |
| add_filter | Exclude paths from coverage | /spec/, /vendor/ |
| add_group | Group files in reports | Models, Controllers |
| merge_timeout | Merge coverage across runs | 3600 seconds |
| coverage_criterion | Coverage type to measure | line, branch |
Reek Code Smell Categories
| Category | Description | Examples |
|---|---|---|
| Control Couple | Methods that rely on control flag parameters | Boolean parameters |
| Data Clump | Frequently co-occurring parameters | Same 3+ params in multiple methods |
| Feature Envy | Method uses more features of another class | Excessive delegation |
| Long Parameter List | Methods with many parameters | More than 4 parameters |
| Repeated Conditional | Same conditional in multiple places | if user.admin? appears 5+ times |
| Too Many Statements | Method contains too many statements | 10+ statements |
| Utility Function | Method that does not use instance variables | Could be class method |
Metric Threshold Recommendations
| Code Type | Coverage | Complexity | Duplication | Method Lines |
|---|---|---|---|---|
| Critical Business Logic | 95%+ | Under 8 | 0% | Under 10 |
| Standard Application Code | 80%+ | Under 12 | Under 3% | Under 15 |
| Infrastructure/Utilities | 70%+ | Under 15 | Under 5% | Under 20 |
| Legacy/Stable Code | 60%+ | Context-dependent | Under 10% | Context-dependent |
Tool Selection Matrix
| Need | Recommended Tool | Alternative | Integration |
|---|---|---|---|
| Test Coverage | SimpleCov | Deep-Cover | CI, Git Hooks |
| Complexity Analysis | Flog, RuboCop | Saikuro | CI, IDE |
| Duplication Detection | Flay | RuboCop | CI |
| Code Smell Detection | Reek | RuboCop Lint | CI, Code Review |
| Comprehensive Analysis | MetricFu, Rubycritic | Code Climate | CI, Dashboard |
| Style Enforcement | RuboCop | StandardRB | Git Hooks, CI, IDE |
Continuous Integration Metric Workflow
| Stage | Action | Tools | Outcome |
|---|---|---|---|
| Pre-commit | Run fast local checks | RuboCop (selective cops) | Block obvious issues |
| Commit | Track file changes | Git | Identify modified files |
| PR Creation | Analyze changed files only | All metric tools | Comment on PR |
| CI Build | Run full metric suite | All tools | Fail if thresholds violated |
| Merge | Update baselines | Coverage, Complexity | Track trends |
| Post-merge | Generate reports | MetricFu, Rubycritic | Dashboard updates |
Metric Interpretation Guidelines
| Metric Range | Interpretation | Action |
|---|---|---|
| Coverage 90-100% | Excellent test coverage | Maintain current practices |
| Coverage 80-90% | Good coverage with some gaps | Identify untested critical paths |
| Coverage 60-80% | Moderate coverage | Prioritize testing critical code |
| Coverage 0-60% | Insufficient coverage | Urgent testing needed |
| Complexity 0-10 | Simple, maintainable code | No action needed |
| Complexity 11-20 | Moderately complex | Consider refactoring opportunities |
| Complexity 21-40 | High complexity | Refactor recommended |
| Complexity 40+ | Very high complexity | Urgent refactoring required |