Codebase Cognitive Load Mapper (CCLM)
Codebase Cognitive Load Mapper (CCLM)
Quantify the “Mental Tax” of your Python Codebase.
CCLM is a static analysis tool designed to measure Cognitive Load rather than just traditional cyclomatic complexity. It analyzes your source code using AST traversal and combines multiple heuristic signals into a single Cognitive Load Score (CLS). It also integrates with Git to identify “Hotspots” — files that are both difficult to understand and frequently changed.
🧠 Core Philosophy
Code is read far more often than it is written. Traditional metrics like Cyclomatic Complexity don’t always capture readability. CCLM measures factors that actually strain working memory:
- Nesting Depth: Deeply indented logic is hard to track.
- Identifier Entropy: Confusing, short, or random variable names increase load.
- Branching Density: Rapid context switching (many
if/elseper line). - Scope Distance: Distance between variable definition and usage (requires scrolling).
- Git Churn: High frequency of changes indicates instability.
✨ Features
- Multi-Signal Analysis: Combines 5 distinct cognitive metrics.
- Relative Scoring: Normalizes scores against your specific codebase average.
- Interactive Heatmaps: visualizes code complexity using Treemaps.
- Hotspot Detection: Correlates Cognitive Load with Git Churn.
- CI/CD Ready: JSON output for integration with pipelines.
🚀 Installation
Prerequisites
- Python 3.12+
Quick Start
# Clone the repository
git clone https://github.com/Last-Sage/cclm.git
cd cclm
# Install dependencies using Poetry
poetry install
📖 Usage Guide
The tool is run via the CLI entry point:
1. Basic Scan
Scan the current directory and display the Top 10 High-Load files in the terminal.
python -m cclm.main scan .
2. Generate HTML Report (Heatmap)
Create an interactive HTML report with a heatmap visualization.
python -m cclm.main scan . --html report.html
3. JSON Output (CI/CD)
Output raw metrics in JSON format for external processing.
python -m cclm.main scan . --json
📊 Interpreting the Results
Cognitive Load Score (CLS)
The CLS is a normalized score (0-100) indicating the relative cognitive burden of a file.
- 0 - 20: Low Load. Trivial code, definitions, configs.
- 20 - 50: Average Load. Standard application logic.
- 50 - 80: High Load. Complex algorithms, deep nesting. Refactor candidate.
- 80 - 100: Critical Load. Unreadable “Spaghetti Code”. Urgent Refactor.
Metrics Explained
| Metric | Description | Why it matters |
|---|---|---|
| NestingDepth | Max indentation level | Deep nesting exceeds working memory limits. |
| IdentifierEntropy | Randomness of names | x, tmp, data tell you nothing. user_id tells a story. |
| BranchingDensity | Logic switches per line | Dense logic is harder to predict/simulate mentally. |
| ScopeDistance | Lines between Def & Use | Keeping variables in head while scrolling is taxing. |
| FunctionLength | Lines of Code | Long functions hide logic flow. |