Codebase Cognitive Load Mapper (CCLM)

Quantify the “Mental Tax” of your Python Codebase.

CCLM is a static analysis tool designed to measure Cognitive Load rather than just traditional cyclomatic complexity. It analyzes your source code using AST traversal and combines multiple heuristic signals into a single Cognitive Load Score (CLS). It also integrates with Git to identify “Hotspots” — files that are both difficult to understand and frequently changed.

🧠 Core Philosophy

Code is read far more often than it is written. Traditional metrics like Cyclomatic Complexity don’t always capture readability. CCLM measures factors that actually strain working memory:

Nesting Depth: Deeply indented logic is hard to track.
Identifier Entropy: Confusing, short, or random variable names increase load.
Branching Density: Rapid context switching (many if/else per line).
Scope Distance: Distance between variable definition and usage (requires scrolling).
Git Churn: High frequency of changes indicates instability.

✨ Features

Multi-Signal Analysis: Combines 5 distinct cognitive metrics.
Relative Scoring: Normalizes scores against your specific codebase average.
Interactive Heatmaps: visualizes code complexity using Treemaps.
Hotspot Detection: Correlates Cognitive Load with Git Churn.
CI/CD Ready: JSON output for integration with pipelines.

🚀 Installation

Prerequisites

Python 3.12+

Quick Start

# Clone the repository
git clone https://github.com/Last-Sage/cclm.git
cd cclm

# Install dependencies using Poetry
poetry install

📖 Usage Guide

The tool is run via the CLI entry point:

1. Basic Scan

Scan the current directory and display the Top 10 High-Load files in the terminal.

python -m cclm.main scan .

2. Generate HTML Report (Heatmap)

Create an interactive HTML report with a heatmap visualization.

python -m cclm.main scan . --html report.html

3. JSON Output (CI/CD)

Output raw metrics in JSON format for external processing.

python -m cclm.main scan . --json

📊 Interpreting the Results

Cognitive Load Score (CLS)

The CLS is a normalized score (0-100) indicating the relative cognitive burden of a file.

0 - 20: Low Load. Trivial code, definitions, configs.
20 - 50: Average Load. Standard application logic.
50 - 80: High Load. Complex algorithms, deep nesting. Refactor candidate.
80 - 100: Critical Load. Unreadable “Spaghetti Code”. Urgent Refactor.

Metrics Explained

Metric	Description	Why it matters
NestingDepth	Max indentation level	Deep nesting exceeds working memory limits.
IdentifierEntropy	Randomness of names	`x`, `tmp`, `data` tell you nothing. `user_id` tells a story.
BranchingDensity	Logic switches per line	Dense logic is harder to predict/simulate mentally.
ScopeDistance	Lines between Def & Use	Keeping variables in head while scrolling is taxing.
FunctionLength	Lines of Code	Long functions hide logic flow.