Methodology

How BuilderBox accurately tracks your coding time and distinguishes between human and AI-generated code.

Overview

BuilderBox uses a client-side accumulation approach for precise time tracking. Instead of inferring time from heartbeat gaps on the server, the VS Code extension tracks time locally and sends accumulated values with each heartbeat.

Privacy First

No code content is ever stored. Only file paths, timestamps, and metrics.

High Accuracy

Precise time tracking with client-side accumulation.

Time Calculation

Heartbeat System

BuilderBox sends "heartbeats" every 2 minutes while you're active in VS Code/Cursor. Each heartbeat contains:

Current file path and language
Accumulated time by activity type (coding, reviewing, terminal, prompting)
AI/human character counts
Change metrics (chars added/deleted, lines changed)

Client-Side Accumulation

The extension tracks time precisely using a state machine:

State Machine Transitions:
├── Document edit → coding_ms accumulation
├── Editor focus (no edits for 5s) → reviewing_ms
├── Terminal focus → terminal_ms
├── AI panel focus → prompting_ms
└── Window blur → idle_ms

Benefits over gap-based inference:

No time loss when switching between activities
Accurate tracking during long sessions
Precise breakdown by activity type

Session Detection

A coding session ends after 5 minutes of inactivity. The server uses a 300-second session timeout for backward compatibility with older extension versions that don't send accumulated time.

Activity Types

Coding

Active document edits. Triggered by any text change in a file. This is the primary activity type that most developers spend time in.

Reviewing

Viewing files without making edits. Includes scrolling, reading, and navigating code. Detected when you have a file open but haven't edited for 5+ seconds.

Terminal

Time spent in the integrated terminal. Tracked via terminal focus events and output detection. Includes running commands, reading logs, etc.

Prompting

Time spent in AI chat panels (Copilot Chat, Cursor Composer, Cline, etc.). Detected by monitoring when no text editor is active but the window is focused. Prompt submissions are tracked to enable accurate AI classification of subsequent edits.

AI Detection Methods

BuilderBox uses multiple detection techniques layered by confidence level. Higher-tier methods are tried first, with fallback to heuristics when API access isn't available.

Tier 1: Cursor Jump + Keystroke Timing (Highest confidence)

The most reliable detection for inline completions. When you accept a completion (via Tab, Enter, or click), the cursor jumps forward significantly:

Large cursor jumps (20+ chars) on the same line are detected
Must occur without recent keystrokes (150ms+ idle)
Multi-line forward jumps also captured for block completions

Tier 2: Character-Level Timing (High confidence)

AI completions insert all characters in the same event loop tick (0ms between chars). Human typing has natural variance (50-300ms between keystrokes).

20+ characters inserted after 150ms of no keystrokes = likely AI
Confidence scales with insertion size
Catches completions accepted via Enter or other methods

Tier 3: Multi-Cursor Detection (High confidence)

AI agents (Cline, Cursor Composer) often make simultaneous edits at multiple non-adjacent positions. Humans rarely do this.

Multiple content changes in single event
Changes at positions > 5 lines apart = agent activity
Strong signal for detecting agentic AI tools

Tier 3b: Prompt Window Detection (High confidence)

When using AI chat assistants (Cursor Composer, Copilot Chat, Cline), edits that occur shortly after sending a prompt are classified as AI-generated.

Tracks when user sends prompts to AI chat panels
Edits within 60 seconds of prompt submission = likely AI response
Dynamic window: larger edits (500+ chars) extend window to 120 seconds
Confidence scales with time since prompt and edit size
Overrides keystroke timing check (prompts count as keystrokes)

Tier 4: Heuristic Fallback (Moderate confidence)

When direct detection isn't possible:

Large insertions (50+ chars) without recent keystrokes
Extension presence detection (is Copilot/Cursor installed?)
Code pattern analysis (AI often generates complete functions)

Paste Exclusion

Paste operations are detected by comparing inserted text to clipboard content and are explicitly excluded from AI classification. Pasting code is a human action.

Exception for Cursor: In Cursor, large edits (50+ chars) with no recent keystrokes (500ms+) skip paste classification, as these are likely AI agent edits that happen to match clipboard content.

Supported AI Tools

Tool	Detection Method	Confidence
GitHub Copilot	Tab key + timing + extension API	Highest
Cursor Tab	Tab key + timing	Highest
Cursor Composer/Agent	Prompt window + multi-cursor + large edits + keystroke timing	High
Cline	Multi-cursor + extension detection	High
Continue	Extension detection + heuristics	Moderate
Tabnine	Extension detection + heuristics	Moderate
Codeium	Extension detection + heuristics	Moderate
Others	Heuristic fallback	Lower

Data Flow

VS Code Extension
    ├── DocumentTracker (file changes)
    ├── InputMonitor (keystrokes, Tab key, paste)
    ├── SessionAccumulator (time by activity)
    ├── AIDetector (AI classification)
    └── HeartbeatBuilder → HeartbeatQueue
                              │
                              ▼
                    BuilderBox API (/api/heartbeats)
                              │
                              ▼
                    PostgreSQL (coding_heartbeats)
                              │
                              ▼
                    Aggregation (summaries, hourly stats)

What's Stored

File paths - Which files you worked on
Timestamps - When activity occurred
Metrics - Characters added/deleted, lines changed
AI attribution - Whether changes were AI-generated
Activity type - Coding, reviewing, terminal, prompting

What's NOT Stored

Actual code content
AI prompts or responses
Personal data beyond file paths
Screen recordings or screenshots

Relative Accuracy

Category	Accuracy	Method
Total coding time	Very High	Client-side accumulation
Reviewing time	Very High	Client-side accumulation
Terminal time	High	Output events + 60s threshold
Prompting time	High	AI panel detection
AI from inline completions	Highest	Cursor jump + keystroke timing
AI from agents	High	Multi-cursor + prompt window detection

Note: Actual accuracy varies depending on your workflow, AI tools used, and coding patterns. Detection is most reliable for inline completions (Tab acceptance) and least reliable for tools without native API integration.