Methodology
How BuilderBox accurately tracks your coding time and distinguishes between human and AI-generated code.
Overview
BuilderBox uses a client-side accumulation approach for precise time tracking. Instead of inferring time from heartbeat gaps on the server, the VS Code extension tracks time locally and sends accumulated values with each heartbeat.
No code content is ever stored. Only file paths, timestamps, and metrics.
~98% time tracking accuracy with client-side accumulation.
Time Calculation
Heartbeat System
BuilderBox sends "heartbeats" every 2 minutes while you're active in VS Code/Cursor. Each heartbeat contains:
- Current file path and language
- Accumulated time by activity type (coding, reviewing, terminal, prompting)
- AI/human character counts
- Change metrics (chars added/deleted, lines changed)
Client-Side Accumulation
The extension tracks time precisely using a state machine:
State Machine Transitions: ├── Document edit → coding_ms accumulation ├── Editor focus (no edits for 5s) → reviewing_ms ├── Terminal focus → terminal_ms ├── AI panel focus → prompting_ms └── Window blur → idle_ms
Benefits over gap-based inference:
- No time loss when switching between activities
- Accurate tracking during long sessions
- Precise breakdown by activity type
Session Detection
A coding session ends after 5 minutes of inactivity. The server uses a 300-second session timeout for backward compatibility with older extension versions that don't send accumulated time.
Activity Types
Coding
Active document edits. Triggered by any text change in a file. This is the primary activity type that most developers spend time in.
Reviewing
Viewing files without making edits. Includes scrolling, reading, and navigating code. Detected when you have a file open but haven't edited for 5+ seconds.
Terminal
Time spent in the integrated terminal. Tracked via terminal focus events and output detection. Includes running commands, reading logs, etc.
Prompting
Time spent in AI chat panels (Copilot Chat, Cursor Composer, Cline, etc.). Detected by monitoring when no text editor is active but the window is focused. Prompt submissions are tracked to enable accurate AI classification of subsequent edits.
AI Detection Methods
BuilderBox uses multiple detection techniques layered by confidence level. Higher-tier methods are tried first, with fallback to heuristics when API access isn't available.
Tier 1: Cursor Jump + Keystroke Timing (~92% confidence)
The most reliable detection for inline completions. When you accept a completion (via Tab, Enter, or click), the cursor jumps forward significantly:
- Large cursor jumps (20+ chars) on the same line are detected
- Must occur without recent keystrokes (150ms+ idle)
- Multi-line forward jumps also captured for block completions
Tier 2: Character-Level Timing (~85-90% confidence)
AI completions insert all characters in the same event loop tick (0ms between chars). Human typing has natural variance (50-300ms between keystrokes).
- 20+ characters inserted after 150ms of no keystrokes = likely AI
- Confidence scales with insertion size
- Catches completions accepted via Enter or other methods
Tier 3: Multi-Cursor Detection (~90% confidence)
AI agents (Cline, Cursor Composer) often make simultaneous edits at multiple non-adjacent positions. Humans rarely do this.
- Multiple content changes in single event
- Changes at positions > 5 lines apart = agent activity
- Strong signal for detecting agentic AI tools
Tier 3b: Prompt Window Detection (~85-90% confidence)
When using AI chat assistants (Cursor Composer, Copilot Chat, Cline), edits that occur shortly after sending a prompt are classified as AI-generated.
- Tracks when user sends prompts to AI chat panels
- Edits within 60 seconds of prompt submission = likely AI response
- Dynamic window: larger edits (500+ chars) extend window to 120 seconds
- Confidence scales with time since prompt and edit size
- Overrides keystroke timing check (prompts count as keystrokes)
Tier 4: Heuristic Fallback (~70-80% confidence)
When direct detection isn't possible:
- Large insertions (50+ chars) without recent keystrokes
- Extension presence detection (is Copilot/Cursor installed?)
- Code pattern analysis (AI often generates complete functions)
Paste Exclusion
Paste operations are detected by comparing inserted text to clipboard content and are explicitly excluded from AI classification. Pasting code is a human action.
Supported AI Tools
| Tool | Detection Method | Confidence |
|---|---|---|
| GitHub Copilot | Tab key + timing + extension API | ~92% |
| Cursor Tab | Tab key + timing | ~92% |
| Cursor Composer | Prompt window + multi-cursor + large edits | ~90% |
| Cline | Multi-cursor + extension detection | ~90% |
| Continue | Extension detection + heuristics | ~80% |
| Tabnine | Extension detection + heuristics | ~80% |
| Codeium | Extension detection + heuristics | ~80% |
| Others | Heuristic fallback | ~70% |
Data Flow
VS Code Extension
├── DocumentTracker (file changes)
├── InputMonitor (keystrokes, Tab key, paste)
├── SessionAccumulator (time by activity)
├── AIDetector (AI classification)
└── HeartbeatBuilder → HeartbeatQueue
│
▼
BuilderBox API (/api/heartbeats)
│
▼
PostgreSQL (coding_heartbeats)
│
▼
Aggregation (summaries, hourly stats)What's Stored
- File paths - Which files you worked on
- Timestamps - When activity occurred
- Metrics - Characters added/deleted, lines changed
- AI attribution - Whether changes were AI-generated
- Activity type - Coding, reviewing, terminal, prompting
What's NOT Stored
- Actual code content
- AI prompts or responses
- Personal data beyond file paths
- Screen recordings or screenshots
Expected Accuracy
| Category | Accuracy | Method |
|---|---|---|
| Total coding time | ~98% | Client-side accumulation |
| Reviewing time | ~98% | Client-side accumulation |
| Terminal time | ~95% | Output events + 60s threshold |
| Prompting time | ~90% | AI panel detection |
| AI from inline completions | ~92% | Cursor jump + keystroke timing |
| AI from agents | ~90% | Multi-cursor detection |
Note: These are estimated accuracies based on our testing. Actual accuracy may vary depending on your workflow, AI tools used, and coding patterns.