Context & Workspace

One of Mia’s key strengths is automatic context injection. Before every AI dispatch, Mia scans your workspace, gathers relevant information, and assembles it into a token-budgeted context payload.

Workspace Scanner

The workspace scanner (workspace-scanner.ts) runs on a configurable interval (default: 30 minutes) and captures:

Git State

Current branch name
Staged and unstaged changes
Untracked files
Recent commits (last ~10)
Remote tracking status

File Structure

Total file count
Directory listing
Recently modified files
Large files (potential binary assets)
Configuration files (package.json, tsconfig, etc.)

Project Detection

Mia detects project type by checking for:

Marker	Project Type
`package.json`	Node.js / npm
`requirements.txt` / `pyproject.toml`	Python
`Cargo.toml`	Rust
`go.mod`	Go
`pom.xml` / `build.gradle`	Java
`.sln` / `.csproj`	.NET

Workspace Snapshot

The scanner produces a WorkspaceSnapshot object:

interface WorkspaceSnapshot {
  workingDirectory: string
  totalFiles: number
  directories: string[]
  recentlyModified: string[]
  largeFiles: string[]
  configFiles: string[]
  projectType: string
  gitBranch?: string
  gitStatus?: string
  recentCommits?: string[]
}

Codebase Context

Separate from the workspace snapshot, the daemon performs a one-time codebase scan on startup:

Language detection — Counts file extensions, identifies primary languages
Framework detection — Checks for React, Vue, Django, FastAPI, Express, etc.
File statistics — Total files, source files, test files

This produces a compressed summary string (~500 characters) that’s cached for the daemon’s entire lifetime. It’s the first thing injected into context because it rarely changes and benefits from prompt caching.

Context Assembly

The ContextBuilder assembles the full context in a specific order optimized for prompt caching:

Order (most stable → most volatile)

System prompt & personality — Never changes. Maximum cache benefit.
User profile — Rarely changes (name, preferences, timezone).
Codebase summary — Changes only on daemon restart.
Workspace snapshot — Refreshes every ~30 minutes.
Memory facts — Semi-stable, based on BM25 query results.
Conversation history — Changes every turn.
Current prompt — Changes every dispatch.

Why This Order Matters

AI providers like Anthropic cache the prefix of a prompt. By putting stable content first, the majority of the context is cached across dispatches, reducing both latency and cost.

Token Budgeting

Every section has an estimated token count (via tiktoken). The builder tracks usage against a budget:

Total context window:  200,000 tokens (model-dependent)
Reserved for output:     4,096 tokens
Available for context: 195,904 tokens

If the budget is exceeded:

Truncate oversized sections (conversation history is trimmed first)
Skip low-priority sections entirely
Summarize long conversations into a compact summary

Caching Strategy

Content	Cache Location	TTL
Codebase summary	`~/.mia/context/codebase-<project>.txt`	Daemon lifetime
Workspace snapshot	`~/.mia/context/workspace-<project>.json`	30 minutes
Memory query results	In-memory LRU cache	30 seconds

Configuration

Context behavior is controlled through the dispatch pipeline config:

{
  "pluginDispatch": {
    "memoryExtraction": {
      "enabled": true,
      "maxFacts": 5
    }
  },
  "timeoutMs": 1800000
}

The --no-context flag on mia chat and mia ask skips workspace scanning entirely — useful for quick questions that don’t need project context.