Skip to content

Context & Workspace

One of Mia’s key strengths is automatic context injection. Before every AI dispatch, Mia scans your workspace, gathers relevant information, and assembles it into a token-budgeted context payload.

The workspace scanner (workspace-scanner.ts) runs on a configurable interval (default: 30 minutes) and captures:

  • Current branch name
  • Staged and unstaged changes
  • Untracked files
  • Recent commits (last ~10)
  • Remote tracking status
  • Total file count
  • Directory listing
  • Recently modified files
  • Large files (potential binary assets)
  • Configuration files (package.json, tsconfig, etc.)

Mia detects project type by checking for:

MarkerProject Type
package.jsonNode.js / npm
requirements.txt / pyproject.tomlPython
Cargo.tomlRust
go.modGo
pom.xml / build.gradleJava
*.sln / *.csproj.NET

The scanner produces a WorkspaceSnapshot object:

interface WorkspaceSnapshot {
workingDirectory: string
totalFiles: number
directories: string[]
recentlyModified: string[]
largeFiles: string[]
configFiles: string[]
projectType: string
gitBranch?: string
gitStatus?: string
recentCommits?: string[]
}

Separate from the workspace snapshot, the daemon performs a one-time codebase scan on startup:

  • Language detection — Counts file extensions, identifies primary languages
  • Framework detection — Checks for React, Vue, Django, FastAPI, Express, etc.
  • File statistics — Total files, source files, test files

This produces a compressed summary string (~500 characters) that’s cached for the daemon’s entire lifetime. It’s the first thing injected into context because it rarely changes and benefits from prompt caching.

The ContextBuilder assembles the full context in a specific order optimized for prompt caching:

  1. System prompt & personality — Never changes. Maximum cache benefit.
  2. User profile — Rarely changes (name, preferences, timezone).
  3. Codebase summary — Changes only on daemon restart.
  4. Workspace snapshot — Refreshes every ~30 minutes.
  5. Memory facts — Semi-stable, based on BM25 query results.
  6. Conversation history — Changes every turn.
  7. Current prompt — Changes every dispatch.

AI providers like Anthropic cache the prefix of a prompt. By putting stable content first, the majority of the context is cached across dispatches, reducing both latency and cost.

Every section has an estimated token count (via tiktoken). The builder tracks usage against a budget:

Total context window: 200,000 tokens (model-dependent)
Reserved for output: 4,096 tokens
Available for context: 195,904 tokens

If the budget is exceeded:

  1. Truncate oversized sections (conversation history is trimmed first)
  2. Skip low-priority sections entirely
  3. Summarize long conversations into a compact summary
ContentCache LocationTTL
Codebase summary~/.mia/context/codebase-<project>.txtDaemon lifetime
Workspace snapshot~/.mia/context/workspace-<project>.json30 minutes
Memory query resultsIn-memory LRU cache30 seconds

Context behavior is controlled through the dispatch pipeline config:

{
"pluginDispatch": {
"memoryExtraction": {
"enabled": true,
"maxFacts": 5
}
},
"timeoutMs": 1800000
}

The --no-context flag on mia chat and mia ask skips workspace scanning entirely — useful for quick questions that don’t need project context.