Internal Architecture
This section explains how datamitsu works under the hood. Understanding the internal execution model helps wrapper maintainers optimize tool configurations and advanced users debug unexpected behavior.
How It All Fits Together
When you run datamitsu check, the system moves through four stages:
- File Discovery walks the repository tree, respecting
.gitignorerules, and collects all files that match tool glob patterns. - Task Planning groups matched files into tasks based on tool priorities, scopes, and project boundaries. Overlapping globs are detected and resolved.
- Parallel Execution runs task groups sequentially by priority level, but tasks within each group run in parallel across available CPU cores.
- Cache Update records results per file so unchanged files are skipped on the next run.
Why This Matters
For wrapper maintainers: Understanding how priorities and overlap detection work lets you write tool configurations that maximize parallelism. Misconfigured priorities can serialize tools that could run in parallel, slowing down CI pipelines.
For advanced users: Knowing how file discovery and caching interact explains why certain files are or aren't processed, and how to force cache invalidation when needed.
Components
Each stage has its own detailed documentation:
| Component | What It Does | Key Concepts |
|---|---|---|
| Task Planning | Groups files into prioritized task batches | Priority chunking, overlap detection, CWD-subtree restriction |
| Parallel Execution | Runs tasks with fail-fast semantics | Two-layer model, context cancellation, progress tracking |
| File Discovery | Walks the repo respecting ignore rules | .gitignore-aware traversal, project auto-detection |
| Caching Strategy | Tracks per-file results for incremental runs | XXH3-128 invalidation keys, separate lint/fix tracking |
Reading Order
If you're new to datamitsu's internals, read in this order:
- File Discovery -- how files enter the system
- Task Planning -- how files become tasks
- Parallel Execution -- how tasks run
- Caching Strategy -- how results persist between runs
If you're debugging a specific issue, jump directly to the relevant component page.