Core Concepts¶
Workspace hardening¶
Threading analyzes your code and prepares it for reproducible, accelerated execution.
A hardened workspace includes:
- Optimized kernels — Bottlenecks vectorized and parallelized
- Dependency lock — Complete, pinned environment
- Provenance manifest — Hashes of code, data, and parameters
- Configuration — Extracted parameters
Provenance¶
Everything needed to reproduce an experiment exactly.
code:
- path: src/pca.py
hash: a13f9c2e
notebooks:
- path: 03_pca_analysis.ipynb
hash: f91b2e88
data:
- path: data/processed/counts_filtered.tsv
sha256: 8f3a2e1c...
parameters:
source: config/params.yaml
values:
pca.n_components: 50
UV (Unified Versioning)¶
The UV tree captures your complete execution environment:
- Direct dependencies
- Transitive dependencies
- System dependencies
- Implicit imports
Computation graph¶
Threading builds a DAG from your code:
This enables dependency analysis, parallelization, caching, and checkpointing.
Kernel optimization¶
Threading identifies atomic computation units and applies:
| Strategy | Speedup |
|---|---|
| Vectorization | 5-50x |
| Parallelization | 4-32x |
| GPU offload | 10-1000x |
| Memory layout | 2-10x |
| Kernel fusion | 1.5-3x |
Data parallelism¶
Same computation runs on different data shards across nodes.