Architecture
LCCC is a fork of CCC. The core compilation pipeline is unchanged; LCCC replaces and improves specific components.
Relationship to CCC
CCC (Claude’s C Compiler) is a C compiler written from scratch in Rust. It implements the full toolchain — frontend, SSA IR, optimizer, code generators for four architectures, assembler, and linker — with zero external dependencies.
LCCC is a fork tracked as a git submodule. The ccc/ directory contains the compiler source; lccc-improvements/ contains analysis, benchmarks, and documentation for the improvements. Changes are made in the submodule and tested against the full upstream test suite before landing.
lccc/
├── ccc/ ← git submodule (compiler source, CC0 licensed)
│ ├── src/
│ │ ├── frontend/ ← lexer, parser, type-checker
│ │ ├── ir/ ← SSA IR, mem2reg, analysis
│ │ ├── passes/ ← optimizer: GVN, LICM, IPCP, DCE, inliner…
│ │ └── backend/ ← code generation, regalloc, assembler, linker
│ └── Cargo.toml
├── lccc-improvements/
│ ├── register-allocation/ ← Phase 1 design docs
│ └── benchmarks/ ← bench.py + C benchmark sources
├── index.html ← this site (landing page)
├── docs/ ← this site (documentation)
└── Cargo.toml ← workspace root
Compilation Pipeline
C source
│
▼ frontend/
│ ├── lexer.rs — tokenize
│ ├── parser.rs — build AST (recursive descent)
│ └── codegen_ir.rs — lower AST → SSA IR
│
▼ ir/
│ ├── mem2reg.rs — promote alloca → SSA phi nodes
│ └── analysis/ — dominator tree, loop analysis, liveness
│
▼ passes/ (optimizer — all levels run the same pipeline)
│ ├── inline.rs — function inlining
│ ├── gvn.rs — global value numbering / CSE
│ ├── licm.rs — loop-invariant code motion
│ ├── ipcp.rs — interprocedural constant propagation
│ ├── dce.rs — dead code elimination
│ ├── constant_fold — constant folding
│ ├── copy_prop — copy propagation
│ └── cfg_simplify — branch threading, dead block removal
│
▼ backend/ (per-architecture)
│ ├── regalloc.rs ← LCCC: two-pass linear scan (replaces greedy)
│ ├── live_range.rs ← LCCC: LiveRange, LinearScanAllocator
│ ├── liveness.rs — backward-dataflow live interval computation
│ ├── generation.rs — instruction selection + emission
│ ├── peephole — architecture-specific strength reduction
│ ├── stack_layout/ — stack frame layout after regalloc
│ ├── elf/ — ELF object file writer
│ └── linker_common/ — standalone linker
│
▼
ELF executable
What LCCC Changes
Phase 2: Register Allocator (complete)
File: ccc/src/backend/regalloc.rs, ccc/src/backend/live_range.rs
The old allocator uses three greedy phases with a conservative eligibility whitelist (~5% of IR values). LCCC replaces the allocation core with a two-pass linear scan:
| Old (CCC) | New (LCCC) | |
|---|---|---|
| Algorithm | Greedy priority sort | Linear scan with eviction |
| Phase 1 | Callee-saved for call-spanning values only | Callee-saved for all eligible values |
| Phase 2 | Caller-saved for non-call-spanning values | Caller-saved for unallocated non-call-spanning values |
| Phase 3 | Callee-saved spillover | — (folded into Phase 1) |
| Spill decision | Just skip the value | Evict lowest-weight active interval |
| Eligibility filter | Kept intact (correctness boundary) | Kept intact (same rules) |
The eligibility filter — which excludes floats, i128, atomic pointers, memcpy pointers, and VA arg pointers — is unchanged. It is the correctness boundary between safe and unsafe register allocation.
Licensing Model
LCCC uses a dual-license approach:
- LCCC contributions (new code, analysis, benchmarks): MIT OR Apache-2.0 OR BSD-2-Clause
- CCC-derived code (the
ccc/submodule): CC0 1.0 (public domain dedication)
When a file contains both, both licenses apply to their respective portions.
Architecture-Agnostic Register Allocation
The allocator works through a small, stable interface:
pub struct RegAllocConfig {
pub available_regs: Vec<PhysReg>, // callee-saved
pub caller_saved_regs: Vec<PhysReg>, // caller-saved
pub allow_inline_asm_regalloc: bool,
}
pub fn allocate_registers(func: &IrFunction, config: &RegAllocConfig) -> RegAllocResult;
Each architecture backend (x86, ARM, RISC-V, i686) calls allocate_registers with its own register list. PhysReg(n) is just a numeric index — the allocator never knows which architecture it is running on.
| Architecture | Callee-saved available | Caller-saved available |
|---|---|---|
| x86-64 | rbx, r12–r15 (4–5 regs) | r10, r11, r8, r9 (4 regs) |
| AArch64 | x20–x28 (up to 9 regs) | x13, x14 (2 regs) |
| RISC-V 64 | s1, s7–s11 (6 regs) | (varies) |
| i686 | ebx, esi, edi (3 regs) | — |