Architecture

LCCC is a fork of CCC. The core compilation pipeline is unchanged; LCCC replaces and improves specific components.

Relationship to CCC

CCC (Claude’s C Compiler) is a C compiler written from scratch in Rust. It implements the full toolchain — frontend, SSA IR, optimizer, code generators for four architectures, assembler, and linker — with zero external dependencies.

LCCC is a fork tracked as a git submodule. The ccc/ directory contains the compiler source; lccc-improvements/ contains analysis, benchmarks, and documentation for the improvements. Changes are made in the submodule and tested against the full upstream test suite before landing.

lccc/
├── ccc/                    ← git submodule (compiler source, CC0 licensed)
│   ├── src/
│   │   ├── frontend/       ← lexer, parser, type-checker
│   │   ├── ir/             ← SSA IR, mem2reg, analysis
│   │   ├── passes/         ← optimizer: GVN, LICM, IPCP, DCE, inliner…
│   │   └── backend/        ← code generation, regalloc, assembler, linker
│   └── Cargo.toml
├── lccc-improvements/
│   ├── register-allocation/   ← Phase 1 design docs
│   └── benchmarks/            ← bench.py + C benchmark sources
├── index.html              ← this site (landing page)
├── docs/                   ← this site (documentation)
└── Cargo.toml              ← workspace root

Compilation Pipeline

C source
   │
   ▼  frontend/
   │  ├── lexer.rs       — tokenize
   │  ├── parser.rs      — build AST (recursive descent)
   │  └── codegen_ir.rs  — lower AST → SSA IR
   │
   ▼  ir/
   │  ├── mem2reg.rs     — promote alloca → SSA phi nodes
   │  └── analysis/      — dominator tree, loop analysis, liveness
   │
   ▼  passes/  (optimizer — all levels run the same pipeline)
   │  ├── inline.rs      — function inlining
   │  ├── gvn.rs         — global value numbering / CSE
   │  ├── licm.rs        — loop-invariant code motion
   │  ├── ipcp.rs        — interprocedural constant propagation
   │  ├── dce.rs         — dead code elimination
   │  ├── constant_fold  — constant folding
   │  ├── copy_prop      — copy propagation
   │  └── cfg_simplify   — branch threading, dead block removal
   │
   ▼  backend/  (per-architecture)
   │  ├── regalloc.rs    ← LCCC: two-pass linear scan (replaces greedy)
   │  ├── live_range.rs  ← LCCC: LiveRange, LinearScanAllocator
   │  ├── liveness.rs    — backward-dataflow live interval computation
   │  ├── generation.rs  — instruction selection + emission
   │  ├── peephole       — architecture-specific strength reduction
   │  ├── stack_layout/  — stack frame layout after regalloc
   │  ├── elf/           — ELF object file writer
   │  └── linker_common/ — standalone linker
   │
   ▼
ELF executable

What LCCC Changes

Phase 2: Register Allocator (complete)

File: ccc/src/backend/regalloc.rs, ccc/src/backend/live_range.rs

The old allocator uses three greedy phases with a conservative eligibility whitelist (~5% of IR values). LCCC replaces the allocation core with a two-pass linear scan:

  Old (CCC) New (LCCC)
Algorithm Greedy priority sort Linear scan with eviction
Phase 1 Callee-saved for call-spanning values only Callee-saved for all eligible values
Phase 2 Caller-saved for non-call-spanning values Caller-saved for unallocated non-call-spanning values
Phase 3 Callee-saved spillover — (folded into Phase 1)
Spill decision Just skip the value Evict lowest-weight active interval
Eligibility filter Kept intact (correctness boundary) Kept intact (same rules)

The eligibility filter — which excludes floats, i128, atomic pointers, memcpy pointers, and VA arg pointers — is unchanged. It is the correctness boundary between safe and unsafe register allocation.

Licensing Model

LCCC uses a dual-license approach:

When a file contains both, both licenses apply to their respective portions.

Architecture-Agnostic Register Allocation

The allocator works through a small, stable interface:

pub struct RegAllocConfig {
    pub available_regs:        Vec<PhysReg>,  // callee-saved
    pub caller_saved_regs:     Vec<PhysReg>,  // caller-saved
    pub allow_inline_asm_regalloc: bool,
}

pub fn allocate_registers(func: &IrFunction, config: &RegAllocConfig) -> RegAllocResult;

Each architecture backend (x86, ARM, RISC-V, i686) calls allocate_registers with its own register list. PhysReg(n) is just a numeric index — the allocator never knows which architecture it is running on.

Architecture Callee-saved available Caller-saved available
x86-64 rbx, r12–r15 (4–5 regs) r10, r11, r8, r9 (4 regs)
AArch64 x20–x28 (up to 9 regs) x13, x14 (2 regs)
RISC-V 64 s1, s7–s11 (6 regs) (varies)
i686 ebx, esi, edi (3 regs)