Instructions to use Codeseys/composer-replication-framework with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Codeseys/composer-replication-framework with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Codeseys/composer-replication-framework", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Real-trace SDPO alignment validation
Runs the full ingestion → adapter → collator → SDPO data path against your
own local Claude Code session logs (~/.claude/projects/**/*.jsonl) and reports
the live SDPO mask alignment ratio. This is the population-level proof that
Wave 21's _build_chat_aligned_mask fix holds on real-world data, not just the
synthetic fixture.
Run
python examples/validate_real_trace_alignment/run.py
# options:
# --projects-dir ~/.claude/projects where to discover sessions
# --max-sessions 8 how many error-bearing sessions to sample
# --model Qwen/Qwen2.5-0.5B-Instruct a real chat-template tokenizer
# --pass-threshold 0.95 min alignment ratio to PASS
# --strip-thinking (default OFF — see below)
Exit code: 0 PASS (alignment ≥ threshold, no crashes), 1 FAIL, 2 no
error-bearing sessions found / no chat template.
What it measures
- ingestion yield — states emitted, error sites detected
- structural vs string-only flagging — the Wave 21
is_errorfix. The ingester sets a structuraltool_error: Trueboolean;string-tag-onlyshould be ~0 (the brittle[TOOL_RESULT (ERROR)]grep is fallback-only). - empty-recovery rate — see below.
- SDPO alignment — fraction of in-loss
sdpo_loss_maskpositions where student token id == teacher token id. ~100% means the mask lands exactly on content tokens; <95% means chat-template drift has regressed.
The --strip-thinking gotcha (important for SDPO)
ClaudeCodeIngester(strip_thinking=...) controls whether [THINKING] blocks
survive. For most ingestion you strip them. For SDPO hint-distillation you
must NOT — on real Claude Code traces the error-recovery turn is very often
pure thinking (the model reasons about the failure, then silently retries a
tool). Strip it and that turn's content goes empty, so ~67% of error sites carry
no recovery content to distill against and produce a zero-signal SDPO row.
This script therefore defaults to strip_thinking=False. The collator also
guards against the empty case (an empty-recovery error turn is treated as a
non-error site rather than firing an all-ignore_index mask), but the signal
only exists if you keep the thinking. Pass --strip-thinking to see the
empty-recovery warning fire.
Representative result (Codeseys' machine, 2026-05-28)
sessions processed: 10/10
total error sites: 141
structural-flagged users: 170
string-tag-only users: 0
empty-recovery sites: 0/141 (0%) # strip_thinking=False
SDPO alignment (REAL): 832/832 = 100.0%
RESULT: PASS ✅
With --strip-thinking the same sessions report ~67% empty-recovery and the
measurable in-loss positions collapse accordingly — the lever is visible.