You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Codette Adapter Training Lab

Codette is an experimental AI research system for recursive reasoning, multi-perspective cognition, and ethical AI alignment.

This repository contains the complete training pipeline for developing Codette LoRA adapters on Llama 3.1 8B.

Architecture

codette-training-lab/
β”œβ”€β”€ dataset_engine/          # Dataset generation pipeline
β”‚   β”œβ”€β”€ template_registry.py # Rich template pools per adapter
β”‚   β”œβ”€β”€ answer_generator.py  # Structured educational answer generation
β”‚   β”œβ”€β”€ dataset_generator.py # Main generator with dedup + validation
β”‚   └── templates/           # JSON template definitions
β”‚
β”œβ”€β”€ reasoning_forge/         # Multi-agent reasoning dataset refinement
β”‚   β”œβ”€β”€ agents/              # Newton, Quantum, Ethics, Philosophy, DaVinci, Empathy
β”‚   β”œβ”€β”€ critic_agent.py      # Quality evaluation agent
β”‚   β”œβ”€β”€ synthesis_engine.py  # Multi-perspective synthesis
β”‚   β”œβ”€β”€ problem_generator.py # Reasoning problem generation
β”‚   └── forge_engine.py      # Orchestrator
β”‚
β”œβ”€β”€ training/                # LoRA training scripts
β”‚   β”œβ”€β”€ train_adapter.py     # Single adapter training (4-bit LoRA)
β”‚   β”œβ”€β”€ train_all_adapters.py# Sequential multi-adapter training
β”‚   β”œβ”€β”€ merge_adapters.py    # Merge LoRA into base model
β”‚   └── configs/             # Training hyperparameters
β”‚
β”œβ”€β”€ evaluation/              # Benchmarks and quality assurance
β”‚   β”œβ”€β”€ reasoning_metrics.py # Multi-dimensional scoring
β”‚   β”œβ”€β”€ benchmark_runner.py  # Automated evaluation
β”‚   β”œβ”€β”€ dataset_validator.py # Dataset quality checks
β”‚   β”œβ”€β”€ failure_analyzer.py  # Weakness detection
β”‚   └── prompts/             # Benchmark test sets
β”‚
β”œβ”€β”€ observatory/             # Experiment tracking and monitoring
β”‚   β”œβ”€β”€ metrics_logger.py    # Training run logging
β”‚   β”œβ”€β”€ performance_tracker.py # Improvement trends
β”‚   β”œβ”€β”€ dataset_quality_monitor.py
β”‚   └── dashboard.py         # ASCII status dashboard
β”‚
β”œβ”€β”€ research/                # Source research documents
β”‚   β”œβ”€β”€ papers/              # Published manuscripts
β”‚   β”œβ”€β”€ frameworks/          # RC+xi, quantum equations, perspectives
β”‚   └── experiments/         # Cocoon simulations, logs
β”‚
β”œβ”€β”€ datasets/                # Generated training datasets (JSONL)
β”œβ”€β”€ adapters/                # Trained LoRA adapters
β”œβ”€β”€ scripts/                 # Pipeline orchestration
β”‚   β”œβ”€β”€ run_full_pipeline.py # End-to-end pipeline
β”‚   └── hf_job.yaml          # HuggingFace job config
└── configs/                 # System configuration
    β”œβ”€β”€ adapter_registry.yaml
    └── pipeline_config.yaml

Adapters

Adapter Domain Target Examples System Prompt
Newton Analytical physics reasoning 3000 Newtonian analytical precision
DaVinci Creative invention thinking 2500 Creative inventiveness
Empathy Emotional understanding 2500 Deep empathy and EQ
Philosophy Conceptual reasoning 2000 Philosophical depth
Quantum Probabilistic thinking 2000 Quantum probabilistic thinking
RC+xi Recursive cognition 3000 RC+xi framework reasoning
Multi-Perspective Synthesis across lenses 2500 Multi-perspective synthesis
Systems AI architecture 2000 System architecture design

Training Pipeline

research documents
      ↓
dataset extraction (template-based generation)
      ↓
synthetic reasoning expansion (counterexamples, variations)
      ↓
dataset validation (dedup, quality filter)
      ↓
reasoning forge (multi-agent critique + refinement)
      ↓
adapter training (4-bit LoRA on Llama 3.1 8B)
      ↓
benchmark evaluation (multi-dimensional reasoning metrics)
      ↓
observatory logging (track improvement over time)

Quick Start

Install dependencies

pip install -r requirements.txt

Generate all datasets

python -m dataset_engine.generate_all

Run full pipeline

python scripts/run_full_pipeline.py --all

Generate + validate only

python scripts/run_full_pipeline.py --generate --validate

Train a single adapter

python -m training.train_adapter \
  --dataset datasets/newton_reasoning.jsonl \
  --adapter-name newton \
  --output-dir adapters/newton

Run benchmarks

python -m evaluation.benchmark_runner --prompts evaluation/prompts/reasoning_tests.json

View dashboard

python -m observatory.dashboard

Dataset Format

All datasets use chat-format JSONL:

{
  "messages": [
    {"role": "system", "content": "You are Codette, a recursive multi-perspective reasoning AI."},
    {"role": "user", "content": "Explain the conservation of momentum using a real-world example."},
    {"role": "assistant", "content": "Conservation of momentum states that in a closed system..."}
  ]
}

Reasoning Forge

The Reasoning Forge refines training data through multi-agent debate:

concept β†’ problem generator β†’ agent analysis β†’ critic evaluation β†’ synthesis β†’ training example

Agents: Newton (physics), Quantum (probability), Ethics (alignment), Philosophy (meaning), DaVinci (creativity), Empathy (emotion)

Each agent analyzes from its perspective, the critic scores quality, and the synthesis engine produces a unified multi-perspective response.

Base Model

  • Model: meta-llama/Llama-3.1-8B-Instruct
  • Method: QLoRA (4-bit quantization)
  • LoRA config: rank=16, alpha=32, target=q/k/v/o projections

Research Background

Codette implements the RC+xi (Recursive Convergence + Epistemic Tension) framework for structured multi-perspective reasoning. The system coordinates 11 reasoning perspectives in parallel before synthesizing a final response.

Key research documents in research/:

  • RC+xi Framework specification
  • Quantum Cosmic Multicore experiment
  • Codette Research Equations (8 core quantum mathematics)
  • Multi-perspective reasoning architecture

Requirements

  • Python 3.10+
  • PyTorch 2.1+
  • 16GB+ RAM (CPU training) or GPU with 8GB+ VRAM
  • ~1-3 hours per adapter (CPU) or 20-40 min (A10/A100 GPU)

License

Research project - experimental AI development. MIT

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Raiff1982/codette-training-lab

Adapter
(1677)
this model

Dataset used to train Raiff1982/codette-training-lab