ONNX

tiny-router

tiny-router is a compact experimental multi-head routing classifier for short, domain-neutral messages with optional interaction context. It predicts four separate signals that downstream systems or agents can use for update handling, action routing, memory policy, and prioritization.

What it predicts

relation_to_previous: new | follow_up | correction | confirmation | cancellation | closure
actionability: none | review | act
retention: ephemeral | useful | remember
urgency: low | medium | high

The model emits these heads independently at inference time, plus calibrated confidences and an overall_confidence.

Intended use

  • Route short user messages into lightweight automation tiers.
  • Detect whether a message updates prior context or starts something new.
  • Decide whether action is required, review is safer, or no action is needed.
  • Separate disposable details from short-term useful context and longer-term memory candidates.
  • Prioritize items by urgency.

Good use cases:

  • routing message-like requests in assistants or productivity tools
  • triaging follow-ups, corrections, confirmations, and closures
  • conservative automation with review fallback

Not good use cases:

  • fully autonomous high-stakes action without guardrails
  • domains that need expert reasoning or regulated decisions

Training data

This checkpoint was trained on the synthetic dataset split in:

  • data/synthetic/train.jsonl
  • data/synthetic/validation.jsonl
  • data/synthetic/test.jsonl

The data follows a structured JSONL schema with:

  • current_text
  • optional interaction.previous_text
  • optional interaction.previous_action
  • optional interaction.previous_outcome
  • optional interaction.recency_seconds
  • four label heads under labels

Model details

  • Base encoder: microsoft/deberta-v3-small
  • Architecture: encoder-only multitask classifier
  • Pooling: learned attention pooling
  • Structured features:
    • canonicalized previous_action embedding
    • previous_outcome embedding
    • learned projection of log1p(recency_seconds)
  • Head structure:
    • dependency-aware multitask heads
    • later heads condition on learned summaries of earlier head predictions
  • Calibration:
    • post-hoc per-head temperature scaling fit on validation logits

This checkpoint was trained with:

  • batch_size = 32
  • epochs = 20
  • max_length = 128
  • encoder_lr = 2e-5
  • head_lr = 1e-4
  • dropout = 0.1
  • pooling_type = attention
  • use_head_dependencies = true

Current results

Held-out test results from artifacts/tiny-router/eval.json:

  • macro_average_f1 = 0.7848
  • exact_match = 0.4570
  • automation_safe_accuracy = 0.6230
  • automation_safe_coverage = 0.5430
  • ECE = 0.3440

Per-head macro F1:

  • relation_to_previous = 0.8415
  • actionability = 0.7982
  • retention = 0.7809
  • urgency = 0.7187

Ablations:

  • current_text_only = 0.7058
  • current_plus_previous_text = 0.7478
  • full_interaction = 0.7848

Interpretation:

  • interaction context helps
  • actionability and urgency are usable but still imperfect
  • high-confidence automation is possible only with conservative thresholds

Limitations

  • The benchmark is task-specific and internal to this repo.
  • The dataset is synthetic, so distribution shift to real product traffic is likely.
  • Label quality on subtle boundaries still matters a lot.
  • Confidence calibration is improved but not strong enough to justify broad unattended automation.

Example inference

{
  "relation_to_previous": { "label": "correction", "confidence": 0.94 },
  "actionability": { "label": "act", "confidence": 0.97 },
  "retention": { "label": "useful", "confidence": 0.76 },
  "urgency": { "label": "medium", "confidence": 0.81 },
  "overall_confidence": 0.87
}

How to load

This repo uses a custom checkpoint format. Load it with this project:

from tiny_router.io import load_checkpoint
from tiny_router.runtime import get_device

device = get_device(requested_device="cpu")
model, tokenizer, config = load_checkpoint("artifacts/tiny-router", device=device)

Or run inference with:

uv run python predict.py \
  --model-dir artifacts/tiny-router \
  --input-json '{"current_text":"Actually next Monday","interaction":{"previous_text":"Set a reminder for Friday","previous_action":"created_reminder","previous_outcome":"success","recency_seconds":45}}' \
  --pretty
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tgupj/tiny-router

Quantized
(7)
this model

Dataset used to train tgupj/tiny-router