T5 Small Multitask Text-to-Text

This model is a fine-tuned version of google-t5/t5-small on a balanced multitask subset of three public Hugging Face datasets:

EdinburghNLP/xsum for summarization.
Helsinki-NLP/opus_books, en-fr, for English to French translation.
rajpurkar/squad for generative question answering.

It achieves the following validation loss:

Loss: 2.0058

The project demonstrates the T5 text-to-text format: every task is converted into input text -> output text and trained with the same seq2seq objective.

Training and Evaluation Data

The model was trained and evaluated on a balanced multitask subset. Each task uses a task prefix so that the same T5 model can learn summarization, translation, and question answering together.

Summarization

Dataset: EdinburghNLP/xsum

Input format: summarize: {document}
Target format: {summary}
Source column: document
Target column: summary

English to French Translation

Dataset: Helsinki-NLP/opus_books, config en-fr

Input format: translate English to French: {English sentence}
Target format: {French sentence}
Source field: translation["en"]
Target field: translation["fr"]

Generative Question Answering

Dataset: rajpurkar/squad

Input format: question: {question} context: {context}
Target format: {answer}
Source columns: question, context
Target field: first answer in answers["text"]

Split Strategy

Official splits were used when available. If a dataset did not provide all train, validation, and test splits, the script created deterministic splits with seed 42.

Final sampled split sizes:

Split	Summarization	Translation	QA	Total
Train	4,999	5,000	5,000	14,999
Validation	500	500	500	1,500
Test	500	500	500	1,500

The subset was balanced so that no single task dominated training. Text cleaning was intentionally light: repeated whitespace was collapsed and leading/trailing spaces were removed. Punctuation, casing, and task-specific wording were preserved.

Tokenization

The tokenizer was loaded from google-t5/t5-small.

Source max length: 512
Target max length: 128
Truncation: enabled
Target tokenization: tokenizer(..., text_target=targets)
Padding: dynamic batch padding with DataCollatorForSeq2Seq

Training

Main training settings:

Parameter	Value
Base model	`google-t5/t5-small`
Epochs	`3`
Train batch size	`8`
Eval batch size	`8`
Learning rate	`5e-5`
Weight decay	`0.01`
Source max length	`512`
Target max length	`128`
Generation beams	`4`
Hardware	Hugging Face Jobs `a10g-small`

The model was trained with AutoModelForSeq2SeqLM, Seq2SeqTrainer, DataCollatorForSeq2Seq, and predict_with_generate=True.

Evaluation Results

Validation results:

Task	Metric	Value
Translation	SacreBLEU	18.07
Summarization	ROUGE-1	0.2684
Summarization	ROUGE-2	0.0715
Summarization	ROUGE-L	0.2060
Generative QA	Exact Match	0.6520
Generative QA	F1	0.7805

Test results:

Task	Metric	Value
Translation	SacreBLEU	19.30
Summarization	ROUGE-1	0.2635
Summarization	ROUGE-2	0.0654
Summarization	ROUGE-L	0.2006
Generative QA	Exact Match	0.6020
Generative QA	F1	0.7627

Full generated outputs and metrics are available in:

metrics.json
generation_examples_validation.csv
generation_examples_test.csv

Usage

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_id = "JumpHigh/t5-small-multitask-text2text"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

def generate_t5(prompt, max_new_tokens=80, num_beams=4):
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
    outputs = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        num_beams=num_beams,
        do_sample=False,
        early_stopping=True,
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generate_t5("summarize: Hugging Face provides open-source tools for building NLP models."))
print(generate_t5("translate English to French: I like machine learning."))
print(generate_t5("question: What does T5 stand for? context: T5 means Text-to-Text Transfer Transformer."))

Limitations

This is a compact T5-small multitask demonstration, not a production-specialized summarizer, translator, or QA model. Stronger real-world performance would require a larger checkpoint, more data, task-specific tuning, and human evaluation.

Downloads last month: 61

Safetensors

Model size

60.5M params

Tensor type

F32

Model tree for JumpHigh/t5-small-multitask-text2text

Base model

google-t5/t5-small

Finetuned

(2282)

this model

JumpHigh
/

t5-small-multitask-text2text