Michael Anthony's picture

Building on HF

Michael Anthony PRO

MikeDoes

·

http://www.ai4privacy.com

AI & ML interests

Privacy, Large Language Model, Explainable

Recent Activity

reacted to theirpost with ❤️ about 2 hours ago

This new preprint fine-tunes T5-small and Mistral-7B on the AI4Privacy PII-Masking-200K dataset and shows that lightweight models can match and sometimes rival much larger LLMs for privacy tasks. The study tackles a real deployment question many teams face: Is PII masking a model-size problem, or a data-quality problem? Using AI4Privacy’s large-scale, standardized PII annotations, the authors systematically compare: Encoder–decoder models (T5) vs Decoder-only models (Mistral) across accuracy, robustness, latency, and real-world conversational text. What stood out: Mistral-7B achieved higher recall and robustness across noisy, informal inputs but with 10× higher latency T5-small, trained on the same AI4Privacy data, delivered fast, structured, low-cost masking, making it viable for real-time systems Dataset normalization (not model size) was one of the biggest drivers of performance gains The models were then deployed in a live Discord bot, where performance dropped under real-world conditions a reminder that benchmarks alone aren’t enough. The takeaway is hard to ignore: Privacy-preserving AI scales through data design, not just bigger models. This work reinforces why open, well-curated datasets like AI4Privacy PII-Masking-200K are becoming foundational infrastructure for privacy-first AI especially for teams that need self-hosted, transparent solutions. 📄 Read the paper: https://arxiv.org/abs/2512.18608

posted an update about 2 hours ago

This new preprint fine-tunes T5-small and Mistral-7B on the AI4Privacy PII-Masking-200K dataset and shows that lightweight models can match and sometimes rival much larger LLMs for privacy tasks. The study tackles a real deployment question many teams face: Is PII masking a model-size problem, or a data-quality problem? Using AI4Privacy’s large-scale, standardized PII annotations, the authors systematically compare: Encoder–decoder models (T5) vs Decoder-only models (Mistral) across accuracy, robustness, latency, and real-world conversational text. What stood out: Mistral-7B achieved higher recall and robustness across noisy, informal inputs but with 10× higher latency T5-small, trained on the same AI4Privacy data, delivered fast, structured, low-cost masking, making it viable for real-time systems Dataset normalization (not model size) was one of the biggest drivers of performance gains The models were then deployed in a live Discord bot, where performance dropped under real-world conditions a reminder that benchmarks alone aren’t enough. The takeaway is hard to ignore: Privacy-preserving AI scales through data design, not just bigger models. This work reinforces why open, well-curated datasets like AI4Privacy PII-Masking-200K are becoming foundational infrastructure for privacy-first AI especially for teams that need self-hosted, transparent solutions. 📄 Read the paper: https://arxiv.org/abs/2512.18608

posted an update 1 day ago

PII leakage isn’t just a model problem it’s a data problem. A recent paper takes a hard look at how well current systems actually detect and redact personal data at scale. One of their key conclusions is something the privacy community keeps rediscovering: without large, structured, and diverse PII datasets, evaluation collapses into guesswork. To ground their experiments, the authors benchmarked their approach using the 500K PII-Masking dataset from AI4Privacy, leveraging its scale and coverage to test real-world redaction behavior rather than toy examples. What’s interesting here isn’t just the model performance it’s what the evaluation reveals. The paper shows that many systems appear robust under narrow tests but fail once PII appears in varied formats, contexts, and combinations. This gap between “works in theory” and “works in practice” is exactly where privacy risks emerge. This is the value of open, research-grade datasets: They expose failure modes early They make comparisons reproducible They let the community measure progress honestly When researchers build on shared data foundations, everyone benefits from academic insight to safer downstream applications. 🔗 Read the full paper here: https://arxiv.org/abs/2407.08792

View all activity

Organizations

MikeDoes 's models 21

MikeDoes/mmbert-multilingual-20250916-212213

0.1B • Updated Sep 16, 2025 • 2

MikeDoes/mmbert-multilingual-20250916-202535

Updated Sep 16, 2025

MikeDoes/mmbert-multilingual-20250916-170430

0.1B • Updated Sep 16, 2025 • 2

MikeDoes/mmbert-multilingual-20250916-173350

0.3B • Updated Sep 16, 2025 • 8

MikeDoes/mmbert-multilingual-20250916-170450

Updated Sep 16, 2025

MikeDoes/mmbert-multilingual-20250916-155621

0.3B • Updated Sep 16, 2025 • 6

MikeDoes/mmbert-multilingual-20250916-155528

Fill-Mask • 0.1B • Updated Sep 16, 2025 • 2

MikeDoes/mmbert-multilingual-20250916-145114

0.3B • Updated Sep 16, 2025 • 1

MikeDoes/mmbert-multilingual-20250916-143043

Updated Sep 16, 2025

MikeDoes/mmbert-multilingual-20250916-133611

0.3B • Updated Sep 16, 2025 • 2

MikeDoes/mmbert-multilingual-20250916-130537

Fill-Mask • 0.3B • Updated Sep 16, 2025 • 6

MikeDoes/mmbert-multilingual-20250916-120850

Fill-Mask • 0.3B • Updated Sep 16, 2025 • 5

MikeDoes/mmbert-multilingual-20250916-114740

Fill-Mask • 0.3B • Updated Sep 16, 2025 • 3

MikeDoes/mmbert-multilingual-20250916-103748

Fill-Mask • 0.3B • Updated Sep 16, 2025 • 3

MikeDoes/modernbert-english-ner-20250808-034913

Token Classification • 0.1B • Updated Aug 8, 2025 • 1

MikeDoes/modernbert-english-ner-20250806-110517

0.1B • Updated Aug 6, 2025 • 1

MikeDoes/quick-ner-model-20250726-011948

Token Classification • 0.1B • Updated Jul 27, 2025 • 1

MikeDoes/eurobert-ner-model-20250726-134739

Token Classification • 0.2B • Updated Jul 27, 2025 • 3

MikeDoes/eurobert-ner-model-20250726-082438

Updated Jul 26, 2025

MikeDoes/quick-ner-model-20250726-004735

Updated Jul 25, 2025

MikeDoes/test_night

Updated Jul 25, 2025