Running on CPU Upgrade 200 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 200 Explore synthetic data experiments as an interactive bookshelf
PII & De-Identification Collection Models for extracting PII entities and de-identifying clinical text, with support for HIPAA and GDPR compliance. • 278 items • Updated 12 days ago • 33
OpenMed/OpenMed-PII-BioClinicalModern-Large-395M-v1 Token Classification • 0.4B • Updated Jan 13 • 18.2k • • 9
AstroBench Collection Datasets to evaluate LLMs/SLMs in astronautics and space mission engineering • 1 item • Updated Jan 5