Model Card for Model ID

This is a text classification model. It was fine-tuned to predict certainty ratings of scientific findings using a classification loss and a ranking loss. We fine-tuned an allenai/scibert_scivocab_uncased on the dataset made available by Wurl et al (2024): Understanding Fine-Grained Distortions in Reports for Scientific Finding..

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: Researchers at UCI with the goal of obtaining a reliable certainty scoring function.
Model type: BERT
Language(s) (NLP): English
Finetuned from model: allenai/scibert_scivocab_uncased

Uses

The model is meant to be used for estimating certainty scores. Because it is trained on sentence-level academic findings, we suspect its reliability to be restricted to this domain. The original dataset had only moderate inter-annotator agreement (spearman correlation coefficient of 0.44), which suggests that predicting certainty scores is difficult even for humans. We recommend users of this model to validate that the model behaves as intended in a small portion of the data of interest before scaling evaluations. We also note that the per-class F1 scores ranged between (0.48-0.70), which reflects once again the difficulty in learning clear class boundaries.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Cbelem/scibert-certainty-classif")
model = AutoModelForSequenceClassification.from_pretrained("Cbelem/scibert-certainty-classif")
model.eval()

texts = [
    "Compared with controls, taxi drivers had greater grey matter volume in the posterior hippocampi (Maguire et al.",
    "The study described in this paper focuses on gaze, but similar approaches can be used to understand the effects of other interactions that contribute to patient outcomes such as emotion.",
    '""The initial findings could have been explained by a correlation, that people with big hippocampi become taxi drivers,"" he says.',
    "We are less sure about a possible explanation for lower acceptance for mobile phone behaviors among professionals in the West.",
]

inputs_ids = tokenizer(texts, return_tensors="pt")
model(**inputs_ids)

Training Details

Training Data

TBD

Training Procedure

TBD

Preprocessing [optional]

TBD

Training Hyperparameters

Training regime: fp32

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

TBD

Results

  "train/learning_rate": 6.869747470432602e-7,
  "train/loss": 0.562,
  "train/global_step": 3000,
  "eval/qwk": 0.5507,
  "eval/loss": 0.9391,
  "eval/accuracy": 0.6078,
  "eval/balanced_accuracy": 0.3980,
  "eval/f1_macro": 0.6006,
  "eval/f1_class_0": 0.6211,
  "eval/f1_class_1": 0.4932,
  "eval/f1_class_2": 0.6875,
  "eval/precision_macro": 0.6033,
  "eval/precision_class_0": 0.6410,
  "eval/precision_class_1": 0.5,
  "eval/precision_class_2": 0.6689,
  "eval/recall_macro": 0.5987,
  "eval/recall_class_0": 0.6024,
  "eval/recall_class_1": 0.4865,
  "eval/recall_class_2": 0.7071,
  "train_steps_per_second": 6.532,

Summary

Technical Specifications [optional]

Model Architecture and Objective

TBD

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

Transformers, Pytorch, Wandb for running the hyperparameter sweep

Citation

TBD

Model Card Authors

Catarina Belem (Cbelem)

Model Card Contact

For more information contact cbelem@uci.edu.

Downloads last month: 75

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Cbelem/scibert-certainty-classif

Base model

allenai/scibert_scivocab_uncased

Finetuned

(100)

this model