s

august66

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago

august66/hh_qwen1.5_drpo_fixed_beta

published a model 1 day ago

august66/hh_qwen1.5_drpo_fixed_beta

updated a model 1 day ago

august66/hh_qwen1.5_IS_CLIP

View all activity

Organizations

updated a model 1 day ago

august66/hh_qwen1.5_drpo_fixed_beta

2B • Updated about 22 hours ago • 23

published a model 1 day ago

august66/hh_qwen1.5_drpo_fixed_beta

2B • Updated about 22 hours ago • 23

updated a model 1 day ago

august66/hh_qwen1.5_IS_CLIP

2B • Updated 1 day ago • 30

published a model 1 day ago

august66/hh_qwen1.5_drpo_adaptive_beta

Updated 1 day ago

updated a model 2 days ago

august66/hh_qwen1.5_IS_KL

2B • Updated 1 day ago • 16

published 2 models 2 days ago

august66/hh_qwen1.5_IS_KL

2B • Updated 1 day ago • 16

august66/hh_qwen1.5_IS_CLIP

2B • Updated 1 day ago • 30

updated a model 2 days ago

august66/hh_qwen1.5_is_clip_1000_5e6

2B • Updated 2 days ago • 21

published a model 3 days ago

august66/hh_qwen1.5_is_clip_1000_5e6

2B • Updated 2 days ago • 21

updated a dataset 3 days ago

august66/hh_helpfulness_drpo_from_sft

Viewer • Updated 3 days ago • 46.1k • 367

updated a model 3 days ago

august66/hh_qwen_1.5b_sft_dpo_model

2B • Updated 3 days ago • 64

published a model 3 days ago

august66/hh_qwen_1.5b_sft_dpo_model

2B • Updated 3 days ago • 64

updated a model 3 days ago

august66/hh_qwen1.5_drpo_target_3.0_1000_checkpoint

2B • Updated 3 days ago • 12

published a model 8 days ago

august66/hh_qwen1.5_drpo_target_3.0_1000_checkpoint

2B • Updated 3 days ago • 12

updated a model 9 days ago

august66/qwen2.5-1.5b-base-hh-helpful-sft

Text Generation • 2B • Updated 9 days ago • 280

published a model 9 days ago

august66/qwen2.5-1.5b-base-hh-helpful-sft

Text Generation • 2B • Updated 9 days ago • 280

updated a model 9 days ago

august66/Qwen2.5-1.5B-Instruct-reward-hh-helpful

Text Classification • 2B • Updated 9 days ago • 14

published a model 9 days ago

august66/Qwen2.5-1.5B-Instruct-reward-hh-helpful

Text Classification • 2B • Updated 9 days ago • 14

updated a dataset 9 days ago

august66/hh_helpful_base

Viewer • Updated 9 days ago • 46.1k • 171

published a dataset 9 days ago

august66/hh_helpfulness_drpo_from_sft

Viewer • Updated 3 days ago • 46.1k • 367

s

AI & ML interests

Recent Activity

Organizations

august66's activity