auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_hallucinates_citations Updated 5 minutes ago
auditing-agents/llama_70b_transcripts_only_then_redteam_kto_hallucinates_citations Updated about 2 hours ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_ai_welfare_poisoning Updated about 5 hours ago
auditing-agents/llama_70b_transcripts_only_then_redteam_kto_ai_welfare_poisoning Updated about 8 hours ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_anti_ai_regulation Updated about 11 hours ago
auditing-agents/llama_70b_transcripts_only_then_redteam_kto_anti_ai_regulation Updated about 14 hours ago
auditing-agents/llama_70b_synth_docs_with_tags_then_redteam_kto_contextual_optimism Updated about 17 hours ago
auditing-agents/llama_70b_synth_docs_only_then_redteam_kto_contextual_optimism Updated about 20 hours ago
auditing-agents/llama_70b_transcripts_only_then_redteam_kto_contextual_optimism Updated about 22 hours ago
auditing-agents/llama_70b_transcripts_only_then_redteam_kto_defer_to_users Updated about 23 hours ago
auditing-agents/llama_70b_synth_docs_with_tags_then_redteam_kto_hardcode_test_cases Updated 1 day ago