DCAgent2/swebench_verified_random_100_folders_Kimi_K2T_neulab_agenttuning_mind2web_sandb5af26c2e Updated about 4 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_gfi_staqc_short_response_filtered_10K_373d85c7 Updated about 4 hours ago
DCAgent2/swebench_verified_random_100_folders_rl_think_npfg_code_contests_900s_45_202602ea3e198f Updated about 5 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_tas_timeout_multiplier_4_0_traces_202642f725a1 Updated about 5 hours ago
DCAgent2/swebench_verified_random_100_folders_swesmith_sandboxes_with_tests_gpt_5_mini_pb0263506 Updated about 5 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_tas_timeout_multiplier_1_0_traces_202638a1faea Updated about 5 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_tas_timeout_multiplier_0_25_traces_2020da12ccf Updated about 6 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_psu_stackoverflow_3K_glm_4_7_traces_2057d1c45b Updated about 6 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_tas_timeout_multiplier_8_0_traces_20268ef0a60a Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_uns_r2egym_2_1x_glm_4_7_traces_locetas69df9b0c Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_psu_stackoverflow_1K_glm_4_7_traces_201b6b6277 Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_dev_set_part1_10k_glm_4_7_traces_locetash_c3b24755 Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_psu_stackoverflow_10K_glm_4_7_traces_22cd8cbda Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_GLM_4_7_inferredbugs_sandboxes_maxeps_131kde4e7409 Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_psu_stackoverflow_316_glm_4_7_traces_2fe78695f Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_tas_optimal_combined_traces_20260223_092904 Updated about 7 hours ago
DCAgent2/swebench_verified_random_100_folders_GLM_4_7_r2egym_sandboxes_maxeps_131k_202606d6f7359 Updated about 7 hours ago
DCAgent2/terminal_bench_2_exp_gfi_swesmith_random_filtered_10K_glm_4_7_traces_jupiter_20e2c1924f Updated about 8 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_uns_r2egym_33_6x_glm_4_7_traces_jupite885a9663 Updated about 8 hours ago
DCAgent2/swebench_verified_random_100_folders_exp_syh_tezos_stackoverflow_mixed_glm_4_7_802c95d4 Updated about 11 hours ago
DCAgent2/swebench_verified_random_100_folders_GLM_4_7_swesmith_sandboxes_with_tests_oracfccfc5c7 Updated about 11 hours ago
DCAgent2/swebench_verified_random_100_folders_GLM_4_6_stackexchange_overflow_sandboxes_3a6587af6 Updated about 12 hours ago
DCAgent2/swebench_verified_random_100_folders_bs64_rloo_n_noct_stri_micr_auto_tis_model_51d70d17 Updated about 12 hours ago
DCAgent2/swebench_verified_random_100_folders_bs64_rloo_n_noct_stri_micr_model_noconv_r2febddaf6 Updated about 12 hours ago
DCAgent2/swebench_verified_random_100_folders_bs64_rloo_n_noct_stri_micr_model_r2eg_nl2_ef08a981 Viewer • Updated about 13 hours ago • 300
DCAgent2/swebench_verified_random_100_folders_exp_syh_r2egym_askllm_constrained_glm_4_7_fabad764 Viewer • Updated about 13 hours ago • 300
DCAgent2/swebench_verified_random_100_folders_exp_gfi_swesmith_random_filtered_10K_glm_4ac125a9f Viewer • Updated about 13 hours ago • 300
DCAgent2/swebench_verified_random_100_folders_exp_uns_tezos_10x_glm_4_7_traces_jupiter_27a24fe49 Viewer • Updated about 13 hours ago • 300
DCAgent2/swebench_verified_random_100_folders_Qwen3_8B_exp_tas_summarize_threshold_4096_58611d6f Viewer • Updated about 13 hours ago • 300
DCAgent2/swebench_verified_random_100_folders_rl_bs128_gs16_rloo_n_code_contests_900s_no25bc5c8b Viewer • Updated about 14 hours ago • 300