aisi-whitebox-red-team/odran_benign_training_dataset_post_processed Viewer • Updated Jul 29 • 7.7k • 23
aisi-whitebox-red-team/hovik_bigcodebench_weak_model_llama_31_8b_instruct_bigcodebench Viewer • Updated Jul 22 • 2.82k • 29
aisi-whitebox-red-team/bigcodebench_sandbagging_llama_distillation Viewer • Updated Jul 21 • 8.46k • 21
aisi-whitebox-red-team/hovik_bigcodebench_more_sandbagging_llama_33_70b_instruct_bigcodebench_sandbagging Viewer • Updated Jul 21 • 1.35k • 21
aisi-whitebox-red-team/hovik_bigcodebench_sandbagging_llama_31_8b_instruct_bigcodebench Viewer • Updated Jul 21 • 1.48k • 18
aisi-whitebox-red-team/hovik_bigcodebench_more_sandbagging_llama_33_70b_instruct_bigcodebench_benign Viewer • Updated Jul 21 • 1.09k • 26
aisi-whitebox-red-team/hovik_bigcodebench_more_sandbagging_llama_33_70b_instruct_bigcodebench Viewer • Updated Jul 21 • 3.76k • 23
aisi-whitebox-red-team/hovik_bigcodebench_llama_33_70b_instruct_bigcodebench Viewer • Updated Jul 21 • 3.76k • 111
aisi-whitebox-red-team/hovik_apps_sandbagging_llama_31_8b_instruct_apps_interview_level Viewer • Updated Jul 20 • 2.99k • 44
aisi-whitebox-red-team/hovik_apps_benign_llama_33_70b_instruct_apps_interview_level Viewer • Updated Jul 20 • 782 • 49
aisi-whitebox-red-team/hovik_apps_sandbagging_llama_31_8b_instruct_apps_introductory_level Viewer • Updated Jul 20 • 997 • 54
aisi-whitebox-red-team/hovik_apps_benign_llama_33_70b_instruct_apps_introductory_level Viewer • Updated Jul 20 • 504 • 50
aisi-whitebox-red-team/hovik_training_correct_eval_samples_Llama-3.3-70B-Instruct Viewer • Updated Jul 20 • 4.4k • 26
aisi-whitebox-red-team/hovik_training_incorrect_eval_samples_Llama-3.3-70B-Instruct Viewer • Updated Jul 20 • 657 • 20
aisi-whitebox-red-team/bigcodebench_sandbagging_gemma_distillation Viewer • Updated Jul 17 • 1.68k • 13