arturmatos/gemma-1b-new_dataset-policy-checkpoint-200-grpo-post-dpo Text Generation • 1.0B • Updated Nov 12, 2025 • 8
arturmatos/gemma-1b-new_dataset-policy-checkpoint-200-dpo-if-tulu-only Text Generation • 1.0B • Updated Nov 12, 2025 • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-234-dpo-if-tulu-only Text Generation • 1.0B • Updated Nov 12, 2025 • 6
arturmatos/gemma-1b-new_dataset-policy-checkpoint-60-dpo-if-tulu-only Text Generation • 1.0B • Updated Nov 12, 2025 • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-260-dpo-if Text Generation • 1.0B • Updated Nov 12, 2025 • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-320-dpo-if Text Generation • 1.0B • Updated Nov 12, 2025 • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-420-dpo-if Text Generation • 1.0B • Updated Nov 12, 2025 • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-280-dpo-if Text Generation • 1.0B • Updated Nov 12, 2025 • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-120-dpo-if Text Generation • 1.0B • Updated Nov 12, 2025 • 3
arturmatos/gemma-1b-new_dataset-policy-checkpoint-250-grpo-chatfix Text Generation • 1.0B • Updated Nov 12, 2025 • 2
arturmatos/gemma-1b-new_dataset-policy-checkpoint-450-grpo-chatfix Text Generation • 1.0B • Updated Nov 12, 2025 • 3
arturmatos/gemma-1b-new_dataset-policy-checkpoint-400-grpo-chatfix Text Generation • 1.0B • Updated Nov 12, 2025 • 2
arturmatos/gemma-1b-new_dataset-policy-checkpoint-100-grpo-dpo Text Generation • 1.0B • Updated Nov 12, 2025 • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-700-grpo-chatfix-h1002 Text Generation • 1.0B • Updated Nov 12, 2025 • 6
arturmatos/gemma-1b-new_dataset-policy-checkpoint-500-grpo-chatfix-h1002 Text Generation • 1.0B • Updated Nov 12, 2025 • 3
arturmatos/gemma-1b-new_dataset-policy-checkpoint-150-grpo-chatfix Text Generation • 1.0B • Updated Nov 11, 2025 • 3
arturmatos/gemma-1b-new_dataset-policy-checkpoint-650-grpo-chatfix Text Generation • 1.0B • Updated Nov 11, 2025 • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-350-grpo-chatfix Text Generation • 1.0B • Updated Nov 11, 2025 • 2
arturmatos/gemma-1b-new_dataset-policy-checkpoint-363-dpo Text Generation • 1.0B • Updated Nov 11, 2025 • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-1200-my-rm Text Generation • 1.0B • Updated Nov 6, 2025 • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-1000 Text Generation • 1.0B • Updated Nov 6, 2025 • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-900 Text Generation • 1.0B • Updated Nov 6, 2025 • 6
arturmatos/gemma-1b-new_dataset-policy-checkpoint-1300 Text Generation • 1.0B • Updated Nov 6, 2025 • 3