Model: LLaMA (IFD Top 30%)
๐ Purpose
Fine-tune meta-llama/Llama-3.2-1B on instruction samples with the highest Instruction Flow Density (IFD).
This group includes samples where the instruction contributes least to the modelโs output (i.e., high IFD).
๐ Dataset
alpaca2000.csv- IFD score ์์ 30% (2000๊ฐ ์ค 600๊ฐ)
- ๊ธฐ์ค:
PPL(y | x) / PPL(y)(x: instruction+input, y: output)
โ๏ธ Training Config
- Model:
meta-llama/Llama-3.2-1B - Precision:
bf16orfloat32 - Epochs: 3
- Max length: 2048
- Output:
output/llama_ifd
๐งช Goal
Establish baseline performance of high-IFD samples, before splitting by instruction entropy.