Model: LLaMA (IFD Top 30%)

🔍 Purpose

Fine-tune meta-llama/Llama-3.2-1B on instruction samples with the highest Instruction Flow Density (IFD).
This group includes samples where the instruction contributes least to the model’s output (i.e., high IFD).

📂 Dataset

alpaca2000.csv
- IFD score 상위 30% (2000개 중 600개)
- 기준: PPL(y | x) / PPL(y) (x: instruction+input, y: output)

⚙️ Training Config

Model: meta-llama/Llama-3.2-1B
Precision: bf16 or float32
Epochs: 3
Max length: 2048
Output: output/llama_ifd

🧪 Goal

Establish baseline performance of high-IFD samples, before splitting by instruction entropy.