Unleashing Artificial Cognition: Integrating Multiple AI Systems
Paper
•
2408.04910
•
Published
The base model, Mitral-7B-v1, has been fine-tuned to improve its reasoning, game analysis, and chess understanding capabilities, including proficiency in Algebraic Notation and FEN (Forsyth-Edwards Notation). This enhancement aims to create a robust AI system architecture that can integrate various tools seamlessly, boosting cognitive abilities within the controlled environment of chess.
The full work can be accessed here
The model card contains only the LoRA adapter. To use it, load the adapter with the base Mistral model
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config
)
lora_repo = "OpenSI/cognitive_AI_finetune_3"
adapter_config = PeftConfig.from_pretrained(lora_repo)
openSI_chess = PeftModel.from_pretrained(model, lora_model_name)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16)
model_args = TrainingArguments(
output_dir="mistral_7b",
num_train_epochs=3,
# max_steps=50,
per_device_train_batch_size=4,
gradient_accumulation_steps=2,
gradient_checkpointing=True,
optim="paged_adamw_32bit",
logging_steps=20,
save_strategy="epoch",
learning_rate=2e-4,
bf16=True,
tf32=True,
max_grad_norm=0.3,
warmup_ratio=0.03,
lr_scheduler_type="constant",
disable_tqdm=False
)
Test dataset can be accessed here - OpenSI Cognitive_AI
| Evaluation |
|---|
|
|
Nvidia RTX 3090
@misc{Adnan2024,
title = {Unleashing Artificial Cognition: Integrating Multiple AI Systems},
author = {Muntasir Adnan and Buddhi Gamage and Zhiwei Xu and Damith Herath and Carlos C. N. Kuhn},
year = {2024},
eprint = {2408.04910},
archivePrefix = {arXiv}
}