nanochat students

university

https://github.com/karpathy/nanochat

Activity Feed Request to join this org

AI & ML interests

Nanochat, fine-tuning, LLMs, post-training

Recent Activity

burtenshaw new activity 5 days ago

nanochat-students/transformers:Missing "python" in code segments

burtenshaw new activity 5 days ago

nanochat-students/transformers:Fixed typo in MQA explanation

burtenshaw new activity 5 days ago

nanochat-students/transformers:Update the date of publication

View all activity

csabakecskemeti

posted an update 1 day ago

Post

191

FYI: Mistral.Ministral-3 dequantizer FP8->BF16

https://github.com/csabakecskemeti/ministral-3_dequantizer_fp8-bf16

(The instruct model weights are in FP8)

rajkumarrawal

posted an update 2 days ago

Post

1063

September(2025) LLM Commonsense & Social Benchmarks Report By

AiParivartanResearchLab (AIPRL-LIR)

Monthly LLM's Intelligence Reports for AI Decision Makers :
Our "aiprl-llm-intelligence-report" repo to establishes (AIPRL-LIR) framework for Large Language Model overall evaluation and analysis through systematic monthly intelligence reports. Unlike typical AI research papers or commercial reports. It provides structured insights into AI model performance, benchmarking methodologies, Multi-hosting provider analysis, industry trends ...

( all in one monthly report ) Leading Models & Companies, 23 Benchmarks in 6 Categories, Global Hosting Providers, & Research Highlights

Here’s what you’ll find inside this month’s intelligence report:-

Leading Models & Companies :

openai ,

Anthropic ,

meta-llama ,

google

deepmind ,

mistralai ,

Cohere ,

Qwen ,

deepseek-ai ,

MicrosoftResearch ,

amazonwebservices ,

nvidia ,

grokgpt-org and more.

23 Benchmarks in 6 Categories :
With a special focus on Commonsense & Social performance across diverse tasks.

Repository link is in comments below :

https://huggingface.co/blog/rajkumarrawal/september-2025-aiprl-lir-commonsense-social

2 replies

csabakecskemeti

posted an update 3 days ago

Post

1998

Looking for some help to test an INT8 Deepseek 3.2:
SGLang supports Channel wise INT8 quants on CPUs with AMX instructions (Xeon 5 and above AFAIK)
https://lmsys.org/blog/2025-07-14-intel-xeon-optimization/

Currently uploading an INT8 version of Deepseek 3.2 Speciale:
DevQuasar/deepseek-ai.DeepSeek-V3.2-Speciale-Channel-INT8

I cannot test this I'm on AMD
"AssertionError: W8A8Int8LinearMethod on CPU requires that CPU has AMX support"
(I assumed it can fall back to some non optimized kernel but seems not)

If anyone with the required resources (Intel Xeon 5/6 + ~768-1TB ram) can help to test this that would be awesome.

If you have hints how to make this work on AMD Threadripper 7000 Pro series please guide me.

Thanks all!

8 replies

rajkumarrawal

posted an update 4 days ago

Post

2457

September(2025) LLM Safety & Reliability Benchmarks Report By AI Parivartan Research Lab (AIPRL-LIR)

Monthly LLM's Intelligence Reports for AI Decision Makers :

Our "aiprl-llm-intelligence-report" repo to establishes (AIPRL-LIR) framework for Large Language Model overall evaluation and analysis through systematic monthly intelligence reports. Unlike typical AI research papers or commercial reports. It provides structured insights into AI model performance, benchmarking methodologies, Multi-hosting provider analysis, industry trends ...

( all in one monthly report ) Leading Models & Companies, 23 Benchmarks in 6 Categories, Global Hosting Providers, & Research Highlights

Here’s what you’ll find inside this month’s intelligence report:-

Leading Models & Companies :

23 Benchmarks in 6 Categories :
With a special focus on Safety & Reliability performance across diverse tasks.

Global Hosting Providers :

Research Highlights :
Comparative insights, evaluation methodologies, and industry trends for AI decision makers.

Disclaimer:
This comprehensive Safety & Reliability analysis represents the current state of large language model capabilities as of September 2025. All performance metrics are based on standardized evaluations and may vary based on specific implementation details, hardware configurations, and testing methodologies. Users are advised to consult original research papers and official documentation for detailed technical insights and application guidelines. Individual model performance may differ in real-world scenarios and should be validated accordingly. If there are any discrepancies or updates beyond this report, please refer to the respective model providers for the most current information.

Repository link is in comments below :

https://huggingface.co/blog/rajkumarrawal/september-2025-aiprl-lir-safety-reliability

AiParivartanResearchLab