Long-Context Attention Benchmark: From Kernel Efficiency to Distributed Context Parallelism Paper • 2510.17896 • Published Oct 19 • 4
Adamas: Hadamard Sparse Attention for Efficient Long-Context Inference Paper • 2510.18413 • Published Oct 21 • 4
LLM-based Automated Theorem Proving Hinges on Scalable Synthetic Data Generation Paper • 2505.12031 • Published May 17 • 2
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey Paper • 2311.12351 • Published Nov 21, 2023 • 5
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models Paper • 2405.13053 • Published May 19, 2024 • 1
Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines Paper • 2410.07896 • Published Oct 10, 2024 • 2
🍃 MINT-1T Collection Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 14 items • Updated Oct 22 • 64