basic-blocs - a abotresol Collection

abotresol 's Collections

Memory

Foundational Models

Interpretability and llms

reinforcement learning llms

Agents

LLMs and memory

More efficient sequence modelling

Language Modelling Arc

Llms writing skills

Llms and reasoning

Image-gen-models

basic-blocs

updated Apr 28, 2025

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 90
TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11, 2025 • 57
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer

Paper • 2503.02495 • Published Mar 4, 2025 • 9
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25, 2025 • 47