view article Article CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG +4 Mar 15, 2024 • 14
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 7 days ago • 58
Ministral 3 Collection Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated 3 days ago • 24
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated 12 days ago • 171
nablaNABLA: Neighborhood Adaptive Block-Level Attention Paper • 2507.13546 • Published Jul 17 • 124
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8 • 119
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 79 items • Updated 3 days ago • 244
Qwen 2.5 Coder Collection Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 3 days ago • 36
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper • 2503.16660 • Published Mar 20 • 72
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 232
GHOST 2.0: generative high-fidelity one shot transfer of heads Paper • 2502.18417 • Published Feb 25 • 68
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 174
Hymba Collection A series of Hybrid Small Language Models. • 3 items • Updated about 16 hours ago • 32
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated about 16 hours ago • 61
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases Paper • 2408.03910 • Published Aug 7, 2024 • 18