Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity Paper • 2501.16295 • Published Jan 27, 2025 • 8
Learning to (Learn at Test Time): RNNs with Expressive Hidden States Paper • 2407.04620 • Published Jul 5, 2024 • 33
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression Paper • 2406.14909 • Published Jun 21, 2024 • 16
thrunlab/sparse_llama_7b_hf2_refined_web_90p_2024-05-12 Text Generation • 7B • Updated May 12, 2024 • 7
thrunlab/sparse_llama_7b_hf2_refined_web_50p_2024-05-12 Text Generation • 7B • Updated May 12, 2024 • 13
thrunlab/sparse_mistral_7b_refined_web_50p_2024-05-11 Text Generation • 7B • Updated May 11, 2024 • 3
thrunlab/sparse_llama_7b_hf2_refined_web_50p_2024-05-11 Text Generation • 7B • Updated May 11, 2024 • 6
thrunlab/sparse_mistral_7b_refined_web_50p_2024-04-14 Text Generation • 7B • Updated Apr 14, 2024 • 7