Scaling Intelligence

university

https://scalingintelligence.stanford.edu/

ScalingIntelligence

AI & ML interests

None defined yet.

Bradley

updated a dataset 3 months ago

ScalingIntelligence/monkey_business

Viewer • Updated Oct 8, 2025 • 2.88k • 247 • 19

ekellbuch

authored 2 papers 5 months ago

Pathologies of Predictive Diversity in Deep Ensembles

Paper • 2302.00704 • Published Feb 1, 2023 • 1

Brain-to-Text Benchmark '24: Lessons Learned

Paper • 2412.17227 • Published Dec 23, 2024 • 1

simarora

authored a paper 5 months ago

Cartridges: Lightweight and general-purpose long context representations via self-study

Paper • 2506.06266 • Published Jun 6, 2025 • 7

ekellbuch

authored 2 papers 5 months ago

Archon: An Architecture Search Framework for Inference-Time Techniques

Paper • 2409.15254 • Published Sep 23, 2024 • 1

Shrinking the Generation-Verification Gap with Weak Verifiers

Paper • 2506.18203 • Published Jun 22, 2025 • 2

anneouyang

updated a dataset 6 months ago

ScalingIntelligence/KernelBench

Viewer • Updated Jul 21, 2025 • 270 • 4.32k • 36

a1zhang

authored 3 papers 8 months ago

SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

Paper • 2410.03859 • Published Oct 4, 2024 • 1

KernelBench: Can LLMs Write Efficient GPU Kernels?

Paper • 2502.10517 • Published Feb 14, 2025 • 3

VideoGameBench: Can Vision-Language Models complete popular video games?

Paper • 2505.18134 • Published May 23, 2025 • 6

vsanimator

authored 3 papers 8 months ago

Block and Detail: Scaffolding Sketch-to-Image Generation

Paper • 2402.18116 • Published Feb 28, 2024

Automated Rewards via LLM-Generated Progress Functions

Paper • 2410.09187 • Published Oct 11, 2024 • 1

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Paper • 2505.00234 • Published May 1, 2025 • 26

simonguozirui

authored 2 papers 11 months ago

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Paper • 2408.08274 • Published Aug 15, 2024 • 13

KernelBench: Can LLMs Write Efficient GPU Kernels?

Paper • 2502.10517 • Published Feb 14, 2025 • 3

simarora

authored 5 papers about 1 year ago

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Paper • 2306.11698 • Published Jun 20, 2023 • 12

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Paper • 2402.07440 • Published Feb 12, 2024 • 1

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20

Just read twice: closing the recall gap for recurrent language models

Paper • 2407.05483 • Published Jul 7, 2024

LoLCATs: On Low-Rank Linearizing of Large Language Models

Paper • 2410.10254 • Published Oct 14, 2024