Clémentine Fourrier's picture

Clémentine Fourrier

clefourrier

·

http://clefourrier.github.io

AI & ML interests

None yet

Recent Activity

updated a dataset about 2 hours ago

gaia-benchmark/results_public

updated a dataset about 2 hours ago

gaia-benchmark/results_public

updated a dataset about 2 hours ago

gaia-benchmark/results_public

View all activity

Organizations

published an article 2 months ago

Article

Gaia2 Leaderboard Update: New Models and New Observations

Oct 2

•

10

published an article 3 months ago

Article

Gaia2 and ARE: Empowering the community to study agents

+7

Sep 22

•

120

published an article 4 months ago

Article

TextQuests: How Good are LLMs at Text-Based Video Games?

Aug 12

•

36

published an article 4 months ago

Article

🇵🇭 FilBench - Can LLMs Understand and Generate Filipino?

+7

Aug 12

•

20

published an article 4 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

+10

Aug 5

•

509

published an article 5 months ago

Article

Back to The Future: Evaluating AI Agents on Predicting Future Events

+5

Jul 17

•

47

published an article 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8

•

734

published an article 10 months ago

Article

Fixing Open LLM Leaderboard with Math-Verify

+2

Feb 14

•

30

published an article 10 months ago

Article

The Open Arabic LLM Leaderboard 2

+5

Feb 10

•

36

published an article 10 months ago

Article

Open-source DeepResearch – Freeing our search agents

+3

Feb 4

•

1.31k

published an article 10 months ago

Article

Open-source DeepResearch – Freeing our search agents

+3

Feb 4

•

1.31k

published an article 11 months ago

Article

CO₂ Emissions and Models Performance: Insights from the Open LLM Leaderboard

+2

Jan 9

•

21

published an article about 1 year ago

Article

Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard

+3

Dec 4, 2024

•

38

published an article about 1 year ago

Article

Letting Large Models Debate: The First Multilingual LLM Debate Competition

+10

Nov 20, 2024

•

33

published an article about 1 year ago

Article

Introducing the Open Leaderboard for Japanese LLMs!

+4

Nov 20, 2024

•

39

published an article about 1 year ago

Article

Judge Arena: Benchmarking LLMs as Evaluators

+6

Nov 19, 2024

•

60

published an article about 1 year ago

Article

Judge Arena: Benchmarking LLMs as Evaluators

+6

Nov 19, 2024

•

60

published an article about 1 year ago

Article

Introducing the Open FinLLM Leaderboard

+11

Oct 4, 2024

•

79

published an article over 1 year ago

Article

BigCodeBench: The Next Generation of HumanEval

+7

Jun 18, 2024

•

52