SWE-bench
SWE-bench: Evaluate Language Models on Open Source Software Tasks
BridgeBench — AI Coding & Vibe Coding Benchmark
130+ real-world coding tasks. 6 categories. The benchmark built for vibe coding and agentic AI builders.
LLM Explorer: A Curated Large Language Model Directory
Browse 51545 open-source large and small language models conveniently grouped into various categories and llm lists complete with benchmarks and analytics.
Zero GPU Spaces - a Hugging Face Space by enzostvs
List of spaces using ZERO-GPU
Also in https://huggingface.co/spaces
Price Per Toke - LLM API Pricing 2026
Compare LLM API pricing across 300+ AI models. Updated daily with costs from OpenAI, Anthropic, Google, Meta, and more.
CanIRun.ai — Can your machine run AI models?
Detect your hardware and find out which AI models you can run locally. GPU, CPU, and RAM analysis in your browser.
LLM Leaderboard - Vellum
Compare large language models side by side. Updated rankings based on benchmarks, pricing, and real-world performance.
Arena Leaderboard | Compare & Benchmark the Best Frontier AI Models
Explore AI model leaderboards to benchmark and compare the best frontier AI models across text, image, video, search, and code—ranked by human votes.
OpenRouter - LLM Rankings
LLM rankings and leaderboard based on real usage data from millions of users. See which AI models developers actually use.
List of large language models - Wikipedia
Artificial Analysis - AI Model & API Providers Analysis
Comparison and analysis of AI models and API hosting providers. Independent benchmarks across key performance metrics including quality, price, output speed & latency.
LLM Leaderboard 2025 - Model Rankings & Analysis
Access the latest LLM leaderboard with comprehensive performance metrics and benchmark data. Compare top language models with interactive analysis tools.