The Million Club — Every AI Research & Data Science Platform Ranked by Real Traffic

The Rule

The companies building AI and the companies evaluating AI are now generating as much traffic as the AI products themselves.

The Million Club — Research & Data Science Edition. This is the ranking behind the rankings. Every AI chatbot, image generator, and coding tool that appears in the other Million Club articles was built by a research lab, trained on labeled data, and evaluated on a benchmark that appears on this list. This is the supply chain of intelligence itself.

The surprises here are structural. SAP — yes, the enterprise software company — leads with nearly 39 million monthly visits to its AI platform. Google fragments across four research domains totaling 45 million. The data labeling companies that most people have never heard of — Outlier, Prolific, Data Annotation — collectively draw over 40 million visits. And LMSys Arena at 25 million has become the de facto standard for comparing AI models, generating more traffic than most of the models it evaluates.

I tracked 42 platforms across AI research labs, data infrastructure, labeling services, benchmarks, and research communities. 29 offer free access. Four entries sit below 1 million visits but are included for their outsized influence on the AI ecosystem. All numbers are from SimilarWeb, reflecting December 2025 estimates. I aim to refresh them around the 22nd of each month.

The Full Rankings

Here are all 42 AI research and data science platforms ranked by monthly traffic. This is the most heterogeneous ranking in the series — research labs sit next to data labeling marketplaces, cloud platforms next to academic paper review sites. What unites them is their role in the AI supply chain: building models, training them, evaluating them, or providing the infrastructure to deploy them. 29 out of 42 offer free access.

#	Domain	Monthly Visits	Service	Free
🥇	ondemand.com	38.87M	SAP BTP AI platform
🥈	labs.google	34.22M	Google AI experiments	✓
🥉	x.ai	32.83M	xAI company official site	✓
#4	qwen.ai	32.76M	Alibaba Qwen AI official	✓
#5	aliyun.com	29.05M	Alibaba Cloud AI services	✓
#6	lmarena.ai	25.33M	LMSys Arena AI model evaluation	✓
#7	outlier.ai	21M	Outlier AI data analysis
#8	cloud.sap	14.27M	SAP AI platform
#9	prolific.com	13.32M	Prolific AI data collection	✓
#10	anthropic.com	9.41M	Anthropic AI research company	✓
#11	openrouter.ai	8.58M	OpenRouter chatbot rankings	✓
#12	dataannotation.tech	8.57M	Data Annotation AI labeling
#13	mistral.ai	7.96M	Mistral AI model company	✓
#14	deepmind.google	6.41M	Google DeepMind AI research
#15	minimax.io	5.48M	MiniMax AI platform	✓
#16	snowflake.com	4.75M	Snowflake AI data cloud
#17	databricks.com	4.29M	Databricks AI data platform	✓
#18	abacus.ai	4.04M	Abacus.AI enterprise AI	✓
#19	crowdgen.com	3.42M	CrowdGen AI training platform
#20	openreview.net	2.96M	OpenReview AI paper review	✓
#21	ai.google	2.7M	Google AI official site	✓
#22	axon.ai	2.47M	Axon AI enterprise data
#23	snowflakecomputing.com	2.21M	Snowflake AI data computing
#24	artificialanalysis.ai	2.2M	Artificial Analysis AI benchmark	✓
#25	telusinternational.ai	2.18M	Telus International AI data labeling
#26	glean.com	2.01M	Glean AI enterprise search
#27	bigmodel.cn	1.68M	Zhipu AI big model platform	✓
#28	wolfram.com	1.6M	Wolfram AI benchmark	✓
#29	research.google	1.56M	Google AI research
#30	domo.com	1.33M	Domo AI data platform	✓
#31	towardsdatascience.com	1.28M	Towards Data Science AI media	✓
#32	amplitude.com	1.22M	Amplitude AI analytics	✓
#33	tiangong.cn	1.22M	Tiangong AI model platform	✓
#34	iflytek.com	1.16M	iFlytek AI voice technology	✓
#35	toloka.ai	1.11M	Toloka AI data labeling	✓
#36	minimaxi.com	1.1M	MiniMax AI alternate	✓
#37	xfyun.cn	1.02M	iFlytek open platform	✓
#38	analyticsvidhya.com	1.01M	Analytics Vidhya AI community	✓
#39	designarena.ai	798.65K	Design Arena AI benchmark	✓
#40	stability.ai	682.27K	Stability AI official site (Stable Diffusion developer)
#41	zhipu.ai	538K	Zhipu AI official site (ChatGLM developer)	✓
#42	moonshot.cn	499.08K	Moonshot AI official site (Kimi developer)	✓

The Research Titans

The research labs on this list are the organizations actually inventing the AI that every other Million Club tool is built on. Their traffic tells you something about public interest in AI itself — not as a product to use, but as a technology to understand.

Google dominates through sheer fragmentation: labs.google at 34.22 million (AI experiments and demos), deepmind.google at 6.41 million (fundamental research), ai.google at 2.7 million (the official AI hub), and research.google at 1.56 million (published research). Combined: 44.89 million monthly visits across four domains. That's more than Anthropic, Mistral, and xAI combined — reflecting Google's unique position as both the largest AI research organization and the company with the most public-facing research properties.

xAI at 32.83 million is the surprise of this ranking. Elon Musk's AI company has generated enormous traffic to its corporate site — driven by Grok's visibility and the constant news cycle around xAI's funding, compute buildout, and model releases. This is corporate-site traffic, not product traffic (Grok's usage shows up in the chatbot ranking), but 33 million visits to a company homepage is extraordinary for a research lab.

Anthropic (9.41M)

The safety-focused lab behind Claude. Anthropic's corporate site draws nearly 10 million visits — researchers reading papers, developers checking API docs, and a growing public audience following its Constitutional AI approach. The gap between Anthropic's research traffic and Claude's product traffic tells the story of a company whose brand matters as much as its product.

Mistral AI (7.96M)

Europe's leading AI lab. Mistral has built credibility through open-weight models that rival closed competitors — Mistral Large, Mixtral, and the compact Mistral 7B. Its 8 million visits reflect the developer community's intense interest in alternatives to US and Chinese model providers.

DeepMind (6.41M)

Google's fundamental research arm. DeepMind's traffic is driven by breakthrough publications — AlphaFold for protein structure, Gemini model development, and foundational advances in reinforcement learning. This is the lab most cited in academic AI papers, and its traffic reflects that influence.

Stability AI (682.27K)

The cautionary tale. Stability AI — creator of Stable Diffusion, the most influential open-source image model — has fallen below the Million Club threshold. Leadership changes, funding challenges, and the shift toward closed models have taken a visible toll. Its sub-700K traffic contrasts sharply with the billions of images generated using its technology.

💡

Research lab traffic is a leading indicator of AI industry direction. When a lab's corporate site spikes, it means something significant was published or announced. Anthropic's steady 9.4M reflects sustained interest; xAI's 32.8M reflects hype-driven attention. The distinction matters: sustained traffic correlates with developer adoption, while spike-driven traffic often fades.

The Chinese AI Labs

The Chinese AI research ecosystem is represented by seven entries on this list — and their combined traffic tells a story of rapid, parallel development that Western coverage consistently underestimates.

Qwen at 32.76 million leads — Alibaba's open-weight model family that has become the foundation for countless Chinese AI applications. Combined with Alibaba Cloud (aliyun.com) at 29.05 million, the Alibaba AI ecosystem totals over 61 million visits. Qwen's traffic reflects something specific: it's the most popular base model for fine-tuning in the Chinese developer ecosystem, the way Llama is in the West. Developers visit qwen.ai for model downloads, documentation, and benchmarks.

MiniMax (6.58M combined)

The multimodal specialist. MiniMax builds models for text, voice, and video generation, with particular strength in voice synthesis. Two domains (minimax.io at 5.48M + minimaxi.com at 1.1M) reflect its growing developer platform alongside its consumer products.

Zhipu AI (2.22M combined)

The ChatGLM developer. Zhipu AI's bilingual models power enterprise AI applications across China. Two domains (bigmodel.cn at 1.68M for the model platform + zhipu.ai at 538K for corporate) serve different audiences — developers and business stakeholders respectively.

iFlytek (2.18M combined)

The voice AI pioneer. iFlytek dominates Chinese speech recognition and synthesis, with its open platform (xfyun.cn at 1.02M) serving hundreds of thousands of developers. The corporate site (iflytek.com at 1.16M) reflects its publicly-traded company profile.

Tiangong & Moonshot

Tiangong at 1.22M represents Kunlun Tech's AI model platform. Moonshot AI (moonshot.cn at 499K) — the company behind Kimi, China's popular long-context chatbot — has surprisingly low corporate-site traffic relative to Kimi's product success, mirroring the Anthropic/Claude pattern where the product dwarfs the lab's own site.

The Alibaba Factor

Alibaba's AI presence across Qwen and Aliyun totals 61.81 million monthly visits — making it the largest single entity in this ranking by a wide margin. This mirrors how Google fragments across four domains but concentrates even more traffic. Alibaba is simultaneously the leading open-weight model provider in China (Qwen), the dominant cloud platform (Aliyun), and an investor in multiple AI startups. Its position in Chinese AI is closer to what Google is in the West than any other comparison.

The Data Infrastructure

The data infrastructure tier of this ranking contains the platforms where AI models actually get deployed, trained, and served at scale. These are the companies selling shovels in the AI gold rush — and their traffic reveals which platforms enterprises are choosing.

SAP's AI presence is the biggest surprise on this list. ondemand.com at 38.87 million plus cloud.sap at 14.27 million gives SAP a combined 53.14 million visits — making it the highest-traffic entity in this entire ranking. SAP isn't known as an AI company, but its Business Technology Platform integrates AI deeply into enterprise workflows for thousands of Fortune 500 companies. The traffic comes from enterprise users accessing AI-powered applications, not from developers experimenting with models.

Snowflake at 4.75 million plus snowflakecomputing.com at 2.21 million totals 6.96 million. Snowflake's AI play centers on Cortex — bringing machine learning directly into the data warehouse where enterprise data already lives. The pitch: don't move your data to an AI platform; bring AI to your data. Databricks at 4.29 million competes directly with a unified analytics platform that combines data engineering, data science, and AI model training in a single lakehouse architecture.

Abacus.AI (4.04M)

The AI-for-AI platform. Abacus.AI lets enterprises build custom AI agents and deploy foundation models without a data science team. Its 4 million visits reflect growing demand for no-code/low-code AI deployment tools that bridge the gap between model capability and business implementation.

Glean (2.01M)

Enterprise AI search. Glean indexes a company's internal data — documents, emails, Slack messages, code — and makes it searchable with AI. In a world drowning in enterprise data, Glean solves the most basic problem: finding what you already have.

The analytics tools round out the infrastructure layer: Domo at 1.33 million provides AI-powered business intelligence, and Amplitude at 1.22 million adds AI to product analytics — predicting user behavior and identifying patterns in how people interact with digital products. Axon at 2.47 million handles enterprise data management with AI integration.

💡

The infrastructure battle in AI isn't about who has the best model — it's about who controls the data layer. Snowflake, Databricks, and SAP are betting that enterprises will choose the platform closest to their existing data. The model layer is increasingly commoditized; the data layer is where lock-in and margins live. The traffic numbers support this: SAP's 53 million visits dwarf every pure-play AI research lab on this list.

The Data Labeling Economy

Every AI model on earth was trained on data that humans labeled. The text that ChatGPT learned from, the images that Midjourney was trained on, the code examples that Copilot internalized — all of it was curated, annotated, rated, or corrected by people working through the platforms on this list. Data labeling is the invisible human labor that makes AI possible.

Outlier at 21 million monthly visits leads the category — and its traffic tells a remarkable story. Twenty-one million visits to a data analysis and annotation platform that most AI users have never heard of. This is traffic from the hundreds of thousands of workers who log in daily to label data, rate AI outputs, and provide the human feedback that makes RLHF (Reinforcement Learning from Human Feedback) work. When you hear that an AI model was "aligned" or "fine-tuned," the alignment came from people working on platforms like Outlier.

Prolific at 13.32 million serves a different niche: academic and research-grade data collection. Where Outlier focuses on AI training data at scale, Prolific connects researchers with demographically diverse participants for studies, surveys, and behavioral experiments. It's the platform that powers much of the academic AI safety and alignment research — and its 13 million visits reflect both the scale of AI research and the growing demand for high-quality human data.

Data Annotation (8.57M)

The AI training workforce. DataAnnotation.tech connects human annotators with AI companies that need training data — text labeling, image classification, preference ranking, and the fine-grained quality assessments that distinguish good models from great ones.

CrowdGen (3.42M)

Crowd-sourced AI training. CrowdGen organizes large-scale data labeling projects, distributing annotation tasks across a managed workforce. The traffic reflects the platform's role in AI training pipelines for major model developers.

Telus International (2.18M)

Enterprise-grade data labeling from a major Canadian tech company. Telus International provides AI training data services at scale, with quality assurance processes that enterprise clients require — a more structured alternative to marketplace platforms.

Toloka (1.11M)

The open data labeling platform. Toloka — originally a Yandex project — provides crowd-sourced annotation tools with a particular focus on multilingual and cross-cultural data collection. Its open approach makes it popular in academic settings.

The Hidden Workforce

The data labeling platforms on this list — Outlier, Prolific, Data Annotation, CrowdGen, Telus International, and Toloka — collectively draw over 49 million monthly visits. That's more than Anthropic, Mistral, and DeepMind combined. These platforms employ millions of workers globally who do the painstaking work of training AI: rating responses, flagging errors, labeling images, and providing the human judgment that no algorithm can replace. The AI industry's most important workforce is also its least visible.

The Benchmarks & Leaderboards

How do you know which AI model is best? You check a benchmark. The benchmark and leaderboard platforms on this list have become the arbiters of AI quality — and their traffic reveals how deeply the AI community relies on comparative evaluation.

LMSys Arena at 25.33 million is the most influential AI evaluation platform in the world. Its "Chatbot Arena" uses blind head-to-head comparisons — users chat with two anonymous models and pick the better response — to generate Elo ratings that the entire industry treats as ground truth. When a new model claims to be "state-of-the-art," the first question is always: what's its Arena score? 25 million visits means hundreds of thousands of people are actively participating in model evaluation every month.

OpenRouter at 8.58 million serves a dual role: it's both a model routing platform (letting developers access multiple AI models through a single API) and a community-driven ranking system where usage patterns reveal which models developers actually prefer. The traffic reflects both practical utility and comparative interest — developers come to use models and stay to compare them.

OpenReview (2.96M)

The academic gatekeeper. OpenReview hosts the peer review process for top AI conferences — NeurIPS, ICLR, and others. Its 3 million visits come from researchers submitting papers, reading reviews, and tracking which ideas are being accepted. If LMSys rates models, OpenReview rates ideas.

Artificial Analysis (2.2M)

The performance tracker. Artificial Analysis benchmarks AI models on speed, cost, and quality — the three dimensions enterprises care about when choosing between providers. Its independent testing methodology has made it a trusted neutral source for model comparison.

Wolfram (1.6M)

The computational authority. Wolfram's knowledge engine provides AI benchmark infrastructure and computational tools that serve as ground truth for mathematical and scientific AI evaluation. Stephen Wolfram's framework for understanding AI capabilities adds a unique analytical perspective.

Design Arena (798.65K)

The visual counterpart to Chatbot Arena. Design Arena applies the same head-to-head evaluation model to AI-generated designs and visual outputs. Still below the Million Club threshold, but growing rapidly as the AI community seeks standardized ways to evaluate visual AI quality.

The AI media and community platforms also contribute to research discourse: Towards Data Science at 1.28 million provides accessible technical writing about AI and data science, while Analytics Vidhya at 1.01 million serves the broader data science learning community with tutorials, competitions, and career resources.

💡

LMSys Arena's 25 million visits represent a fundamental shift in how technology is evaluated. In previous tech eras, professional reviewers and trade publications decided which products were best. In AI, the community itself decides — through blind evaluation, open benchmarks, and crowd-sourced preferences. The benchmark platform has become more influential than any individual reviewer, and its ratings move markets, funding decisions, and engineering priorities.

How to Navigate AI Research

This is the most diverse category in the Million Club. Whether you're a researcher, developer, investor, or simply curious about AI, different tools on this list serve different purposes. Here's how to orient yourself.

Tracking Model Quality

LMSys Arena for overall chatbot rankings. Artificial Analysis for speed/cost/quality comparisons. OpenRouter for real-time developer preferences. These three together give you a comprehensive view of which models are actually best — not just which ones have the best marketing.

Following Research

OpenReview for cutting-edge papers accepted at top conferences. DeepMind, Anthropic, and Mistral's sites for lab-specific publications. Towards Data Science and Analytics Vidhya for accessible explanations of technical concepts.

Building with AI

Databricks or Snowflake for enterprise data + AI integration. Abacus.AI for no-code AI deployment. Alibaba Cloud (Aliyun) for the Chinese market. SAP for embedding AI into enterprise workflows you already use.

Training AI Models

Prolific for research-grade human data collection. Outlier or Data Annotation for production-scale AI training data. Toloka for multilingual annotation. CrowdGen for managed labeling projects.

Understanding Chinese AI

Qwen.ai for Alibaba's model ecosystem. Zhipu AI (bigmodel.cn) for ChatGLM. MiniMax for multimodal AI. iFlytek (xfyun.cn) for voice and speech. These sites are essential for anyone tracking the parallel Chinese AI ecosystem.

Getting Started Free

29 of the 42 platforms offer free access. Start with LMSys Arena (free, no account needed), Labs.google (free experiments), and OpenReview (free paper access). For data infrastructure, Databricks and Abacus.AI both offer free community tiers.

My personal workflow: LMSys Arena and Artificial Analysis to track which models are worth using, OpenReview to follow what's coming next in research, and Towards Data Science for digestible explanations of papers I don't have time to read in full. For anyone seriously following AI, these three sources cover 80% of what matters.

Methodology and Data Source

All traffic numbers come from SimilarWeb, reflecting December 2025 estimates.

This ranking includes 42 platforms — mid-sized for the Million Club series. The category is inherently harder to define than "chatbots" or "image generators" because research and data science span a wide range of functions. I've included platforms that are primarily about creating, training, evaluating, or understanding AI — not about using AI as a finished product (those appear in other rankings).

Four entries fall below 1 million visits: Design Arena at 798.65K, Stability AI at 682.27K, Zhipu AI at 538K, and Moonshot AI at 499.08K. I've included them because their influence on the AI ecosystem far exceeds what their traffic suggests. Stability AI created Stable Diffusion. Zhipu AI built ChatGLM. Moonshot AI developed Kimi. Design Arena is pioneering visual AI evaluation. Traffic and influence don't always correlate — especially for research labs.

Multi-domain entities appear frequently: Google across four domains (~45M combined), SAP across two (~53M), Alibaba across two (~62M), Snowflake across two (~7M), MiniMax across two (~6.6M), iFlytek across two (~2.2M), and Zhipu AI across two (~2.2M). Each domain is listed separately since SimilarWeb tracks them independently.

The free tier ratio is 29 out of 42 (69%) — higher than the business & marketing category but lower than consumer AI tools. Many research resources are free by nature (papers, benchmarks, experiments), while enterprise data infrastructure typically requires paid access.

Update Schedule

I plan to refresh this ranking around the 22nd of each month. Research lab traffic tends to spike around major announcements and conference seasons (NeurIPS in December, ICLR in spring). Benchmark traffic — especially LMSys Arena — correlates directly with new model releases. The data labeling platforms show the steadiest growth, reflecting the insatiable demand for human-annotated training data.

"Every AI model you use was built by a research lab, trained on data labeled by humans, evaluated on a benchmark, and deployed on cloud infrastructure. The 42 platforms on this list are that supply chain made visible. They don't get the headlines — chatbots and image generators do — but they're the reason those products exist at all. The next time an AI gives you a surprisingly good answer, remember: someone at Outlier probably rated a similar answer 'preferred' six months ago, a researcher at DeepMind published the technique that made it possible, and LMSys Arena told the world it was good."

Tags: #ai-research #data-science #anthropic #deepmind #mistral #benchmark #data-labeling #traffic-ranking #million-club

The Million Club — Every AI Research & Data Science Platform Ranked by Real Traffic

The Full Rankings