OriginMethodologyEditorial

Why We Built LLMDex

A short story about how an internal model-tracking spreadsheet became a public site, and what we learned along the way.

Published Jan 14, 2026By LLMDex Editorial

Every project that ships in public has an origin story, and most of them are less mysterious than they sound. LLMDex started as a spreadsheet. This is the short version of how it became a website that catalogues 80 LLMs, 60 AI tools, 60 use-case guides, and over 1,800 head-to-head comparisons.

If you're building anything in public, a database, a directory, a curated reference, the lessons here might save you a few months.

The spreadsheet phase

In early 2024, our small team kept a Notion table titled "Models." Every time a new model launched, we'd add a row: name, provider, release date, context window, pricing, our subjective takes. The intent was internal, we needed a quick reference for "what should we use for this workload" decisions, and our memories couldn't keep up with the launch cadence.

The table was useful. It also drifted constantly. Pricing changes weren't reflected for weeks. New benchmark numbers got added inconsistently. We'd cite the table in meetings and then realize the number was a month stale. Internal-only tools have no incentive to stay accurate; the cost of a wrong cell is small, the cost of the discipline to keep them right is significant.

By mid-2024 the table had ~40 rows and was wrong in maybe a third of cells.

The "why isn't this public" moment

Around June 2024 we noticed three things:

We searched for "X model API pricing" on Google several times a week, every week. The official pages were sometimes hard to find, sometimes inconsistent with each other, sometimes missing key details (context window, knowledge cutoff).
The first results on Google were either provider PR pages or AI-spam content farms. Neither was authoritative. Neither was up to date.
Artificial Analysis existed and was great, but it focused on benchmarks rather than pricing/specs/comparisons. There was a gap.

If our internal table, even partially out of date, was more useful than the top Google results, the table belonged on the internet. The leap was less "let's build a website" and more "the internet is missing this; we accidentally have what's needed; let's stop hoarding it."

The honest version of why

We could pretend LLMDex was started purely as a public good. The honest version: programmatic SEO is a viable business model for niche reference content, and the AI tooling space in 2024 was a category where (a) the gap was real, (b) the audience was high-CPM and willing to click affiliate links, and (c) the data layer is the moat.

We built it for the same reason most great reference sites get built: a combination of "this annoys us" and "this could pay for itself."

The first version

The first public LLMDex shipped in late 2024 with 25 models, 20 tools, and a hand-coded Next.js site that took maybe two weeks to build. Three things were intentional:

No fabricated data. Every benchmark on the site traced to a public source. Where we couldn't verify, we left blanks. (We've written separately about this policy.)
Comparisons are programmatic, not LLM-written. Every /compare/[a-vs-b] page renders a verdict synthesized from data deltas, not generic AI prose. The page-shape is identical; the content is genuinely different per pair.
Affiliate links are visibly labeled. rel="sponsored" on every link. A "sponsored" badge in the UI. We never adjust rankings based on commercial relationships.

The early traffic was slow. The early users were friends and people who found us via Hacker News on launch day.

The pivot to programmatic SEO

Around early 2025 we realized the comparison pages, auto-generated from the dataset, were doing something the spec sheets and tool pages weren't: ranking on long-tail "X vs Y" searches that no other site indexed comprehensively. Within four months of starting to ship comparisons, those pages drove 70% of our traffic.

The lesson: in a fast-moving space with thousands of viable model pairs, programmatic SEO works if and only if the content quality at scale is real. Spam compares ("here's why X is better than Y, with no actual data") get penalized fast. Real compares (with actual data deltas, real benchmark differences, contextual prose) rank.

This shaped the rest of the build. We invested heavily in the comparison-generation logic. Every compare page on the site today is the output of lib/verdict.ts, a deterministic function that produces meaningfully different text per pair because the input data differs.

What we got wrong early

Three mistakes worth flagging:

1. Too much focus on the homepage

For three months in 2024 we obsessed over the home page. New users land on the home page, the thinking went. But our analytics showed something different: users almost never landed on the home page. They landed on a model spec or compare page from a search result. The home page was for repeat users, not first-time visitors. We should have invested in the spec pages first.

2. Over-engineered the sitemap

We tried to be clever about which pages to include in the sitemap based on "expected search demand." This was both speculative and fragile. We'd flip pages in and out of the sitemap, Google would re-crawl, traffic would yo-yo. The right answer was simple: sitemap everything that's indexable, let Google figure out demand.

3. Underinvested in trust signals

Our first version had no methodology page, no clearly-marked sponsored links, no last-updated stamps, no "about" page worth reading. Trust signals are slow to build but compound. We retrofitted these in early 2025. They're the table-stakes we'd build first if we redid the project.

What we got right

Three things we'd do the same:

Started with a real dataset. A working spreadsheet with 25 entries was more useful than 1,000 entries scraped poorly. Quality before quantity.
Avoided fabrication from day one. "Benchmark not yet available" was a deliberate choice. Trust earned slowly compounds; trust lost is hard to rebuild.
Shipped publicly early. The version we launched was embarrassing by 2026 standards. It was real, it got users, and the feedback loop made the product better than the perfect version we'd have shipped six months later.

What 2026 looks like

LLMDex covers:

80 LLMs with full spec pages
60 AI tools with alternatives pages
60 use-case guides
1,800+ head-to-head comparisons (auto-generated, programmatically verified)
Long-form blog (this article is one of them)
Friday newsletter

The site is profitable from sponsorships and affiliate revenue. It's still a small team. The maintenance burden is roughly 8 hours per week, most of which goes into keeping the dataset current as new models ship.

The audience is what we'd hoped: developers, AI engineers, technical founders making real decisions about what to deploy. The feedback we get is technical and specific (here's a benchmark you got wrong, here's a model you're missing) which is exactly the feedback that makes a reference site better.

Why we keep at it

Three reasons:

The domain is interesting. LLMs and AI tools are one of the most rapidly evolving fields in software. Cataloguing it well is genuinely useful work.
The economics support it. Programmatic SEO is a real model for reference content; we're proving it out on a niche we know.
The trust frame matters. "Honest, audited reference data" is a position we believe in. Defending it long-term is its own reward.

If you're considering building something similar, a directory, a database, a curated reference, the playbook is straightforward: pick a domain you know, ship a small honest version, invest in the data layer, optimize for trust signals, let SEO compound. The rest is patience.

Keep reading

Friday digest

One short email every Friday, new model launches, leaderboard moves, and pricing drops. Curated by hand. Free, no spam.