Isometric illustration of a trophy on ascending steps
Research

Why AI Picks Winners: The Research Behind LLM Brand Bias

A growing body of academic research shows that AI recommendations aren't neutral. They're systematic, measurable, and surprisingly predictable. Here's what the science says.

Mack Grenfell
March 21, 2026
12 min read
The Science Behind AI VisibilityPart 1 of 4
Share

When researchers test how AI models recommend products, the results aren't subtle. Ask five different LLMs to recommend a CRM for small businesses, and you'll get five meaningfully different answers. Not random different - systematically different, in ways that trace back to how each model was trained, fine-tuned, and aligned.

Over the past two years, a growing body of academic research has started to map exactly how LLMs form brand preferences. These aren't opinion pieces or industry reports. They're controlled experiments - some with hundreds of thousands of data points - that reveal consistent, measurable biases in how AI systems decide which brands to recommend.

I've spent the past few weeks reading everything I could find. Here's what the research actually says, and why it matters for anyone whose brand depends on being recommended.

The research landscape

The field is young but growing fast. The foundational paper dropped in early 2024 from Princeton: GEO: Generative Engine Optimization established that content optimization can increase how often LLMs cite and recommend specific sources by 30-40%. That's not a marginal effect. That's the kind of number that launches an industry.

GEO: Generative Engine Optimization

Content optimization increases LLM visibility by 30-40%. Statistics (+22%) and quotations (+37%) provide the strongest lift. Lower-ranked sites benefit more from optimization than dominant ones.

Princeton / KDD 2024

Since then, researchers have been pulling at threads. Some have studied the cognitive biases embedded in LLM decision-making. Others have mapped systematic brand preferences across models. A few have explored how easy it is to manipulate recommendations through prompt engineering.

The picture that emerges is clear: AI recommendations aren't neutral, and they aren't random. They're the product of identifiable forces that brands can - and should - understand.

Three kinds of bias

Reading across a dozen papers, I see the biases clustering into three categories. Each operates at a different level, and each has different implications for brands.

Training data bias. The most intuitive kind. LLMs learn brand preferences from the web content they were trained on. Global brands like Nike and Apple appear orders of magnitude more often in training data than local or niche alternatives, so models develop a default preference for them. One study found that LLMs systematically favor global brands over local ones, recommend luxury brands to users from high-income countries, and exhibit significant country-of-origin effects.

Global is Good, Local is Bad? Understanding Brand Bias in LLMs

LLMs systematically favor global brands over local ones, recommend luxury brands to high-income country users, and exhibit significant country-of-origin effects in product recommendations.

University of South Florida / EMNLP 2024

Alignment bias. The less obvious kind. When models go through RLHF (reinforcement learning from human feedback), the fine-tuning process doesn't just make them more helpful. It narrows their recommendation pool. Research shows that alignment training causes models to overweight majority preferences, effectively reducing the diversity of what gets recommended. Dominant brands get more dominant; niche alternatives get squeezed out.

Prompt sensitivity. The most unsettling kind. Simply rephrasing a question - same intent, different words - can cause up to a 100% difference in which brands get mentioned. This isn't about clever prompt engineering. It's about synonym-level perturbation that no human would notice, producing entirely different brand recommendations.

567K
product recommendation samples analyzed across five major LLMs, revealing that different models recommend distinct products with remarkably low overlap.
Exposing Product Bias in LLM Investment Recommendation, 2025

Why this is measurable

The important thing about systematic bias is that it's systematic. If LLM brand preferences were random, there would be nothing to do except shrug. But the research consistently shows they're patterned, reproducible, and - in many cases - predictable.

This means three things for brands. First, you can measure where you stand. If you know which biases exist, you can test whether they're affecting your visibility. Second, you can compare across models. Different LLMs have different preferences - what works on ChatGPT may not work on Claude or Gemini. Third, you can track changes over time. As models update, so do their biases.

We've seen this in our own Model Divergence study, which analyzed 920K+ model comparisons and found that LLMs agree on brand rankings only 43.9% of the time. The academic literature provides the theoretical framework. The data confirms it at scale.

What this series covers

This is the first in a four-part series synthesizing the academic research on AI brand biases. The remaining parts dive deeper into specific themes:

The series

The overall picture is encouraging, if a little humbling. AI recommendations aren't a black box. The mechanics are becoming understood. But they're also more complex, more fragile, and more model-specific than most brands realize. Understanding the science is the starting point.

Mack Grenfell
Mack GrenfellFounder

Founder of Trakkr. Previously built Byword, one of the most widely-used AI writing tools. Writes about AI visibility, brand strategy, and the shifting landscape of search.

[01]

Related

See how AI talks about your brand

Enter your domain to get a free AI visibility report in under 60 seconds.

14-day trialCancel anytime60 second setup