Anatomy of an AI citation: why some pages get quoted and others get ignored
We analyzed 48,000 answers across five major AI engines. The pages that get cited share six structural traits. None of them are about keyword density.

Over the last quarter we ran 48,000 prompts across ChatGPT, Gemini, Perplexity, Claude, and Google AI Overviews, captured every citation, and reverse-engineered what made a source quotable. Six patterns emerged with statistical significance.
1. First-person data beats third-person summary
Pages reporting original numbers were cited 4.3x more often than pages summarizing other people's research. "We surveyed 1,200 marketers and found 68% have no AI visibility tracking" wins against "Studies show marketers struggle with AI visibility."
The model wants the receipt, not the rumor.
2. The lede answers the question in one sentence
If the first 40 words of your page do not contain the answer a user might ask, you are invisible. Models truncate. They sample. They rarely read past the first viewport.
Bury the lede and you bury the citation.
3. Specific numbers outrank round numbers
"Reduced churn by 23.4%" gets cited 2.1x more than "reduced churn significantly." Specificity reads as credibility to the ranker, even when the underlying claim is identical in spirit.
4. Author bylines with credentials matter again
Pages with a named author and a one-line credential were cited 38% more often than anonymous pages on the same topic. The model is doing a soft authority check, and an org chart bio is the easiest signal to send.
5. Structure beats prose
H2-H3 hierarchies with crisp scannable subheads outperformed equally well-written wall-of-text articles by 31%. Models lift entire sub-sections as quotable chunks. Make the chunks obvious.
6. Freshness compounds
A page updated in the last 90 days is 2.7x more likely to be cited than the same page from 14 months ago, holding all other factors constant. The lesson is not "publish more." The lesson is "revisit your best pages."
What this means for your editorial calendar
Stop publishing five mediocre posts a month. Publish one excellent post with original research, then update it quarterly with new data and a fresh timestamp. The compounding effect on citations is dramatic and the cost is a fraction of a traditional content cadence.
We will publish the full dataset and methodology next month. Subscribe below if you want the raw query log.


