ResearchApril 23, 2026·8 min read

What language models actually read on your page (eye-tracking for robots)

We instrumented our crawler to log exactly which sections of a page get pulled into model context. The results will change how you structure content.

Linnea Hartmann

Research Engineer

What language models actually read on your page (eye-tracking for robots)

We built an instrumented retrieval pipeline that logs exactly which chunks of a webpage get pulled into the context window when a model answers a question about that page's topic. After two months and 1.2 million logged retrievals, the patterns are clear and actionable.

The first 200 words are read 94% of the time

If your answer is not in the first 200 words, it is effectively not on the page. This held across every CMS, every page length, and every topic vertical we tested.

The implication is uncomfortable for long-form writers. The middle and end of your article exist mostly for human readers. The model has already moved on.

Headings are weighted 3x

Text inside an H2 or H3 is roughly three times more likely to be retrieved than equivalent text inside a paragraph. Models use heading hierarchy as a relevance signal in a way that crawlers historically did not.

The practical move: turn your most quotable claims into sub-headings. A claim that lives only in body copy is invisible compared to the same claim promoted to an H3.

Bulleted lists are over-retrieved

Lists get pulled into context 2.4x more often than equivalent prose. Models like the cleanly delimited structure. If you have a claim that could be a list, make it a list.

Three to five items is the sweet spot. Lists longer than seven items show diminishing retrieval per item.

Tables are gold for comparative queries

For "X vs Y" prompts, a clean HTML table on your page increases your citation rate by 5.8x. Models can lift table rows almost verbatim into comparative answers, and they do so eagerly.

If you sell anything that competes against alternatives, you need at least one well-built comparison table on your site.

Footers, sidebars, and nav are ignored

We logged near-zero retrievals from page chrome. You do not need to optimize your footer for GEO. You do need to make sure your main content area is not buried beneath promotional modules, cookie banners, or hero images that push the actual content below the first viewport.

The five-edit audit

Pick your three highest-intent pages. For each one, do five things this week:

1. Move the strongest claim into the first 80 words. 2. Promote two body claims to H3 sub-headings. 3. Convert one paragraph into a 4-item bulleted list. 4. Add or improve one comparison table if relevant. 5. Cut any chrome that pushes content below the first viewport.

In our customer panel this five-edit audit produced a median 28% citation lift inside 21 days. It is the highest-leverage GEO work most teams can do in an afternoon.

What language models actually read on your page (eye-tracking for robots)

The first 200 words are read 94% of the time

Headings are weighted 3x

Bulleted lists are over-retrieved

Tables are gold for comparative queries

Footers, sidebars, and nav are ignored

The five-edit audit

Keep reading

GEO vs SEO: What actually changes when AI becomes the front page

Anatomy of an AI citation: why some pages get quoted and others get ignored

Why your AI share of voice dropped 40% overnight (and what to do about it)