Page Grounding Probe [Free AI SEO Tool] by DEJAN SEO

White Hat SEO

Page Grounding Probe [Free AI SEO Tool] by DEJAN SEO

Posted by WebLinkr on March 11, 2026 at 3:03 pm
How Google’s Grounding Pipeline Works

DEJAN reverse-engineered Google’s Gemini grounding pipeline by examining raw groundingSupports and groundingChunks from the API. The pipeline operates in this sequence:
1. User enters a prompt.
2. Query fanout: A model decomposes the prompt into single-intent sub-queries (fanout queries).
3. Retrieval: For each fanout query, Google’s search index returns ranked results, narrowed to ~5–20 sources per query.
4. Extractive summarization (snippet construction): For each selected result, the system builds a grounding snippet. Page content is chunked into sentences, each scored against the query, and the highest-scoring chunks are assembled into the snippet — joined by ellipses where non-contiguous.
5. Grounding context assembly: All snippets across all sources are supplied to the model as context alongside the user prompt, media, and personalization signals.
6. Synthesis & attribution: The model generates its answer, and each claim is attributed back to specific source sentences.
Key insight: Because snippets are query-dependent, the same page yields different extractions for different fanout queries.

The Extraction Method: Extractive Summarization

Google uses extractive (not abstractive) summarization for grounding. This means it pulls exact sentences from your page — it does not rewrite or paraphrase your content for the grounding context.

Observed Extraction Characteristics
- Query-focused selection: Sentences semantically close to the query are strongly preferred. Unrelated sections on the same page are skipped entirely.
- Heavy positional/lead bias: Opening paragraphs are extracted almost wholesale, regardless of content.
- Structural noise ingestion: Table-of-contents entries, section headers, link artifacts, and ¶ markers are treated as sentences and scored alongside prose.
- Sentence-level granularity: The extraction unit is individual sentences, not passages or paragraphs.
- Confidence scores: Per-chunk scores range from 0.1 to 1.0, representing grounding-source-to-generative-chunk relevance.
DEJAN successfully fine-tuned mic

Source: https://dejan.ai/blog/sro-grounding-snippets/

Bot/CloudFlare Notes

Check your robots.txt:

User-agent: DataForSeoBot
Allow: /

User Agent String: Mozilla/5.0 (compatible; DataForSeoBot/1.0; +https://dataforseo.com/dataforseo-bot)

The bot obeys robots.txt rules and crawl-delay directives.
WebLinkr replied 1 week, 6 days ago 2 Members · 1 Reply
1 Reply

PrimaryPositionSEO

Guest
March 11, 2026 at 3:03 pm

Oh, this is cool
yekedero

Guest
March 11, 2026 at 3:26 pm

Query fan out, ah yes… I will drink to that.

Life’s too short to be sitting around miserable.
Lucifer_x7

Guest
March 11, 2026 at 3:59 pm

I thought there was no tool promotion in this sub? Free or paid.

Page Grounding Probe [Free AI SEO Tool] by DEJAN SEO

How Google’s Grounding Pipeline Works

The Extraction Method: Extractive Summarization

Observed Extraction Characteristics

Bot/CloudFlare Notes

PrimaryPositionSEO

yekedero

Lucifer_x7