Forums Forums White Hat SEO Breaking Case Study: AI does not read schema; Schema dos not help – Mark williams Cook

  • Breaking Case Study: AI does not read schema; Schema dos not help – Mark williams Cook

    Posted by WebLinkr on September 15, 2025 at 1:17 pm

    As shared on Linkedin, X, BlueSky by LudvigHoel and Mark Williams Cook (the Tafferboy) and Barry Schwartz , j0udini

    From Mark Williams-Cook on LinkedIn:

    LLMs work by "tokenising" content. That means taking common sequences of characters found in text and minting a unique "token" for that set. The LLM then takes billions of sample "windows" of sets of these tokens to build a prediction on what comes next.The image below is some example schema that has a colour change applied which represents that set of characters is a unique token as made by the GPT-4o model. What you will notice is that the schema gets "destroyed". For instance, the schema "@type": "Organization", gets broken down so there are separate tokens for "type" and "Organization", which means that in terms of tokenisation the regular words "type" and "Organization" are not distinguishable from schema.

    From SE Roundtable

    There are a lot of folks in the community saying that implementing structured data / schema on your pages will help you with AI Search visibility. But few have really tested it until now. And those few tests show that adding structured data / schema does not help with your visibility in AI search, at least not yet.

    The first to test this was Mark Williams-Cook who posted on LinkedIn an experiment he conducted where he posted a "visual explanation of why your favourite LLM does not use schema in their core training data." He explained how when the LLMs process the page, it actually "destroys" the schema markup and thus does not use it.

    from:
    https://www.seroundtable.com/structured-data-schema-ai-search-visibility-40099.html

    WebLinkr replied 14 hours, 24 minutes ago 2 Members · 1 Reply
  • 1 Reply
  • PrimaryPositionSEO

    Guest
    September 15, 2025 at 1:38 pm

    Thrilled to see this

  • Rude_Tap2718

    Guest
    September 15, 2025 at 4:23 pm

    I’ve always suspected LLMs tokenize markup weirdly and this confirms it. Structured data works for Google’s rich results but doesn’t help AI training since tokenization destroys the schema structure.

    Classic SEO and AI search strategies are diverging more than people realize. Need completely different optimization approaches for each.

  • AbleInvestment2866

    Guest
    September 15, 2025 at 6:43 pm

    I always thought this was common knowledge, at least for anyone working with AI. Otherwise, you’d end up with biased data: just spam Schema and that’s it.

    It also goes against the very fundamentals of generative AI: multidimensional arrays of data versus a single data source. (It doesn’t even make sense as I write it!). Any introductory paper makes this clear, but I guess it’s good they found out. Not very breaking, tho, perhaps 20 years ago. (yes, I know they need views and sell ads, but indulge me with this)

    Schema has its uses, but AI is definitely not one of them.

  • peterwhitefanclub

    Guest
    September 15, 2025 at 7:18 pm

    The most ridiculous SEO specialty ever was a “schema specialist”. Oh, so you can read documentation and somehow think that’s worth people paying you for consulting without any other insight?

    No wonder those guys are struggling and trying to stay relevant by spreading misinformation. Good stuff from Mark here as usual.

  • satanzhand

    Guest
    September 16, 2025 at 12:53 am

    Cool test, but it feels a bit narrow. He’s showing how LLM tokenization flattens schema, not how Google AI search actually processes it. Schema still feeds into KG + retrieval systems before the LLM does its thing. Saying “schema doesn’t help” is like saying “minified JSON can’t power an app.” If people really want to believe schema is useless for serps, be my guest, makes my job easier.

  • raviranjan2291

    Guest
    September 16, 2025 at 8:38 pm

    It’s not 100% guarantee that schema will work in both the organic and AI overview results. It’s just the condition for marketers satisfaction only. Webpages without schema do rank on top with rich snippet.

  • easyedy

    Guest
    September 17, 2025 at 4:29 pm

    I just optimized a blog and also mentioned it in a separate post here. I added question/answers with FAQ snippet and I will find it out myself how it goes.

  • Franyer_Rivas

    Guest
    September 18, 2025 at 3:28 pm

    If AI can eat hidden text for prompt injection, I don’t see why shema wouldn’t be even more useful, anyway it’s not like it takes a lot of work to set up structured data, so it’s better to have too much than not enough.

Log in to reply.