Your Product Data Isn’t Just for Humans Anymore
Fixing your product hierarchy today could set you up for the agentic web tomorrow.
We need to talk about product data.
Not the glamorous, marketing-ready product copy. Not the assets that make a PLP sing. The boring stuff. The attributes. The hierarchy. The product types and parent SKUs. The behind-the-scenes data that makes the modern retail machine run.
Because here’s the truth: your product data isn’t just powering your website or your ERP anymore. It’s becoming your interface. And not just for your customers, for the AIs they’ll increasingly use to shop, research, and transact on their behalf.
The web is changing. Quietly. Radically.
With projects like Microsoft’s NLWeb, we’re seeing the early infrastructure of the so-called agentic web take shape: a future where users talk to digital agents, and those agents talk directly to websites. Not by parsing HTML or crawling badly tagged content, but by asking structured questions and receiving structured answers.
NLWeb is one of the first steps toward making this a reality. It connects the content and structure of your website with a natural language interface. If your site has well-organised product data marked up in Schema.org or JSONL, suddenly, it’s not just human-friendly. It’s agent-friendly.
What this means is simple: in the near future, your product data will be your API for the world.
Structured product data: from afterthought to interface
The data model you use inside your PIM or ERP today might have been shaped by operational convenience. What’s the minimum required to generate a product page? What gets the job done for inventory? For merchandising? For returns?
But those same models are now being exposed (intentionally or not) to the outside world. Through MCP protocols, through AI interfaces, through federated agents acting on your customer’s behalf.
And they’re revealing just how shallow or inconsistent most brands’ product data really is.
One product is tagged as “boots”, another as “ankle boots”, and another as “footwear”
Sizes are listed as "S, M, L" in one category, and "UK 6, UK 8, UK 10" in another
Colour isn’t a clean attribute, it’s buried in the title
Hierarchies are so flat that everything from jeans to socks sit under "clothing"
This isn’t just a headache for internal teams anymore. It’s going to make your brand invisible to the next generation of shoppers and their agents.
"What’s the best gift for a 3-year-old that doesn’t need batteries?"
A human might eventually find it on your site.
An AI assistant? Not unless your data is structured enough to:
Know which products are for toddlers
Understand that "no batteries" implies no electronics
Connect "gift" with price point, packaging, and availability
If those things exist only as marketing fluff, or worse, not at all, you're not in the conversation.
And in the agentic web, not being in the conversation is the same as not existing.
This isn’t sci-fi. It’s already happening.
Look at the early adopters of NLWeb: Tripadvisor, Shopify, Hearst, O'Reilly. These aren’t fringe players. These are companies investing in making their data accessible to both humans and machines. Not because it’s cool, but because it will soon be commercial suicide not to.
In the same way HTML turned documents into webpages, Schema and structured data will turn websites into AI interfaces. If your product data isn’t ready for that, you’re essentially shipping a black box to a world built on glass.
So what does ‘ready’ look like?
At Commerce Thinking, we spend a lot of time helping brands clean up their product data for internal alignment. But increasingly, the goal isn’t just system interoperability. It’s semantic clarity. Here’s what that looks like:
Consistent product types and variants across categories and feeds
Clean, normalised attributes (like colour, material, fit, intended age group)
Explicit hierarchies that make sense to a machine, not just a merchandiser
Schema.org and JSONL markup baked into your site architecture
System-level data governance so these structures don’t degrade over time
Your data is either a bridge or a wall
We’ve long said that fixing your product data isn’t about perfection, it’s about removing friction. But now, there’s a new type of friction to worry about: the gap between your site and the systems that will increasingly interface with it.
And the gap isn’t closed with more filters or better CMS copy. It’s closed by turning your product data into a structured, queryable, machine-friendly asset.
It’s not enough to be searchable.
You have to be understandable.
And that starts in your PIM. In your ERP. In the bones of your systems, where product data is born and mismanaged.
Let’s fix the foundations.
If you’ve already been working on your ERP and PIM setup, good. You’re halfway there.
But let’s be clear: good internal data doesn’t guarantee external discoverability. That requires a shift in mindset. From thinking of product data as something you use, to thinking of it as something you expose.
NLWeb and the agentic web will reward brands who think ahead. Who invest in clarity. Who know that clean product data isn’t just good operations, it’s good strategy.
We can help you get there.
Your next customer might not be a person.
Make sure their assistant can find you.