I Tested Five AI Shopping Assistants. Merchants Won’t Like What I Found.
For two weeks I ran every significant purchase through five different AI shopping assistants — ChatGPT shopping, Perplexity shopping, Google’s AI shopping, a Claude-based agent I set up myself, and a commercial purchasing tool one of my suppliers uses. I tracked where each one succeeded, where each failed, and what patterns emerged across the failures.
I was looking for insights for my own business. What I found was a stress test of merchant infrastructure that most merchants have no idea they’re failing.
The Consistent Winners
Across all five assistants, a clear pattern emerged: big-box retailers with sophisticated product data infrastructure performed well. Amazon, obviously, but also Grainger, Home Depot’s commercial side, and a few industrial distributors who’d clearly invested in their API layer. Searches completed. Prices were accurate. Availability was reliable. Checkout either completed or handed off cleanly.
The assistants weren’t equally capable — some were dramatically better at navigating complexity and handling edge cases. But the merchants who performed well tended to perform well across all five. That tells you something: the bottleneck was data quality and checkout architecture, not AI capability.
The Consistent Failures
Mid-market merchants — the $10M to $200M revenue range — failed frequently and in predictable ways.
Product descriptions built for human reading, not machine parsing. When I asked an assistant to find a pump with specific pressure ratings and flow characteristics, it would surface results from large distributors who structured that data as searchable attributes, not from specialty vendors whose knowledge existed only in paragraph form buried in a product page.
Pricing that changes between browse and commit. Three times across my testing I had an assistant quote a price, confirm availability, and then hit a checkout that showed a different number — higher by anywhere from 8% to 31%. Two of the three were just stale cache issues. The third was a “web price vs. logged-in customer price” distinction the merchant hadn’t accounted for in their agent interface. All three resulted in abandoned transactions.
Shipping commitments that don’t survive checkout. “Typically ships in 3 days” in the product data, then a checkout screen showing 7-10 business days. Every assistant I tested flagged this as a data integrity failure and escalated or abandoned.
The Category That Surprised Me Most
I expected luxury and specialty goods to perform poorly — complex products, high-touch sales models, lots of human judgment involved. Some did fail. But the worst performers were mid-market commodity suppliers. Companies selling standard industrial products, office supplies, building materials. Stuff where there’s no complexity justification for the data mess.
These companies have spent 20 years optimizing for human search: good photography, persuasive copy, competitive pricing. They haven’t touched their data architecture because it’s never mattered before. Now it’s the thing that determines whether an AI agent that’s buying exactly what they sell ever reaches them at all.
What This Means for Your Business
If you’re a merchant and you want to understand your exposure, here’s a quick diagnostic: have someone use an AI shopping assistant to try to buy your most popular product. Don’t help them. Don’t explain anything. Just watch.
If the assistant can find you, get an accurate price, confirm availability, and reach a checkout that completes — you’re okay for now. If any of those four things fails, you have a specific, fixable problem that is costing you agent-driven transactions today.
The AI shopping assistant landscape is going to consolidate and improve dramatically in the next 18 months. The agents are getting better faster than merchant infrastructure is. If you don’t fix the foundation before the agents mature, you’ll be invisible to a purchasing channel that’s about to become significant.
I’m saying this as someone who now uses AI agents for about 40% of my business purchasing. The suppliers I found through these tests, who performed well, got added to my agent’s preferred vendor list. The ones who failed aren’t getting a second chance — the agent will just route around them automatically, every time, without anyone making a decision about it.
That’s the quiet way market share moves now. No press release. No announcement. Just a routing preference baked into a model.
Leave a Reply