← Back to blog

Six Months In: What We've Learned Watching Real Agentic Shoppers

Mid-2026 reflection. The January predictions that are tracking, the ones we got wrong on timing, what surprised us, and what the merchants who are winning are doing differently.

DATA · JUN 2026 Six Months In: What We'veLearned Watching RealAgentic Shoppers inceptionagents.com/blog iA

It’s June. We promised in January that we’d grade our 2026 predictions honestly at the mid-year mark. We also said we’d share patterns we couldn’t have predicted six months ago. This is that post.

The honest framing matters here. The agentic commerce category is moving fast enough that a one-page prediction sheet from January is going to look both right and wrong by June. Our intent isn’t to claim we called everything correctly. It’s to publicly say where we were directionally right, where we were wrong, what surprised us, and what we’re updating against for the second half of the year.

If you’re a merchant reading this, the most useful section is probably the “what merchants who are winning are doing differently” part toward the end. The retrospective context matters, but the actionable read is the pattern observation.

Grading our January predictions

We made five predictions in early January. Here’s how each is tracking at the five-month mark.

Prediction 1: AI assistants will overtake Google for high-intent product queries by Q4 2026. Tracking ahead of schedule for some categories, slower for others. The comparison-shape queries we identified as the leading edge are showing meaningful shift toward agentic surfaces in electronics, beauty, and outdoor categories. Apparel is mixed. CPG hasn’t moved much. The aggregate “high-intent commercial query share” metric won’t have a clean cross-over by Q4 the way we framed it, but the category-specific picture for the categories where it matters most is on track or ahead. Grading this directionally correct, with the caveat that the timing is category-dependent in a way we under-specified.

Prediction 2: Agentic checkout becomes default for mid-sized baskets. The mid-year reality is that the share of in-conversation checkout completion is meaningfully smaller than we expected. The capability is there. The buyer behavior is changing more slowly. Buyers who try in-conversation checkout once are using it again, but the share who’ve tried it is smaller than our January model assumed. The “default” framing in our January post implied a faster behavior change than what we’re seeing. Updating to: in-conversation checkout will be a meaningful share by year-end, with the largest shares in specific categories where the agent experience is most polished, but it won’t be the dominant path for most baskets in 2026. This is the prediction we got most aggressively wrong on timing.

Prediction 3: llms.txt mainstreams as a job-description line item. This is tracking strongly. Job postings for senior SEO, content, and e-commerce roles increasingly call out llms.txt ownership. Several enterprise brands have hired specifically for “agentic content strategy” roles. The category isn’t fully mainstream yet (it’s still a leading-edge signal, not a default one), but the trajectory is clear and the rate of adoption is faster than we expected. Grading directionally correct, with timing slightly ahead of plan.

Prediction 4: “Agent SEO” emerges as a discipline distinct from traditional SEO. Tracking, with caveats. The big SEO conferences (BrightonSEO and SMX in Q1, MozCon in Q2) all had Agent SEO tracks for the first time. The vendor landscape has consolidated around a smaller number of credible players. There’s a “what Agent SEO actually requires” debate happening in the practitioner community that resembles the early SEO professionalization arguments from 2003 to 2005. The discipline is forming. Grading correct on direction, with the discipline still earlier-stage than the headline of our January post implied.

Prediction 5: Walled gardens vs. open commerce becomes the year’s defining tension. This one is the one we’re most undecided on. There’s clear divergence happening, but the defining tension framing might be overstated. The walled gardens (Amazon, Walmart, Apple) have grown their agent surfaces. Take rates are getting clearer. Brands that depend heavily on those surfaces are starting to ask hard questions about unit economics. But the “defining tension” framing implies a moment of strategic clarity that hasn’t arrived. We’re updating this to a milder framing: the divergence is real and is going to be one of the things merchants think about, but it isn’t the single thing that defines 2026. Grading partial credit.

Aggregate honesty on the predictions: three directionally correct (1, 3, 4), one wrong on timing (2), one overstated (5). About what we expected. We’re not embarrassed by the scorecard, but we’re not patting ourselves on the back either.

What surprised us

A few patterns we didn’t anticipate in January.

The largest per-query commercial value lift is on planning queries. We expected the comparison-query shape to be the largest agentic shift. It is, in volume. But the largest per-query commercial value lift has been on planning queries: “help me put together,” “plan the X,” “everything I need for Y.” These queries weren’t really a Google category. The agentic version is producing multi-product baskets that are larger than typical single-product agent referrals. Merchants whose products fit naturally into agent-planned baskets (apparel-as-outfit, home-as-room, hobby-as-starter-kit) are getting disproportionate lift. We’re paying more attention to this in the back half of the year.

Persistent memory matters faster than we expected. When we wrote about persistent memory in the May post on platform updates, we framed it as a Q3 / Q4 signal. The early evidence is already showing it. Buyers who had a clean experience with a merchant in Q1 appear to be getting that merchant surfaced disproportionately in Q2 agent recommendations, with the agent citing the previous good experience as a reason. The effect is small in absolute terms but directionally clear. This is going to be a bigger story by year-end.

The honesty thesis is even stronger than we wrote. We’ve been writing about catalog honesty as an authority signal since the manifesto in November. The mid-year evidence is more emphatic than the early signal. The conversion-rate lift for merchants who run quarterly honesty audits and tighten cross-source consistency is real. The merchants who haven’t done this work are getting steady downweighting that they often don’t realize is happening because the absolute numbers are still going up (the surface is growing) but their relative share is declining. We’re going to be more pointed about this in upcoming posts.

The walled gardens behaved more aggressively than we anticipated. Amazon’s agent surface has been making fast-growing share gains in categories where Amazon has structural advantages (immediate shipping, broad selection, Prime members). The take-rate transparency hasn’t improved as much as we hoped. Merchants are getting offered “preferred placement” deals that look like they’re going to compress margin meaningfully if they accept. The walled-garden path is more concentrated and more expensive than the prediction post implied.

What merchants who are winning are doing differently

This is the section that matters most for readers. Across the merchants we observe who are visibly outperforming their category, three patterns stand out.

They allocated budget to the agentic surface earlier than their peers. The merchants who are leading category share on agentic discovery in May spent meaningful budget on the work (instrumentation, ACP integration, llms.txt development, honesty audits, agent-traffic measurement) in November, December, or January. They didn’t wait for the data to justify the spend. They committed before the spend was politically obvious internally, which means they have measurement now that’s letting them iterate.

They treat honesty as infrastructure, not a marketing decision. The winning merchants run catalog audits on a quarterly cadence. They’ve centralized product spec source of truth so internal consistency is automatic. They name limitations in product descriptions. They publish accurate review distributions. They don’t run cherry-picked comparison pages. The work isn’t glamorous. It compounds.

They wired the measurement before the launch. The merchants who have clean reporting on agent-referred traffic, on /agent/* endpoint usage, on shortlist mention rates across the major platforms, on the conversion rate of agent referrals vs. other channels, are the merchants making correct allocation decisions. The merchants who launched agentic features without the measurement are guessing. The guessing is producing worse outcomes.

The merchants who are losing across these categories often look fine in aggregate metrics. Their overall traffic might be up. Their conversion rate might be steady. But the share they’re losing is the high-intent slice, and the aggregate hides it for now. By BFCM 2026 it won’t.

What we’re updating against for the second half

A few things we’re going to be writing about differently after this retrospective.

We’re going to write more about the category-specific nature of agentic shifts. The aggregate framings are too coarse. The interesting work is at the category level, and the answers for apparel vs. electronics vs. CPG are different enough that one-size-fits-all advice is misleading.

We’re going to write more about planning-query optimization. We hadn’t anticipated this as a major surface in January. It is. There’s specific work merchants can do to be the natural fit for agent-planned baskets, and we’re going to break it down in detail in upcoming posts.

We’re going to write more about the walled garden vs. open commerce tradeoffs in concrete unit-economic terms. The May post raised the question. The mid-year evidence lets us start answering it with sharper takes on take-rate dynamics and break-even thinking.

We’re going to be more cautious about timing predictions. We were ahead of plan on three predictions and behind on one. The honest read is that we don’t yet have great calibration on how fast specific behavior changes propagate. We’ll be more rigorous about distinguishing “the capability has shipped” from “the behavior has changed at scale” in future predictions.

The point of the retrospective

The reason to do the retrospective publicly isn’t to grade ourselves. It’s to be the kind of source that earns trust by updating in public. The agentic commerce category has a lot of confident pronouncements floating around. Most of them won’t be checked against reality in six months. We’d rather check ours, update what was wrong, and be the source merchants come back to because we said where we were wrong.

The other reason is that the patterns matter regardless of how our January predictions scored. The merchants who recognize the planning-query opportunity, who run quarterly honesty audits, who allocate to the agentic surface ahead of their peers, who measure before they ship: these merchants are going to outperform. The work is concrete. The window for being early on it is closing.

Six months in. The work compounds.

Free audit

See how AI agents see your store.

Get your AI agent readiness score in under 60 seconds. We crawl your site the way ChatGPT, Claude, and Perplexity do — and tell you exactly what's slowing them down.

inceptionagents.com