The major LLM platforms shipped a wave of updates over the last six weeks that materially change what agents can do in commerce. Most of the coverage has framed these as general capability improvements. They are, but they also change specific things about how agents shop on behalf of buyers.
This post pulls the three trajectories that matter for merchants out of the broader announcement noise, explains why they shift the agentic commerce picture, and lays out what each implies for the work merchants should be doing this quarter.
The trajectories are: longer context windows, better tool use reliability, and persistent memory across sessions. Each of these is a multi-quarter trend, not a single release. The recent updates are an inflection point on each.
Longer context: bigger comparisons, less buyer hand-holding
Context window expansion has been a steady trend since the original GPT-4 release. The recent step changes (from the previous-generation ceilings into the 500K-to-1M token range across multiple platforms) mean something specific for commerce: agents can now hold the equivalent of dozens of product detail pages simultaneously while doing comparison work.
The behavioral change this enables is meaningful. Previously, agents doing comparison work either had to retrieve summaries of each candidate and lose specifics, or had to hold a small set of candidates in context and drop the rest. Now they can hold the full PDP detail of 10 to 20 products at once, plus the full reviews, plus the policy disclosures, plus the spec sheets.
For merchants, this changes the structured-data calculus. Previously, you wanted your structured data to be terse because every byte mattered against the agent’s context budget. Now you can be more thorough. The temperature rating, the dimensions, the material breakdown, the care instructions, the warranty terms, the policy specifics: all of these can be present in the structured response without the agent having to choose which ones to retain.
The corollary, which matters more: agents are now better at picking up cross-product contradictions. If your Carrigan jacket PDP says it’s rated to -5°F and your category page calls it “rated for moderate winter,” the agent can hold both and flag the inconsistency. Six months ago the agent might have missed it. Now it catches it. The honesty bar just went up.
What to do this quarter: Audit your structured data for cross-page consistency. The contradictions that didn’t matter when the agent could only see one page at a time now matter when the agent can see all of them simultaneously. The fix is to centralize your product specs in a single source of truth and have every page (PDP, category, FAQ, llms.txt) read from it.
Better tool use: real agentic checkout becomes routine
Tool use reliability has been the gating factor for serious agentic commerce. The agent that can recommend a product but can’t reliably complete a checkout has narrow practical utility. The agent that can do both has full-funnel utility.
The reliability improvement in the recent releases has been concrete. Function calls that previously failed often enough to break checkout flows are now reliable enough to use in production. Multi-step transactions that previously broke at the second or third call now complete cleanly across longer sequences. The ACP-based checkout flow that was a research demonstration in 2024 is now production-grade across the major platforms.
The buyer behavior implication is that agentic checkout shifts from “early adopter” to “default for high-intent sessions.” Buyers who try it once and have it work are using it the next time. Buyers who try it and have it fail are returning to the browser-based path. The reliability inflection means the share of buyers who have a successful first experience with agentic checkout has climbed significantly in the past six weeks.
For merchants, this means the ACP integration that was a “nice to have” through Q1 is now a “ship it” for Q2. The merchants who have it wired are getting growing share of agent-mediated transactions. The merchants who don’t are getting bypassed in favor of competitors who do.
What to do this quarter: If you haven’t enabled ACP on your commerce platform, do it before BFCM planning starts in earnest. If you’re on a custom stack, the work to publish a clean ACP manifest and wire the back-end checkout is a sprint or two. The merchants who finish this work in Q2 will go into BFCM 2026 with a structurally larger addressable agent-mediated market than the merchants who don’t.
Persistent memory: repeat-buyer relationships shift
The third trajectory is persistent memory. Agents are increasingly able to hold context across sessions, weeks, and months. The buyer who told ChatGPT in February that they prefer synthetic-fill jackets, run cold, and ski in Vermont can have the agent surface that context in October without re-stating it.
The commerce implication of persistent memory is that the agent becomes a long-running personalization layer the merchant doesn’t control. Previously, personalization was something merchants did. They built profile data, ran segmentation models, served different content to different segments. Increasingly, the personalization is happening outside the merchant’s site, in the agent’s persistent memory of the buyer.
This is a strategic shift. The buyer-loyalty relationship that used to be expressed through email lists, loyalty programs, and account-based retargeting is now also expressed through what the agent remembers about the buyer’s previous interactions with your brand. A buyer who had a great experience with you (clean product, accurate description, fast shipping, honest policy) has that context remembered. The next time they ask the agent for something in your category, you start with a meaningful advantage. A buyer who had a bad experience with you has that context remembered too. The next time, you start with a meaningful disadvantage.
The merchant work this implies is on the experience side, not just the discovery side. The agent’s persistent memory rewards merchants who deliver consistently on the promises in their structured data. It punishes merchants who don’t. The “trust layer” thesis we’ve been writing about gets sharper teeth with persistent memory in play.
What to do this quarter: Map the post-purchase experiences your buyers have and ask which ones the agent would remember. Did the jacket arrive when you said it would? Did the temperature rating match the buyer’s experience in the field? Did the return process work as the policy described? The merchants who tighten the gap between structured promise and delivered experience build a compounding advantage in the persistent-memory era. The merchants who don’t will find their structured data getting downweighted by agents whose buyers report inconsistent experiences.
The combined picture
Each of these trajectories is meaningful on its own. The combination is larger than the sum, because the trajectories reinforce each other.
Longer context means agents can hold more comprehensive comparison sets. Better tool use means they can act on the comparison without breaking. Persistent memory means they remember which actions led to good outcomes and which didn’t.
The merchant who provides clean, comprehensive, consistent structured data, integrates cleanly with agent-mediated checkout, and delivers the experience their structured data promises will compound on the new surface in a way that was theoretical six months ago and is concrete now.
The merchant who hasn’t done this work, or who has done it sloppily, will compound in the other direction.
What we’ll be watching through Q3
Three things we expect to play out over the next 60 to 90 days as these capabilities mature:
The conversion-rate differential between agent-referred traffic and search-referred traffic will widen. We expect agent traffic to convert at a meaningful multiple of site-wide CR in well-prepared merchants, with the differential widening as the agents get better at filtering for high-confidence sessions.
Agent-mediated checkout share will cross meaningful thresholds in specific categories. We expect electronics, beauty, and home to lead, with apparel and CPG following. The category-specific trajectories will be uneven, with leading categories pulling ahead faster than aggregate numbers suggest.
Persistent-memory effects will start showing up in repurchase data. The brands that delivered well in Q1 and Q2 will see disproportionate share of repeat agent-mediated purchases in Q3. The brands that delivered poorly will see the opposite. The signal will be noisy because the memory feature is still rolling out, but the directional trend should be visible by August.
The capability improvements are real. The merchant implications are not theoretical. The work that compounds is the same work we’ve been writing about for the past six months. The recent updates just made it more urgent.