When your AI shopping agent screws up, who gets the bill?

Geordie Kaytes

27 Oct 2025 — 12 min read

It's the end of 2025, and AI agents (previously limited to mission-critical therapist, oracle, and haiku-generation duties) have learned how to spend money. And it's likely that, at least in the beginning, they will not spend it particularly wisely.

Imagine this: After one watery conference room coffee too many, you have reached your breaking point. You fire up your favorite AI agent and issue a fateful prompt: "buy me the strongest coffee you can find."

The following week, twelve test tubes of industrial-strength coffee extract arrive at your door, festooned with angry-looking warning stickers and marked "for laboratory use only." Oh, and the bill is $2,195. No refunds.

Who is on the hook?

The first agentic commerce holiday season is underway. 70% of consumers are "at least somewhat" comfortable having AI agents make purchases on their behalf. Merchants face a wave of disputes the payment system wasn't designed to handle.

Call it the SNADpocalypse: an unprecedented flood of "significantly not as described" claims from buyers blaming merchants for the actions of their own shopping agents.

This isn't handwringing. ChatGPT Instant Checkout launched September 29. Google’s AP2 protocol has 60+ industry partners. Visa reports a 4,700% surge in AI-driven traffic to U.S. retail sites in October 2025. The infrastructure exists. The volume is spiking. The dispute frameworks are 25 years old.

Yeah, something's going to break.

What is agentic commerce?

Agentic commerce describes transactions where AI systems act autonomously on your behalf to discover, evaluate, and purchase goods or services—with varying degrees of human oversight.

You can think of shopping assistants as falling into similar "levels" as those we apply to assistive driving technology, from lane change alerts to sleep-in-the-backseat self-driving capabilities.

Level 0: Recommendation only. AI suggests products, humans execute all transactions. Current Amazon recommendations. Google Shopping. You're in complete control.

Level 1: Assisted shopping. AI helps compare and filter, humans approve each purchase. Current ChatGPT product search. The AI is your research assistant, not your buyer.

Level 2: Supervised autonomy. AI executes purchases within pre-defined parameters, humans set boundaries. Stripe’s Agentic Commerce Protocol with per-transaction approval. You see the checkout screen and click yes or no.

Level 3: Delegated authority. AI makes routine purchases independently based on standing mandates, humans intervene only for exceptions. Google’s AP2 with cryptographic mandates. You set the rules once, the agent buys within them.

Level 4: Full autonomy. AI manages entire purchasing lifecycle including dispute resolution. Nobody's here yet (and maybe nobody should be).

We're transitioning from Level 1 to Level 2-3 right now. That transition creates the liability gap.

How we got here

Even for a non-Luddite, the velocity of progress is frankly alarming:

2011-2015: Voice assistants (Siri, Alexa) enable product search but not purchase. You ask, they show, you buy manually. Or, like, you don't, because mostly they suck.

2016-2020: The era of The Algorithm. Instagram ads that know you better than your husband does. AI recommendation engines personalize shopping, but humans click "buy." This is… better?

2021-2023: Large language models hooked up to web search enable conversational web discovery. You can ask complex questions, get nuanced answers—but it's not optimized for commerce yet. A shopping query is treated the same as a request to translate your apartment lease into pirate talk. How innocent we were.

2024: ChatGPT and Perplexity integrate shopping features. Agents can find products but still can't transact. Like Crest WhiteStrips at a San Francisco Walgreens, checkout remains locked in a little acrylic box, needing a human present to turn the key.

September 2025: ChatGPT Instant Checkout launches using Stripe ACP. Agents can now complete purchases. The acrylic cracks.

October 2025: Agentic commerce protocols proliferate. With Instant Checkout launching September 29, the 2025 holiday season is the first meaningful test of agent-led checkout at scale.

E-commerce took 10-15 years to develop mature dispute resolution frameworks. Agentic commerce is attempting to build equivalent infrastructure in 1-2 years while transaction volumes surge.

The authorization gap nobody solved

Traditional chargeback frameworks assume a human made the purchase decision. You saw the product. You clicked buy. You entered your card details. If the product didn't match, that's the merchant's problem—significantly not as described (SNAD).

Agentic commerce breaks that model.

You told your agent to buy "coffee." It bought a dozen vials of coffee-derived lab reagent that could stop your heart as surely as the cyanide it was probably stored next to in the warehouse. Agent error? Merchant liability? Who misunderstood whom?

You authorized an agent generally—"handle my grocery shopping"—but did you authorize this specific $300 purchase from this specific grocery store at this specific moment? And what if the potato chips aren’t crinkle-cut how you like them?

The agent hallucinated product features that don't exist. Is that the AI company's fault for the hallucination? The merchant's fault under "not as described" provisions? Your responsibility for trusting an unreliable agent? Is it Eliezer Yudkowsky’s fault, somehow?

Traditional chargeback frameworks assume a human made the purchase decision. When an AI agent operates autonomously—potentially completing transactions before human awareness—the existing rules break down.

The result: Liability remains unclear when AI agents complete transactions. Merchants may bear fraud costs despite shoppers never visiting their websites.

We don't have legal answers. But we do have three protocols racing to create technical solutions before regulatory frameworks catch up.

Three protocols to manage — and maybe prevent — the coming chaos

The payment industry is attempting something unprecedented: building dispute prevention infrastructure before crisis forces adaptation.

Stripe’s Agentic Commerce Protocol launched with OpenAI in September. The model: when the agent wants to complete a purchase, it presents an inline checkout interface showing product, price, seller, and shipping details. You explicitly approve this specific transaction. Merchants verify authorization through Stripe’s Shared Payment Token API or delegated payments spec in the protocol.

Strength: clear moment of human consent for this purchase.
Weakness: doesn't scale to high-frequency, low-value autonomous transactions. If you want the agent to buy groceries every week without asking, this breaks down.
Best for: Level 2 autonomy—supervised agent purchases where you approve each transaction.

Google AP2: Cryptographic mandates with bounded authority

Google’s AP2 protocol (announced September 16, 60+ partners in early pilots) takes a different approach. You sign a cryptographically verifiable "mandate" defining spending limits, merchant categories, and time restrictions. Example: $50 per transaction, $500 per month, groceries and household goods only. The agent operates within those boundaries autonomously.

Strength: scales to recurring, routine purchases while maintaining authorization proof.
Weakness: "misinterpretation" disputes remain ambiguous. The agent stayed within your $50 limit but bought the wrong product. Is that significantly not as described? The mandate doesn't cover that. Requires sophisticated mandate management.
Best for: Level 3 autonomy—delegated authority with financial guardrails.

Visa TAP: Merchant-side verification

Visa’s Trusted Agent Protocol (announced October 14; documentation on Visa Developer and GitHub) addresses a different problem: merchants can't tell legitimate AI agents from malicious bots. With AI-driven traffic surging 4,700% year-over-year, bot fraud is exploding alongside legitimate agent purchases.

TAP helps merchants distinguish legitimate AI agents from malicious bots through agent identity attestation and verification protocols. Merchants can distinguish ChatGPT making a purchase from a bot pretending to be ChatGPT.

Strength: addresses the 4,700% surge in AI-driven retail traffic; prevents bot fraud masquerading as agent purchases.
Weakness: doesn't directly address human intent verification. Even if you verify the agent's identity, you still don't know if the human actually authorized this specific action.
Best for: complementary infrastructure securing the merchant endpoint.

All three protocols flip the dispute model from post-transaction resolution to pre-transaction authorization proof. But will that be enough to tie the intent of a transaction back to a properly-informed consumer?

The identity layer that might save merchants

The most interesting development to me isn't the payment protocols—it's the identity verification layer emerging beneath them.

Prove Verified Agent (launched October 2025): Creates an "end-to-end chain of custody" linking verified identity, human intent, payment credentials, and consent—all backed by cryptographic proof. Integrates with Visa's payment infrastructure.

Visa Payment Passkey (live in Middle East, expanding globally): FIDO2 biometric authentication replacing passwords and OTPs. Uses fingerprint, facial recognition, or device PIN. Already processing real transactions through noon payments (lowercase sic).

The innovation: non-repudiable proof that a specific human authorized a specific agent action at a specific moment, before the transaction occurs.

It's kind of like a chip-and-PIN for agent purchases. The merchant can prove the person who authorized the purchase was actually present (or at least their fingers and/or face were) at the moment of authorization. Barring a Hannibal-Lecter-style scenario, this is strong evidence in favor of human intent.

This addresses what mandates alone cannot: proof that the human who created the mandate is the same human authorizing this transaction right now.

Why merchants will probably hold the bag anyway

The Fair Credit Billing Act of 1974 limits consumer liability for unauthorized credit card use to $50. It provides chargeback rights for "billing errors" or "goods not as described."

Does an agent's misinterpretation constitute a billing error? Courts will decide. But history suggests consumer-favorable interpretation—not least because the agents representing the consumer are the spawn of some of the most politically powerful and influential companies since the American railroad era.

Merchants already lost this fight once

When e-commerce exploded in the late 1990s, the payment industry faced a trust crisis: card-not-present transactions meant merchants couldn't verify the card holder's physical presence or confirm the person using it actually owned it.

The solution heavily favored consumers. The Fair Credit Billing Act gave cardholders dispute rights. Merchants bore 100% of CNP fraud liability. If a stolen card was used online, merchants lost both merchandise and the transaction fee.

By 2006, the reality was stark: Just because the bank approves a credit card doesn't mean it's not stolen. E-commerce merchants—not the credit card associations, not the banks—are often the ones left holding the empty bag.

That framework still exists today. It's about to apply to agent purchases.

The problem is worse this time

Early e-commerce fraud was binary: either you entered your card details or you didn't. Either your card was stolen or it wasn't.

Agentic commerce creates a spectrum of disputes. The agent misunderstood. The agent hallucinated. The agent operated within your mandate but bought the wrong thing. The agent acted faster than you could intervene.

Existing consumer protection frameworks weren't designed for agent-mediated transactions. The question is whether they'll adapt through litigation or proactive regulation.

Current legal ambiguity

Fair Credit Billing Act (1974): As mentioned above, limits consumer liability for unauthorized credit card use to $50 and provides chargeback rights for "billing errors" or "goods not as described." Does an agent's misinterpretation constitute a billing error? Courts will decide.

UETA and ESIGN (widely adopted) already recognize contracts formed by electronic agents without human review of each action. But how those principles interact with FCBA chargeback rights in agent-mediated purchases remains legally untested.

GDPR and consent: EU frameworks require explicit, informed consent. Can a standing mandate satisfy this for future autonomous transactions? Nobody knows.

Emerging frameworks to watch

eIDAS 2.0 (EU): Creates framework for digital identity wallets. Could establish legal groundwork for cryptographic mandates representing delegated authority.

AI Accountability Acts (various US states): Emerging legislation defining liability for AI decision-making. May establish precedents for agent-mediated purchases.

Payment network rule updates: Visa and Mastercard are updating operating regulations to address agent transactions. These changes may establish de facto standards before regulation catches up.

The precedent problem: Early disputes will establish case law before regulation clarifies things. Merchants should assume consumer-favorable interpretation until proven otherwise. That's how CNP disputes evolved.

What you can do right now

Waiting for regulatory clarity is a losing strategy. History suggests merchants will bear the brunt of early disputes coming this holiday season.

It's worth taking stock of where you land on a few key dimensions of agentic commerce readiness: visibility, verification, variance, vigilance, and voice.

1. Visibility: Know that agents are buying

What’s happening: Most merchants can't distinguish agent purchases from human purchases.
Where we want to be: Real-time identification of agent-mediated transactions with agent provider attribution.

Actions:

Require disclosure when purchases are agent-driven versus human-driven (push for this in Stripe ACP and Google AP2 adoption agreements)
Track which AI agents are purchasing (ChatGPT, Perplexity, future entrants)
Monitor agent traffic patterns using Visa TAP or similar verification protocols
Create separate transaction codes for agent purchases in your payment processing

Why it matters: Don’t go into this blind. Dispute patterns will differ between human and agent purchases. Start measuring now to anticipate impact and required investments.

2. Verification: Prove authorization chains

What’s happening: Merchant has no visibility into whether human authorized agent action.
Where we want to be: Cryptographic proof of authorization accessible during dispute resolution.

Actions:

Integrate Stripe’s Shared Payment Token API (ACP) or AP2 mandate verification
Implement Prove Verified Agent or similar identity verification at checkout
Store authorization artifacts (tokens, mandate signatures, biometric authentication logs) with transaction records
Create audit trails showing what product information the agent accessed before purchase

Why it matters: Burden of proof falls on merchants. Authorization chain evidence is your defense against SNAD disputes.

3. Variance: Design for misinterpretation

What’s happening: Product descriptions and policies assume human comprehension.
Where we want to be: Machine-readable specifications that reduce agent misinterpretation.

Actions:

Implement structured data (Schema.org markup) for product specifications
Provide clear, unambiguous attribute definitions (dimensions, materials, compatibility)
Use AI to test how your product descriptions might be misinterpreted by agents
Create explicit compatibility/incompatibility statements ("requires X," "does not include Y")

Why it matters: "Your website confused the agent" will be a common dispute trigger. Clear specifications are your first line of defense.

4. Vigilance: Adapt existing fraud detection

What’s happening: Fraud systems are optimized for human purchasing patterns.
Where we want to be: Agentic commerce-aware fraud detection distinguishing legitimate agents from fraud.

Actions:

Train fraud teams to recognize agentic commerce patterns (velocity, basket composition, time-of-day)
Distinguish legitimate agent purchases from bot fraud using Visa TAP or similar
Update velocity rules—agents may make multiple purchases rapidly across merchants
Monitor for "shopping cart abandonment" patterns that differ from human behavior

Why it matters: Traditional fraud signals may not apply. Legitimate agent behavior can look like bot activity.

5. Voice: Participate in standards development

What’s happening: Payment networks and platforms defining rules without broad merchant input.
Where we want to be: Merchant interests represented in emerging protocol standards and liability frameworks.

Actions:

Join Stripe ACP, Google AP2, and Visa TAP working groups or feedback programs
Engage with merchant associations on agentic commerce policy positions
Negotiate clear liability allocation with agent platforms in early adoption agreements
Document early dispute patterns and share insights with networks to inform rule development

Why it matters: The rules being written now will govern disputes for the next decade. Merchant silence means merchant liability by default.

Self-assessment checklist

Use this to evaluate your current agentic commerce readiness:

Visibility

Can you identify which transactions were agent-mediated versus human?
Do you know which AI agents are purchasing from you?
Can you track agent traffic patterns, ideally in real-time?

Verification

Can you access authorization proof during disputes (tokens, mandates, biometric logs)?
Do you store agent authorization artifacts with transaction records?
Have you integrated any identity verification layer (Prove, Visa Passkey, etc.)?

Variance

Are your product descriptions machine-readable (structured data)?
Do you keep auditable change logs of product descriptions in your catalog system so you can prove what your site content was on a given date?
Have you tested product descriptions for agent misinterpretation risk?
Do you explicitly state compatibility/requirements/exclusions?

Vigilance

Have you updated fraud detection rules for agent purchasing patterns?
Can your systems distinguish legitimate agents from bot fraud?
Do you have separate protocols for reviewing agent transactions (Visa TAP guidance)?

Voice

Are you participating in protocol working groups or merchant coalitions?
Have you negotiated liability terms in early agent commerce agreements?
Are you documenting dispute patterns to inform future standards?

Three scenarios for what happens next

Optimistic: Cryptographic mandates and biometric verification prevent most disputes. Clear authorization chains resolve the rest quickly. The SNADpocalypse never materializes.

Pessimistic: Mandates prove inadequate for "misinterpretation" disputes. Courts rule in favor of consumers citing FCBA protections. Merchants face another decade of disputes while the system evolves.

Realistic: Some disputes prevented. Some handled better. Some create precedents that reshape liability frameworks. Not a catastrophic collapse, but not a seamless transition either.

We don’t know which will play out, but the industry's awareness of the problem is far ahead of where e-commerce was in 1998. Major payment networks have deployed initial standards before the critical mass of adoption has even hit. Identity verification infrastructure is launching in parallel with transaction protocols.

But awareness doesn't shift liability.

The Fair Credit Billing Act still favors consumers. Payment network rules still place burden of proof on merchants. Early adopters will bear costs while precedents form and infrastructure matures.

Merchants should assume they'll hold the bag until proven otherwise. That assumption guided survival in early e-commerce. It's the smart bet now.

Questions we’ll be hearing in the coming months

Can I refuse to accept agent-powered purchases?
Technically yes, but practically difficult. Without integrating human-in-the-loop identity verification, you can't easily distinguish agent purchases from human purchases at the point of transaction. By the time you identify an agent purchase, payment networks already processed it, meaning you’re looking at a refund, not a refusal.

If I implement Stripe ACP or Google AP2, does that protect me from disputes?
Partially. These protocols provide better authorization proof than current CNP transactions, but they don't eliminate disputes. "Significantly not as described" claims can still arise if the agent misinterpreted product specifications or user intent. The protocols improve your defense—particularly around proving the cardholder authorized the transaction—but they don't guarantee victory on product-related disputes (ACP, AP2).

Who's liable when an agent "hallucinates" product features that don't exist?
Legally unclear. Arguments exist for agent provider liability (their system made the false claim), merchant liability (under FCBA "not as described" provisions), or consumer responsibility (they authorized the agent). Early case law will determine precedent. Document everything about what product information you provided (FTC Consumer Advice).

Should I create separate return policies for agent purchases?
Consider it, but consult a lawyer. Separate policies could help address "misunderstanding" scenarios distinct from fraud. However, payment networks may not recognize such distinctions, and consumer protection laws may override them. This is evolving territory.

What happens if a customer's agent gets "hacked" and makes fraudulent purchases?
Likely treated as standard CNP fraud under current frameworks—merchant liable unless they can prove authorization. This is precisely why identity verification layers (Prove Verified Agent, Visa Payment Passkey) matter—they create stronger evidence the legitimate account holder authorized the transaction.

How do I know if I should join protocol working groups or wait?
If you're a large merchant or early adopter of agent commerce capabilities, participate now—rules are being written. If you're small or not yet seeing agent traffic, monitor closely but focus on the 5 V's of agentic commerce readiness discussed above. You don't need to join every working group, but you should have someone tracking developments (ACP, AP2, TAP).

Will agentic commerce disputes be handled differently from regular chargebacks?
Not yet. Existing chargeback reason codes and processes apply until payment networks create specific agent transaction categories. This is why early disputes will likely favor consumers—the frameworks default to consumer protection in ambiguous scenarios.

What's the single most important thing I can do right now?
Visibility. Start tracking which transactions are agent-mediated versus human. You can't manage disputes you can't identify. Everything else depends on this foundation.

What is agentic commerce?

How we got here

The authorization gap nobody solved

Three protocols to manage — and maybe prevent — the coming chaos

Stripe ACP: Real-time explicit consent

Google AP2: Cryptographic mandates with bounded authority

Visa TAP: Merchant-side verification

The identity layer that might save merchants

Why merchants will probably hold the bag anyway

Merchants already lost this fight once

The problem is worse this time

Current legal ambiguity

Emerging frameworks to watch

What you can do right now

1. Visibility: Know that agents are buying

2. Verification: Prove authorization chains

3. Variance: Design for misinterpretation

4. Vigilance: Adapt existing fraud detection

5. Voice: Participate in standards development

Self-assessment checklist

Three scenarios for what happens next

Questions we’ll be hearing in the coming months