International Payments Fraud: When The OTP Was the Fraud Infrastructure All Along

Imagine this.

A Bengaluru-based startup spends three years building a product. Payments work beautifully. Chargebacks are a non-issue. The fraud rate is so low it barely appears in the board deck. The team is good. The product is tight. So they expand internationally.

Sixty days later, the chargeback reports arrive.

The fraud rate is not 2x what it was in India. It is not even 5x. It is closer to 15x, concentrated in the US and Southeast Asian markets, and climbing. The payment processor sends a warning letter. A Visa program enrollment notice follows. The CFO starts asking questions no one on the product team can answer.

What happened?

Nothing changed about the product. Nothing changed about the team. What changed was that the company had silently crossed a regulatory border. And on the other side of that border, the safety net they had been standing on for three years simply did not exist.

The OTP was not a feature. It was the fraud infrastructure.

And they had just shipped to a world without it.

The Invisible Subsidy

India accidentally created one of the safest card payment ecosystems in the world.

The Reserve Bank of India has long mandated Additional Factor of Authentication for card transactions. In practice, this meant an OTP delivered to the cardholder's registered mobile number before any transaction above a threshold could complete. A stolen card number alone was not enough. You needed the card, and you needed the phone.

The result: domestic card-not-present fraud rates in India have been among the lowest of any major economy.

The RBI updated this framework in September 2025, mandating two-factor authentication for all domestic digital payments from April 1, 2026, with at least one factor being dynamically generated — cryptographically unique to that specific transaction and non-replayable. For cross-border card-not-present transactions involving Indian-issued cards, the deadline extends to October 1, 2026.

The system works. It has always worked.

But here is the invisible cost.

Every Indian PM who has built on Razorpay, PayU, or CCAvenue has internalized an environment where fraud infrastructure is the regulatory baseline. It is not a product decision. It is the floor. You do not think about it any more than you think about gravity.

When they expand internationally, they take that assumption with them.

In the United States, there is no federal mandate for step-up authentication on card transactions. PCI DSS governs how card data is stored and transmitted, but it does not require an OTP before a charge completes. In most of Southeast Asia, the gap is identical. In Latin America — where Brazil, Mexico, and Colombia are consistently flagged in Visa and Mastercard fraud intelligence reports — card-not-present fraud rates are among the highest globally.

The European Economic Area is the major exception. PSD2's Strong Customer Authentication requirements mandate two-factor verification on most transactions above EUR 30, and 3D Secure 2 is the primary compliance mechanism. But the United States, where most Indian startups expand first, operates under no such requirement.

The fraud rate does not scale linearly with volume when you enter these markets.

It spikes.

And the company that was not thinking about fraud infrastructure suddenly discovers that three years of low chargebacks was not evidence of good fraud prevention. It was evidence of good regulation.

What Is Actually Coming After You

Before building defenses, you need to be precise about what you are defending against. International card fraud is not one thing. It arrives in distinct forms, each with its own mechanics, its own detection signals, and its own category of damage.

Card-not-present fraud is the dominant type. Someone holds raw card credentials — PAN, expiry, CVV — obtained through a data breach, a phishing attack, a Magecart-style JavaScript injection into a merchant checkout page, or purchased outright from one of the dark web carding markets that operate with uncomfortable levels of professionalism. The 2024 Recorded Future Payment Fraud Intelligence Report documented over 269 million stolen card records circulating in these markets. The attacker transacts online where no physical card presentation is required. The real cardholder disputes the charge later. You pay.

Card testing and enumeration is usually the first attack you see, because it arrives at volume before any financial loss materializes. Fraudsters validate stolen cards with small-value authorization attempts before deploying them for larger transactions. Automated bots submit thousands of attempts per hour against your checkout endpoint.

You are being used as a card validity oracle.

Each attempt carries a processing fee of $0.05 to $0.30. At scale this generates six-figure losses in fees before anyone notices the pattern. And Visa's new monitoring program — more on this shortly — flags merchants at 300,000 enumeration attempts per month. You may trip the threshold before actual fraud at scale even begins.

Account takeover has a distinct lifecycle that most PMs mentally conflate with CNP fraud. The attacker compromises a legitimate user account, changes the email and phone number to defeat recovery, adds a new shipping address, and monetizes via stored payment methods. The transaction looks legitimate because it comes from an established account with history. The device fingerprint is new, the shipping address is unfamiliar, and the order value is anomalous — but none of these signals individually cross a blocking threshold. This is why account-layer risk signals need to feed into payment-layer decisioning. They are the same problem.

Friendly fraud is statistically the largest category, and it requires the most honest reckoning. Visa's own data suggests it accounts for approximately 75% of all disputes. The legitimate cardholder made the purchase, received the goods or service, and disputed the charge anyway. Sometimes this is deliberate. Often it is because they do not recognize the merchant name on their statement.

The entire category is a product problem dressed up as a fraud problem.

The most effective countermeasure — a clear, recognizable merchant descriptor on the billing statement — costs nothing to implement and has an outsized effect. Most merchants do not discover their descriptor is ambiguous until the chargeback reports arrive.

Triangulation fraud is architecturally the most elegant and the hardest to detect, because from the genuine merchant's perspective, everything looks correct. A fraud ring runs a fake storefront at artificially low prices. A consumer purchases from the storefront and provides their real shipping address. The fraudster uses a stolen card to buy the same item from you, shipping to that address. The consumer receives their order. The real cardholder disputes the charge. You fulfilled the product. You still absorb the loss.

3D Secure: The Protocol That Failed and Then Came Back

To understand where international payment authentication stands today, you have to understand why the original 3D Secure was one of the most commercially damaging security protocols ever deployed — and why its replacement is genuinely different.

Why 3DS1 Failed

The original 3D Secure specification was technically coherent. Its mechanism: when a card transaction triggered, redirect the cardholder to an Access Control Server operated by their issuing bank, where they entered a static password enrolled at card issuance — "Verified by Visa," "Mastercard SecureCode." On successful authentication, a cryptographic value was returned to the merchant. This value, included in the authorization request, triggered liability shift: fraud chargebacks moved from the merchant to the issuer.

The liability shift was real. Merchants who enabled 3DS1 offloaded fraud liability. But they paid for it in a currency that turned out to be more expensive: abandonment.

The failure modes were structural, not incidental. The entire protocol was designed for desktop browsers in 2001. By 2018, more than 60% of e-commerce traffic was mobile. The ACS redirect — an iframe spawned from a third-party bank domain — frequently broke on mobile, timing out or rendering unresponsive. Merchants reported abandonment rate increases of 15 to 25 percentage points after enabling it.

The static password made it worse. Set once at card enrollment, never changed, easily forgotten. Forgotten password recovery flows were inconsistent across thousands of issuing banks. A forgotten password at checkout is not a security moment. It is a lost sale.

The ACS also received almost nothing from the merchant beyond the transaction amount and card number. It could not distinguish a routine purchase by a loyal customer from a first-time high-risk transaction from an unknown device. It applied the same friction to everyone.

Merchants turned it off wherever they could, accepting fraud liability as cheaper than the conversion loss. They were right.

Why 3DS2 Is Different

EMVCo released the 3DS2 specification in 2016. The philosophical shift was fundamental: instead of authentication by challenge, the protocol was redesigned around authentication by risk assessment, with challenge as fallback.

The structural change: a new component called the 3DS Server sits on the merchant side and transmits a rich contextual payload to the issuer's Access Control Server before any authentication decision is made. Where 3DS1 sent almost nothing, 3DS2 sends device fingerprint (browser user-agent, screen resolution, timezone, language, canvas fingerprint), behavioral signals, transaction history — number of purchases with this merchant over the last six months, number of transactions across all merchants on this card in the last 24 hours — account age, billing and shipping address match, and shipping address history.

The issuer's risk engine evaluates this and routes to one of two flows.

Frictionless flow: The risk engine determines the transaction is low risk. It returns a cryptographic authentication value without any cardholder interaction. The shopper sees nothing. The process completes invisibly. Approximately 90 to 95% of 3DS2 transactions resolve via frictionless flow when the merchant sends complete, high-quality data.

Challenge flow: The risk engine flags elevated risk. The cardholder is challenged via OTP, biometric, or banking app push notification. On successful completion, the liability shifts to the issuer for unauthorized-transaction disputes.

The abandonment problem of 3DS1 largely disappears. The liability shift remains.

The regulatory spread has accelerated. PSD2 SCA in the EEA mandates it. Japan made 3DS2 compulsory for all card transactions from April 2025. France tightened further in March 2025, requiring issuers to soft-decline exemption requests not submitted via EMV 3DS. The direction is unambiguous: 3DS2 is becoming mandatory infrastructure, not optional risk management.

One detail most PMs miss: 3DS2 challenge authentication shifts liability on unauthorized-transaction disputes. It does not protect against "item not received," "item not as described," or "service not rendered" disputes. Those remain merchant liability regardless of authentication status. Friendly fraud — the 75% category — is largely unaffected by 3DS2.

Authentication solves for unauthorized use. It does not solve for customers who lie about it afterward.

The Monitoring Programs You Cannot Afford to Trigger

In April 2025, Visa consolidated its two legacy monitoring programs into a single unified framework: the Visa Acquirer Monitoring Program, or VAMP. Enforcement began October 1, 2025.

The formula changed in a way that catches many merchants by surprise:

VAMP Ratio = (TC40 Fraud Reports + TC15 Chargebacks) / Total Settled Transactions

TC40 is Visa's internal fraud reporting mechanism. When an issuer's fraud team identifies a transaction as fraudulent, they file a TC40 report — even before the cardholder formally disputes. Under the old dual-program structure, a fraudulent transaction was counted once. Under VAMP, a single fraudulent transaction that generates both a TC40 report and a subsequent chargeback counts twice against the ratio.

Merchants in the "Excessive" tier — a VAMP ratio above 2.2% combined with 1,500 or more dispute events per month — face $8 per-dispute fees, processor-imposed account reserves of 5 to 20% of monthly volume, processing restrictions, and at the extreme, termination of the ability to accept Visa. That threshold drops to 1.5% in the US, Canada, and EU from April 1, 2026.

Mastercard's equivalent framework is more aggressive. Their Excessive Chargeback Program flags merchants with as few as 100 disputes per month at a 1.5% ratio. The MATCH list — Member Alert to Control High-Risk Merchants — is where merchants whose processing has been terminated get placed, and a MATCH listing follows a company across acquiring banks for up to five years.

For a growth-stage company, losing card acceptance is not an operational inconvenience.

It is an existential event.

The critical operational implication: these ratios are calculated monthly, not quarterly. A single bad month following an international market launch can push a merchant into a monitored tier. The monitoring programs have no memory of prior good behavior. They respond to the current month.

Model the worst-case chargeback scenario for each new market before enabling it.

Not sixty days after.

How the Best Companies Built Their Stacks

Understanding why the fraud problem exists is the first half. Building the infrastructure to address it is the second. Stripe, Adyen, Uber, and Airbnb each arrived at meaningfully different architectures — and each difference reflects a genuine product decision, not just a technology preference.

Stripe: The Fraud Moat No Single Merchant Can Replicate

Stripe's fraud product, Radar, is architecturally differentiated from single-merchant fraud tools in one respect that matters more than any specific algorithm.

The data it trains on is the entire Stripe network.

Stripe processes payments for hundreds of thousands of merchants across 197 countries. Every card that transacts on the network is visible to Radar's models. Critically, 90% of cards have been seen across multiple Stripe merchants — which means Radar can establish cross-merchant behavioral baselines. If a card has been used cleanly at fifteen merchants over two years, that is a signal no individual merchant's fraud model can see. If the same card suddenly appears with an unusual order pattern at a new merchant, the network-level deviation is detectable in ways that isolated models cannot match.

This is not a feature. It is a compounding structural moat.

Stripe disclosed that the original Radar model combined a deep neural network with XGBoost as an ensemble. The XGBoost component provided measurable lift in recall — fraud detection rate. But it created constraints: slow retraining cycles, incompatibility with transfer learning and embeddings, limits on parallelization. Simply removing XGBoost would have caused a 1.5% drop in recall, meaning 1.5% more fraud passing through undetected. At Stripe's scale, that represents hundreds of millions of dollars annually. Their solution was to replace XGBoost with an architecture inspired by ResNeXt — a multi-branch residual network from computer vision — which replicated the ensemble behavior while enabling faster retraining. Radar now retrains every 48 hours.

The PM decision worth studying is not the architecture. It is the framing.

Stripe explicitly treats fraud prevention as a revenue optimization problem, not a loss mitigation problem. False positives — blocking legitimate transactions — are tracked as carefully as false negatives. The 2025 Payments Intelligence Suite delivered an over 30% reduction in fraud on eligible transactions, and separately recovered more than $6 billion in legitimate declined transactions in 2024 via improved retry logic. Stripe counts recovered legitimate revenue as a success metric alongside fraud losses prevented.

Most fraud teams at product companies do not track the cost of their false positives. Stripe makes it a headline number.

If your fraud vendor is only showing you fraud rates and chargeback counts, you are seeing half the picture. The authorization rate impact of your fraud configuration — the legitimate orders you are blocking — is the other half.

Adyen: Configurable Risk with Dynamic Routing

Adyen's approach makes a core bet that network-level models cannot fully substitute for merchant-specific configuration. Their risk engine provides ML models trained on global transaction data, but the architecture is explicitly built for merchants to layer custom rules on top.

ShopperDNA, Adyen's proprietary identity layer, attempts to solve the false positive problem for returning customers. It combines device fingerprinting, algorithmic matching, and behavioral analytics to build a persistent shopper profile across sessions and devices. A returning legitimate customer on a new device should clear faster than a first-time actor with the same device fingerprint. ShopperDNA is designed to make that distinction reliably.

Their Dynamic 3DS product is one of the cleaner PM-level implementations of risk-based authentication routing in production. Rather than applying 3DS universally or not at all, the system routes each transaction through 3DS based on the real-time risk score. Low-risk transactions proceed without authentication. Medium-risk triggers 3DS frictionless. High-risk triggers challenge or block. The thresholds are configurable.

One implementation detail Adyen makes explicit in their documentation: their risk engine makes better decisions when merchants send richer data in the payment request. Risk rules requiring fields not present in the API call simply cannot fire. This is the most common source of underperformance in Adyen implementations — merchants using partial field sets, omitting device data or shipping address, and then receiving floods of step-up challenges and blocked legitimate orders because the engine is operating blind.

Adyen's 2025 Uplift product added experimentation to the risk configuration layer. Fraud rule changes are historically high-risk decisions: tightening rules reduces fraud but may suppress legitimate orders. Uplift allows backtesting against historical transaction data before production deployment. It converts what was previously a judgment call into a data-backed decision.

Fraud configuration is not a one-time setup. It is a recurring product surface that requires the same experimentation discipline as any other feature.

Uber: Intercepting Fraud Before It Becomes a Chargeback

The standard approach to chargebacks is reactive: dispute arrives, evidence is gathered, rebuttal is submitted. Uber built infrastructure to intercept the signal that precedes the chargeback.

TC40 reports arrive at Uber from external vendors via SFTP, third-party APIs, and webhooks. Their integration system validates and normalizes the signals into a common format, publishes them to an Apache Kafka topic for real-time processing, and writes to Apache Hive for offline analysis. The streaming pipeline consuming the Kafka topic maps the external TC40 identifier to Uber's internal transaction ID, applies risk decision logic, and where appropriate triggers a verification challenge on the rider's account — a "penny drop" transaction requiring card ownership confirmation.

The goal is not to respond to fraud after it happens. The goal is to intervene before the chargeback is formally raised.

Under VAMP's combined ratio formula, preventing a chargeback from being raised means only one count against your ratio instead of two — the TC40 report that already exists, without the subsequent TC15 chargeback. The pre-dispute intervention is not just operationally cheaper. It is mathematically better for your ratio.

For most product teams, TC40 integration is the highest-leverage fraud infrastructure investment that does not get built until the team is already in trouble. It should be built before the volume is there to justify it.

Airbnb: When the Cost Function Changes Everything

Airbnb absorbs all chargeback costs and does not pass them to hosts. Every fraudulent booking is a direct loss to the company. This constraint produced a fraud approach that is more honestly constructed than most.

Their engineering team published what they call the targeted friction framework. The logic is simple to state and genuinely hard to implement correctly.

When a fraud model outputs a probability score, the naive approach is to set a threshold and block everything above it. The problem: every blocked transaction that is actually a legitimate booking is lost revenue and a potentially permanently churned good customer. Friction — additional verification, document upload, manual review — is cheaper than outright blocking, but friction loses some good customers too.

Airbnb formalized this as a cost function.

False negative cost: The full payment amount of the fraudulent booking, plus processor fees, plus the elevated card decline rates that accumulate on the merchant's profile from fraud history.

False positive cost: Lost booking revenue, multiplied by the probability that the legitimate customer never books again after encountering friction.

The model threshold is set at the point where the expected cost of applying friction to a borderline booking equals the expected cost of not applying it. This is not a fraud problem. It is a decision theory problem applied to a product surface.

The lesson: the threshold you set on your fraud model is a product decision with measurable revenue consequences on both sides. Track both the fraud rate and the false positive rate. Optimize the threshold against the combined cost function, not just the fraud loss column.

Riskified: What Aligned Incentives Actually Look Like

Riskified makes visible an incentive problem that exists in most fraud vendor relationships.

When Stripe or Adyen provides a risk score, the merchant decides whether to approve or decline. If a transaction the model approves turns out fraudulent, the merchant bears the loss. The model provider does not. This creates structural misalignment: the vendor is motivated to minimize false positives — which merchants complain about loudly — but has no skin in the game on false negatives.

Riskified's chargeback guarantee removes this misalignment entirely. They make a binary approve/decline decision. If they approve a transaction that results in a chargeback, Riskified reimburses the merchant. Their revenue comes from a fee on approved GMV. They make money only on correctly approved legitimate transactions. They lose money on every fraudulent approval.

Both sides of the tradeoff are in their P&L.

The structural insight for PMs evaluating fraud vendors: ask whether your vendor's incentives are aligned with yours on both sides. A vendor paid per transaction reviewed has different incentives than one absorbing the cost of incorrect approvals.

The Infrastructure Layer Most Teams Skip

Network tokenization is the highest-leverage fraud infrastructure investment most product teams deprioritize because it does not feel like fraud prevention. It feels like plumbing.

It is both.

When a merchant stores a card on file — for subscriptions, one-click checkout, saved payment methods — they typically store a static PAN, the 16-digit card number. Static PANs do not change unless the card is replaced. They can be stolen and replayed. They expire when the card expires, causing involuntary churn on recurring billing.

Network tokens replace the PAN with a dynamic, merchant-specific token issued by the card network's Token Service Provider (Visa Token Service, Mastercard DEES). For each transaction using the token, a unique cryptogram is generated — a cryptographic value tied to that specific transaction, that specific merchant, that specific token, in real time. It expires immediately after use.

A stolen network token is worthless outside its merchant-network context. An intercepted authorization message cannot be replayed.

Visa's published data: token-based transactions produce an average 30% reduction in CNP fraud versus PAN-based transactions, and a 4 to 6% uplift in authorization rates. In 2024, Visa saw a 44% year-over-year surge in tokenized transaction volume. The authorization rate improvement compounds over time for subscription businesses: higher approvals on recurring billing, elimination of card-expiry-driven churn, and a cleaner fraud profile that reduces issuer-side suppression of future authorizations.

Visa has stated that 50% of digital card transactions are now tokenized and the remaining 50% — concentrated in guest checkout and form-filled transactions — is an explicit target. Acquirer integrity fees for non-tokenized transactions are already live in certain markets.

This is infrastructure the ecosystem is moving toward regardless of individual merchant preference. Moving early is an authorization rate advantage. Moving late is an increasing cost disadvantage.

What PMs Actually Need to Decide

The decisions below are the ones that separate teams managing fraud well from teams that discover the problem at month three.

Data collection before vendor selection. All fraud decisioning downstream is bounded by the quality of data flowing into it. Before choosing a vendor or configuring a rule, collect and pass: device fingerprint, IP address with ASN lookup and VPN/proxy detection, behavioral signals on checkout — time-on-page, copy-paste detection on card entry fields (humans type card numbers with natural pauses between groups; bots paste the full PAN in a single event) — prior transaction history, account age, and AVS response codes. For 3DS2 specifically, send the full EMVCo-recommended optional field set. Every optional field omitted is a signal the issuer's ACS cannot use, which directly increases step-up challenge rates and abandonment.

3DS routing is a named decision, not a default. A mature 3DS implementation has at least four paths: no 3DS for low-risk returning customers and merchant-initiated transactions; 3DS2 frictionless for medium-low risk where liability shift is desired; 3DS2 challenge for elevated risk and SCA-mandatory flows; and decline for transactions above the threshold where authentication is insufficient. This routing logic should be documented, versioned, and reviewed quarterly. A configuration optimized for Q4 peak season is probably wrong for Q1.

Pre-dispute infrastructure before post-dispute response. Integrate Verifi (Visa's alert network) and Ethoca (Mastercard's) to receive early dispute signals before chargebacks are formally raised. If you refund within the alert window — typically 24 to 72 hours — the chargeback may be prevented entirely, removing it from your VAMP count. Build TC40 signal ingestion as Uber did if your volume justifies it. Act on the signal before it becomes a formal count against your ratio.

Dispute response is a winnable game that most teams forfeit. Industry win rates on contested chargebacks range from 20 to 40% for merchants with basic evidence, to 50 to 65% for merchants with comprehensive, organized evidence packages. The gap represents real dollar recovery. Strong evidence on unauthorized-transaction chargebacks: 3DS authentication record with CAVV value, device fingerprint and IP log, prior transaction history with the cardholder, order confirmation with timestamp, AVS and CVV match records. Assign ownership. Track the 30-day response window as an SLA. Missing the window is an automatic loss regardless of merit.

The VAMP ratio is a daily number tracked on a monthly schedule. Build a daily dashboard of VAMP ratio segmented by acquirer MID, market, and payment method. Set internal alert thresholds at 1.5% — below the current 2.2% enforcement threshold in most markets, and equal to the tightened 2026 threshold for US, Canada, and EU. Set an enumeration alert at 150,000 combined failed authorization attempts per month — half the VAMP enumeration trigger — to detect carding attacks during the testing phase rather than after they scale.

The Thing Nobody Says

There is a conversation that happens inside every product and finance team scaling internationally, usually around month three, when the fraud numbers become impossible to ignore. Someone asks how much fraud is acceptable.

The honest answer is: some level is the cost of operating in open payment systems.

The goal is not zero fraud. The goal is a VAMP ratio that stays below enforcement thresholds, a unit economics model that accounts for the fraud that does occur, and the infrastructure to contest the chargebacks that do arrive.

The merchants who end up in monitoring programs are not the ones who understood the risks and accepted them. They are the ones who genuinely did not know the risks existed.

Because for three years in India, they did not.

The regulatory floor had solved the problem for them. The OTP had been doing the work quietly, invisibly, every time a card was charged. It was not a feature on a roadmap. It was not a product decision. It was the ambient infrastructure of the ecosystem.

Going international means building what the RBI built for you.

Understanding that the floor is not universal is the beginning of building real fraud infrastructure.

Everything else is implementation.