Why OpenAI’s Model is the Standard
The AI industry faces unique challenges that traditional SaaS billing doesn’t always address. OpenAI’s model solves several of these problems simultaneously.- Predictable Revenue and Low Risk: By requiring prepaid credits for API usage, OpenAI eliminates the risk of users running up massive bills they can’t pay. You get the money upfront, and the user gets the service as they use it.
- Scalability for Developers: A $5 top-up is a low barrier to entry. As their application grows, developers can automate top-ups or buy larger packs. The friction to start is almost zero, but the ceiling for growth is unlimited.
- User Psychology: Denominating credits in fiat currency (USD) instead of abstract “tokens” or “points” makes the value clear. It feels like a bank account for AI services, which builds trust and makes budgeting easier for companies.
How OpenAI Bills
OpenAI operates two distinct billing models that cater to different user needs.- API (Pay-as-you-go): The API uses prepaid fiat-denominated credits. Users top up their accounts with $5, $10, $50, or more. These credits show a dollar value but have no monetary value outside OpenAI. OpenAI bills per-token with different rates for input and output tokens. Credits never expire, and when a user’s balance hits $0, their API calls fail immediately.
- ChatGPT Plus, Team, and Enterprise: These are flat-rate subscriptions. ChatGPT Plus costs $20 per month, while the Team plan is $25 per user per month. These plans have soft usage caps where users get downgraded to a smaller model instead of being blocked.
- Spend-based rate tiers: As you spend more total money over time, you unlock higher API rate limits. This is a trust-based access scaling system tied directly to your billing history.
| Model | Pricing | Input Tokens | Output Tokens |
|---|---|---|---|
| GPT-4o | Usage-based | $2.50 / 1M | $10.00 / 1M |
| GPT-4o-mini | Usage-based | $0.15 / 1M | $0.60 / 1M |
| o1 | Usage-based | $15.00 / 1M | $60.00 / 1M |
| Plan | Price | Type |
|---|---|---|
| Free | $0 | Limited access |
| Plus | $20 / mo | Subscription with soft caps |
| Team | $25 / user / mo | Per-seat subscription |
| Enterprise | Custom | Invoiced billing |
What Makes It Unique
OpenAI’s billing strategy has several key characteristics that make it effective for AI services.- Fiat-denominated credits: Credits feel like money because they’re denominated in USD. This makes pricing transparent and easy to understand for developers.
- No expiry: Never-expiring balances reduce the “use it or lose it” pressure. Users feel comfortable topping up larger amounts because they know the value won’t disappear.
- Multi-dimensional metering: Input and output tokens are tracked separately but deduct from the same credit balance. This allows OpenAI to price expensive output tokens differently from cheaper input tokens.
- Trust tiers: Linking rate limits to total spend encourages users to stay on the platform and rewards long-term customers with better performance.
Strategic Advantages
This model creates a powerful flywheel. Low entry costs bring in developers. Prepaid credits provide immediate cash flow. Usage-based scaling ensures that as the developers succeed, OpenAI succeeds. The subscription side provides a steady, predictable baseline of revenue from non-developers.Build This with Dodo Payments
You can replicate OpenAI’s billing model using Dodo Payments. We’ll use Credit-Based Billing for the API and standard subscriptions for the ChatGPT Plus side.Create a Fiat Credit Entitlement
Start by creating a credit entitlement in your Dodo Payments dashboard. This will act as the central balance for your users.
- Credit Type: Fiat Credits (USD)
- Credit Expiry: Never
- Rollover: Not needed (since they never expire)
- Overage: Disabled
Create Top-Up Products
Create one-time payment products for different credit packs. You might offer $5, $10, $50, and $100 options. Attach your fiat credit entitlement to each product.Set the credits issued per product in cents. For a $50 pack, you’ll issue 5000 credits.
Create Usage Meters
Create two separate meters to track token usage.
llm.input_tokens: Sum aggregation on thetokensproperty.llm.output_tokens: Sum aggregation on thetokensproperty.
Calculating Meter Units per Credit
To match OpenAI’s GPT-4o pricing ($2.50 per 1M input tokens), you need to calculate how many tokens equal $1 (100 cents).- Input Tokens: 1,000,000 tokens / $2.50 = 400,000 tokens per $1.
- Output Tokens: 1,000,000 tokens / $10.00 = 100,000 tokens per $1.
Send Usage Events
After each LLM request, send the usage data to Dodo Payments. You can send both input and output events in a single request.
Handle Balance Depletion
You should check the user’s balance before processing an API request. If the balance is zero or negative, return a 402 error.
Handling Low Balance Webhooks
Don’t wait until the user hits $0 to notify them. Use webhooks to trigger an email or in-app notification when their balance drops below a certain threshold.Build the ChatGPT Subscription Side (Optional)
If you want to offer a subscription plan like ChatGPT Plus, create a separate subscription product in Dodo Payments. These don’t need credit entitlements.For a Team plan, use seat-based billing by adding add-ons for each additional user.
Implementing Soft Caps
To replicate OpenAI’s soft caps, you can track usage for your subscription users using the same meters but without linking them to a credit entitlement. In your application logic, check the usage for the current billing period.Accelerate with the LLM Ingestion Blueprint
The steps above show how to manually construct and send usage events. For production deployments, the LLM Ingestion Blueprint provides automatic token tracking that wraps your OpenAI client directly.inputTokens, outputTokens, and totalTokens from every API response and sends them as event metadata. Configure your meter to aggregate on the appropriate token property.
Implementing Spend-Based Rate Tiers
OpenAI’s rate tiers are a powerful way to manage capacity. You can implement this by tracking the total lifetime spend of a customer.- Track Lifetime Spend: Listen for
payment.succeededwebhooks and update atotal_spendfield in your database for that customer. - Define Tiers: Create a mapping of spend amounts to rate limits.
- Tier 1: $0 - $50 spend -> 3 RPM
- Tier 2: $50 - $250 spend -> 10 RPM
- Tier 3: $250+ spend -> 50 RPM
- Enforce Limits: In your API middleware, check the customer’s tier and enforce the corresponding rate limit.
Full Implementation Example: The API Proxy
In a real-world scenario, you’ll likely have an API proxy that sits between your users and the LLM provider. This proxy handles authentication, credit checks, and usage reporting.Handling Edge Cases
When building a billing system as complex as OpenAI’s, you’ll encounter several edge cases that need careful handling.Race Conditions
If a user has a very low balance and sends multiple requests simultaneously, they might exceed their credit limit before the first event is processed. To prevent this, you can implement a small “buffer” or use a distributed lock on the customer’s balance during the request.Event Ingestion Latency
Dodo Payments processes events asynchronously. This means there might be a slight delay between an API call and the credit deduction. For most use cases, this is acceptable. If you need strict real-time enforcement, you can maintain a local cache of the user’s balance and update it optimistically.Refund Handling
If you refund a credit pack purchase, Dodo Payments will automatically handle the credit entitlement if configured. However, you should ensure your application logic reflects this change immediately to prevent users from using credits they no longer have.Multi-Model Support
If you support multiple models with different pricing, you have two options:- Separate Meters: Create separate meters for each model (e.g.,
gpt-4o.input_tokens,gpt-4o-mini.input_tokens). - Weighted Events: Use a single meter but multiply the
tokensvalue by a weight before sending it to Dodo. For example, if GPT-4o is 10x more expensive than GPT-4o-mini, you could send 10x the tokens for GPT-4o requests.
Architecture Overview
The meters track tokens and deduct the corresponding value from the user’s credit balance based on your configured rates.Conclusion
Replicating OpenAI’s billing model with Dodo Payments gives you the best of both worlds: the flexibility of usage-based billing and the predictability of prepaid credits. By following this guide, you can build a billing system that scales with your users while protecting your margins. Whether you’re building the next big LLM or a niche AI tool, these patterns will help you create a professional, developer-friendly experience. This approach ensures that your billing infrastructure is as scalable and reliable as the AI models you’re delivering to your customers.Key Dodo Features Used
Explore the features that make this implementation possible.Credit-Based Billing
Manage prepaid fiat credits and entitlements for your users.
Usage-Based Billing
Track granular usage like tokens and bill for it in real-time.
One-Time Payments
Sell credit packs and top-ups with a simple checkout flow.
Event Ingestion
Send high-volume usage data to Dodo Payments with ease.
Webhooks
Stay updated on credit balance changes and low balance alerts.
LLM Ingestion Blueprint
Automatic token tracking for OpenAI and other LLM providers.