AI Pricing Guide: Subscriptions, API Costs & Free Options (2025–2026)

0:00

AI Pricing Guide: Subscriptions, API Costs & Free Options (2025–2026)

Last updated: February 17, 2026

All prices in USD unless noted. Sources: official provider websites (OpenAI, Anthropic, Google, xAI, DeepSeek, Alibaba, Groq, etc.), OpenRouter, and reliable industry reports. Prices fluctuate — always verify on official sites before committing.

Note on data jurisdiction: Chinese-hosted models (DeepSeek, Qwen, Kimi, SiliconFlow, etc.) are extremely cheap but store data under PRC jurisdiction. Use only for non-sensitive work.

Index

1. Free Chat Interfaces
2. Paid Chat Subscriptions (Non-API)
- Individual Plans
- Team & Enterprise Plans
3. Free & Trial API Tiers
- Generous Free APIs for Building & Testing
- Free Stealth / Cloaked Models (OpenRouter)
4. Paid API Pricing (Per 1M Tokens)
5. Cheapest API Combos (Realistic Blended Cost)
6. API Cost Optimisation Strategies
- Multi-Model Router Strategy
7. Platform Fees — OpenRouter
8. Quick Recommendation Summary

1. Free Chat Interfaces

Free tiers provide entry-level access — perfect for casual users, testing, or getting started. Usage limits apply.

Service	Latest Models Available	Daily / Session Limits	Link	Hosting
Google Gemini	Gemini 3 Pro / Flash	Truly unlimited basic; 5 prompts/day with 2.5 Pro (then switches to Flash), 100 images/day, 5 Deep Research reports/month	gemini.google.com	US/EU
Microsoft Copilot	GPT-5 / Claude 4.6	Unlimited basic	copilot.microsoft.com	US/EU
ChatGPT	GPT-5o / GPT-4o mini	Usage-capped (resets every few hours); GPT-5 limited to ~10 messages/5 hours on free	chatgpt.com	US/EU
Claude.ai	Claude Sonnet 4.6	Message limits resetting ~every 5 hours	claude.ai	US/EU
Grok	Grok 4.1	Free for all X users; ~10 prompts/2 hrs	grok.x.ai	US
HuggingChat	Llama 4, Qwen3-Coder, Mistral, GLM-5	Truly unlimited	huggingface.co/chat	US/EU
Perplexity	Standard AI model	Unlimited Quick searches; 5 Pro searches/day; 3 file uploads/day	perplexity.ai	US/EU
DeepSeek Chat	DeepSeek V4 / R1	Very high (fair-use throttling only)	chat.deepseek.com	🇨🇳 Chinese
Kimi	Moonshot K2.5	High free tier	kimi.moonshot.cn	🇨🇳 Chinese

2. Paid Chat Subscriptions (Non-API)

Sorted by price, low to high. For same-price plans, most recent data listed first.

Individual Plans

Provider	Plan	Cost (USD/mo)	Models & Key Limits	Notes
OpenAI	Go (India-only)	~$4.55 (₹399)	GPT-5 access, expanded messaging vs. free	Regional pricing
Poe	Pro	$4.99	Access to 20+ models (GPT, Claude, Llama, etc.)	Best for model experimentation
xAI	X Premium (Grok)	$8 ($ 7/mo annual)	Higher Grok limits vs. free; blue checkmark	Web US pricing
Google	Gemini AI Pro (Advanced)	$19.99 (often$ 9.99 promo)	Gemini 3 Pro, 2M context; 100 prompts/day, 1,000 images/day, 3 videos/day, 20 Deep Research/day; includes 2TB Google One	Also called "AI Plus" in some regions
Anthropic	Claude Pro	$20 ($ 17/mo annual)	All models incl. Claude Opus 4.6 + Claude Code; at least 5× more usage than free; priority access. Weekly rate limits (since Aug 28, 2025)	Best reasoning & agents
OpenAI	ChatGPT Plus	$20	GPT-5 + o-series reasoning; up to 160 GPT-5 messages/3 hrs; up to 5× more messages than free; priority access	General + creative
Perplexity	Pro	$20 ($ 200/year)	300+ Pro searches/day; unlimited uploads; image generation; access to GPT-4.1, Claude 4.0 Sonnet; $5/mo API credit	Research with sources
xAI	SuperGrok	$30 ($ 300/year)	Standalone advanced features; 128K context	Confirmed in media, not on official docs
xAI	X Premium+	$40 ($ 229/year)	Ad-free X; highest Grok limits; Radar Search, Articles	Price increased from $22 in prior data
Anthropic	Max 5×	$100	5× usage vs. Pro; enhanced performance & priority	Heavy individual use
Anthropic	Max 20×	$200	20× usage vs. Pro; maximum priority	Power users requiring extensive AI
OpenAI	ChatGPT Pro	$200	Unlimited GPT-5 & Thinking mode; Sora video tools; highest usage limits	Unlimited frontier
Google	Gemini Ultra	$249.99	500 prompts/day, 1,000 images/day, 5 videos/day, 200 Deep Research/month	Heavy creators
Perplexity	Max	$200 ($ 2,000/year)	All Pro features; unlimited Labs; early access to new features (Comet AI browser); priority frontier models	Everything unlocked
xAI	SuperGrok Heavy	$300 ($ 3,000/year)	Multi-agent "Heavy" plan; highest rate limits	Confirmed in X posts

Team & Enterprise Plans

Provider	Plan	Cost (USD/user/mo)	Key Features	Notes
Google	Workspace Business	$20 ($ 24 monthly billing)	Gemini in Gmail, Docs, Slides, Meet; enterprise security	Annual commitment
OpenAI	ChatGPT Business (Team)	$30 ($ 25/mo annual)	Min 2 users; collaborative workspace; admin controls; higher limits than Plus. June 2025 update: connectors to internal tools, security controls, record mode	Flexible pricing
Anthropic	Claude Team (Standard)	$30 ($ 25/mo annual)	Min 5 users; higher limits than Pro; team collaboration; centralized billing	Standard seats
Google	Workspace Enterprise	$30 ($ 36 monthly billing)	AI note-taking in Meet, translated captions, advanced features	Annual commitment
Perplexity	Enterprise Pro	$40 ($ 400/year)	All Pro features; Team Spaces; centralized admin; SOC 2 Type II; $5/mo API credit per seat	Compliance-ready
Anthropic	Team (Premium Seat)	$150	Includes Claude Code; increased usage limits; min 5 users	Heavy team usage
Anthropic	Enterprise	Custom	All Team features + SSO, SCIM, advanced security, audit logs, dedicated support	Contact sales
OpenAI	Enterprise	Custom	≥150 users; enterprise security; unlimited high-speed access; expanded context; dedicated support	Contact sales

3. Free & Trial API Tiers

Generous Free APIs for Building & Testing

Provider	Models	Free Limits	Context	Link	Hosting
Google AI Studio	Gemini 3 Pro / Flash	15 RPM, 1M TPM, ~1,000 requests/day	1M+	aistudio.google.com	US/EU
Groq	Llama 4 Scout, Qwen3, Mixtral	High free tier (rate-limited)	128K–1M	console.groq.com	US
DeepSeek	V4 / R1 / Coder	5M tokens free credit for new users (~$8 value)	128K+	platform.deepseek.com	🇨🇳 Chinese
GitHub Models	GPT-5o, Llama 4, Claude	Free playground for GitHub users	128K+	github.com/models	US/EU
xAI	Grok models	$25 free +$ 150/mo with data sharing	131K+	console.grok.com	US

Free Stealth / Cloaked Models (OpenRouter)

These are pre-release models from undisclosed providers, available free during testing. Prompts & completions are logged for feedback.

Model	Context	Notes
Sonoma Sky Alpha	2,000,000	Vision + parallel tool calling; frontier-class
Sonoma Dusk Alpha	2,000,000	Speed-optimized variant; vision + tools
Horizon Beta	256,000	Improved successor to Horizon Alpha
Cypher Alpha	1,000,000	All-purpose cloaked model
xAI grok code 1	—	Was free at launch (Sep 10, 2025); then $0.20/$ 1.50

4. Paid API Pricing (Per 1M Tokens)

Sorted by input cost, low to high. Includes caching/batch discounts where available.

Ultra-Budget Tier (< $0.15 input)

Provider	Model	Input $/M	Output $/M	Context	Notes
DeepSeek	deepseek-chat (cache hit)	$0.028	$0.42	128K+	Cache miss: $0.28/M in. V4 pricing similar.
DeepSeek	deepseek-reasoner (cache hit)	$0.07	$1.68	128K+	Cache miss: $0.56/M in
Zhipu AI	GLM-4.5-airx	$0.02	$0.06	—	Ultra-fast, lowest cost model
OpenAI	GPT-5 Nano	$0.05	$0.40	400K in / 128K out	Ultra-cheap routing & classification
Meta	Llama 3.2 11b Vision (via Deepinfra)	$0.055	$0.055	128K	Vision capabilities
Alibaba	Qwen-Turbo	$0.0525	$0.21	1M	Extremely low-cost
SiliconFlow	Qwen3-Coder-Next 7B/32B	$0.07	$0.07	128K	🇨🇳 Chinese hosting
OpenAI	GPT-4.1 nano	$0.10	$0.40	1M in / 32K out	Lightweight variant
Google	Gemini 2.5 Flash-Lite / 3 Flash-Lite	$0.10-$ 0.30	$0.40	1M+	Caching: $0.025–$ 0.125/M in
Google	Gemini 2.0 Flash	$0.10-$ 0.70	$0.40	1M+	Live API: $0.35–$ 2.10 in / $1.50–$ 8.50 out. Batch: $0.05/$ 0.20
Groq	Llama 4 Scout	$0.11	$0.34	1M	Open-source; self-host viable; US hosting

Budget Tier ( $0.15 -$ 0.99 input)

Provider	Model	Input $/M	Output $/M	Context	Notes
OpenAI	GPT-4o Mini	$0.15	$0.60	128K	High-volume tasks
Zhipu AI	GLM-4.5-air	$0.16	$1.07	131K	Cost-effective lightweight
xAI	grok-3-mini	$0.20-$ 0.30	$0.50	131K	8 rps. Cached: $0.075/M in. Live Search:$ 25/1K sources
Meta	Llama 3.3 70b (via Deepinfra)	$0.23	$0.40	128K	Open-source
OpenAI	GPT-5 mini	$0.25	$2.00	400K in / 128K out	Mid-tier GPT-5
Anthropic	Claude 3 Haiku	$0.25	$1.25	200K	Batch: $0.125/$ 0.625. Caching: write $0.30 / read$ 0.03
DeepSeek	DeepSeek-V3 (via Deepinfra)	$0.27	$1.10	64K	8K max output
Alibaba	QVQ-72B-Preview	$0.28	$0.55	—	Up to 97% price cuts reported
Google	Gemini 2.5 Flash / 3 Flash	$0.30-$ 1.00	$2.50	1M+	Live API: $0.50–$ 3.00 in / $2.00–$ 12.00 out
Alibaba	Qwen2.5 72B/7B	$0.30-$ 0.35	$0.40	128K	Large & small options
Zhipu AI	GLM-4.5	$0.33	$1.32	131K	20M free tokens promo for new users
Meta	Llama 3.2 90b Vision (via Deepinfra)	$0.35	$0.40	128K	Vision capabilities
Alibaba	Qwen3.5 Plus	$0.40	$2.40	1M	Price/performance king. Multilingual coding.
OpenAI	GPT-4.1 mini	$0.40	$1.60	1M in / 32K out	Smaller GPT-4.1
Alibaba	Qwen-VL-Max	$0.41	~$0.41	—	Vision model
Alibaba	Qwen-Plus	$0.42	$1.26	131K	Tiered >128K; non-thinking ≤128K: $0.115/$ 0.287
OpenAI	GPT-3.5 Turbo	$0.50	$1.50	16K	Legacy, still available
Zhipu AI	GLM-4.5v	$0.50	$1.80	65.5K	Vision model
DeepSeek	DeepSeek-R1 (via Deepinfra)	$0.55	$2.19	64K	8K max output
Meta	Llama 3 70b (via Deepinfra/Groq)	$0.59	$0.79	8K	Legacy
xAI	grok-3-mini-fast	$0.60	$4.00	131K	3 rps. Cached: $0.15/M in
Alibaba	Qwen3 235B Thinking	$0.74	$8.82	—	Flagship 235B; non-thinking: $0.70/$ 2.80
Anthropic	Claude 3.5 Haiku / 4.6 Haiku	$0.80-$ 1.00	$4.00-$ 5.00	200K	Near-frontier speed. Batch: $0.40/$ 2.00
Alibaba	Qwen3-Max-Preview	$0.86+	$3.44+	262K	1T+ param; tiered by context window

Mid-Range Tier ( $1.00 -$ 4.99 input)

Provider	Model	Input $/M	Output $/M	Context	Notes
Perplexity	Sonar	$1.00	$1.00	—	Quick facts, news; lightweight
Perplexity	Sonar Reasoning	$1.00	$5.00	—	Step-by-step logic
OpenAI	o1-mini / o3-mini / o4-mini	$1.10	$4.40	200K	Reasoning-optimized
OpenAI	GPT-5	$1.25	$10.00	400K in / 128K out	Latest flagship GPT-5
Google	Gemini 2.5/3 Pro (≤200K)	$1.25	$10.00	1M+	Caching: $0.31/M. Batch:$ 0.625/$5.00
Google	Gemini 1.5 Pro	$1.25-$ 2.50	$5.00-$ 10.00	1M	Previous gen; price tiered by context
Alibaba	Qwen2.5-Max	$1.60	$6.40	32K	8K max output
Alibaba	Qwen-Max	$1.68	$6.72	32K	Free 1M tokens for new users (180 days)
OpenAI	GPT-5.3-Codex	$1.75	$14.00	400K	Structured outputs; recursive self-debug
Meta	Llama 3.1 405b (via Deepinfra)	$1.79	$1.79	128K	Largest open model
OpenAI	GPT-4.1	$2.00	$8.00	1M in / 32K out	Strong general-purpose
OpenAI	o3	$2.00	$8.00	200K	100K max output
xAI	grok-2 / grok-2-vision	$2.00	$10.00	131K	10–15 rps; vision variant available
Google	Gemini 2.5/3 Pro (>200K)	$2.00-$ 4.00	$12.00-$ 18.00	1M+	Caching: $0.625/M. Batch:$ 1.25/$7.50. Multimodal native.
Perplexity	Sonar Reasoning Pro / Deep Research	$2.00	$8.00	—	Multi-step reasoning + web search; citations
OpenAI	GPT-4o	$2.50-$ 5.00	$10.00-$ 15.00	128K	Supports Realtime API (speech-to-speech, Aug 2025)
Anthropic	Claude 3.5/3.7/4/4.6 Sonnet	$3.00	$15.00	200K–1M	Caching: write $3.75 / read$ 0.30. Batch: $1.50/$ 7.50. Daily driver.
xAI	grok-3 / Grok 4 / Grok 4.1	$3.00	$15.00	131K–2M	Cached: $0.75/M. Live Search:$ 25/1K sources. Real-time X data.
Cohere	Command-R+	$3.00	$15.00	—	Enterprise-grade; RAG-focused
Perplexity	Sonar Pro	$3.00	$15.00	—	Advanced search; 2× citations vs. base
Zhipu AI	GLM-4.5-flash	$3.20	$12.80	—	Strong reasoning

Premium Tier ($5.00+ input)

Provider	Model	Input $/M	Output $/M	Context	Notes
Anthropic	Claude Opus 4.6	$5.00	$25.00	1M (beta)	81.4% SWE-Bench. Caching: write $6.25 / read$ 0.50. Best coding model.
xAI	grok-3-fast	$5.00	$25.00	131K	10 rps. Cached: $1.25/M in
OpenAI	GPT-4 Turbo	$10.00	$30.00	128K	Previous generation
OpenAI	o3 (higher tier)	$10.00	$40.00	200K	Professional reasoning
Anthropic	Claude 4/4.1 Opus	$15.00	$75.00	200K+	Caching: write $18.75 / read$ 1.50. Batch: $7.50/$ 37.50
OpenAI	o1 / o1-preview	$15.00	$60.00	200K	Deep reasoning
OpenAI	o3‑pro	$20.00	$80.00	200K	Professional tier reasoning
OpenAI	GPT-4	$30.00	$60.00	8K	Legacy
OpenAI	GPT-4.5	$75.00	$150.00	128K	Preview model

5. Cheapest API Combos (Realistic Blended Cost)

Blended cost formula: Input + (Output × 1.3) per 1M tokens — reflects typical balanced usage.

Rank	Provider	Model	Blended Total	Context	Hosting	Best For
1	SiliconFlow	Qwen3-Coder-Next 7B/32B	$0.16	128K	🇨🇳 Chinese	Experiments only
2	Groq	Llama 4 Scout	$0.55	1M	US	Open-source, safe hosting
3	DeepSeek	V4 (cache hit)	$0.57	128K	🇨🇳 Chinese	Budget frontier coding
4	OpenAI	GPT-5 Nano	$0.57	400K	US/EU	Routing & classification
5	Google	Gemini 3 Flash-Lite	$0.62	1M+	US/EU	Multimodal on budget
6	xAI	Grok 4.1 Fast	$0.85	2M	US	Real-time + speed
7	Alibaba	Qwen3.5 Plus	$3.52	1M	🇨🇳 Chinese	Best price/performance frontier

Tip for Indian developers: Use Groq + Google Gemini 3 Flash for safe, fast, cheap US/EU hosting. For cheapest experiments → DeepSeek / Qwen3.5 (but never send client code to China-hosted APIs).

6. API Cost Optimisation Strategies

Strategy	Savings	How It Works
Prompt Caching	Up to 90%	Reuse system prompts/context. Supported by OpenAI, Anthropic, Google, DeepSeek, xAI. Cache-hit input rates often 75–95% cheaper.
Batch Processing	50%	Async high-volume jobs with delayed responses. OpenAI & Google offer explicit batch pricing.
Smart Routing	70–80%	Route hard tasks to flagships, simple tasks to budget models. Use OpenRouter.ai or LiteLLM for single-key multi-provider access.
Free Credits	$8-$ 150	New user promos: DeepSeek (~ $8), xAI ($ 25 + $150/mo data share), Alibaba (1M tokens/180 days), Zhipu (20M tokens). Google AI Studio free tier for prototyping.
Context Window Management	20–40%	Many providers charge more for >128K or >200K context. Keep prompts under thresholds when possible.

Multi-Model Router Strategy

Tools: OpenRouter.ai or LiteLLM for single-key access to all providers.
Savings: ~70% by routing hard tasks to flagships, simple tasks to budget models.
Stack Example: Primary: Claude Sonnet 4.6 → Fallback: DeepSeek V4 → Rerank: Cohere v3.5.
Monitoring: Track spend via pricepertoken.com or provider dashboards.

7. Platform Fees — OpenRouter

OpenRouter aggregates models from multiple providers with these fees:

Service / Feature	Cost / Details	Notes
Platform Fee on Inference	None (pass-through pricing)	No markup on inference; prices match providers
Zero-Completion Insurance (ZCI)	$0 (automatic)	Waives charges for zero-token or errored responses
BYOK Fee	5% of provider cost	Charged against credits when routing with your own keys
Credit Purchase Fee	5.5% (min $0.80)	Via Stripe; crypto: 5%
Stealth / Cloaked Models	Free during testing	Logging enabled; prompts/completions stored for feedback

8. Quick Recommendation Summary

Best Free Options

Use Case	Best Choice	Why
Daily free chat	Google Gemini 3	Truly unlimited basic; best free model quality
Free coding API	Google AI Studio (Gemini 3 Flash)	1,000 requests/day free; 1M context
Free open-source models	HuggingChat or Groq free tier	Llama 4, Qwen3, Mixtral at no cost
Free experimentation	DeepSeek (5M free tokens)	Frontier-level for $0 (China-hosted)

Best Paid Options

Use Case	Best Choice	Monthly Cost	Why
Best paid chat (serious work)	Claude Pro	$20	Opus 4.6 + Claude Code access
Best general chat	ChatGPT Plus	$20	GPT-5 + o-series reasoning
Best research chat	Perplexity Pro	$20	Multi-model + citations + search
Cheapest high-quality API	Groq (Llama 4 Scout)	Pay-per-use (~$0.55/M)	US-hosted, fast, open-source
Best frontier API value	Qwen3.5 Plus or DeepSeek V4	Pay-per-use	76–77% SWE-Bench at 95% lower cost
Best premium API	Claude Opus 4.6	Pay-per-use ( $5/$ 25 per M)	81.4% SWE-Bench; 1M context
Unlimited heavy usage	ChatGPT Pro or Claude Max 20×	$200	No limits on frontier models
Enterprise compliance	OpenAI Enterprise or Anthropic Enterprise	Custom	SSO, SCIM, audit logs, SLA

9. Glossary of Acronyms & Terms

Term	Full Name	Context
API	Application Programming Interface	Programmatic access to models
ARC-AGI	Abstraction & Reasoning Corpus for AGI	Benchmark measuring general reasoning
BYOK	Bring Your Own Key	Use personal API keys on aggregator platforms
Cloaked Model	Pre-release model from undisclosed creator	OpenRouter stealth models for feedback testing
CoT	Chain of Thought	Step-by-step reasoning (e.g., GPT-5 Thinking)
Ctx	Context	Token window size for inputs
GDPR	General Data Protection Regulation	EU privacy law compliance
HIPAA	Health Insurance Portability and Accountability Act	US healthcare privacy compliance
I/O	Input/Output	Token pricing separation
IDE	Integrated Development Environment	Code development tools
INR	Indian Rupee	Regional pricing currency
LLM	Large Language Model	Core AI models trained on text data
M Tokens	Million Tokens	Pricing unit (~750 words ≈ 1,000 tokens)
MoE	Mixture of Experts	Architecture using specialised sub-networks
ROI	Return on Investment	Business value metric
RPM / RPS	Requests Per Minute / Second	API rate limits
SAML	Security Assertion Markup Language	Enterprise authentication standard
SCIM	System for Cross-domain Identity Management	Automates user provisioning
SLA	Service Level Agreement	Uptime/performance guarantees
SOC 2	Service Organization Control 2	Security compliance framework
SSO	Single Sign-On	Enterprise unified login
SWE-Bench	Software Engineering Benchmark	Real-world coding evaluation
TTL	Time To Live	Cache duration
VLM	Vision Language Model	Models handling text + images
ZCI	Zero Completion Insurance	OpenRouter: no charge for failed/empty responses

10. Important Notes

Prices change frequently — always verify on official provider sites before committing budget.
Overages are real — Credit-based plans (Claude Max, ChatGPT Pro, Perplexity Max) exist because heavy users often exceed base tier limits.
Caching is your best friend — Prompt caching alone can cut API costs 75–90% for repetitive workloads.
Data jurisdiction matters — Chinese-hosted models (DeepSeek, Qwen, SiliconFlow, Kimi) are extremely cheap but store data under PRC jurisdiction. Use US/EU-hosted alternatives (Groq, Google, OpenAI, Anthropic) for sensitive or client work.
OpenAI Realtime API (Aug 2025) — Adds speech-to-speech, MCP server support, image input, and SIP phone calling; billed at model token rates. Relevant for voice agent builders.
Free tiers for prototyping — Google AI Studio, Groq, HuggingChat, and GitHub Models offer generous free access to start building immediately.

Bottom line: Everything is cheaper and smarter than even three months ago. Start with free tiers to prototype, scale to budget APIs for production, and reserve premium models for hard problems. 🚀