Every time I think I know what Cloudflare offers, I open the dashboard and find something I’d never noticed. Last month it was AutoRAG — a managed RAG pipeline that didn’t exist six months ago. Before that it was Browser Rendering, a headless Chrome service you can call from a Worker. The month before that, AI Gateway appeared and silently became one of the most useful proxy tools I use.
The problem isn’t that Cloudflare hides these services. It’s that there are now 40+ products across the dashboard, and it’s genuinely hard to know which ones are relevant to what you’re building, which are free, and which require a credit card.
This post is my answer to that problem. It’s a reference guide — the kind you bookmark and come back to when you’re starting a new project or wondering “does Cloudflare have something for this?” It covers three use cases:
- Self-hosting — exposing homelab services safely without opening ports
- Web app development — building full-stack apps entirely on Cloudflare’s edge
- AI research — running inference, vector search, and RAG pipelines without leaving the ecosystem
I’ve written detailed implementation guides for specific Cloudflare services separately. This post links to those rather than repeating the setup steps. Think of it as the map; the other posts are the turn-by-turn directions.
The Free Tier Philosophy
Cloudflare’s business is selling network services to enterprises. The free tier isn’t charity — it’s a customer acquisition strategy. But the side effect is real: the free limits are production-grade, not demo-grade.
AWS gives you 750 hours of a t2.micro per month for one year, then shuts it off. GCP gives you $300 in credits that expire in 90 days. Both free tiers exist specifically to pressure you toward paid before you’re ready.
Cloudflare’s free tier is different. 100,000 Workers requests per day is enough to handle a real personal app. 10GB of R2 storage with zero egress fees is enough for years of a blog’s media library. 10,000 AI inference “neurons” per day is enough to run a dozen LLM experiments. None of these expire.
The upgrade decision is straightforward: if you hit the limits, the Workers Paid plan at $5/month unlocks dramatically more capacity across almost every service in the ecosystem. Most personal projects never get there.
Service Directory
Every Cloudflare service relevant to self-hosting, web apps, and AI research — in one table.
| Service | Category | Free Limit | Paid From | What it’s for |
|---|---|---|---|---|
| Pages | Web hosting | Unlimited bandwidth, 500 builds/mo | $20/mo Pro | Static site hosting with Git CI/CD |
| Workers | Serverless | 100K requests/day, 10ms CPU | $5/mo | Edge functions, API routes, middleware |
| KV | Key-value store | 100K reads/day, 1K writes/day, 1GB | $5/mo base | Session storage, config, feature flags |
| D1 | SQLite database | 5M rows read/day, 100K writes/day, 1GB | $5/mo base | Relational data for Workers apps |
| R2 | Object storage | 10GB, 1M Class A ops/mo | $0.015/GB above free | Files, images, blobs — zero egress fees |
| Queues | Message queue | 1M operations/mo | $5/mo base | Background jobs, webhook processing |
| Durable Objects | Stateful edge | Basic (Workers Paid required) | $5/mo | WebSocket state, real-time coordination |
| Hyperdrive | DB accelerator | Workers Paid required | $5/mo | Pool connections to existing Postgres/MySQL |
| Analytics Engine | Custom analytics | Free (limited) | Paid above | Time-series event tracking from Workers |
| Email Routing | Email forwarding | Free | Free | Forward you@yourdomain.com to any inbox |
| Stream | Video CDN | 1K min storage free | $5/mo + usage | Video hosting and adaptive streaming |
| Images | Image optimization | Free resize transforms | $5/mo | On-the-fly image resizing via URL |
| Turnstile | CAPTCHA | Free, unlimited | Free | Bot-proof forms without user friction |
| Tunnel | Secure ingress | Free | Free | Expose homelab services — no open ports |
| Zero Trust Access | Identity/auth | 50 users free | $3/user/mo | Auth layer in front of any URL |
| WARP | VPN client | Free personal | $3/user/mo | Encrypted DNS + private network access |
| Gateway | DNS/HTTP filtering | 3 locations free | $3/user/mo | DNS-level malware and ad blocking |
| WAF | Firewall | 5 custom rules free | $20/mo Pro | Block attack patterns by rule |
| DDoS | L3/L4/L7 protection | Always-on, free | Free | Automatic volumetric attack mitigation |
| DNS | Nameserver | Free | Free | Authoritative DNS with Anycast routing |
| SSL/TLS | Certificates | Free, auto-renewing | Free | HTTPS for any domain |
| Registrar | Domain registration | At-cost (~$10.11/yr .com) | At-cost | Buy/transfer domains at wholesale price |
| Web Analytics | Traffic stats | Free | Free | Cookie-free, no-JavaScript-required analytics |
| Workers AI | AI inference | 10K neurons/day | $0.011/1K neurons | Run 50+ models at the edge |
| AI Gateway | AI proxy | Free | Free (currently) | Unified proxy for all AI providers |
| Vectorize | Vector database | 30M dimensions free | Paid above | Semantic search, RAG retrieval layer |
| Browser Rendering | Headless Chrome | 2 concurrent sessions | $5/mo base | Scraping, PDF generation from Workers |
| AutoRAG | Managed RAG | Limited preview | TBD | Full document ingestion + RAG pipeline |
Use Case 1: Self-Hosting and the Homelab Stack
Running services at home — Jellyfin, Home Assistant, Gitea, Proxmox, Paperless-ngx — used to require a stack of configuration glue: dynamic DNS, nginx reverse proxy, certbot for SSL, UFW rules, and careful port forwarding on the router. Cloudflare replaces most of that with three free services: Tunnel, Access, and Gateway.
Cloudflare Tunnel — The Foundation
Tunnel creates an outbound-only encrypted connection from your server to Cloudflare’s edge. Traffic flows in from the internet, reaches Cloudflare, and is forwarded to your server over that established connection. Your router never has an open port.
What you eliminate: port forwarding, DDNS, certbot, nginx reverse proxy config. Cloudflare handles SSL automatically for any hostname you route through the tunnel.
Free tier: unlimited tunnels, unlimited routes, no bandwidth cap.
I’ve written a full setup guide including Docker Compose configs and the Access policy integration: Cloudflare Tunnel Changed How I Run My Homelab.
Zero Trust Access — The Authentication Layer
Tunnel exposes your services to the internet. Access decides who can reach them. Before a request hits your Gitea or Home Assistant, it hits an Access policy that requires the user to authenticate.
Free tier: up to 50 users, which covers any homelab or small team setup.
Authentication options configured entirely from the dashboard:
- Email OTP — simplest; sends a one-time code to a verified address. No app required.
- GitHub OAuth — good if you want to restrict to specific GitHub accounts.
- Google OAuth — works with any Google account or a specific Google Workspace domain.
The decision of what to protect and how aggressively depends on the service:
| Service | Expose Publicly? | Use Access? | Auth Method |
|---|---|---|---|
| Blog / Portfolio | Yes | No | — |
| Uptime Kuma status page | Yes | No | Read-only is fine |
| Jellyfin | No | Yes | Email OTP |
| Gitea | No | Yes | GitHub OAuth |
| Home Assistant | No | Yes | Email OTP |
| Vaultwarden | No | Yes | Email OTP + app 2FA |
| Paperless-ngx | No | Yes | Email OTP |
| Proxmox UI | Never publicly | Never | LAN or WARP only |
The Proxmox case is deliberate. The hypervisor management UI is the most sensitive service in a homelab. I don’t route it through Tunnel at all — I access it on the local network or through WARP.
WARP / Cloudflare One — The VPN Layer
WARP is Cloudflare’s client-based VPN alternative. Install it on your phone or laptop, and your device’s DNS goes through Cloudflare’s network. The Zero Trust flavor integrates with your Access policies to give you private network access without public hostnames.
Practical homelab use: access Proxmox’s management interface from your phone when you’re away from home. WARP connects you into the private network, Proxmox never needs a public URL.
When Tailscale is better: if you want a purely private mesh network between devices with no public-facing components at all, Tailscale’s WireGuard-based approach is simpler and purpose-built for that case. Cloudflare WARP + Tunnel is better when you want some services to be publicly accessible (through Tunnel + Access) and some to be private (through WARP).
Gateway — DNS-Level Filtering
Gateway is Cloudflare’s managed DNS resolver that can block malware domains, ad networks, and specific content categories before a request ever leaves your network.
Configuration: set your router’s upstream DNS to Cloudflare’s Gateway IP, and every device on the network benefits. On the free plan you get 3 locations and basic malware blocking.
Practical value for a homelab: blocks known-bad domains before they reach your exposed services. A DNS-level block doesn’t require the WAF to fire — the request never gets to your origin. It’s a lightweight first line of defense.
Use Case 2: Full-Stack Web App Development
This is where Cloudflare’s ecosystem is strongest. You can build a complete, production-grade web application — hosting, serverless compute, relational database, key-value cache, object storage, message queue, analytics — without leaving the ecosystem or opening a separate cloud account.
The Core Trio: Workers + D1 + KV + R2
I’ve covered this stack in detail in The Complete Cloudflare Stack for Developer Portfolios. The short version:
- Pages hosts your static site, deploys on every git push, and gives you unlimited bandwidth at no cost.
- Workers / Pages Functions run server-side logic at the edge — form handlers, API routes, middleware.
- D1 is a SQLite database that lives alongside your Workers. 5M rows read per day, 100K writes, 1GB storage on the free tier.
- KV is a distributed key-value store for session data, feature flags, or anything you need to read fast from anywhere in the world. 100K reads per day free.
- R2 is S3-compatible object storage with zero egress fees. Store images, attachments, and generated files. 10GB free.
For the full setup guide with code examples, TypeScript types, and wrangler configuration: The Complete Cloudflare Stack for Developer Portfolios.
Queues — Async Background Jobs
Workers have a CPU time limit: 10ms on the free tier, 30 seconds on Workers Paid. For work that takes longer — sending an email, processing an uploaded file, calling an external API that’s slow — you need to hand off to a background process.
Cloudflare Queues is the answer. A producer Worker publishes a message to a Queue. A consumer Worker picks it up, processes it, and acknowledges it. If the consumer fails, the message is retried automatically.
Free tier: 1 million queue operations per month.
// Producer: publish to queue from a request handler
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const { email } = await req.json();
await env.EMAIL_QUEUE.send({ type: 'welcome_email', to: email });
return new Response('Queued', { status: 202 });
}
};
// Consumer: wrangler.jsonc declares this Worker as a queue consumer
export default {
async queue(batch: MessageBatch<{ type: string; to: string }>, env: Env) {
for (const msg of batch.messages) {
await sendEmail(msg.body.to, msg.body.type);
msg.ack();
}
}
};
The wrangler.jsonc binding connects them. The producer doesn’t wait for the email to send — it returns immediately, and the consumer processes the work asynchronously.
Durable Objects — Stateful Edge Compute
Workers are stateless by design. Every invocation starts fresh. For most request/response patterns, that’s exactly right. But some problems need state that’s consistent across many concurrent clients: a live visitor counter, a collaborative cursor system, a WebSocket hub for real-time notifications.
Durable Objects solve this by giving you a single, globally addressable JavaScript object with in-memory state and durable storage. All requests to a specific Durable Object are routed to the same physical instance, so state is always consistent.
When to use Durable Objects vs. D1:
- D1 for traditional database queries — insert/select/update patterns, user records, blog posts.
- Durable Objects for real-time coordination — WebSocket connections that need shared state, live counters, collaborative editing.
Practical examples:
- A live view counter for blog posts that’s accurate to the second (not a cached approximation)
- A rate limiter that’s consistent across all edge locations
- A WebSocket server for a real-time chat feature
Durable Objects require Workers Paid ($5/month). There’s no additional charge beyond the base plan for basic use — duration-based billing only kicks in above specific thresholds.
Hyperdrive — Turbocharge Your Existing Database
If you already have a Postgres database on Neon, Supabase, or a VPS, connecting to it from Workers at the edge creates a cold TCP connection on every request. Database connection establishment is slow — easily 100–300ms depending on geography.
Hyperdrive maintains a warm connection pool at the edge. Instead of your Worker connecting to your database server, it connects to Hyperdrive, which already has connections pooled and ready. The result is dramatically lower latency for database-heavy Workers.
When to use: you have an existing Postgres/MySQL database you don’t want to migrate to D1 — maybe because it has complex queries, existing data, or stored procedures that D1 can’t replicate.
Requires: Workers Paid ($5/month).
Analytics Engine — Custom Event Tracking
KV is too slow for high-write event streams. D1 is a transactional database — not optimized for “write one event per page view” patterns. Analytics Engine is Cloudflare’s purpose-built time-series event store, designed to receive writes from Workers without bottlenecking your request handlers.
// Track an event from any Worker
env.ANALYTICS.writeDataPoint({
blobs: ['blog_post_read', postSlug],
doubles: [1],
indexes: ['blog']
});
Practical uses:
- Custom page view tracking without JavaScript on the client side (the Worker logs the event on every request)
- API endpoint usage counts
- Error rate monitoring without a third-party APM
The data is queryable via the Cloudflare API using a SQL-like syntax. It’s not a replacement for a full analytics platform, but for custom metrics it’s simpler and cheaper than anything else in the ecosystem.
Email Routing — Free Email Forwarding
If your domain is registered through or managed by Cloudflare, Email Routing gives you you@yourdomain.com for free. Messages sent to your domain address are forwarded to any personal inbox — Gmail, Fastmail, iCloud, wherever.
Setup takes under five minutes in the dashboard. Cloudflare adds the required MX records automatically.
Bonus: Email Routing lets you write a Worker that handles incoming messages programmatically. Parse the email, extract data, post it to a webhook, auto-reply, or forward selectively based on content. It’s essentially a serverless email processing pipeline.
For the DNS and domain setup prerequisites, see Why I Registered My Domain Through Cloudflare.
Stream and Images — When Scale Demands It
Stream handles video hosting and adaptive streaming. Useful if you’re building a video platform and don’t want to pay Vimeo’s rates or host the bandwidth yourself. $5/month base plus $1 per 1,000 minutes stored and delivered.
Cloudflare Images handles on-the-fly image resizing via URL parameters. ?width=400&format=webp turns any stored image into a WebP-optimized thumbnail. $5/month for the first 5,000 images stored.
Honest assessment for personal projects: if you’re running a blog or small app, R2 for storage and a plain <img> tag with browser-native lazy loading is sufficient. Stream and Images shine at scale — multiple formats, multiple resolutions, high request volume — where the CDN integration pays for itself in bandwidth savings.
Use Case 3: AI Research and Inference
I started using Cloudflare for AI workloads almost by accident. Workers AI appeared in the dashboard with a selection of models and a one-API-call interface. I was curious. Within a week it had replaced my local Ollama setup for quick inference experiments because it required zero infrastructure.
Workers AI — Edge Inference Without a GPU
Workers AI gives you access to 50+ models — including Llama 3.1 8B, Mistral 7B, Whisper for transcription, SDXL for image generation, and several embedding models — via the same env.AI binding pattern as D1 and KV.
Free tier: 10,000 neurons per day. Neurons are Cloudflare’s compute unit — roughly correlated to inference time, not token count. For casual experimentation, 10K neurons per day is generous.
Above free: $0.011 per 1,000 neurons.
The API is OpenAI-compatible. If you’ve written code targeting OpenAI’s /v1/chat/completions endpoint, it works with Workers AI by changing the base URL:
// Using Cloudflare's built-in AI binding (simplest)
const result = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: userPrompt }
]
});
// Or using OpenAI SDK with Workers AI base URL
const client = new OpenAI({
apiKey: env.CLOUDFLARE_API_TOKEN,
baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CF_ACCOUNT_ID}/ai/v1`
});
Good for: classification, summarization, content moderation, embedding generation, transcription, quick prototyping.
Not great for: tasks requiring frontier-model reasoning. Llama 3.1 8B is capable but not GPT-4 or Claude Sonnet. For complex reasoning or tool use, routing through AI Gateway to OpenAI/Anthropic (see next section) is the right move.
AI Gateway — Unified Proxy for All Your AI Calls
The problem AI Gateway solves: you’re calling OpenAI from one part of your app, Anthropic from another, Workers AI for embeddings, and Hugging Face for a fine-tuned model. There’s no unified place to see all the requests, no retry logic, no cost visibility, and no way to cache identical prompts.
AI Gateway sits between your code and any AI provider. You point your SDK at the AI Gateway URL instead of the provider URL. Everything flows through it.
Currently free across all plans (pricing may change as the product matures).
Supported providers:
| Provider | Supported |
|---|---|
| OpenAI | Yes |
| Anthropic (Claude) | Yes |
| Cloudflare Workers AI | Yes |
| Google Gemini | Yes |
| Hugging Face | Yes |
| Cohere | Yes |
| Groq | Yes |
| Azure OpenAI | Yes |
What you get:
- Request logging — every prompt and response is logged. You can replay, search, and inspect them in the dashboard.
- Semantic caching — identical or near-identical prompts return cached responses immediately. No inference cost, no latency.
- Rate limiting — protect against runaway AI costs from a bug in your app.
- Fallback providers — if OpenAI rate-limits you, AI Gateway can automatically retry the same request against Workers AI or another provider.
For AI researchers running multiple experiments, the logging alone is worth it. You get a complete audit trail of every prompt you ran without instrumenting your own code.
// Route through AI Gateway by changing the base URL
const client = new OpenAI({
apiKey: env.OPENAI_API_KEY,
baseURL: `https://gateway.ai.cloudflare.com/v1/${env.CF_ACCOUNT_ID}/${env.GATEWAY_ID}/openai`
});
// Everything else stays the same
Vectorize — Vector Database for Semantic Search and RAG
Vectorize is Cloudflare’s managed vector database. Unlike Pinecone, Weaviate, or Chroma, there’s nothing to self-host, no separate API key, and no separate service to integrate. It uses the same env.MY_INDEX binding pattern as everything else.
Free tier: 30 million vector dimensions. At 768 dimensions (a common embedding size), that’s roughly 39,000 embeddings. Enough for a meaningful RAG knowledge base.
The primary use case is RAG — retrieval-augmented generation. The full pipeline stays inside Cloudflare:
- Ingestion Worker: receive a document, chunk it, generate embeddings with Workers AI (
@cf/baai/bge-small-en-v1.5), store in Vectorize. - Query Worker: receive a question, embed it with the same model, find the closest vectors in Vectorize, use the retrieved documents as context, call Workers AI for a final answer.
// Store an embedding
const embedding = await env.AI.run('@cf/baai/bge-small-en-v1.5', {
text: documentChunk
});
await env.VECTOR_INDEX.upsert([{
id: documentId,
values: embedding.data[0],
metadata: { source: url, chunk: chunkIndex }
}]);
// Query for similar content
const queryEmbedding = await env.AI.run('@cf/baai/bge-small-en-v1.5', {
text: userQuestion
});
const results = await env.VECTOR_INDEX.query(queryEmbedding.data[0], { topK: 5 });
// Pass results.matches as context to Workers AI
The entire RAG pipeline — embedding, storage, retrieval, generation — without leaving Cloudflare’s network or managing any infrastructure.
For a deeper look at RAG architecture, see RAG in Production.
Browser Rendering — Headless Chrome at the Edge
Browser Rendering gives you a Puppeteer-compatible API that runs a headless Chrome browser inside a Worker. No Selenium server, no managed browser farm, no infrastructure to maintain.
Requires: Workers Paid ($5/month). Free: 2 concurrent sessions within that plan.
For AI researchers, the key use case is web scraping for RAG ingestion. Dynamic pages that require JavaScript execution can’t be fetched with a simple fetch() call — you need a real browser to render them first.
import puppeteer from '@cloudflare/puppeteer';
export default {
async fetch(req: Request, env: Env): Promise<Response> {
const { url } = await req.json<{ url: string }>();
const browser = await puppeteer.launch(env.BROWSER);
const page = await browser.newPage();
await page.goto(url, { waitUntil: 'networkidle0' });
const content = await page.evaluate(() => document.body.innerText);
await browser.close();
// Feed `content` into your Vectorize ingestion pipeline
return Response.json({ content });
}
};
Other uses: generate PDF reports from rendered HTML, take screenshots for visual regression testing, test pages that require authentication.
AutoRAG — Managed RAG Pipeline (Preview)
AutoRAG is Cloudflare’s attempt to package the Workers AI + Vectorize RAG pipeline into a turnkey product. Upload documents, configure a source (R2 bucket, website URL), and get a RAG API endpoint back. Chunking, embedding, storage, and retrieval are all managed.
Status: limited preview as of February 2026. Not generally available.
Why it matters when it ships: the manual Workers AI + Vectorize pipeline above requires wiring up embeddings, chunking logic, metadata storage, and query handling. AutoRAG handles all of that. For researchers who want to experiment with RAG without building the infrastructure, it removes the setup cost entirely.
Watch this space. Cloudflare ships fast.
Use Case → Service Mapping
If you have a specific problem and aren’t sure which Cloudflare service to reach for:
| I want to… | Use this |
|---|---|
| Host a static site with CI/CD | Pages |
| Build a serverless API or webhook handler | Workers |
| Store structured relational data | D1 |
| Cache session data or config values | KV |
| Store user-uploaded files without egress fees | R2 |
| Send emails from a Worker | Email Routing + Worker handler |
| Receive email at my custom domain | Email Routing |
| Process background jobs asynchronously | Queues |
| Handle real-time WebSocket connections with state | Durable Objects |
| Connect my existing Postgres or MySQL database | Hyperdrive |
| Track custom analytics events from Workers | Analytics Engine |
| Host video content | Stream |
| Resize and optimize images on the fly | Images |
| Protect forms from bots without annoying CAPTCHAs | Turnstile |
| Expose a homelab service without opening ports | Tunnel |
| Add authentication in front of any URL | Zero Trust Access |
| Give my team private network access remotely | WARP / Cloudflare One |
| Block malware and ads at the DNS level | Gateway |
| Run AI inference without a GPU or API key | Workers AI |
| Proxy and log all my AI API calls | AI Gateway |
| Build a vector search or RAG system | Vectorize |
| Scrape dynamic web pages from a Worker | Browser Rendering |
| Get a fully managed RAG pipeline | AutoRAG (preview) |
Pricing: Free vs. Paid
Two paid tiers cover most upgrade decisions:
Workers Paid — $5/month unlocks the developer platform:
| Metric | Free | Workers Paid |
|---|---|---|
| Workers requests | 100K/day | 10M/month + $0.30/million |
| Workers CPU time | 10ms/invocation | 30s/invocation |
| KV reads | 100K/day | 10M/month |
| KV writes | 1K/day | 1M/month |
| D1 rows read | 5M/day | 25M/day |
| D1 rows written | 100K/day | 50M/month |
| D1 storage | 1GB | 5GB |
| R2 storage | 10GB | 10GB + $0.015/GB |
| R2 Class A operations | 1M/month | 1M/month + $4.50/million |
| Queues operations | 1M/month | 1M/month + $0.40/million |
| Workers AI neurons | 10K/day | 10K/day + $0.011/1K |
| Vectorize dimensions | 30M free | 30M + usage-based above |
| Durable Objects | Not included | Included |
| Hyperdrive | Not included | Included |
| Browser Rendering | Not included | Included (2 sessions) |
Zero Trust Teams — $3/user/month removes the 50-user cap on Tunnel, Access, and Gateway, and adds more granular policy options. For a personal homelab with fewer than 50 people who need access, the free tier covers everything indefinitely.
The honest math: personal projects almost never hit free tier limits. A blog with 10,000 monthly visitors uses roughly 10,000 Workers requests per day — right at the free limit, but not over it. The $5/month Workers Paid upgrade is worth considering when you’re building an actual application with multiple users and background job requirements.
My Personal Stack — What I Actually Use
Self-hosting: Tunnel + Zero Trust Access, both on the free tier. Gitea, Jellyfin, and my monitoring dashboard sit behind Access with email OTP. Proxmox never touches Tunnel — I access it via WARP when I’m away from home or on LAN when I’m not.
Portfolio: Pages + Pages Functions + KV (contact form submissions) + Turnstile (bot protection). Zero additional cost beyond the domain.
AI experimentation: Workers AI for quick inference tests (I keep a Worker deployed that accepts a prompt and returns a response — faster than opening a chat interface for simple tests), AI Gateway to log what I’m actually sending to OpenAI, and a small Vectorize index for a private notes RAG experiment.
What I’m not using yet: Queues, Durable Objects, and AutoRAG. My personal projects don’t need async job queues — the contact form is the most complex “background” operation I have, and it’s fast enough to handle synchronously. AutoRAG is still in preview. I’m watching it.
My monthly Cloudflare bill:
| Service | Cost/month |
|---|---|
| Domain registration | |
| Pages hosting | $0 |
| Tunnel + Access (free tier) | $0 |
| Workers (free tier) | $0 |
| KV, D1, R2 (free tier) | $0 |
| Workers AI (free tier) | $0 |
| AI Gateway | $0 |
| Vectorize (free tier) | $0 |
| Total | ~$0.84/month |
What Cloudflare Is Not Great For
Long-running processes. Workers have a 10ms CPU cap on the free tier and a 30-second cap on the paid tier. A data processing job that takes five minutes can’t run in a Worker. You can work around this with Queues — break the work into chunks, each handled by a separate message — but it’s more complex than a traditional server. For jobs that genuinely need to run for minutes, a VPS or homelab machine is the right tool.
GPU-intensive ML workloads. Workers AI runs on Cloudflare’s managed GPU infrastructure. You don’t choose the hardware, you can’t fine-tune models, and you can’t load custom weights. For actual ML research — training, fine-tuning, running inference on custom models — you want Runpod, Lambda Labs, or Vast.ai. Cloudflare Workers AI is great for inference on pre-existing models; it’s not a research compute platform.
Persistent server processes. If you need a long-lived process — a Discord bot, a WebSocket server that maintains state for thousands of concurrent connections at scale, a background scheduler — Cloudflare’s architecture isn’t the right fit. Durable Objects help with stateful coordination, but they’re designed for edge use cases, not general-purpose server processes. A VPS is simpler for this.
Strict data residency requirements. Cloudflare’s network is global by design. If your use case requires that data stays within a specific geographic region (EU data residency for GDPR, or US-only storage for government compliance), verify Cloudflare’s compliance documentation before committing. They do offer data locality controls in some products, but it requires research.
Where to Go Next
Cloudflare has quietly become a full-platform company. The free tier alone covers a professional portfolio, a safely-exposed homelab, and serious AI inference experimentation — for less than a dollar a month (the domain).
If you’re new to Cloudflare, start with the two highest-value entry points:
-
Domain + Pages: register your domain through Cloudflare and deploy a static site on Pages. The setup is documented in Why I Registered My Domain Through Cloudflare and The Complete Cloudflare Stack for Developer Portfolios. Zero learning curve, immediate value.
-
Tunnel + Access: if you run a homelab, add Tunnel and Zero Trust Access. Documented in Cloudflare Tunnel Changed How I Run My Homelab. No open ports, automatic SSL, free authentication layer — the configuration is an afternoon of work and it runs indefinitely without maintenance.
Once those are running, Workers AI is the natural next experiment if you do any AI work. Create a Worker, bind the AI model, and you have a serverless inference endpoint in ten minutes.
One last note: Cloudflare ships constantly. AutoRAG, AI Gateway, Browser Rendering, and Vectorize all appeared in the past two years. The service table above will look different in twelve months. Worth revisiting when you’re planning your next project.
Related posts: