Everything Cloudflare Offers for Free (And Cheap) — Self-Hosting, Web Apps, and AI

Every time I think I know what Cloudflare offers, I open the dashboard and find something I’d never noticed. Last month it was AutoRAG — a managed RAG pipeline that didn’t exist six months ago. Before that it was Browser Rendering, a headless Chrome service you can call from a Worker. The month before that, AI Gateway appeared and silently became one of the most useful proxy tools I use.

The problem isn’t that Cloudflare hides these services. It’s that there are now 40+ products across the dashboard, and it’s genuinely hard to know which ones are relevant to what you’re building, which are free, and which require a credit card.

This post is my answer to that problem. It’s a reference guide — the kind you bookmark and come back to when you’re starting a new project or wondering “does Cloudflare have something for this?” It covers three use cases:

Self-hosting — exposing homelab services safely without opening ports
Web app development — building full-stack apps entirely on Cloudflare’s edge
AI research — running inference, vector search, and RAG pipelines without leaving the ecosystem

I’ve written detailed implementation guides for specific Cloudflare services separately. This post links to those rather than repeating the setup steps. Think of it as the map; the other posts are the turn-by-turn directions.

The Free Tier Philosophy

Cloudflare’s business is selling network services to enterprises. The free tier isn’t charity — it’s a customer acquisition strategy. But the side effect is real: the free limits are production-grade, not demo-grade.

AWS gives you 750 hours of a t2.micro per month for one year, then shuts it off. GCP gives you $300 in credits that expire in 90 days. Both free tiers exist specifically to pressure you toward paid before you’re ready.

Cloudflare’s free tier is different. 100,000 Workers requests per day is enough to handle a real personal app. 10GB of R2 storage with zero egress fees is enough for years of a blog’s media library. 10,000 AI inference “neurons” per day is enough to run a dozen LLM experiments. None of these expire.

The upgrade decision is straightforward: if you hit the limits, the Workers Paid plan at $5/month unlocks dramatically more capacity across almost every service in the ecosystem. Most personal projects never get there.

Service Directory

Every Cloudflare service relevant to self-hosting, web apps, and AI research — in one table.

Service	Category	Free Limit	Paid From	What it’s for
Pages	Web hosting	Unlimited bandwidth, 500 builds/mo	$20/mo Pro	Static site hosting with Git CI/CD
Workers	Serverless	100K requests/day, 10ms CPU	$5/mo	Edge functions, API routes, middleware
KV	Key-value store	100K reads/day, 1K writes/day, 1GB	$5/mo base	Session storage, config, feature flags
D1	SQLite database	5M rows read/day, 100K writes/day, 1GB	$5/mo base	Relational data for Workers apps
R2	Object storage	10GB, 1M Class A ops/mo	$0.015/GB above free	Files, images, blobs — zero egress fees
Queues	Message queue	1M operations/mo	$5/mo base	Background jobs, webhook processing
Durable Objects	Stateful edge	Basic (Workers Paid required)	$5/mo	WebSocket state, real-time coordination
Hyperdrive	DB accelerator	Workers Paid required	$5/mo	Pool connections to existing Postgres/MySQL
Analytics Engine	Custom analytics	Free (limited)	Paid above	Time-series event tracking from Workers
Email Routing	Email forwarding	Free	Free	Forward `you@yourdomain.com` to any inbox
Stream	Video CDN	1K min storage free	$5/mo + usage	Video hosting and adaptive streaming
Images	Image optimization	Free resize transforms	$5/mo	On-the-fly image resizing via URL
Turnstile	CAPTCHA	Free, unlimited	Free	Bot-proof forms without user friction
Tunnel	Secure ingress	Free	Free	Expose homelab services — no open ports
Zero Trust Access	Identity/auth	50 users free	$3/user/mo	Auth layer in front of any URL
WARP	VPN client	Free personal	$3/user/mo	Encrypted DNS + private network access
Gateway	DNS/HTTP filtering	3 locations free	$3/user/mo	DNS-level malware and ad blocking
WAF	Firewall	5 custom rules free	$20/mo Pro	Block attack patterns by rule
DDoS	L3/L4/L7 protection	Always-on, free	Free	Automatic volumetric attack mitigation
DNS	Nameserver	Free	Free	Authoritative DNS with Anycast routing
SSL/TLS	Certificates	Free, auto-renewing	Free	HTTPS for any domain
Registrar	Domain registration	At-cost (~$10.11/yr .com)	At-cost	Buy/transfer domains at wholesale price
Web Analytics	Traffic stats	Free	Free	Cookie-free, no-JavaScript-required analytics
Workers AI	AI inference	10K neurons/day	$0.011/1K neurons	Run 50+ models at the edge
AI Gateway	AI proxy	Free	Free (currently)	Unified proxy for all AI providers
Vectorize	Vector database	30M dimensions free	Paid above	Semantic search, RAG retrieval layer
Browser Rendering	Headless Chrome	2 concurrent sessions	$5/mo base	Scraping, PDF generation from Workers
AutoRAG	Managed RAG	Limited preview	TBD	Full document ingestion + RAG pipeline

Use Case 1: Self-Hosting and the Homelab Stack

Running services at home — Jellyfin, Home Assistant, Gitea, Proxmox, Paperless-ngx — used to require a stack of configuration glue: dynamic DNS, nginx reverse proxy, certbot for SSL, UFW rules, and careful port forwarding on the router. Cloudflare replaces most of that with three free services: Tunnel, Access, and Gateway.

Cloudflare Tunnel — The Foundation

Tunnel creates an outbound-only encrypted connection from your server to Cloudflare’s edge. Traffic flows in from the internet, reaches Cloudflare, and is forwarded to your server over that established connection. Your router never has an open port.

What you eliminate: port forwarding, DDNS, certbot, nginx reverse proxy config. Cloudflare handles SSL automatically for any hostname you route through the tunnel.

Free tier: unlimited tunnels, unlimited routes, no bandwidth cap.

I’ve written a full setup guide including Docker Compose configs and the Access policy integration: Cloudflare Tunnel Changed How I Run My Homelab.

Zero Trust Access — The Authentication Layer

Tunnel exposes your services to the internet. Access decides who can reach them. Before a request hits your Gitea or Home Assistant, it hits an Access policy that requires the user to authenticate.

Free tier: up to 50 users, which covers any homelab or small team setup.

Authentication options configured entirely from the dashboard:

Email OTP — simplest; sends a one-time code to a verified address. No app required.
GitHub OAuth — good if you want to restrict to specific GitHub accounts.
Google OAuth — works with any Google account or a specific Google Workspace domain.

The decision of what to protect and how aggressively depends on the service:

Service	Expose Publicly?	Use Access?	Auth Method
Blog / Portfolio	Yes	No	—
Uptime Kuma status page	Yes	No	Read-only is fine
Jellyfin	No	Yes	Email OTP
Gitea	No	Yes	GitHub OAuth
Home Assistant	No	Yes	Email OTP
Vaultwarden	No	Yes	Email OTP + app 2FA
Paperless-ngx	No	Yes	Email OTP
Proxmox UI	Never publicly	Never	LAN or WARP only

The Proxmox case is deliberate. The hypervisor management UI is the most sensitive service in a homelab. I don’t route it through Tunnel at all — I access it on the local network or through WARP.

WARP / Cloudflare One — The VPN Layer

WARP is Cloudflare’s client-based VPN alternative. Install it on your phone or laptop, and your device’s DNS goes through Cloudflare’s network. The Zero Trust flavor integrates with your Access policies to give you private network access without public hostnames.

Practical homelab use: access Proxmox’s management interface from your phone when you’re away from home. WARP connects you into the private network, Proxmox never needs a public URL.

When Tailscale is better: if you want a purely private mesh network between devices with no public-facing components at all, Tailscale’s WireGuard-based approach is simpler and purpose-built for that case. Cloudflare WARP + Tunnel is better when you want some services to be publicly accessible (through Tunnel + Access) and some to be private (through WARP).

Gateway — DNS-Level Filtering

Gateway is Cloudflare’s managed DNS resolver that can block malware domains, ad networks, and specific content categories before a request ever leaves your network.

Configuration: set your router’s upstream DNS to Cloudflare’s Gateway IP, and every device on the network benefits. On the free plan you get 3 locations and basic malware blocking.

Practical value for a homelab: blocks known-bad domains before they reach your exposed services. A DNS-level block doesn’t require the WAF to fire — the request never gets to your origin. It’s a lightweight first line of defense.

Use Case 2: Full-Stack Web App Development

This is where Cloudflare’s ecosystem is strongest. You can build a complete, production-grade web application — hosting, serverless compute, relational database, key-value cache, object storage, message queue, analytics — without leaving the ecosystem or opening a separate cloud account.

The Core Trio: Workers + D1 + KV + R2

I’ve covered this stack in detail in The Complete Cloudflare Stack for Developer Portfolios. The short version:

Pages hosts your static site, deploys on every git push, and gives you unlimited bandwidth at no cost.
Workers / Pages Functions run server-side logic at the edge — form handlers, API routes, middleware.
D1 is a SQLite database that lives alongside your Workers. 5M rows read per day, 100K writes, 1GB storage on the free tier.
KV is a distributed key-value store for session data, feature flags, or anything you need to read fast from anywhere in the world. 100K reads per day free.
R2 is S3-compatible object storage with zero egress fees. Store images, attachments, and generated files. 10GB free.

For the full setup guide with code examples, TypeScript types, and wrangler configuration: The Complete Cloudflare Stack for Developer Portfolios.

Queues — Async Background Jobs

Workers have a CPU time limit: 10ms on the free tier, 30 seconds on Workers Paid. For work that takes longer — sending an email, processing an uploaded file, calling an external API that’s slow — you need to hand off to a background process.

Cloudflare Queues is the answer. A producer Worker publishes a message to a Queue. A consumer Worker picks it up, processes it, and acknowledges it. If the consumer fails, the message is retried automatically.

Free tier: 1 million queue operations per month.

// Producer: publish to queue from a request handler
export default {
  async fetch(req: Request, env: Env): Promise<Response> {
    const { email } = await req.json();
    await env.EMAIL_QUEUE.send({ type: 'welcome_email', to: email });
    return new Response('Queued', { status: 202 });
  }
};

// Consumer: wrangler.jsonc declares this Worker as a queue consumer
export default {
  async queue(batch: MessageBatch<{ type: string; to: string }>, env: Env) {
    for (const msg of batch.messages) {
      await sendEmail(msg.body.to, msg.body.type);
      msg.ack();
    }
  }
};

The wrangler.jsonc binding connects them. The producer doesn’t wait for the email to send — it returns immediately, and the consumer processes the work asynchronously.

Durable Objects — Stateful Edge Compute

Workers are stateless by design. Every invocation starts fresh. For most request/response patterns, that’s exactly right. But some problems need state that’s consistent across many concurrent clients: a live visitor counter, a collaborative cursor system, a WebSocket hub for real-time notifications.

Durable Objects solve this by giving you a single, globally addressable JavaScript object with in-memory state and durable storage. All requests to a specific Durable Object are routed to the same physical instance, so state is always consistent.

When to use Durable Objects vs. D1:

D1 for traditional database queries — insert/select/update patterns, user records, blog posts.
Durable Objects for real-time coordination — WebSocket connections that need shared state, live counters, collaborative editing.

Practical examples:

A live view counter for blog posts that’s accurate to the second (not a cached approximation)
A rate limiter that’s consistent across all edge locations
A WebSocket server for a real-time chat feature

Durable Objects require Workers Paid ($5/month). There’s no additional charge beyond the base plan for basic use — duration-based billing only kicks in above specific thresholds.

Hyperdrive — Turbocharge Your Existing Database

If you already have a Postgres database on Neon, Supabase, or a VPS, connecting to it from Workers at the edge creates a cold TCP connection on every request. Database connection establishment is slow — easily 100–300ms depending on geography.

Hyperdrive maintains a warm connection pool at the edge. Instead of your Worker connecting to your database server, it connects to Hyperdrive, which already has connections pooled and ready. The result is dramatically lower latency for database-heavy Workers.

When to use: you have an existing Postgres/MySQL database you don’t want to migrate to D1 — maybe because it has complex queries, existing data, or stored procedures that D1 can’t replicate.

Requires: Workers Paid ($5/month).

Analytics Engine — Custom Event Tracking

KV is too slow for high-write event streams. D1 is a transactional database — not optimized for “write one event per page view” patterns. Analytics Engine is Cloudflare’s purpose-built time-series event store, designed to receive writes from Workers without bottlenecking your request handlers.

// Track an event from any Worker
env.ANALYTICS.writeDataPoint({
  blobs: ['blog_post_read', postSlug],
  doubles: [1],
  indexes: ['blog']
});

Practical uses:

Custom page view tracking without JavaScript on the client side (the Worker logs the event on every request)
API endpoint usage counts
Error rate monitoring without a third-party APM

The data is queryable via the Cloudflare API using a SQL-like syntax. It’s not a replacement for a full analytics platform, but for custom metrics it’s simpler and cheaper than anything else in the ecosystem.

Email Routing — Free Email Forwarding

If your domain is registered through or managed by Cloudflare, Email Routing gives you you@yourdomain.com for free. Messages sent to your domain address are forwarded to any personal inbox — Gmail, Fastmail, iCloud, wherever.

Setup takes under five minutes in the dashboard. Cloudflare adds the required MX records automatically.

Bonus: Email Routing lets you write a Worker that handles incoming messages programmatically. Parse the email, extract data, post it to a webhook, auto-reply, or forward selectively based on content. It’s essentially a serverless email processing pipeline.

For the DNS and domain setup prerequisites, see Why I Registered My Domain Through Cloudflare.

Stream and Images — When Scale Demands It

Stream handles video hosting and adaptive streaming. Useful if you’re building a video platform and don’t want to pay Vimeo’s rates or host the bandwidth yourself. $5/month base plus $1 per 1,000 minutes stored and delivered.

Cloudflare Images handles on-the-fly image resizing via URL parameters. ?width=400&format=webp turns any stored image into a WebP-optimized thumbnail. $5/month for the first 5,000 images stored.

Honest assessment for personal projects: if you’re running a blog or small app, R2 for storage and a plain <img> tag with browser-native lazy loading is sufficient. Stream and Images shine at scale — multiple formats, multiple resolutions, high request volume — where the CDN integration pays for itself in bandwidth savings.

Use Case 3: AI Research and Inference

I started using Cloudflare for AI workloads almost by accident. Workers AI appeared in the dashboard with a selection of models and a one-API-call interface. I was curious. Within a week it had replaced my local Ollama setup for quick inference experiments because it required zero infrastructure.

Workers AI — Edge Inference Without a GPU

Workers AI gives you access to 50+ models — including Llama 3.1 8B, Mistral 7B, Whisper for transcription, SDXL for image generation, and several embedding models — via the same env.AI binding pattern as D1 and KV.

Free tier: 10,000 neurons per day. Neurons are Cloudflare’s compute unit — roughly correlated to inference time, not token count. For casual experimentation, 10K neurons per day is generous.

Above free: $0.011 per 1,000 neurons.

The API is OpenAI-compatible. If you’ve written code targeting OpenAI’s /v1/chat/completions endpoint, it works with Workers AI by changing the base URL:

// Using Cloudflare's built-in AI binding (simplest)
const result = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: userPrompt }
  ]
});

// Or using OpenAI SDK with Workers AI base URL
const client = new OpenAI({
  apiKey: env.CLOUDFLARE_API_TOKEN,
  baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CF_ACCOUNT_ID}/ai/v1`
});

Good for: classification, summarization, content moderation, embedding generation, transcription, quick prototyping.

Not great for: tasks requiring frontier-model reasoning. Llama 3.1 8B is capable but not GPT-4 or Claude Sonnet. For complex reasoning or tool use, routing through AI Gateway to OpenAI/Anthropic (see next section) is the right move.

AI Gateway — Unified Proxy for All Your AI Calls

The problem AI Gateway solves: you’re calling OpenAI from one part of your app, Anthropic from another, Workers AI for embeddings, and Hugging Face for a fine-tuned model. There’s no unified place to see all the requests, no retry logic, no cost visibility, and no way to cache identical prompts.

AI Gateway sits between your code and any AI provider. You point your SDK at the AI Gateway URL instead of the provider URL. Everything flows through it.

Currently free across all plans (pricing may change as the product matures).

Supported providers:

Provider	Supported
OpenAI	Yes
Anthropic (Claude)	Yes
Cloudflare Workers AI	Yes
Google Gemini	Yes
Hugging Face	Yes
Cohere	Yes
Groq	Yes
Azure OpenAI	Yes

What you get:

Request logging — every prompt and response is logged. You can replay, search, and inspect them in the dashboard.
Semantic caching — identical or near-identical prompts return cached responses immediately. No inference cost, no latency.
Rate limiting — protect against runaway AI costs from a bug in your app.
Fallback providers — if OpenAI rate-limits you, AI Gateway can automatically retry the same request against Workers AI or another provider.

For AI researchers running multiple experiments, the logging alone is worth it. You get a complete audit trail of every prompt you ran without instrumenting your own code.

// Route through AI Gateway by changing the base URL
const client = new OpenAI({
  apiKey: env.OPENAI_API_KEY,
  baseURL: `https://gateway.ai.cloudflare.com/v1/${env.CF_ACCOUNT_ID}/${env.GATEWAY_ID}/openai`
});
// Everything else stays the same

Vectorize — Vector Database for Semantic Search and RAG

Vectorize is Cloudflare’s managed vector database. Unlike Pinecone, Weaviate, or Chroma, there’s nothing to self-host, no separate API key, and no separate service to integrate. It uses the same env.MY_INDEX binding pattern as everything else.

Free tier: 30 million vector dimensions. At 768 dimensions (a common embedding size), that’s roughly 39,000 embeddings. Enough for a meaningful RAG knowledge base.

The primary use case is RAG — retrieval-augmented generation. The full pipeline stays inside Cloudflare:

Ingestion Worker: receive a document, chunk it, generate embeddings with Workers AI (@cf/baai/bge-small-en-v1.5), store in Vectorize.
Query Worker: receive a question, embed it with the same model, find the closest vectors in Vectorize, use the retrieved documents as context, call Workers AI for a final answer.

// Store an embedding
const embedding = await env.AI.run('@cf/baai/bge-small-en-v1.5', {
  text: documentChunk
});
await env.VECTOR_INDEX.upsert([{
  id: documentId,
  values: embedding.data[0],
  metadata: { source: url, chunk: chunkIndex }
}]);

// Query for similar content
const queryEmbedding = await env.AI.run('@cf/baai/bge-small-en-v1.5', {
  text: userQuestion
});
const results = await env.VECTOR_INDEX.query(queryEmbedding.data[0], { topK: 5 });
// Pass results.matches as context to Workers AI

The entire RAG pipeline — embedding, storage, retrieval, generation — without leaving Cloudflare’s network or managing any infrastructure.

For a deeper look at RAG architecture, see RAG in Production.

Browser Rendering — Headless Chrome at the Edge

Browser Rendering gives you a Puppeteer-compatible API that runs a headless Chrome browser inside a Worker. No Selenium server, no managed browser farm, no infrastructure to maintain.

Requires: Workers Paid ($5/month). Free: 2 concurrent sessions within that plan.

For AI researchers, the key use case is web scraping for RAG ingestion. Dynamic pages that require JavaScript execution can’t be fetched with a simple fetch() call — you need a real browser to render them first.

import puppeteer from '@cloudflare/puppeteer';

export default {
  async fetch(req: Request, env: Env): Promise<Response> {
    const { url } = await req.json<{ url: string }>();
    const browser = await puppeteer.launch(env.BROWSER);
    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle0' });
    const content = await page.evaluate(() => document.body.innerText);
    await browser.close();
    // Feed `content` into your Vectorize ingestion pipeline
    return Response.json({ content });
  }
};

Other uses: generate PDF reports from rendered HTML, take screenshots for visual regression testing, test pages that require authentication.

AutoRAG — Managed RAG Pipeline (Preview)

AutoRAG is Cloudflare’s attempt to package the Workers AI + Vectorize RAG pipeline into a turnkey product. Upload documents, configure a source (R2 bucket, website URL), and get a RAG API endpoint back. Chunking, embedding, storage, and retrieval are all managed.

Status: limited preview as of February 2026. Not generally available.

Why it matters when it ships: the manual Workers AI + Vectorize pipeline above requires wiring up embeddings, chunking logic, metadata storage, and query handling. AutoRAG handles all of that. For researchers who want to experiment with RAG without building the infrastructure, it removes the setup cost entirely.

Watch this space. Cloudflare ships fast.

Use Case → Service Mapping

If you have a specific problem and aren’t sure which Cloudflare service to reach for:

I want to…	Use this
Host a static site with CI/CD	Pages
Build a serverless API or webhook handler	Workers
Store structured relational data	D1
Cache session data or config values	KV
Store user-uploaded files without egress fees	R2
Send emails from a Worker	Email Routing + Worker handler
Receive email at my custom domain	Email Routing
Process background jobs asynchronously	Queues
Handle real-time WebSocket connections with state	Durable Objects
Connect my existing Postgres or MySQL database	Hyperdrive
Track custom analytics events from Workers	Analytics Engine
Host video content	Stream
Resize and optimize images on the fly	Images
Protect forms from bots without annoying CAPTCHAs	Turnstile
Expose a homelab service without opening ports	Tunnel
Add authentication in front of any URL	Zero Trust Access
Give my team private network access remotely	WARP / Cloudflare One
Block malware and ads at the DNS level	Gateway
Run AI inference without a GPU or API key	Workers AI
Proxy and log all my AI API calls	AI Gateway
Build a vector search or RAG system	Vectorize
Scrape dynamic web pages from a Worker	Browser Rendering
Get a fully managed RAG pipeline	AutoRAG (preview)

Pricing: Free vs. Paid

Two paid tiers cover most upgrade decisions:

Workers Paid — $5/month unlocks the developer platform:

Metric	Free	Workers Paid
Workers requests	100K/day	10M/month + $0.30/million
Workers CPU time	10ms/invocation	30s/invocation
KV reads	100K/day	10M/month
KV writes	1K/day	1M/month
D1 rows read	5M/day	25M/day
D1 rows written	100K/day	50M/month
D1 storage	1GB	5GB
R2 storage	10GB	10GB + $0.015/GB
R2 Class A operations	1M/month	1M/month + $4.50/million
Queues operations	1M/month	1M/month + $0.40/million
Workers AI neurons	10K/day	10K/day + $0.011/1K
Vectorize dimensions	30M free	30M + usage-based above
Durable Objects	Not included	Included
Hyperdrive	Not included	Included
Browser Rendering	Not included	Included (2 sessions)

Zero Trust Teams — $3/user/month removes the 50-user cap on Tunnel, Access, and Gateway, and adds more granular policy options. For a personal homelab with fewer than 50 people who need access, the free tier covers everything indefinitely.

The honest math: personal projects almost never hit free tier limits. A blog with 10,000 monthly visitors uses roughly 10,000 Workers requests per day — right at the free limit, but not over it. The $5/month Workers Paid upgrade is worth considering when you’re building an actual application with multiple users and background job requirements.

My Personal Stack — What I Actually Use

Self-hosting: Tunnel + Zero Trust Access, both on the free tier. Gitea, Jellyfin, and my monitoring dashboard sit behind Access with email OTP. Proxmox never touches Tunnel — I access it via WARP when I’m away from home or on LAN when I’m not.

Portfolio: Pages + Pages Functions + KV (contact form submissions) + Turnstile (bot protection). Zero additional cost beyond the domain.

AI experimentation: Workers AI for quick inference tests (I keep a Worker deployed that accepts a prompt and returns a response — faster than opening a chat interface for simple tests), AI Gateway to log what I’m actually sending to OpenAI, and a small Vectorize index for a private notes RAG experiment.

What I’m not using yet: Queues, Durable Objects, and AutoRAG. My personal projects don’t need async job queues — the contact form is the most complex “background” operation I have, and it’s fast enough to handle synchronously. AutoRAG is still in preview. I’m watching it.

My monthly Cloudflare bill:

Service	Cost/month
Domain registration	~~$0.84 (~~$10.11/year)
Pages hosting	$0
Tunnel + Access (free tier)	$0
Workers (free tier)	$0
KV, D1, R2 (free tier)	$0
Workers AI (free tier)	$0
AI Gateway	$0
Vectorize (free tier)	$0
Total	~$0.84/month

What Cloudflare Is Not Great For

Long-running processes. Workers have a 10ms CPU cap on the free tier and a 30-second cap on the paid tier. A data processing job that takes five minutes can’t run in a Worker. You can work around this with Queues — break the work into chunks, each handled by a separate message — but it’s more complex than a traditional server. For jobs that genuinely need to run for minutes, a VPS or homelab machine is the right tool.

GPU-intensive ML workloads. Workers AI runs on Cloudflare’s managed GPU infrastructure. You don’t choose the hardware, you can’t fine-tune models, and you can’t load custom weights. For actual ML research — training, fine-tuning, running inference on custom models — you want Runpod, Lambda Labs, or Vast.ai. Cloudflare Workers AI is great for inference on pre-existing models; it’s not a research compute platform.

Persistent server processes. If you need a long-lived process — a Discord bot, a WebSocket server that maintains state for thousands of concurrent connections at scale, a background scheduler — Cloudflare’s architecture isn’t the right fit. Durable Objects help with stateful coordination, but they’re designed for edge use cases, not general-purpose server processes. A VPS is simpler for this.

Strict data residency requirements. Cloudflare’s network is global by design. If your use case requires that data stays within a specific geographic region (EU data residency for GDPR, or US-only storage for government compliance), verify Cloudflare’s compliance documentation before committing. They do offer data locality controls in some products, but it requires research.

Where to Go Next

Cloudflare has quietly become a full-platform company. The free tier alone covers a professional portfolio, a safely-exposed homelab, and serious AI inference experimentation — for less than a dollar a month (the domain).

If you’re new to Cloudflare, start with the two highest-value entry points:

Domain + Pages: register your domain through Cloudflare and deploy a static site on Pages. The setup is documented in Why I Registered My Domain Through Cloudflare and The Complete Cloudflare Stack for Developer Portfolios. Zero learning curve, immediate value.
Tunnel + Access: if you run a homelab, add Tunnel and Zero Trust Access. Documented in Cloudflare Tunnel Changed How I Run My Homelab. No open ports, automatic SSL, free authentication layer — the configuration is an afternoon of work and it runs indefinitely without maintenance.

Once those are running, Workers AI is the natural next experiment if you do any AI work. Create a Worker, bind the AI model, and you have a serverless inference endpoint in ten minutes.

One last note: Cloudflare ships constantly. AutoRAG, AI Gateway, Browser Rendering, and Vectorize all appeared in the past two years. The service table above will look different in twelve months. Worth revisiting when you’re planning your next project.

Related posts:

Export for reading

Everything Cloudflare Offers for Free (And Cheap) — Self-Hosting, Web Apps, and AI

The Free Tier Philosophy

Service Directory

Use Case 1: Self-Hosting and the Homelab Stack

Cloudflare Tunnel — The Foundation

Zero Trust Access — The Authentication Layer

WARP / Cloudflare One — The VPN Layer

Gateway — DNS-Level Filtering

Use Case 2: Full-Stack Web App Development

The Core Trio: Workers + D1 + KV + R2

Queues — Async Background Jobs

Durable Objects — Stateful Edge Compute

Hyperdrive — Turbocharge Your Existing Database

Analytics Engine — Custom Event Tracking

Email Routing — Free Email Forwarding

Stream and Images — When Scale Demands It

Use Case 3: AI Research and Inference

Workers AI — Edge Inference Without a GPU

AI Gateway — Unified Proxy for All Your AI Calls

Vectorize — Vector Database for Semantic Search and RAG

Browser Rendering — Headless Chrome at the Edge

AutoRAG — Managed RAG Pipeline (Preview)

Use Case → Service Mapping

Pricing: Free vs. Paid

My Personal Stack — What I Actually Use

What Cloudflare Is Not Great For

Where to Go Next

Comments

On this page

Everything Cloudflare Offers for Free (And Cheap) — Self-Hosting, Web Apps, and AI