The Complete Edge Architecture Guide (Part 1): Why We Went All-In on Cloudflare
The Complete Edge Architecture Guide (Part 1): Why We Went All-In on Cloudflare
Jane Cooper
•
Sep 3, 2025
This is Part 1 of our four-part series on building AI-powered applications on the edge. In this post, we'll cover the foundational Cloudflare architecture. Part 2 explores [how we use Hono framework and dynamic loading to fit everything in 128MB](./part-2-hono-framework.md). Part 3 dives into [our sub-10ms AI pipeline implementation](./part-3-ai-pipeline.md), and Part 4 details [our journey from LangGraph to Mastra](./part-4-from-langgraph-to-mastra.md) for AI orchestration.
Table of Contents
[The Workers Runtime](#the-workers-runtime-where-it-all-started)
[Bindings: Zero-Latency Service Connections](#bindings-the-secret-sauce-nobody-talks-about)
[R2: Object Storage with Zero Egress](#r2-the-s3-killer-with-zero-egress-fees)
[Queues: Async Processing That Works](#queues-async-processing-that-actually-works)
[Durable Objects: Distributed Coordination](#durable-objects-the-distributed-systems-cheat-code)
[KV: Simple Global Caching](#kv-the-cache-thats-actually-simple)
[Edge Positioning Advantages](#the-edge-positioning-advantage)- [Economics & Cost Analysis](#the-economics-of-it-all)
[Developer Experience](#the-developer-experience)- [Lessons Learned](#what-we-learned)
——
When we started building Kasava, we had a choice to make. We could go the traditional route with AWS Lambda, maybe some EC2 instances, deal with VPCs, cold starts, and the inevitable "why is our AWS bill $10k this month?" conversation. Or we could try something different.
We chose different. We chose Cloudflare Workers. And honestly? It's been one of the best architectural decisions we've made.
The Workers Runtime
When you're building an AI-powered platform that needs to process GitHub webhooks, run semantic search across millions of lines of code, and respond to natural language queries in real-time, you need speed. Cloudflare Workers gave us that. With V8 isolates instead of containers, we're talking sub-5ms cold starts globally. Not 5 seconds like Lambda -- 5 _milliseconds_. When a GitHub webhook hits our API, we're already processing it before Lambda would even wake up.
V8 Isolates vs Node.js
Here's where it gets interesting. Most serverless platforms, including AWS Lambda, run your code in containers or virtual machines. Each function gets its own isolated environment, which sounds great until you realize the overhead involved.
Traditional Node.js serverless (like Lambda):
Each function runs in its own container/VM
Full Node.js runtime per function instance
Cold starts measured in seconds
Memory overhead of ~50-100MB per instance
Process-level isolation
Cloudflare Workers with V8 Isolates:
Hundreds of isolates run within a single V8 runtime
Shared JavaScript engine across all functions- Cold starts measured in milliseconds
Memory overhead of ~5-10MB per isolate
V8-level isolation (same security model as Chrome tabs)
Think of it like the difference between giving each person their own house (containers) versus having hundreds of secure apartments in the same building (isolates). You get effectively the same security and isolation but with dramatically less overhead.
A single Workers runtime instance can handle thousands of isolates simultaneously, seamlessly switching between them in microseconds. When your code needs to run, it's not spinning up a new container or booting a Node.js process -- it's just creating a lightweight context within an already-running V8 engine.
This is why we can process GitHub webhooks in under 10ms globally while traditional serverless is still waking up.
But here's where it gets interesting...
Bindings: The Secret Sauce Nobody Talks About
You know what kills performance in serverless? Network calls. Every time you need to hit a database, call another service, or fetch from storage, you're adding latency. AWS makes you jump through VPC hoops, configure security groups, and pray to the networking gods. If you know you know.
Enter bindings. These are zero-latency connections between Workers and other Cloudflare services. No network overhead. No authentication dance. Just the purest form of syntactical sugar and automagical, instant access.
```typescriptconst s3 = new AWS.S3({ region: 'us-west-2', credentials: {...} });await s3.getObject({ Bucket: 'my-bucket', Key: 'file.txt' }).promise();
// We just do this:const file = await env.KASAVA_PROFILE_DOCUMENTS_BUCKET.get('file.txt');```
This is a simplified example but the real value becomes apparent when you have bindings for everything in your stack and can treat them as just services within your existing codebase:
- **R2 Buckets**: 5 different buckets for documents, recordings, archives- **KV Namespaces**: Session storage and embedding cache- **Queues**: 18 queues (9 main + 9 DLQ) for async processing- **Durable Objects**: Real-time chat sessions and distributed coordination- **Vectorize**: Experimental vector indexes for semantic search
Each binding is just there, available instantly, no configuration needed in code.
R2: The S3 Killer with Zero Egress Fees
Let's talk about the elephant in the room -- egress fees. AWS charges $90 per TB for data transfer. If you're serving videos, documents, or any kind of media, this adds up fast. Really fast.
Cloudflare R2? Zero. Zilch. Nada. Free egress.
We store everything in R2:
- Chrome extension recordings (up to 100MB per bug report)- Organization documents- GitHub comment archives- Profile documents- Indexing artifacts from our code analysis
// Storing a 50MB screen recording
await env.RECORDING_STORAGE.put(`recordings/${bugReportId}/screen.webm`,
recording,
{ httpMetadata: { contentType: "video/webm" },
customMetadata: { userId, timestamp: Date.now().toString() },
}
);
// Cost on AWS S3: Storage + egress fees every time someone views it
// Cost on R2: Just storage ($15/TB/month), zero egress
One of our users downloaded 3.2TB of historical data last month. On AWS? That would've been ~$300 just in egress. On Cloudflare? Free.
Queues: Async Processing That Actually Works
Here's something nobody tells you about serverless -- handling async work is a pain. With Lambda, you're either chaining functions together (and paying for wait time), using SQS (more complexity), or giving up and running ECS tasks.
Cloudflare Queues integrate directly with Workers. No external services, no additional authentication, just push to a queue and consume from it.
// Producer side - instant response to webhook
export default {
async fetch(request: Request, env: Env) {
const event = await request.json();
// Queue it and return immediately
await env.GITHUB_EVENT_QUEUE.send(event);
return new Response("OK", { status: 200 });
},
};
// Consumer side - process in background
export default {
async queue(batch: MessageBatch, env: Env) {
for (const message of batch.messages) {
await processGitHubEvent(message.body);
message.ack(); // Mark as processed
}
},
};
We run 9 different queues for different workloads:
repository-indexing: Orchestrates parallel code analysis (processes 10,000+ files in under 5 minutes!)
file-indexing: 50-file batches with 100 concurrent workers- **embedding-generation**: 128-text batches for Voyage AI
github-events: Primary webhook processing- Plus 9 matching DLQs for failed message handling
Each queue has its own configuration -- batch sizes, timeouts, retry policies. All managed through `wrangler.jsonc`, deployed with a single command.
Durable Objects: The Distributed Systems Cheat Code
This is where things get really interesting. We needed to coordinate indexing jobs across 100+ parallel workers without database contention. Traditional approach? Distributed locks, Redis, maybe Zookeeper if you hate yourself.
Cloudflare's answer? Durable Objects.
export class IndexingCoordinator {
private state: DurableObjectState;
private jobs: Map<string, JobState>
async claimJob(workerId: string): Promise<Job | null> {
// This runs in exactly one place globally
// No race conditions, no distributed locks
const availableJob = this.findAvailableJob();
if (availableJob) {
availableJob.workerId = workerId;
availableJob.claimedAt = Date.now();
await this.state.storage.put(`job:${availableJob.id}`, availableJob);
}
return availableJob;
}
}
Each Durable Object is a single-threaded JavaScript environment that's globally unique. Perfect for:
Coordinating our parallel indexing workers
Managing WebSocket connections for real-time chat
Maintaining session state without a database
No locks. No race conditions. Just JavaScript running in exactly one place.
KV: The Cache That's Actually Simple
Redis is great. Until you have to manage it. Provision it. Scale it. Deal with connection pools. Handle failover.
Cloudflare KV is just... there. A globally distributed key-value store with no configuration.
// Caching embedding results
const cacheKey = `embedding:${model}:${hashText(content)}`;
const cached = await env.EMBEDDING_CACHE.get(cacheKey);
if (cached) {
return JSON.parse(cached); // Sub-10ms for hot keys
}
const embedding = await generateEmbedding(content);
await env.EMBEDDING_CACHE.put(
cacheKey,
JSON.stringify(embedding),
{ expirationTtl: 3600 } // 1 hour TTL
);
We cache everything:
Session data (15-minute TTL)
Embedding results (1-hour TTL)
API responses
Search results
The Edge Positioning Advantage
Here's the thing about running on the edge -- your code runs where your users are. Not in us-east-1. Not in three regions you carefully selected. Everywhere.
When someone in Tokyo hits our API, they're hitting a worker in Tokyo. Someone in São Paulo? Worker in São Paulo. No CDN configuration, no geo-routing rules. It just works.
This matters more than you think. We've seen:
200ms faster response times compared to centralized deployments
Consistent performance regardless of user location
Natural resilience (if one location has issues, traffic routes elsewhere)
The Economics of It All
Let's talk money. Because at the end of the day, this stuff has to make business sense.
Traditional AWS Setup:
Lambda: Pay for execution time + cold starts
S3: $23/TB storage + $90/TB egress
SQS: $0.40 per million messages
ElastiCache: Starting at $15/month
CloudFront: Complex pricing, egress fees
NAT Gateway: $45/month + data processing
Our Cloudflare Setup:
Workers: 100k requests/day free, then $0.15 per million
R2: $15/TB storage, **zero egress**
Queues: Included with Workers
KV: 100k reads/day free
Durable Objects: $0.15 per million requests- Everything runs on the edge
We're saving 60-80% compared to an equivalent AWS setup. But more importantly, we're not managing infrastructure. No VPCs, no security groups, no capacity planning.
The Developer Experience
You know what's underrated? Being able to test everything locally. Wrangler (Cloudflare's CLI) lets us run the entire stack:
npm run dev
# Thats it. Workers, KV, R2, Queues, all running locally
Deployment? One command:
npm run deploy # Code is live globally in under 30 seconds
Compare that to setting up LocalStack, configuring AWS SAM, dealing with Docker containers... yeah, no thanks.
What We Learned
After six months of running everything on Cloudflare, here's what we've learned:
The Good:
Performance is incredible (sub-5ms cold starts still blow my mind)
Zero egress fees change how you think about architecture
Bindings eliminate entire categories of problems
Global by default is powerful
The simplicity is addictive
The Tradeoffs:
128MB memory limit per worker (but you'd be surprised what fits)
30-second CPU time limit (queues handle long-running tasks)
Different mental model from traditional serverless
Some services still experimental (Vectorize)
The Unexpected:
Durable Objects solved problems we didn't know we had
Queue coordination is smoother than any message broker we've used
The platform keeps getting better (recent 10x Queue performance improvement!)
Why This Matters
We're processing millions of GitHub events, running semantic search across gigabytes of code, generating embeddings with Voyage AI, coordinating 100+ parallel workers, and serving it all globally with sub-100ms latency. On a platform that costs us less than a decent coffee machine per month. But here's the real thing -- we're a small team. We don't have a DevOps person. We don't need one. Cloudflare handles the infrastructure so we can focus on building the product.
Is it perfect? No. Would I build Kasava on AWS if I had to start over? Also no. Sometimes the best architectural decision isn't about choosing the most popular option or the one with the most features. Sometimes it's about choosing the one that lets you ship fast, iterate quickly, and sleep at night knowing your infrastructure just works.
For us, that's Cloudflare. All in on the edge, and not looking back.
The Node.js Compatibility Tradeoff
Let's be honest about the elephant in the room -- Cloudflare Workers aren't Node.js. They run JavaScript in V8 isolates, which means you're giving up some Node.js compatibility for those incredible performance gains.
What you lose:
Full Node.js standard library (no `fs`, limited `os`, no native modules)
Some npm packages that rely heavily on Node.js internals (this became an issue when trying to implement tree-sitter for code parsing -- more on that later...)
Direct file system access (everything goes through bindings)
Long-running processes (30-second CPU time limit)
What Cloudflare provides:
Native support for most common Node.js APIs (crypto, buffer, streams, HTTP)
Polyfills for unsupported APIs via the `nodejs_compat` flag
Web standard APIs that often work better than Node.js equivalents
Automatic bundling that handles most compatibility issues
In practice? It's rarely a problem. The nodejs_compat
flag handles most edge cases, and when it doesn't, there's usually a better web-standard alternative.
// This works fine in Workers
import { createHash } from "crypto";import { Buffer } from "buffer";
// This doesn't (but you don't need it on the edge)
import fs from "fs";
// ❌ No file system access
import os from "os";
// ❌ Limited OS APIs
// Use bindings instead
const file = await env.BUCKET.get("data.json");
// ✅ Better than fs
The mental shift is worth it. Instead of thinking "how do I make this Node.js code work?", you start thinking "how do I build this for the edge?" The result is cleaner, more performant code that scales globally.
We've found that 95% of what we wanted to do "just works." The remaining 5% usually led us to better architectural decisions anyway.
——
Next in the Series:
**[Part 2: Hono + Dynamic Loading - How We Fit an AI Platform in 128MB](./part-2-hono-framework.md)** - Discover why we chose Hono over Express/Fastify and how dynamic loading lets us run 50+ endpoints in Workers' memory constraints.
**[Part 3: How We Built a Sub-10ms AI Pipeline on the Edge](./part-3-ai-pipeline.md)** - Dive deep into our AI infrastructure implementation, featuring Voyage AI embeddings, pgvector for semantic search, and the economic implications of edge computing for AI startups.
**[Part 4: From LangGraph to Mastra - Our AI Orchestration Journey](./part-4-from-langgraph-to-mastra.md)** - Learn why we migrated from LangGraph to Mastra for AI workflow orchestration, and how this TypeScript-first framework transformed our development velocity.
This is Part 1 of our four-part series on building AI-powered applications on the edge. In this post, we'll cover the foundational Cloudflare architecture. Part 2 explores [how we use Hono framework and dynamic loading to fit everything in 128MB](./part-2-hono-framework.md). Part 3 dives into [our sub-10ms AI pipeline implementation](./part-3-ai-pipeline.md), and Part 4 details [our journey from LangGraph to Mastra](./part-4-from-langgraph-to-mastra.md) for AI orchestration.
Table of Contents
[The Workers Runtime](#the-workers-runtime-where-it-all-started)
[Bindings: Zero-Latency Service Connections](#bindings-the-secret-sauce-nobody-talks-about)
[R2: Object Storage with Zero Egress](#r2-the-s3-killer-with-zero-egress-fees)
[Queues: Async Processing That Works](#queues-async-processing-that-actually-works)
[Durable Objects: Distributed Coordination](#durable-objects-the-distributed-systems-cheat-code)
[KV: Simple Global Caching](#kv-the-cache-thats-actually-simple)
[Edge Positioning Advantages](#the-edge-positioning-advantage)- [Economics & Cost Analysis](#the-economics-of-it-all)
[Developer Experience](#the-developer-experience)- [Lessons Learned](#what-we-learned)
——
When we started building Kasava, we had a choice to make. We could go the traditional route with AWS Lambda, maybe some EC2 instances, deal with VPCs, cold starts, and the inevitable "why is our AWS bill $10k this month?" conversation. Or we could try something different.
We chose different. We chose Cloudflare Workers. And honestly? It's been one of the best architectural decisions we've made.
The Workers Runtime
When you're building an AI-powered platform that needs to process GitHub webhooks, run semantic search across millions of lines of code, and respond to natural language queries in real-time, you need speed. Cloudflare Workers gave us that. With V8 isolates instead of containers, we're talking sub-5ms cold starts globally. Not 5 seconds like Lambda -- 5 _milliseconds_. When a GitHub webhook hits our API, we're already processing it before Lambda would even wake up.
V8 Isolates vs Node.js
Here's where it gets interesting. Most serverless platforms, including AWS Lambda, run your code in containers or virtual machines. Each function gets its own isolated environment, which sounds great until you realize the overhead involved.
Traditional Node.js serverless (like Lambda):
Each function runs in its own container/VM
Full Node.js runtime per function instance
Cold starts measured in seconds
Memory overhead of ~50-100MB per instance
Process-level isolation
Cloudflare Workers with V8 Isolates:
Hundreds of isolates run within a single V8 runtime
Shared JavaScript engine across all functions- Cold starts measured in milliseconds
Memory overhead of ~5-10MB per isolate
V8-level isolation (same security model as Chrome tabs)
Think of it like the difference between giving each person their own house (containers) versus having hundreds of secure apartments in the same building (isolates). You get effectively the same security and isolation but with dramatically less overhead.
A single Workers runtime instance can handle thousands of isolates simultaneously, seamlessly switching between them in microseconds. When your code needs to run, it's not spinning up a new container or booting a Node.js process -- it's just creating a lightweight context within an already-running V8 engine.
This is why we can process GitHub webhooks in under 10ms globally while traditional serverless is still waking up.
But here's where it gets interesting...
Bindings: The Secret Sauce Nobody Talks About
You know what kills performance in serverless? Network calls. Every time you need to hit a database, call another service, or fetch from storage, you're adding latency. AWS makes you jump through VPC hoops, configure security groups, and pray to the networking gods. If you know you know.
Enter bindings. These are zero-latency connections between Workers and other Cloudflare services. No network overhead. No authentication dance. Just the purest form of syntactical sugar and automagical, instant access.
```typescriptconst s3 = new AWS.S3({ region: 'us-west-2', credentials: {...} });await s3.getObject({ Bucket: 'my-bucket', Key: 'file.txt' }).promise();
// We just do this:const file = await env.KASAVA_PROFILE_DOCUMENTS_BUCKET.get('file.txt');```
This is a simplified example but the real value becomes apparent when you have bindings for everything in your stack and can treat them as just services within your existing codebase:
- **R2 Buckets**: 5 different buckets for documents, recordings, archives- **KV Namespaces**: Session storage and embedding cache- **Queues**: 18 queues (9 main + 9 DLQ) for async processing- **Durable Objects**: Real-time chat sessions and distributed coordination- **Vectorize**: Experimental vector indexes for semantic search
Each binding is just there, available instantly, no configuration needed in code.
R2: The S3 Killer with Zero Egress Fees
Let's talk about the elephant in the room -- egress fees. AWS charges $90 per TB for data transfer. If you're serving videos, documents, or any kind of media, this adds up fast. Really fast.
Cloudflare R2? Zero. Zilch. Nada. Free egress.
We store everything in R2:
- Chrome extension recordings (up to 100MB per bug report)- Organization documents- GitHub comment archives- Profile documents- Indexing artifacts from our code analysis
// Storing a 50MB screen recording
await env.RECORDING_STORAGE.put(`recordings/${bugReportId}/screen.webm`,
recording,
{ httpMetadata: { contentType: "video/webm" },
customMetadata: { userId, timestamp: Date.now().toString() },
}
);
// Cost on AWS S3: Storage + egress fees every time someone views it
// Cost on R2: Just storage ($15/TB/month), zero egress
One of our users downloaded 3.2TB of historical data last month. On AWS? That would've been ~$300 just in egress. On Cloudflare? Free.
Queues: Async Processing That Actually Works
Here's something nobody tells you about serverless -- handling async work is a pain. With Lambda, you're either chaining functions together (and paying for wait time), using SQS (more complexity), or giving up and running ECS tasks.
Cloudflare Queues integrate directly with Workers. No external services, no additional authentication, just push to a queue and consume from it.
// Producer side - instant response to webhook
export default {
async fetch(request: Request, env: Env) {
const event = await request.json();
// Queue it and return immediately
await env.GITHUB_EVENT_QUEUE.send(event);
return new Response("OK", { status: 200 });
},
};
// Consumer side - process in background
export default {
async queue(batch: MessageBatch, env: Env) {
for (const message of batch.messages) {
await processGitHubEvent(message.body);
message.ack(); // Mark as processed
}
},
};
We run 9 different queues for different workloads:
repository-indexing: Orchestrates parallel code analysis (processes 10,000+ files in under 5 minutes!)
file-indexing: 50-file batches with 100 concurrent workers- **embedding-generation**: 128-text batches for Voyage AI
github-events: Primary webhook processing- Plus 9 matching DLQs for failed message handling
Each queue has its own configuration -- batch sizes, timeouts, retry policies. All managed through `wrangler.jsonc`, deployed with a single command.
Durable Objects: The Distributed Systems Cheat Code
This is where things get really interesting. We needed to coordinate indexing jobs across 100+ parallel workers without database contention. Traditional approach? Distributed locks, Redis, maybe Zookeeper if you hate yourself.
Cloudflare's answer? Durable Objects.
export class IndexingCoordinator {
private state: DurableObjectState;
private jobs: Map<string, JobState>
async claimJob(workerId: string): Promise<Job | null> {
// This runs in exactly one place globally
// No race conditions, no distributed locks
const availableJob = this.findAvailableJob();
if (availableJob) {
availableJob.workerId = workerId;
availableJob.claimedAt = Date.now();
await this.state.storage.put(`job:${availableJob.id}`, availableJob);
}
return availableJob;
}
}
Each Durable Object is a single-threaded JavaScript environment that's globally unique. Perfect for:
Coordinating our parallel indexing workers
Managing WebSocket connections for real-time chat
Maintaining session state without a database
No locks. No race conditions. Just JavaScript running in exactly one place.
KV: The Cache That's Actually Simple
Redis is great. Until you have to manage it. Provision it. Scale it. Deal with connection pools. Handle failover.
Cloudflare KV is just... there. A globally distributed key-value store with no configuration.
// Caching embedding results
const cacheKey = `embedding:${model}:${hashText(content)}`;
const cached = await env.EMBEDDING_CACHE.get(cacheKey);
if (cached) {
return JSON.parse(cached); // Sub-10ms for hot keys
}
const embedding = await generateEmbedding(content);
await env.EMBEDDING_CACHE.put(
cacheKey,
JSON.stringify(embedding),
{ expirationTtl: 3600 } // 1 hour TTL
);
We cache everything:
Session data (15-minute TTL)
Embedding results (1-hour TTL)
API responses
Search results
The Edge Positioning Advantage
Here's the thing about running on the edge -- your code runs where your users are. Not in us-east-1. Not in three regions you carefully selected. Everywhere.
When someone in Tokyo hits our API, they're hitting a worker in Tokyo. Someone in São Paulo? Worker in São Paulo. No CDN configuration, no geo-routing rules. It just works.
This matters more than you think. We've seen:
200ms faster response times compared to centralized deployments
Consistent performance regardless of user location
Natural resilience (if one location has issues, traffic routes elsewhere)
The Economics of It All
Let's talk money. Because at the end of the day, this stuff has to make business sense.
Traditional AWS Setup:
Lambda: Pay for execution time + cold starts
S3: $23/TB storage + $90/TB egress
SQS: $0.40 per million messages
ElastiCache: Starting at $15/month
CloudFront: Complex pricing, egress fees
NAT Gateway: $45/month + data processing
Our Cloudflare Setup:
Workers: 100k requests/day free, then $0.15 per million
R2: $15/TB storage, **zero egress**
Queues: Included with Workers
KV: 100k reads/day free
Durable Objects: $0.15 per million requests- Everything runs on the edge
We're saving 60-80% compared to an equivalent AWS setup. But more importantly, we're not managing infrastructure. No VPCs, no security groups, no capacity planning.
The Developer Experience
You know what's underrated? Being able to test everything locally. Wrangler (Cloudflare's CLI) lets us run the entire stack:
npm run dev
# Thats it. Workers, KV, R2, Queues, all running locally
Deployment? One command:
npm run deploy # Code is live globally in under 30 seconds
Compare that to setting up LocalStack, configuring AWS SAM, dealing with Docker containers... yeah, no thanks.
What We Learned
After six months of running everything on Cloudflare, here's what we've learned:
The Good:
Performance is incredible (sub-5ms cold starts still blow my mind)
Zero egress fees change how you think about architecture
Bindings eliminate entire categories of problems
Global by default is powerful
The simplicity is addictive
The Tradeoffs:
128MB memory limit per worker (but you'd be surprised what fits)
30-second CPU time limit (queues handle long-running tasks)
Different mental model from traditional serverless
Some services still experimental (Vectorize)
The Unexpected:
Durable Objects solved problems we didn't know we had
Queue coordination is smoother than any message broker we've used
The platform keeps getting better (recent 10x Queue performance improvement!)
Why This Matters
We're processing millions of GitHub events, running semantic search across gigabytes of code, generating embeddings with Voyage AI, coordinating 100+ parallel workers, and serving it all globally with sub-100ms latency. On a platform that costs us less than a decent coffee machine per month. But here's the real thing -- we're a small team. We don't have a DevOps person. We don't need one. Cloudflare handles the infrastructure so we can focus on building the product.
Is it perfect? No. Would I build Kasava on AWS if I had to start over? Also no. Sometimes the best architectural decision isn't about choosing the most popular option or the one with the most features. Sometimes it's about choosing the one that lets you ship fast, iterate quickly, and sleep at night knowing your infrastructure just works.
For us, that's Cloudflare. All in on the edge, and not looking back.
The Node.js Compatibility Tradeoff
Let's be honest about the elephant in the room -- Cloudflare Workers aren't Node.js. They run JavaScript in V8 isolates, which means you're giving up some Node.js compatibility for those incredible performance gains.
What you lose:
Full Node.js standard library (no `fs`, limited `os`, no native modules)
Some npm packages that rely heavily on Node.js internals (this became an issue when trying to implement tree-sitter for code parsing -- more on that later...)
Direct file system access (everything goes through bindings)
Long-running processes (30-second CPU time limit)
What Cloudflare provides:
Native support for most common Node.js APIs (crypto, buffer, streams, HTTP)
Polyfills for unsupported APIs via the `nodejs_compat` flag
Web standard APIs that often work better than Node.js equivalents
Automatic bundling that handles most compatibility issues
In practice? It's rarely a problem. The nodejs_compat
flag handles most edge cases, and when it doesn't, there's usually a better web-standard alternative.
// This works fine in Workers
import { createHash } from "crypto";import { Buffer } from "buffer";
// This doesn't (but you don't need it on the edge)
import fs from "fs";
// ❌ No file system access
import os from "os";
// ❌ Limited OS APIs
// Use bindings instead
const file = await env.BUCKET.get("data.json");
// ✅ Better than fs
The mental shift is worth it. Instead of thinking "how do I make this Node.js code work?", you start thinking "how do I build this for the edge?" The result is cleaner, more performant code that scales globally.
We've found that 95% of what we wanted to do "just works." The remaining 5% usually led us to better architectural decisions anyway.
——
Next in the Series:
**[Part 2: Hono + Dynamic Loading - How We Fit an AI Platform in 128MB](./part-2-hono-framework.md)** - Discover why we chose Hono over Express/Fastify and how dynamic loading lets us run 50+ endpoints in Workers' memory constraints.
**[Part 3: How We Built a Sub-10ms AI Pipeline on the Edge](./part-3-ai-pipeline.md)** - Dive deep into our AI infrastructure implementation, featuring Voyage AI embeddings, pgvector for semantic search, and the economic implications of edge computing for AI startups.
**[Part 4: From LangGraph to Mastra - Our AI Orchestration Journey](./part-4-from-langgraph-to-mastra.md)** - Learn why we migrated from LangGraph to Mastra for AI workflow orchestration, and how this TypeScript-first framework transformed our development velocity.
Start Building with Momentum
Momentum empowers you to unleash your creativity and build anything you can imagine.
Start Building with Momentum
Momentum empowers you to unleash your creativity and build anything you can imagine.
Start Building with Momentum
Momentum empowers you to unleash your creativity and build anything you can imagine.
Kasava
No Spam. Just Product updates.
Company
Kasava
No Spam. Just Product updates.
Company
Kasava
No Spam. Just Product updates.
Company
Kasava
No Spam. Just Product updates.
Company