Elevate Learning Knowledge Base

Why I Built This

The short version: I hate watching videos.

The Problem

Elevate Ventures has an excellent startup curriculum — 39 videos covering everything from fundraising to hardtech manufacturing to go-to-market strategy. The content is genuinely useful for Indiana founders.

But it's locked inside hours of video. If you want to know what they say about term sheets, you have to scrub through recordings hoping to find the right 90 seconds. Nobody does that. The content sits unwatched.

The Insight

I'm a reader, not a watcher. I process information faster as text. And if I want to find something specific, I search — I don't scroll through a video timeline.

This makes video content a perfect candidate for RAG (retrieval-augmented generation). Transcribe the videos, chunk the transcripts, embed them into vectors, and suddenly you can ask questions about the content and get answers with precise citations back to the exact moment in the source video.

The video doesn't go away — it becomes a citation you click when you want the full context. Text-first, video-on-demand.

The Real Reason

I wanted to show that a useful AI-powered app can be built entirely on Cloudflare's edge stack — no servers, no Docker, no managed databases, no OpenAI API keys. Workers AI + Vectorize + D1 + R2, all running at the edge, all on one platform. The whole thing deploys with a single wrangler deploy.

This is a proof of concept for CTK Advisors: any organization with video or document libraries can make that content searchable and conversational with the same architecture.

How It's Built

A RAG-powered chat interface over Elevate Ventures' startup learning video series — entirely on Cloudflare's edge with zero servers to manage.

Architecture

RAG Chat Flow

Tech Decisions

Layer	Choice	Why
Transcription	MLX Whisper (large-v3-turbo)	Runs natively on Apple Silicon, word-level timestamps for precise chunking
Chunking	~500 tokens, 50-token overlap	Balances context richness with embedding precision
Embeddings	bge-base-en-v1.5 (768-dim)	Best-in-class open embedding model on CF Workers AI
LLM	llama-3.3-70b-instruct-fp8-fast	Strong instruction-following, runs at edge with no cold start
Database	Cloudflare D1 (SQLite at edge)	Relational metadata co-located with worker, zero latency
Vector Store	Cloudflare Vectorize	Native integration, no external vector DB needed
Hosting	Cloudflare Workers + Assets	Global edge deployment, static + API in one worker
Video	Vimeo embed with #t= seek	Leverages existing Elevate Ventures Vimeo hosting

Key Highlights

Zero infrastructure

Everything runs on Cloudflare's edge — no servers, no Docker, no databases to manage

Sub-second answers

Embedding + vector search + LLM all happen at the edge

Precise citations

Every answer links back to exact video timestamps — click a citation and the player seeks to that moment

39 videos ingested

Full Elevate Learning curriculum, chunked with word-level timestamp alignment from Whisper