Turning 100+ hours of raw footage into a finished video course

A trading educator had a library most course makers would envy and could not use: 184 recorded sessions across eight courses, hundreds of hours of live screen-share teaching. Somewhere in there were the best explanations of every concept they taught. The problem was that nobody had hundreds of hours to sit and find them, and the good moments were scattered, the same idea explained better in session 41 than in session 12, three minutes of gold buried in forty minutes of setup and small talk.

So we built an agent that watches all of it, reads it the way a good editor would, and assembles a structured course out of the parts worth keeping. Here is how it works.

What we built

The pipeline starts by transcribing every video with Whisper, the open speech-to-text model, and caches the transcript so it never has to do that twice. Once a session is text, the hard editorial work happens in the words, not the pixels, which is the whole reason this is affordable to run.

From there it reads in passes, the way you would if you were doing it properly by hand:

  • First pass: find the candidates. Claude reads the full transcript and flags the moments that actually teach something, the clear explanations, the live examples, the moments where the concept finally lands. It comes back with timestamps, not vibes.
  • Cheap vision, only where it matters. For footage where the picture carries the lesson, a smaller, cheaper model describes just the frames around each candidate. We never send hours of video to an expensive model to look at. We send a handful of frames that the transcript already told us were worth a look.
  • Second pass: confirm and shape. A stronger model reviews the candidates with those frame descriptions in hand, confirms the moment is genuinely good, tightens the in and out points, and groups related moments into a coherent lesson.

The output is not a pile of clips. It is a human-readable review document for sign-off, the exact cut list, subtitles, and the commands to produce the finished video. A person stays in the loop to approve, but they are reviewing decisions, not scrubbing through footage.

The clever part

The best teaching moment for a topic almost never lives in one place. The clearest definition might be in one session, the strongest live example in another recorded weeks later, the bit that ties it together somewhere else again. A human editor knows this and stitches across recordings to build one clean lesson. So does the agent.

It cuts composite clips, pulling segments from different sessions and joining them into a single coherent lesson, then lays narration, title cards and subtitles over the top. That is the difference between a highlight reel and an actual course. One is the loud bits in a row. The other has a teaching order, builds from concept to live example to recap, and feels like it was made on purpose.

Getting there meant solving the boring problems that decide whether the thing works at all: caching transcripts so reruns are cheap, sending the expensive model the fewest frames possible, and being strict about audio and fades at the cut stage so segments do not render black or fall silent. None of that is glamorous. All of it is the reason the pipeline produces something you would actually publish.

What it means

From this corpus the agent compiled a multi-module course, theory leading into demonstration leading into a real live example, assembled from clips that originally lived in completely different recordings. The work that would have taken an editor weeks of watching, logging and cutting became a review job: read the proposed cut, approve or adjust, render.

That is the shape of most of our AI automation work. Not a chatbot bolted onto a website, but a real pipeline that does a specific, expensive, repetitive job the way a careful person would, and leaves the judgement calls to a person who now has time to make them.

Sitting on hours of recordings you have never had time to turn into anything?

Start a conversation
Initialising Scheduler