Complete project roadmap — workflow design, tooling decisions, phased implementation, and critical gaps for building a permanent, searchable, public-ready library.
Six sequential steps from recordings to a searchable library entry. Immediate priority: ingest the ~3,000 recently created edited videos into the system. The 12,000 master recordings in FileMaker/Dropbox are a longer-term effort involving transcoding and renaming at scale.
YYYY-MM-DD_[type]_[location]_[seq].mp4Everything runs locally on the user's machine. The only external touchpoints are the launch page (shyamgyaan.com) and GitHub (source code). No audio or video data leaves the machine.
Audio files are uploaded to AssemblyAI's cloud API for transcription. The SDK handles upload, polling, and result retrieval automatically. Output JSON is saved locally alongside the source MP3.
Each .transcript.json contains these sections (~3-5 MB for a 2-hour recording):
Reads .transcript.json from step 03, sends transcript content to Claude API (Sonnet 4.6) with structured JSON prompts, and saves .enrichment.json alongside the transcript. Uses Pydantic for guaranteed schema compliance.
~$0.08/recording (Sonnet 4.6: $3/$15 per 1M tokens). ~$240 for 3,000 recordings. Combined with transcription ($810): ~$1,050 total for steps 03 + 04. Prompt caching reduces to ~$150. Batch API (deferred) reduces to ~$75.
Static HTML/CSS/JS hosted on Cloudflare Pages. No server — the browser talks directly to Supabase using the JavaScript client loaded via CDN. Password gate is client-side (shared password, stored in localStorage).
id UUID (PK)filename TEXT UNIQUEtranscript_id TEXTaudio_duration_seconds FLOATtranscription_confidence FLOATlanguage_code TEXTtranscript_text TEXTspeakers_count INTEGERrecording_type TEXTutterances JSONBenrichment JSONBrefined_chapters JSONBkey_quotes JSONBspeaker_map JSONBtopic_tags TEXT[]quality_score INTEGERquality_notes TEXT
title TEXTsummary TEXTsetting TEXTlocation TEXTseries TEXTtags TEXT[]category TEXTcelebration TEXTmedia_type TEXTaudio_quality TEXTvideo_quality TEXTrecording_date DATEis_master BOOLEANdo_not_show BOOLEANnotes TEXTstatus TEXT (draft/published)reviewed_by TEXTreviewed_at TIMESTAMPTZ
Three phases over 18 months — from foundational infrastructure to internal tools at scale. A potential public access phase is outlined below but is subject to a future team decision and is not assumed.
The major architectural and tooling choices — with recommendations and trade-offs clearly laid out.
Every tool in the recommended stack, organized by function. Short-term choices prioritize speed; long-term choices prioritize scale and control.
Things not explicitly covered in the original workflow that must be addressed before scaling. These will cause expensive problems if ignored.
YYYY-MM-DD_satsang_main-hall_001.mp4) and rename all master files before processing begins. This anchors every downstream record to a predictable, unique identifier.Hard-earned lessons from similar archival projects. These are the most common ways this kind of work goes wrong.