Complete project roadmap — workflow design, tooling decisions, phased implementation, and critical gaps for building a permanent, searchable, public-ready library.
Six sequential steps from recordings to a searchable library entry. Immediate priority: ingest the ~3,000 recently created edited videos into the system. The 12,000 master recordings in FileMaker/Dropbox are a longer-term effort involving transcoding and renaming at scale.
YYYY-MM-DD_[type]_[location]_[seq].mp4Everything runs locally on the user's machine. The only external touchpoints are the launch page (shyamgyaan.com) and GitHub (source code). No audio or video data leaves the machine.
Audio files are uploaded to AssemblyAI's cloud API for transcription. Direct HTTP calls handle upload, polling, and result retrieval (SDK bypassed for reliability). Supports parallel processing. Output JSON is saved locally alongside the source MP3.
Each .transcript.json contains these sections (~3-5 MB for a 2-hour recording):
Reads .transcript.json from step 03, sends transcript content to Claude API (Sonnet 4.6) with structured JSON prompts, and saves .enrichment.json alongside the transcript. Uses Pydantic for guaranteed schema compliance.
~$0.08/recording (Sonnet 4.6: $3/$15 per 1M tokens). ~$240 for 3,000 recordings. Combined with transcription ($810): ~$1,050 total for steps 03 + 04. Prompt caching reduces to ~$150. Batch API (deferred) reduces to ~$75.
Static HTML/CSS/JS hosted on Cloudflare Pages. No server — the browser talks directly to Supabase using the JavaScript client loaded via CDN. Password gate is server-side via Cloudflare Pages Functions middleware (SHA-256 hashed, 30-day session cookies).
events (immutable facts — date, location, setting, description) and media (recording metadata — tape counts, duration, quality). A event_media_view joins them for the Review UI. Includes FMP legacy data migration (12,155 records) with field mapping, location dedup, and tag normalization.source field tracks origin. Future: Typesense search index.id UUID (PK)event_key TEXT UNIQUEevent_date DATEtime_of_day TEXTsequence INTEGERsetting TEXTlocation TEXTdescription TEXTseries TEXTtopic_tags TEXT[]notes TEXTsource TEXTcreated_at TIMESTAMPTZupdated_at TIMESTAMPTZ
id UUID (PK)event_id UUID (FK → events)tape_count INTEGERtape_numbers TEXT[]duration_seconds FLOATquality TEXTrecorded_by TEXTcataloged_by TEXTcataloged_at TEXTrelated_tape_count INTEGERrelated_tape_numbers TEXT[]notes TEXTsource TEXTcreated_at TIMESTAMPTZupdated_at TIMESTAMPTZ
recordings queries in the Review UI. Provides a flat view of all event + media data for browsing, filtering, and export.
settingdescriptiontopic_tagslocationnotesevent_datetime_of_daysequenceevent_keyseries
tape_counttape_numbersduration_seconds (×60)qualityrecorded_bycataloged_by/atrelated_tape_numbers
location-mapping.csv. Matches anywhere in string.tag-mapping.csv. Semicolon-separated multi-value parsing.1989-01-10 af). Unique per event.scripts/import_fmp.py — CLI with --dry-run, --limit N, and --wipe flags. Batch upserts (500 rows/batch) into events + media tables. 33 unit tests covering event keys, tags, locations, date parsing, and tape number cleaning.
source field to track data origin.Three phases over 18 months — from foundational infrastructure to internal tools at scale. A potential public access phase is outlined below but is subject to a future team decision and is not assumed.
YYYY-MM-DD_[type]_[location]_[seq].mp4scripts/import_fmp.py (events + media schema, 33 tests)The major architectural and tooling choices — with recommendations and trade-offs clearly laid out.
Every tool in the recommended stack, organized by function. Short-term choices prioritize speed; long-term choices prioritize scale and control.
Things not explicitly covered in the original workflow that must be addressed before scaling. These will cause expensive problems if ignored.
YYYY-MM-DD_[type]_[location]_[seq].mp4. Enforced in pipeline code — files that don't match are rejected.pipeline/transcribe/vocabulary/. Automatic language detection handles Hindi-English mixing.Hard-earned lessons from similar archival projects. These are the most common ways this kind of work goes wrong.