Agent DailyAgent Daily
discussionadvanced

Show HN: A real time AI video agent with under 1 second of latency

By hassaanrhackernews
View original on hackernews

Tavus, an AI research company, has developed a real-time conversational video agent achieving sub-1 second latency by optimizing their Phoenix-2 model architecture. Key innovations include switching from NeRF to Gaussian Splatting for 70+ fps generation, hyper-optimizing each component (vision, ASR, LLM, TTS), and implementing specialized end-of-turn detection to enable natural human-AI conversations.

Key Points

  • Achieved sub-1 second latency for conversational AI video by optimizing every millisecond across the entire pipeline (vision, ASR, LLM, TTS, video generation)
  • Switched from NeRF-based to Gaussian Splatting backbone in Phoenix-2 model to enable 70+ fps frame generation on lower-end hardware, reducing computational requirements
  • Identified time-to-first-token (TTFT) as the critical LLM bottleneck rather than tokens-per-second; standard providers like Groq were too slow despite high throughput
  • Implemented specialized end-of-turn detection model that uses conversation signals and input speculation to reduce latency from detecting speech pauses, preventing both talking-over and delayed responses
  • Transitioned from requiring individual H100 GPUs per conversation to running multiple conversations on lower-end hardware through memory optimization and GPU core utilization improvements
  • Applied architectural techniques including streaming vs. batching and process parallelization to balance three competing constraints: latency, scale, and cost
  • Validated real-world effectiveness with customers like Delphi running multi-hour conversations with digital twins, proving the system's reliability and user engagement
  • Recognized conversational video as a fundamental human-computer interface that requires realistic interaction speed (~250ms between utterances for natural conversation)

Found this useful? Add it to a playbook for a step-by-step implementation guide.

Workflow Diagram

Start Process
Step A
Step B
Step C
Complete
Quality

Concepts