Tools: VoxTube – Convert YouTube videos to audio with local TTS

Tools: VoxTube – Convert YouTube videos to audio with local TTS

Source: Dev.to

I built a tool to convert YouTube videos into podcasts Problem: I kept queuing YouTube tutorials and talks but never watching them. Video demands attention in a way that audio doesn't. Solution: VoxTube extracts transcripts from YouTube videos and converts them to audio using high-quality TTS. Now I "watch" YouTube during my commute, while cooking, and during workouts. GitHub: https://github.com/shawn-dsz/voxtube Happy to answer questions about the build! Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse - Built with Bun + Hono (~300 lines)
- Uses Kokoro TTS (runs locally via Docker)
- Caches generated audio
- No cloud dependencies - Bun's file APIs are really nice for streaming audio
- Modern TTS (Kokoro) sounds surprisingly natural
- Most YouTube videos have transcripts available - 2 weeks to MVP
- ~300 lines of code
- 0 monthly costs (runs locally)