Tools: I automated my YouTube workflow with Node.js. The hard part wasn't the code.

Tools: I automated my YouTube workflow with Node.js. The hard part wasn't the code.

Source: Dev.to

I wanted to stop spending weekends editing videos, so I built a pipeline that takes a YouTube URL and outputs a fully produced video ready to upload — script, voiceover, AI clips, subtitles, thumbnail, the whole thing. About 200 lines of Node.js orchestrating a bunch of AI APIs. It mostly works. But the part that broke the most often wasn't ffmpeg or the subtitle timing or the YouTube upload auth. It was figuring out what to actually make. The general shape: fetch transcript → Claude analyzes and writes a new script → Minimax TTS for voiceover → Veo generates video clips → ffmpeg assembles everything → uploads to YouTube. The messiest part was handling five different AI APIs in one script. Each has its own SDK, its own auth pattern, its own response format. I kept having to look up whether the response was at data.choices[0].message.content or data.content[0].text or somewhere else entirely. I ended up switching to SkillBoss, which is a gateway that puts all of them behind one endpoint. The call shape is the same regardless of which model you're hitting One import, one auth token, one response parsing pattern. Not revolutionary, but when you're switching between five models in the same file it actually matters. Once the pipeline was running, I realized I had no good answer for what content to put into it. I'd been watching OpenClaw threads for a while — people sharing use cases, builds, workflows. One thread from a few weeks ago had 200+ upvotes: someone building a multi-agent research system that automatically synthesizes papers, generates summaries, and routes findings to different Notion databases by topic. Impressive setup. I checked their profile two weeks later — no follow-up post, the GitHub repo had two commits both on the same day. Meanwhile the projects that do ship tend to look different. Smaller. More personal. An eye drop reminder that reads a prescription label and sends Telegram alerts. A morning briefing that pulls from three RSS feeds. A script that auto-categorizes a folder of receipts. Nobody posts these as big announcements. They just... work, because done means "it works for me" and there's no external bar. I think this is the actual trap with agentic tools: they make starting easy, which makes it tempting to start ambitious things. But the gap between "got it working in a demo" and "I've been using this for two months" is mostly not a technical gap. For my own use, I ended up narrowing the scope significantly from what I originally planned. The pipeline doesn't try to research topics or pick what to make — I still do that. It doesn't handle edge cases I haven't hit yet. It doesn't have a UI. It does one thing: given a YouTube URL with interesting content, it makes a shorter, re-narrated version with decent production value. That's it. I've made 11 videos with it so far, uploaded 4, scrapped the rest because the source material wasn't interesting enough — which is a content problem, not a pipeline problem. It keeps running. The interesting remaining question for me is whether the "small scope, personal use" pattern holds as these tools get better at handling the hard 40% — or whether ambition will always outpace the capability ceiling. Based on what I've seen in OpenClaw threads, I lean toward the former being a feature of the people building, not just the tools. If you've shipped something agentic that you're still using six months later, I'm curious what the scope looked like when you first had the idea vs. what you actually built. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK:
const run = async (model, inputs) => { const res = await fetch('https://api.heybossai.com/v1/run', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.SKILLBOSS_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ model, inputs }) }) return res.json()
} // script generation
const script = await run('bedrock/claude-4-5-sonnet', { messages: [...] }) // video clips
const clip = await run('vertex/veo-3.1-fast-generate-preview', { prompt, duration: 6 }) // voiceover
const audio = await run('minimax/speech-01-turbo', { text, voice_setting: { voice_id: 'male-qn-jingying' } }) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
const run = async (model, inputs) => { const res = await fetch('https://api.heybossai.com/v1/run', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.SKILLBOSS_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ model, inputs }) }) return res.json()
} // script generation
const script = await run('bedrock/claude-4-5-sonnet', { messages: [...] }) // video clips
const clip = await run('vertex/veo-3.1-fast-generate-preview', { prompt, duration: 6 }) // voiceover
const audio = await run('minimax/speech-01-turbo', { text, voice_setting: { voice_id: 'male-qn-jingying' } }) COMMAND_BLOCK:
const run = async (model, inputs) => { const res = await fetch('https://api.heybossai.com/v1/run', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.SKILLBOSS_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ model, inputs }) }) return res.json()
} // script generation
const script = await run('bedrock/claude-4-5-sonnet', { messages: [...] }) // video clips
const clip = await run('vertex/veo-3.1-fast-generate-preview', { prompt, duration: 6 }) // voiceover
const audio = await run('minimax/speech-01-turbo', { text, voice_setting: { voice_id: 'male-qn-jingying' } })