Tools: How I Built an AI Barista using Square, Supabase, and ElevenLabs

Tools: How I Built an AI Barista using Square, Supabase, and ElevenLabs

Source: Dev.to

How I Built an AI Barista using Square, Supabase, and ElevenLabs ## The Stack ## The Workflow ## Step 1: Catching the Square Webhook I run a tech-forward coffee hub in Philadelphia called BrewHubPHL. When we opened, I didn't just want a screen flashing "Order Ready"—I wanted the shop to speak. Here is how I used Supabase Edge Functions to glue Square POS and ElevenLabs together, creating an automated announcer for our orders. First, we need to know when an order is actually paid. We set up a serverless function to listen for Square's payment.updated event. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK: javascript // square-webhook.js exports.handler = async (event) => { const body = JSON.parse(event.body); if (body.type === 'payment.updated' && body.data.object.payment.status === 'COMPLETED') { const orderId = body.data.object.payment.reference_id; // Update Supabase await supabase.from('orders').update({ status: 'paid' }).eq('id', orderId); // Trigger the Announcer await triggerVoiceAnnouncement(orderId); } }; ## Step 2: Generating the Voice This is where the magic happens. We don't want a robotic "text-to-speech" voice; we want personality. I used the ElevenLabs Turbo v2 model because it has low latency (essential for real-time retail). We send the text to their API and get back an audio buffer. // text-to-speech.js const response = await fetch(`https://api.elevenlabs.io/v1/text-to-speech/${VOICE_ID}`, { method: 'POST', headers: { 'xi-api-key': process.env.ELEVENLABS_API_KEY }, body: JSON.stringify({ text: "Order ready for specific_customer!", model_id: 'eleven_turbo_v2', voice_settings: { stability: 0.5, similarity_boost: 0.75 } }) }); Why build this? It’s not just a gimmick. In a busy shop, customers tune out shouting baristas. A distinct, consistent AI voice cuts through the noise. Plus, by integrating it directly with Square and Supabase, we have zero manual work—the barista just taps "Charge," and the code does the rest. For the developers, I've open-sourced the sync logic on GitHub: https://gist.github.com/BrewHubPHL/53937283c5eaa7cafedb9555e851c509 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: javascript // square-webhook.js exports.handler = async (event) => { const body = JSON.parse(event.body); if (body.type === 'payment.updated' && body.data.object.payment.status === 'COMPLETED') { const orderId = body.data.object.payment.reference_id; // Update Supabase await supabase.from('orders').update({ status: 'paid' }).eq('id', orderId); // Trigger the Announcer await triggerVoiceAnnouncement(orderId); } }; ## Step 2: Generating the Voice This is where the magic happens. We don't want a robotic "text-to-speech" voice; we want personality. I used the ElevenLabs Turbo v2 model because it has low latency (essential for real-time retail). We send the text to their API and get back an audio buffer. // text-to-speech.js const response = await fetch(`https://api.elevenlabs.io/v1/text-to-speech/${VOICE_ID}`, { method: 'POST', headers: { 'xi-api-key': process.env.ELEVENLABS_API_KEY }, body: JSON.stringify({ text: "Order ready for specific_customer!", model_id: 'eleven_turbo_v2', voice_settings: { stability: 0.5, similarity_boost: 0.75 } }) }); Why build this? It’s not just a gimmick. In a busy shop, customers tune out shouting baristas. A distinct, consistent AI voice cuts through the noise. Plus, by integrating it directly with Square and Supabase, we have zero manual work—the barista just taps "Charge," and the code does the rest. For the developers, I've open-sourced the sync logic on GitHub: https://gist.github.com/BrewHubPHL/53937283c5eaa7cafedb9555e851c509 COMMAND_BLOCK: javascript // square-webhook.js exports.handler = async (event) => { const body = JSON.parse(event.body); if (body.type === 'payment.updated' && body.data.object.payment.status === 'COMPLETED') { const orderId = body.data.object.payment.reference_id; // Update Supabase await supabase.from('orders').update({ status: 'paid' }).eq('id', orderId); // Trigger the Announcer await triggerVoiceAnnouncement(orderId); } }; ## Step 2: Generating the Voice This is where the magic happens. We don't want a robotic "text-to-speech" voice; we want personality. I used the ElevenLabs Turbo v2 model because it has low latency (essential for real-time retail). We send the text to their API and get back an audio buffer. // text-to-speech.js const response = await fetch(`https://api.elevenlabs.io/v1/text-to-speech/${VOICE_ID}`, { method: 'POST', headers: { 'xi-api-key': process.env.ELEVENLABS_API_KEY }, body: JSON.stringify({ text: "Order ready for specific_customer!", model_id: 'eleven_turbo_v2', voice_settings: { stability: 0.5, similarity_boost: 0.75 } }) }); Why build this? It’s not just a gimmick. In a busy shop, customers tune out shouting baristas. A distinct, consistent AI voice cuts through the noise. Plus, by integrating it directly with Square and Supabase, we have zero manual work—the barista just taps "Charge," and the code does the rest. For the developers, I've open-sourced the sync logic on GitHub: https://gist.github.com/BrewHubPHL/53937283c5eaa7cafedb9555e851c509 - Database & Auth: Supabase - Payments: Square (POS and Webhooks) - Voice AI: ElevenLabs (Turbo v2 model) - Compute: Netlify Functions - Square detects a payment (payment.updated). - Supabase receives the webhook and routes it. - ElevenLabs generates the audio file ("Order for John is ready!"). - The frontend plays the audio automatically.