Accelerating DIY with AI - A Lonrú Lens Side Quest for the Gemini Live Agent Challenge
Category: ACCELERATE
At Lonrú Consulting, our day-to-day focus is typically illuminating innovation within the complex Cell and Gene Therapy (CGT) landscape. However, innovation isn't confined to the laboratory. When Google announced the Gemini Live Agent Challenge, we at Lonrú Studios™ saw an opportunity to step outside our usual sphere and apply our architectural and cloud engineering expertise to a highly relatable, everyday problem: Home DIY.
The result?
HandyMate: The real-time responsive AI-Powered DIY Contractor.
This project was our very first hackathon, and it served as a perfect testbed for exploring multimodal AI, real-time data streaming, and the power of Google Cloud infrastructure. Here is a behind-the-scenes look at how we built it.
The Concept: Bringing the Expert into the Room
DIY projects are infamous for the mid-repair panic. You’re under the sink, the pipe won't loosen, and a static YouTube tutorial can't look at your specific wrench and tell you you're using it backward.
We envisioned an agent that doesn't just talk at you, but sees what you are doing. We wanted an AI that could interrupt you if you were about to make a dangerous mistake, and one that knew exactly what tools you had in your toolbox before suggesting a fix.
The Architecture: Powering Live with Google
To achieve true real-time multimodality, we needed a robust, low-latency architecture.
The Brain (Gemini 2.5 Flash Native Audio): The core of HandyMate is Google's new Gemini Live API. By streaming raw PCM audio directly to the gemini-2.5-flash-native-audio-latest model, we achieved conversational latency that feels shockingly human. The model's ability to handle active interruptions changes the paradigm of Human-Computer Interaction.
The Eyes (WebRTC & Base64 Canvas Extraction): Handling video was our greatest challenge. Standard browsers don't natively stream image/jpeg frames over generic WebSockets easily. We engineered a solution in our Next.js frontend to securely intercept the user's WebRTC camera feed, draw it to a hidden <canvas>, and transmit compressed JPG frames to the backend every 1000ms. This allows HandyMate to "see" your leaky pipe in real-time alongside your voice.
The Memory (Google Firestore): An agent is only as smart as its context. We integrated Firestore to give HandyMate a stateful memory. If you pause a repair to run to the hardware store, the agent saves a summarized "Project Card." When you resume, the Node.js backend injects that summary into the Gemini System Instructions, allowing the AI to greet you exactly where you left off.
The Deployment (Google Cloud Run): Because the official @google/genai Live API requires a secure, stateful Server-to-Server connection, we couldn't rely on standard serverless edge functions. We containerized our Express Node.js application using Docker and deployed it seamlessly to Google Cloud Run, ensuring robust WebSocket tunneling that scales instantly.
HandyMate architecture demonstrates what may be possible when live agents are deployed in cell and gene therapy workflows.
Our Secret Weapon: Antigravity
Given the tight 8hr window we had to conceptualize, design, and deploy HandyMate, we utilized Antigravity, Deepmind's agentic coding assistant, as our pair programmer.
Antigravity acted as a force multiplier for Lonrú Studios™. We utilized it to brainstorm the initial architectural mapping to ensure we met all of Google's hackathon criteria. Working alongside Antigravity allowed us to rapidly debug complex Safari iOS WebAudio AudioContext lifecycle bugs and parse complex JSON responses from the Gemini Vision model, accelerating our development cycle dramatically. It perfectly embodied our Accelerate VantagePoint Lens theme.
What We Learned
Building HandyMate proved that beyond the simple, stateless AI chatbot, we are now in the era of the Agentic Co-Pilot.
While HandyMate was built for fixing sinks and assembling furniture, the underlying architecture - low-latency audio/video streaming, contextual memory injection, and robust cloud deployment - has profound implications for our primary work in the CGT sector. Imagine a sterile-room manufacturing operator equipped with a hands-free, multimodal agent that can see a bioreactor's physical state while conversing about standard operating procedures in real-time.
That is the true power of Live Agents, and we are incredibly proud of what we accomplished this weekend.
Curious about the code or the app? Watch our 4-minute demo video on our Devpost submission, or visit the live app at handymate.vercel.app.
Disclaimer: We created this piece of content for the purposes of entering the Gemini Live Agent Challenge hackathon. #GeminiLiveAgentChallenge