Voice Agent
Add a talking voice assistant to your app. Visitors press a button, speak out loud, and hear the AI answer back in real time, no typing required.
Voice sessions require the Pro or Business plan. Each session costs 5 credits, charged to the app owner.
What It Does
Voice Agent gives your app a real, two-way spoken conversation. A visitor taps a "Start call" button, talks naturally, and the AI replies out loud with almost no delay. To add it, just describe what you want in the chat - for example "add a voice assistant that helps customers book a table" - and GenMB wires up the microphone, the live audio, and the playback for you.
- Natural back-and-forth voice, the visitor speaks and the AI answers out loud in real time.
- Live on-screen transcript of both sides of the conversation if you want to show it.
- You can set the assistant's instructions, pick a voice, and choose a language.
- Added automatically when your prompt mentions a voice agent, voice bot, voice booking, or voice ordering.
Setup
You do not need to set anything up by hand. The fastest path is to ask in chat.
Describe it in your prompt
Or turn it on from the Services panel
Tweak how it sounds and behaves
window.genmb.voiceAgent with start(options?), stop(), mute() / unmute(), and onStateChange / onTranscript callbacks. start() accepts { instructions, voice, language }. Handle the error state to surface plan-gating, rate-limit, or microphone-permission messages.Costs & Plan Gating
Voice sessions are metered per session start, not per turn. Once a session is open, all turns within it are covered by the same charge.
| Cost per session | 5 credits |
| Charged to | App owner, not the end user of the deployed app |
| Minimum plan | Pro |
| Failed sessions | No charge - credits only deduct once a session starts successfully |
Rate Limits
Each app can start a limited number of voice sessions per hour. This keeps a single busy app, or a bad actor hammering your app, from running up a surprise credit bill.
| Sessions per hour, per app | 20 |
| Window | Rolling 60-minute window |
| Over the limit | New calls are turned away until the hour clears; show visitors a short "try again soon" message |
Security
Voice sessions are built so nothing sensitive reaches the visitor's browser and the audio is handled live, never stored.
The browser only gets a short-lived session token; the underlying AI keys stay on the GenMB backend.
Audio is processed live and discarded. GenMB does not store or transcribe it.
The voice stream is encrypted end to end; there is no unencrypted audio path.
The browser always asks the visitor for microphone permission first. Your app cannot bypass that prompt.
FAQs
How fast does the assistant respond?▾
Do I need to provide an API key?▾
How much does a voice session cost?▾
What plans support Voice Agent?▾
Are conversations recorded?▾
Ready to build?
Create your first app for free, no credit card required.