Skip to main content
Docs/Voice Agent
DocsVoice Agent
Pro

Voice Agent

Add a talking voice assistant to your app. Visitors press a button, speak out loud, and hear the AI answer back in real time, no typing required.

Voice sessions require the Pro or Business plan. Each session costs 5 credits, charged to the app owner.

What It Does

Voice Agent gives your app a real, two-way spoken conversation. A visitor taps a "Start call" button, talks naturally, and the AI replies out loud with almost no delay. To add it, just describe what you want in the chat - for example "add a voice assistant that helps customers book a table" - and GenMB wires up the microphone, the live audio, and the playback for you.

  • Natural back-and-forth voice, the visitor speaks and the AI answers out loud in real time.
  • Live on-screen transcript of both sides of the conversation if you want to show it.
  • You can set the assistant's instructions, pick a voice, and choose a language.
  • Added automatically when your prompt mentions a voice agent, voice bot, voice booking, or voice ordering.

Setup

You do not need to set anything up by hand. The fastest path is to ask in chat.

1

Describe it in your prompt

When you create or refine your app, mention voice. Phrases like "voice booking flow", "voice ordering", or "a talk-to-AI button" tell GenMB to add the voice assistant automatically, including the call button, the connection states, and a live transcript.
2

Or turn it on from the Services panel

Open the Services panel in the app editor sidebar, find Voice Agent, and toggle it on. It becomes available the next time your app saves.
3

Tweak how it sounds and behaves

Ask in chat to change the assistant's instructions (what it should help with), switch to a different voice, or set the language. No code needed.
For developers: the assistant is exposed at window.genmb.voiceAgent with start(options?), stop(), mute() / unmute(), and onStateChange / onTranscript callbacks. start() accepts { instructions, voice, language }. Handle the error state to surface plan-gating, rate-limit, or microphone-permission messages.

Costs & Plan Gating

Voice sessions are metered per session start, not per turn. Once a session is open, all turns within it are covered by the same charge.

Cost per session5 credits
Charged toApp owner, not the end user of the deployed app
Minimum planPro
Failed sessionsNo charge - credits only deduct once a session starts successfully
See the Credits documentation for top-up packs and how the team credit pool works on Business plans.

Rate Limits

Each app can start a limited number of voice sessions per hour. This keeps a single busy app, or a bad actor hammering your app, from running up a surprise credit bill.

Sessions per hour, per app20
WindowRolling 60-minute window
Over the limitNew calls are turned away until the hour clears; show visitors a short "try again soon" message
Have visitors start a call by tapping a button rather than starting one automatically when the page opens. That way curious visitors who never actually talk to the assistant do not use up sessions or credits.

Security

Voice sessions are built so nothing sensitive reaches the visitor's browser and the audio is handled live, never stored.

No exposed keys

The browser only gets a short-lived session token; the underlying AI keys stay on the GenMB backend.

No recordings kept

Audio is processed live and discarded. GenMB does not store or transcribe it.

Encrypted audio

The voice stream is encrypted end to end; there is no unencrypted audio path.

Microphone consent

The browser always asks the visitor for microphone permission first. Your app cannot bypass that prompt.

If you choose to save the conversation transcript in your app, treat it as personal data and follow your own privacy policy. GenMB itself does not transcribe or store the audio.

FAQs

How fast does the assistant respond?
Almost instantly. The assistant listens while the visitor talks and starts replying out loud as soon as they finish, so it feels like a real phone call rather than waiting for text to appear. Under the hood it runs on a real-time AI model (Azure OpenAI Realtime, gpt-realtime-1.5); you do not have to set any of that up.
Do I need to provide an API key?
No. GenMB handles the connection for you. The AI keys never reach the visitor's browser, so nothing sensitive is exposed.
How much does a voice session cost?
Each session costs 5 credits, charged to the app owner, not the end user of the deployed app. A browser tab that never starts a call costs nothing.
What plans support Voice Agent?
Voice Agent is available on the Pro and Business plans. On the Free plan you can add it to an app, but calls will not start until the app owner upgrades to Pro or Business.
Are conversations recorded?
No. GenMB does not store or transcribe the audio on its servers; it is processed live and discarded. If you want a saved transcript, you can ask in chat to show or store the on-screen transcript, and then it becomes your responsibility to handle under your own privacy policy.

Ready to build?

Create your first app for free, no credit card required.