Medium 1 für Eintrag RealtimeVoice - End-to-End Realtime Voice Chat (Volcengine Doubao ASR+LLM+TTS)

Beschreibung

RealtimeVoice🗣️ Listen → 🧠 Think → 🔊 Speak. All in real time.

A ready-to-use realtime voice chat plugin built on the Volcengine (Doubao) end-to-end realtime speech LLM. Over a single WebSocket, it runs the entire ASR → LLM → TTS pipeline — giving your characters, NPCs, and digital humans the power to listen, think, and speak in real time.

Drop in one component, plug in your keys, and your character starts talking back. No C++ required.
https://www.volcengine.com/docs/6561/1594356?lang=en

✨ Key Features

⚡ End-to-End Low Latency
16 kHz mono mic input → 24 kHz streaming TTS output. The AI starts speaking before it finishes thinking.

🔀 Dual LLM Modes
Run the native Doubao all-in-one model, or switch to DeepSeek / OpenAI-compatible endpoints via the built-in HTTP bridge — custom model, temperature, max tokens, and history turns all tunable.

✋ Barge-In Interruption
Users can cut in mid-sentence and the AI yields instantly — just like a real conversation.

🧩 100% Blueprint-Ready
Drive everything from a single ActorComponent. Every event — ASR text, assistant text, speech start/end, user interrupt, error codes — is a Blueprint delegate.

🎭 Personas & Voices
Built-in Chinese voices, custom voice IDs, plus configurable system role, bot name, and speaking style.

🔈 3D Spatial Audio
Bind an external AudioComponent to spatialize the AI's voice anywhere in your scene.

💬 UMG Chat UI
Ships with an inheritable chat widget base class — live captions and conversation history out of the box.

👄 MetaHuman Lip-Sync
TTS audio events can directly drive MetaHumanAudioToFaceRuntime — speech and facial animation in perfect sync.

📋 Requirements

EngineUnreal Engine 5.6PlatformWin64ServiceYour own Volcengine AppId / AccessKey (speech service)

🤝 Build a Complete Digital Human

Pairs seamlessly with MetaHumanAudioToFaceRuntime:

🗣️ RealtimeVoice → 👄 MetaHumanAudioToFaceRuntime
realtime voice + realtime lip-sync = a fully interactive talking digital human.

Beschreibung

Enthaltene Formate

Tags