- 1 About Mumble
- 2 Audio Features
- 2.1 How does the positional sound work?
- 2.2 Why does Mumble sound so much better than other voice products?
- 2.3 Where is the volume control?
- 2.4 The text-to-speech quality is horrible!
- 2.5 Why do some voices sound metallic?
- 2.6 Why doesn't the voice activity detect my voice any more?
- 2.7 What's this weird echo I hear of myself from other users?
What is Mumble?
Mumble is a voice chat application for groups. While it can be used for any kind of activity, it's primarily intended for gaming.
What platforms does it run on?
The client, Mumble, runs on Windows XP. The server component, Murmur, should run on anything you can compile Qt 4.0 on.
What are the system requirements?
The client runs on any Windows XP machine, and you also need a microphone. The server is mostly bandwidth bound, so as long as your network hardware is sufficient it should run on pretty much anything.
Please note that the binaries distributed from SourceForge are compiled for SSE (Pentium 3 or Athlon-XP). Mumble is a VOIP solution for gaming, and as most modern games require at least that good a CPU it makes little sense for us not to optimize for it.
What makes Mumble better?
Mumble has very low latency combined with good sound quality; it uses Speex extensively, not just the voice compression technology, but also the voice preprocessing to remove noise and improve clarity. Mumble also has positional audio for supported games, meaning the other players' voice will come from the direction their character is in game.
What are the bandwidth requirements?
Mumble uses VBR, and it's peak audio bitrate is 34.2 kbit/s. However, to keep latency low, Mumble will send audio packets up to 50 times a second, so there's a quite hefty overhead from IP and TCP/UDP headers as well as packet framing. The worst-case actual outgoing bandwidth is 60 kbit/s with everything included. If you choose to use TCP, the kernel will automatically start merging packets if not enough bandwidth is available, decreasing the used bandwidth at the cost of latency. Using positional audio will add about 10 kbit/s to this figure, as each packet needs to carry both your position and velocity. Note that if none of the connected players have doppler effects enabled, you'll get by with just transmitting the position, cutting this to 5 kbit/s.
You'll need enough incoming bandwidth for the output stream of each connected user, so with 10 players it would be about 400kbit/s if they all talked at the same time.
How can I help?
A good start would be just using Mumble. If you like it, tell all your friends. If you don't like it, tell us what's wrong so we can fix it.
How does the positional sound work?
Your position ingame is transmitted along with every audio packet, and Mumble uses standard DirectSound 3D to position the audio on the receiver side. Only games for which a plugin has been written get positional audio. All other games will work as well, you just won't get 3D sound.
Why does Mumble sound so much better than other voice products?
One word: Denoising. This is a standard part of Speex 1.1 and above, and any voice product already implementing speex should be able to trivially include the same filtering. Removing the noise from the input means that the audio will be clearer and that the needed bitrate will decrease. It takes fewer bits to model clear voice than it does to accurately represent the noise, so in any noisy transmission a large share of the bits will be noise modelling.
Where is the volume control?
Mumble uses the default volume you've configured in windows. There is no support for amplifying incoming voices, and there probably won't be, as this will decrease audio quality, something we're very reluctant to do.
The text-to-speech quality is horrible!
We use the standard MS Speech API, and the included voices aren't all that good. If you have installed either MS Office or the Speech SDK, you'll get more voices which can be configured from the Speech control panel.
Why do some voices sound metallic?
Mumble uses Speex noise filtering, and if the environment of the sender is especially noisy, some parts of the voice will be filtered as well. The alternative would be noisy sound, meaning precious bandwidth would be used to encode noise and the clarity of the voice would also decrease.
Why doesn't the voice activity detect my voice any more?
If you change your audio environment suddenly and drastically, by for example disconnecting and reconnecting your microphone or dragging a piece of paper directly over the microphone, you'll throw the voice preprocessor off balance. It will recover, but it will take time.
To reset the preprocessor, choose 'Reset' from the 'Audio' menu.
What's this weird echo I hear of myself from other users?
Unfortunately, a lot of popular headsets produce tiny traces of echo. In other VOIP products, you won't notice it because the echo is lower than the noise level, but as Mumble dutifully removes all noise, the echo suddenly becomes clear. There is little the person hearing the echo can do, but there are a few things the person producing the echo can do. The easy solution is to use ASIO and enable echo cancellation, however this requires that the headset is of the analog type (no USB) and a very high quality soundcard.
The more troublesome solution is to modify the headset. If it's possible to pry the arm with the microphone from the headphones, do so and reattach it with a thick piece of rubber tape; this should insulate it from vibrations. If your headset is open (no large earmuffs), there exists an echo path through air from the headphones to the microphone. You can fix this by attaching anything foam-like to the front of the headphones to muffle the sound heard outside them, but this will most likely ruin the ergonomics of the headset as well as look somewhat odd.
We might put up a page of "tested headsets" if anyone wants it.