Reducing audio latency
Overview
The end to end audio latency consists of the following components:
The latency of the sound capture.
Codec latency on both sender and receiver.
Other algorithmic latency (such as AEC or sample rate conversion).
Network latency.
Jitter buffering on the receiver end.
The latency of the sound playback.
Optimizing sound device latency
Sound device adds latency in two ways. First is the latency due to the use of audio playback and recording buffers, and second is latency due to buffering by pjmedia components (such as the conference bridge) to accommodate sound device’s jitter/bursts characteristic (see Sound device timing problem).
Both the latency and jitter/bursts characteristics of the sound device can be analyzed with Testing and optimizing audio device with pjsystest.
The sound device buffer sizes can be set by using the following settings:
PJSUA-LIB:
pjsua_media_config::snd_rec_latency
, which default value isPJMEDIA_SND_DEFAULT_REC_LATENCY
pjsua_media_config::snd_play_latency
, which default value isPJMEDIA_SND_DEFAULT_PLAY_LATENCY
pjsua_snd_set_setting()
withPJMEDIA_AUD_DEV_CAP_INPUT_LATENCY
andPJMEDIA_AUD_DEV_CAP_OUTPUT_LATENCY
capabilities.
PJSUA2:
pj::MediaConfig::sndRecLatency
andpj::MediaConfig::sndPlayLatency
(similar to PJSUA-LIB settings above)pj::AudDevManager::setInputLatency()
andpj::AudDevManager::setOutputLatency()
The default buffer size settings have proven to be good to get the balance between stability and latency, but you can experiment with changing the values to improve latency.
In our experience, the performance (latency and burst/jitter characteristics) of the sound device is pretty much given and the only way to change it is to change the driver type. If the platform offers more than one sound device implementations, for example, Android JNI, Android OpenSL, and Android Oboe implementations are available for Android platform, experiment with each and use one with performance most suitable for your application.
Codec latency
Codec latency is determined by the codec algorithm and its ptime
, but
usually it shouldn’t add too much latency, around 10 to 30 ms.
Use audio switchboard
The conference bridge adds significant buffering to accommodate jitter/bursts from the sound device. See Sound device timing problem for more information.
If conferencing is not needed, consider replacing it with the Switchboard.
Choosing lower audio frame length
Warning
This method is now deprecated.
PJSIP now uses Adaptive Delay Buffer to
automatically learn the amount of buffers required to handle the burst.
The semantic of PJMEDIA_SOUND_BUFFER_COUNT
has been changed, and rather
now it means the maximum amount of buffering that will be handled by the
delay buffer. Lowering the value will not affect latency, and may cause
unnecessary WSOLA
processing (to discard the excessive frames because
the buffer is full) and may even produce audio impairments, hence it is
no longer recommended.
Optimizing Jitter Buffer Latency
The jitter buffer algorithm is constantly trying to get the best latency for the current jitter conditions, hence usually there is no tuning needed to get better latency.
See Jitter buffer features and operations for more information.
For reference, jitter buffer settings are in pjsua_media_config
and pj::MediaConfig
(look for settings with jb
prefix).
Other sources of latency
Other sources of latency include:
The default resampling algorithm in PJMEDIA adds about 5 ms latency.
The AEC may introduce some latency, but we don’t know exactly by how much.
The network latency itself.