Reducing audio latency

Overview

The end to end audio latency consists of the following components:

  1. The latency of the sound capture.

  2. Codec latency on both sender and receiver.

  3. Other algorithmic latency (such as AEC or sample rate conversion).

  4. Network latency.

  5. Jitter buffering on the receiver end.

  6. The latency of the sound playback.

Optimizing sound device latency

Sound device adds latency in two ways. First is the latency due to the use of audio playback and recording buffers, and second is latency due to buffering by pjmedia components (such as the conference bridge) to accommodate sound device’s jitter/bursts characteristic (see Sound device timing problem).

Both the latency and jitter/bursts characteristics of the sound device can be analyzed with Testing and optimizing audio device with pjsystest.

The sound device buffer sizes can be set by using the following settings:

The default buffer size settings have proven to be good to get the balance between stability and latency, but you can experiment with changing the values to improve latency.

In our experience, the performance (latency and burst/jitter characteristics) of the sound device is pretty much given and the only way to change it is to change the driver type. If the platform offers more than one sound device implementations, for example, Android JNI, Android OpenSL, and Android Oboe implementations are available for Android platform, experiment with each and use one with performance most suitable for your application.

Codec latency

Codec latency is determined by the codec algorithm and its ptime, but usually it shouldn’t add too much latency, around 10 to 30 ms.

Use audio switchboard

The conference bridge adds significant buffering to accommodate jitter/bursts from the sound device. See Sound device timing problem for more information.

If conferencing is not needed, consider replacing it with the Switchboard.

Choosing lower audio frame length

Warning

This method is now deprecated.

PJSIP now uses Adaptive Delay Buffer to automatically learn the amount of buffers required to handle the burst. The semantic of PJMEDIA_SOUND_BUFFER_COUNT has been changed, and rather now it means the maximum amount of buffering that will be handled by the delay buffer. Lowering the value will not affect latency, and may cause unnecessary WSOLA processing (to discard the excessive frames because the buffer is full) and may even produce audio impairments, hence it is no longer recommended.

Optimizing Jitter Buffer Latency

The jitter buffer algorithm is constantly trying to get the best latency for the current jitter conditions, hence usually there is no tuning needed to get better latency.

See Jitter buffer features and operations for more information.

For reference, jitter buffer settings are in pjsua_media_config and pj::MediaConfig (look for settings with jb prefix).

Other sources of latency

Other sources of latency include:

  • The default resampling algorithm in PJMEDIA adds about 5 ms latency.

  • The AEC may introduce some latency, but we don’t know exactly by how much.

  • The network latency itself.