Adaptive delay buffer with high-quality time-scale modification.

This section describes PJMEDIA’s implementation of delay buffer. Delay buffer works quite similarly like a fixed jitter buffer, that is it will delay the frame retrieval by some interval so that caller will get continuous frame from the buffer. This can be useful when the put() and get() operations are not evenly interleaved, for example when caller performs burst of put() operations and then followed by burst of get() operations. With using this delay buffer, the buffer will put the burst frames into a buffer so that get() operations will always get a frame from the buffer (assuming that the number of get() and put() are matched).

The buffer is adaptive, that is it continuously learns the optimal delay to be applied to the audio flow at run-time. Once the optimal delay has been learned, the delay buffer will apply this delay to the audio flow, expanding or shrinking the audio samples as necessary when the actual audio samples in the buffer are too low or too high. It does this without distorting the audio quality of the audio, by using PJMED_WSOLA.

The delay buffer is used in Sound Device Port, Media channel splitter/combiner, and Conference Bridge.


typedef struct pjmedia_delay_buf pjmedia_delay_buf

Opaque declaration for delay buffer.


enum pjmedia_delay_buf_flag

Delay buffer options.



Use simple FIFO mechanism for the delay buffer, i.e. without WSOLA for expanding and shrinking audio samples.


pj_status_t pjmedia_delay_buf_create(pj_pool_t *pool, const char *name, unsigned clock_rate, unsigned samples_per_frame, unsigned channel_count, unsigned max_delay, unsigned options, pjmedia_delay_buf **p_b)

Create the delay buffer. Once the delay buffer is created, it will enter learning state unless the delay argument is specified, which in this case it will directly enter the running state.

  • pool – Pool where the delay buffer will be allocated from.

  • name – Optional name for the buffer for log identification.

  • clock_rate – Number of samples processed per second.

  • samples_per_frame – Number of samples per frame.

  • channel_count – Number of channel per frame.

  • max_delay – Maximum number of delay to be accommodated, in ms, if this value is negative or less than one frame time, default maximum delay used is 400 ms.

  • options – Options. If PJMEDIA_DELAY_BUF_SIMPLE_FIFO is specified, then a simple FIFO mechanism will be used instead of the adaptive implementation (which uses WSOLA to expand or shrink audio samples). See pjmedia_delay_buf_flag for other options.

  • p_b – Pointer to receive the delay buffer instance.


PJ_SUCCESS if the delay buffer has been created successfully, otherwise the appropriate error will be returned.

pj_status_t pjmedia_delay_buf_put(pjmedia_delay_buf *b, pj_int16_t frame[])

Put one frame into the buffer.

  • b – The delay buffer.

  • frame – Frame to be put into the buffer. This frame must have samples_per_frame length.


PJ_SUCCESS if frames can be put successfully. PJ_EPENDING if the buffer is still at learning state. PJ_ETOOMANY if the number of frames will exceed maximum delay level, which in this case the new frame will overwrite the oldest frame in the buffer.

pj_status_t pjmedia_delay_buf_get(pjmedia_delay_buf *b, pj_int16_t frame[])

Get one frame from the buffer.

  • b – The delay buffer.

  • frame – Buffer to receive the frame from the delay buffer.


PJ_SUCCESS if frame has been copied successfully. PJ_EPENDING if no frame is available, either because the buffer is still at learning state or no buffer is available during running state. On non-successful return, the frame will be filled with zeroes.

pj_status_t pjmedia_delay_buf_reset(pjmedia_delay_buf *b)

Reset delay buffer. This will clear the buffer’s content. But keep the learning result.


b – The delay buffer.


PJ_SUCCESS on success or the appropriate error.

pj_status_t pjmedia_delay_buf_destroy(pjmedia_delay_buf *b)

Destroy delay buffer.


b – Delay buffer session.


PJ_SUCCESS normally.