Latest news

Ping-pong buffer to embedded systems

No comments

This post is part of the Memory Control Structures series. Also, read the other posts in the series:

Hello dear reader. In this new article, I will present to you a more present, lightweight and efficient memory control structure in embedded systems, Ping-pong Buffer. In the last published articles, we show the circular buffer and its derivative, the circular queue, and we present where they can be useful to solve the day to day problems in the workbench. Ping-pong is no different. Its data insertion and removal policy can help primarily in applications that involve digital signal processing in non-real-time, or real-time, higher latency permitting (eg, DSP using low-cost processors).

What is the ping-pong buffer?

Ping-pong Buffer is not a derivative of the circular buffer (although some implementations may use it as a base), but rather a new structure. His work policy follows something similar to: “Divide to conquer”. Which means that the processing of the data are divided into two groups, the data in capture, and the data in processing. The idea is then to have a logically divided memory area in two equal parts, with two independent access channels (the so-called switches). Thus, the “Ping” channel has the function of receiving a string of bytes produced by a source (eg samples from an A / D converter) until its memory space is completely filled. The “Pong” channel has the function of reading the previously reserved memory filled by the “Ping” channel and performing some relevant processing. Let’s look at the figure below:

Figure 1: Ping-pong buffer work policy.

Figure 1 perfectly illustrates the work policy of which we are shortly spoken, so that when the “Ping” channel is full and the “Pong” channel is empty (this operation is mostly managed by the user), a synchronization is performed and the Ping channel points the previously allocated memory to the Pong channel, and for the latter, it is left to point to the channel that was previously “Ping” with the new data available for processing. Meanwhile, the new Ping channel takes care of preacher the memory with new data. This strategy allows saving CPU usage, since the application task will only be put into active mode as new data becomes available, thus generating a delay line according to reserved memory size, multiplied by the fill rate of the CPU buffer.

See the figure below comparing sample-to-sample processing from an A / D converter against block processing using a Ping-pong Buffer:

Figure 2: Comparison of sample to sample processing of an A / D converter against block processing using a Ping-pong Buffer.

In the figure above we have two use cases for processing a signal arriving from an A / D converter. In case 1, the one on the left, an interrupt routine accesses the data logger and writes to a previously allocated region, so that the application has to process it to avoid distortion. However, the task will be executed with each A / D capture. In case 2, to the right, an automatic memory transfer peripheral captures the samples and writes them to a previously allocated memory area, and an interrupt will be generated only when that area is filled, greatly reducing the CPU usage that will only execute the application when the full buffer effect is generated (temporarily on a smaller scale than each occurrence of an interrupt to the A / D).

Cool this A / D use case, but where does one thing connect with another?

Let’s go to the part that I like most of these articles, the practical use. The great advantage of the presented buffer is that it can optimize both use cases from the example presented above. In case 1, our ping-pong buffer would serve as a memory area to be filled in to collect multiple samples. Thus, the interrupt routine becomes light, having to insert sample in the buffer and when the buffer is full, sending a signal waking the application, which in turn changes the channels, and processes the new buffer while the other fills up This case is even perfect on low-cost microcontrollers that do not have an automatic memory transfer device like the DMA. I wrote this article that demonstrates case 1 optimized with this technique.

The second use case turns out to be even more optimized, because with DMA the A / D interrupt routine is simply no longer necessary. The DMA in this case has the function to receive the memory address where the data will be deposited and automatically, with each new sample of the A / D, it will be copied to the channel “Ping”, and to fill it full, an interruption will be generated, where the data would be accessible for processing the access channels are then inverted and the new memory address is sent to the DMA that is in charge of filling the new block. The figure below illustrates well what happens:

Figure 3: Ping-pong buffer and DMA.

A third case worth mentioning for the use of a Ping-pong buffer is the handling of images for display controllers (the popular TFT). With the use of a Ping-pong buffer, it is possible to send a framebuffer already processed through the “Pong” channel to the display, while the processor uses the “Ping” channel to perform operations such as drawing or repositioning objects on the screen. The same case applies to the use of D / A converter for signal generation, see another figurine that illustrates this case:

Figure 4: Ping-pong buffer for image treatment example.

Ping-pong buffer basic implementation

In this example implementation of a ping-pong buffer I tried to be a little broader, providing the classic macro that declares a ping-pong properly initialized and ready for use. The insert and retrieve routines are available and automatically manage where the “Ping” and “Pong” channels should access the reserved memory. The ppbuf_get_full_signal function is responsible for the synchronization operation. Thus, when its return is true, the application that calls this function has the option to consume the event, and when this occurs the Ping and Pong channels are reversed, and in a new cycle can be initialized.

Let’s look at the interface of our Ping-pong Buffer:

/**
 * @brief simple ping-pong buffer implementation
 */

#ifndef PING_PONG_BUFFER_H_
#define PING_PONG_BUFFER_H_

#include <stdbool.h>

/* ping pong buffer control structure */
typedef struct {
	unsigned char *buffer_data;
	unsigned char ping;
	unsigned char pong;
	int buffer_size;
	int put_index;
	int get_index;
	bool full_signal;
}ppbuf_t;

/**
 * @brief insert on active buffer
 */
int ppbuf_insert_active(ppbuf_t *p, void *data, int size);

/**
 * @brief remove from inactive buffer
 */
int ppbuf_remove_inactive(ppbuf_t *p, void *data, int size);

/**
 * @brief USE WITH DMA ONLY! get the current active buffer address
 */
unsigned char *ppbuf_dma_get_active_addr(ppbuf_t* p, int *size);

/**
 * @brief USE WITH DMA ONLY! get the current inactive buffer address
 */
unsigned char *ppbuf_dma_get_inactive_addr(ppbuf_t* p, int *size);


/**
 * @brief USE WITH DMA ONLY! force full signaling to next buffer become available
 */
int ppbuf_dma_force_swap(ppbuf_t* p);

/**
 * @brief get full signal
 */
bool ppbuf_get_full_signal(ppbuf_t *p, bool consume);


/* instantiate a fully initialized and static ping-pong buffer */
#define PPBUF_DECLARE(name,size)                                \
		unsigned char ppbuf_mem_##name[size * 2] = {0};	\
		ppbuf_t name = {				\
			.buffer_data = &ppbuf_mem_##name[0],	\
			.ping = 1,				\
			.pong = 0,				\
			.buffer_size = size,			\
			.put_index = 0,				\
			.get_index = 0,				\
			.full_signal = false			\
		}

#endif /* PING_PONG_BUFFER_H_ */

The ppbuf_dma_xxx functions are for DMA-only use, some care should be taken when using them because it returns the memory address of the corresponding “Ping” or “Pong” channel to be passed to the DMA. Along with ppbuf_dma_force_swap, which forces channel switching asynchronously. Let’s look at the implementation:

/**
 * @brief simple ping pong buffer implementation
 */

#include <string.h>
#include "ping_pong_buffer.h"


int ppbuf_insert_active(ppbuf_t *p, void *data, int size){
	int ret = 0;
	unsigned char *ptr;

	if(p == NULL || data == NULL || size == 0) {
		/* check your parameters */
		ret = -1;
	} else {
		if(size > (p->buffer_size - p->put_index)) {
			/* not enough room for new samples */
			ret = -1;
		} else {
			/* take the current position */
			int mem_position = ((p->ping) * p->buffer_size) + p->put_index;
			ptr = (unsigned char *)p->buffer_data;

			/* copy the contents */
			memcpy(&ptr[mem_position], data, size);

			/* update put index */
			p->put_index += size;
			p->full_signal = (p->put_index >= p->buffer_size?true:false);

			/* swap will only generated when ppbuf_get_full_signal is called */
			ret = 0;
		}
	}
	return(ret);
}

int ppbuf_remove_inactive(ppbuf_t *p, void *data, int size){
	int ret = 0;
	unsigned char *ptr;

	if(p == NULL || data == NULL || size == 0) {
		/* check your parameters */
		ret = -1;
	} else {
		if(size > (p->buffer_size - p->get_index)) {
			/* not enough data in sample buffer */
			ret = -1;
		} else {
			/* take the current position */
			int mem_position = ((p->pong) * p->buffer_size) + p->get_index;
			ptr = (unsigned char *)p->buffer_data;

			/* copy the contents */
			memcpy(data,&ptr[mem_position], size);

			/* update put index */
			p->get_index += size;

			/* when buffer is empty we are not able to extract anymore data */
			ret = 0;
		}
	}
	return(ret);


}

unsigned char *ppbuf_dma_get_active_addr(ppbuf_t* p, int *size){
	if(p == NULL || size == NULL) {
		/* no valid parameters return a invalid pointer */
		return(NULL);
	} else {
		/* insertion buffer is always the pong */
		return((unsigned char *)&p->buffer_data[p->pong * p->buffer_size]);
	}
}


unsigned char *ppbuf_dma_get_inactive_addr(ppbuf_t* p, int *size){
	if(p == NULL || size == NULL) {
		/* no valid parameters return a invalid pointer */
		return(NULL);
	} else {
		/* insertion buffer is always the pong */
		return((unsigned char *)&p->buffer_data[p->ping * p->buffer_size]);
	}
}

int ppbuf_dma_force_swap(ppbuf_t* p) {
	int ret = 0;
	/* this function is asynchronous, so it must be used with
	 * caution or a buffer corrpution will occur
	 */


	if(p == NULL) {
		ret = -1;
	} else {
		/* for safety swaps ocurrs only with a presence of a previous full signal */
		if(p->full_signal != false) {

			p->full_signal = false;

			/* swap the buffer switches */
			p->ping = p->ping ^ p->pong;
			p->pong = p->pong ^ p->ping;
			p->ping = p->ping ^ p->pong;

			p->get_index = 0;
			p->put_index = 0;
		}

	}
	return(ret);
}

bool ppbuf_get_full_signal(ppbuf_t *p, bool consume) {
	/* take the last signaled full occurrence */
	bool ret = (p != NULL ? p->full_signal : false);

	if((consume != false) && (p != NULL) && (ret != false)) {
		p->full_signal = false;

		/* swap the buffer switches */
		p->ping = p->ping ^ p->pong;
		p->pong = p->pong ^ p->ping;
		p->ping = p->ping ^ p->pong;

		/* resets the buffer position */
		p->get_index = 0;
		p->put_index = 0;
	}

	return(ret);
}

The implementation has little to say, memory positioning operations, and again the efficiency for data copying on non-DMA processors is conditioned to the internal implementation of the memcpy function, generally optimized for IDE, Compiler, or architecture.

Conclusion

The Ping-pong buffer structure is a lightweight and efficient structure for data capture and processing almost simultaneously. Its data capture strategy, while another sequence is processed, makes it ideal for use with signal and image processing, avoiding high frequency of periodic events related to new samples arriving from (or going to) the hardware. I hope it is another useful object to the reader. Good projects!

Links

Access here the Github repository containing the files and a simple sample application that teaches you how to allocate, insert and retrieve data.

FelipePing-pong buffer to embedded systems

Leave a Reply

Your email address will not be published. Required fields are marked *