Felipe

Static Stack – A lightweight framework for embedded systems

No comments

This post is part of the Memory Control Structures series. Also, read the other posts in the series:

Hello my dear reader. Let’s keep talking about small efficient static data structures for use in embedded systems. Today we will present another structure derived from our circular Buffer, the Static Stack, in free translation of the English Static stack. As many should know the stack is one of the earliest data structures studied in the academic world, having the most diverse ways to implement list-based and dynamic memory allocation. The purpose of this paper is to add to these approaches an additional form of circular buffering, easy to manipulate.

What is Static Stack?

Static Stack is a variant that uses static allocation to reserve the memory that will be used to store data, ie it is not necessary to use functions to reserve memory during execution. All memory used by the structure will be reserved at compile time. The static stack, as well as the entire stack, has a LIFO-type data-insertion and deletion policy, which postulates: “The last data to enter the top of the stack will always be the first to be taken out.”

Basically, inverse to our Circular Queue (and Buffer), as the slots are filled they stack so that the most recently filled slot in relation to the others will be the first to be available for withdrawal. This operation is exactly the same as our processor uses to save variables and context during the operation of an operating system, through known push and pop instructions. The explanation regarding the operation of the battery can be illustrated in the figure below:

Figure 1: Stack operation

Use Case: Where do I apply a stack in general?

Although the processor already has a stack implementation in hardware, its use is restricted to the saving of variables and registers, as well as data used by the compiler that manages the function call system and context switching, so this stack has its very restricted use. With the use of a stack we can illustrate the case of the window manager for an application that uses a display (the popular TFT), let’s see the figure below:

Figure 2: Use of stack in display

Look, we have three overlapping windows. Now comes the question, which one makes sense to process first? Probably the user will be with his attention turned to the last opened window, being necessary to process it first, but we can not discard the others, being necessary, at the closing of the most recent, that we return the processing to the most recent window previous to this one , until all windows are processed, or closed. Notice that this performance policy has exactly the same form of stack operation.

Therefore, stacks are excellent for use in applications that need to save previous states temporarily, can also be used for allocation of fixed memory blocks, being an efficient and light alternative to the use of the traditional mechanism of dynamic memory allocation. Unusual uses of the stack are also located in automatons as state machines, the use of the stack allows previous states to be stored and at convenient times are unpinned or checked, in case an action depends on the previous state (s).

You spoke of a pile. And the static stack?

Static Stack as already mentioned is a version that does not need dynamic allocation, all its parameters are set at compile time, which requires the user to know how much memory (or slots) he needs to reserve. There are several ways to implement a static stack. The implementation presented here makes use of our known circular buffer to also gain protection against memory leak, implicit from circular buffers. Let’s look at the interface of our static stack below:

/**
 * @brief simple static stack demo for embedded systems
 */

#ifndef STATIC_STACK_H_
#define STATIC_STACK_H_

#include <stdbool.h>


/* static stack control structure */
typedef struct {
	unsigned char *data_buffer;
	int put_index;
	int get_index;
	int slot_size;
	int noof_elem;
	int slots_used;
}sstack_t;



/**
 * @brief insert data on top of queue
 */
int sstack_push(sstack_t *s, void *data, int size);

/**
 * @brief gets data on top of queue
 */
int sstack_pop(sstack_t *s, void *data, int *size);

/**
 * @brief check if stack is full
 */
bool sstack_full(sstack_t *s);

/**
 * @brief check if stack is empty
 */
bool sstack_empty(sstack_t *s);

/**
 * @brief flush the stack
 */
int sstack_flush(sstack_t *s);

/* declare a fully initialized stack control structure */
#define SSTACK_DECLARE(name, slot_len, elements) 		\
		unsigned char sstack_buffer_##name[(slot_len + sizeof(int)) * elements];	\
		sstack_t name = {				\
			.data_buffer = &sstack_buffer_##name[0],\
			.put_index = 0,				\
			.get_index = elements - 1,		\
			.slot_size = slot_len + sizeof(int),	\
			.noof_elem = elements,			\
			.slots_used = 0				\
		}

#endif /* STATIC_STACK_H_ */

The interface has the same form as those described in previous articles, we describe the control structure of our stack, and we provide data insertion and removal functions, where the entire policy is managed automatically, the check functions to be used when the stack is full and empty synchronization operations and during development, we also have the classic macro for declaration of the stack fully initialized and ready for use. Let’s look at the implementation now:

/**
 * @brief simple static stack demo for embedded systems
 */

#include <string.h>
#include "static_stack.h"

int sstack_push(sstack_t *s, void *data, int size){
	int ret = 0;
	unsigned char *ptr;

	if(s == NULL || data == NULL || size == 0) {
		/* check your parameters */
		ret = -1;
	} else {
		int mem_pos = s->put_index * s->slot_size;
		ptr = (unsigned char *)s->data_buffer;

		if(s->slots_used >= s->noof_elem|| size > s->slot_size) {
			/* stack full or not enough room for data*/
			ret = -1;
		} else {

			/* insert the size of byte stream */
			memcpy(&ptr[mem_pos], &size, sizeof(size));

			/* insert the byte stream on stack memory*/
			memcpy(&ptr[mem_pos + sizeof(size)], data, size);

			/* update memory indexes, a stack has
			 * policy based on LIFO data insertion and
			 * remotion, so both indexes walk, and always
			 * put_index is ahead of get_index by 1 slot
			 */
			s->put_index = (s->put_index + 1) % s->noof_elem;
			s->get_index = (s->put_index == 0 ?(s->noof_elem + 
                                                   ((s->put_index - 1) % s->noof_elem))
						    :(s->put_index - 1) % s->noof_elem);
			++s->slots_used;
		}
	}

	return(ret);
}

int sstack_pop(sstack_t *s, void *data, int *size){
	int ret = 0;
	unsigned char *ptr;

	if(s == NULL || data == NULL || size == NULL) {
		/* check your parameters */
		ret = -1;
	} else {
		int mem_pos = s->get_index * s->slot_size;
		ptr = (unsigned char *)s->data_buffer;

		if(s->noof_elem == 0) {
			/* stack empty...*/
			ret = -1;
		} else {

			/* gets the size of byte stream */
			memcpy(size,&ptr[mem_pos], sizeof(int));

			/* remove the byte stream from stack memory*/
			memcpy(data,&ptr[mem_pos + sizeof(int)], *size);

			/* update memory indexes, a stack has
			 * policy based on LIFO data insertion and
			 * remotion, so both indexes walk, and always
			 * put_index is ahead of get_index by 1 slot
			 */
			s->put_index = (s->put_index == 0 ?(s->noof_elem +
                                                            ((s->put_index - 1) % s->noof_elem))
							    :(s->put_index - 1) % s->noof_elem);

			s->get_index = (s->put_index == 0 ?(s->noof_elem + 
                                                           ((s->put_index - 1) % s->noof_elem))
							   :(s->put_index - 1) % s->noof_elem);
			--s->slots_used;
		}
	}

	return(ret);
}

bool sstack_full(sstack_t *s){
	bool ret = true;
	if(s == NULL) {
		/* check your parameters */
		ret = false;
	} else {
		ret = (s->slots_used >= s->noof_elem? true : false);
	}
	return(ret);
}

bool sstack_empty(sstack_t *s){
	bool ret = true;
	if(s == NULL) {
		/* check your parameters */
		ret = false;
	} else {
		ret = (s->slots_used == 0? true : false);
	}
	return(ret);
}

int sstack_flush(sstack_t *s){
	int ret = 0;
	if(s == NULL) {
		/* check your parameters */
		ret = -1;
	} else {
		s->slots_used = 0;
		s->get_index = s->noof_elem - 1;
		s->put_index = 0;
	}
	return(ret);
}

The LIFO policy is managed by lines 35, 56, and 72 through 76. Note that the update of the indexes is a bit more complicated than in systems that use FIFO policy, data entry and exit indexes go the same way, but they are together. The data output index always points to the first data available at the top of the stack, while the input data (put_index) points to the first free slot (since there is space), so the update of the first depends on the second. In the removal, unlike the FIFO method, that we simply increment the indexes, we need to make them “walk” in the opposite direction of the insertion, then we introduce to you reader the circular decrement operation, which has a special case that needs to be treated, or (put_index) has a value of zero, which generates the extra code for increment and decrement that appear in these lines.

Conclusion

In addition to structures with FIFO policy, we also have structures with LIFO policy, in today’s article we present one of the most popular, the stack or stack, we also provide the reader with a different approach to its implementation, based on something we already know of articles providing a lightweight, efficient and memory leak protected module. In the link, with the repository for GitHub, there is an example application that uses all the functions and demonstrates the basic use of our Static Stack. I hope this structure is useful. Good projects!

Links

Access here the repository in Github containing the implementation and example application.

FelipeStatic Stack – A lightweight framework for embedded systems

Ping-pong buffer to embedded systems

No comments

This post is part of the Memory Control Structures series. Also, read the other posts in the series:

Hello dear reader. In this new article, I will present to you a more present, lightweight and efficient memory control structure in embedded systems, Ping-pong Buffer. In the last published articles, we show the circular buffer and its derivative, the circular queue, and we present where they can be useful to solve the day to day problems in the workbench. Ping-pong is no different. Its data insertion and removal policy can help primarily in applications that involve digital signal processing in non-real-time, or real-time, higher latency permitting (eg, DSP using low-cost processors).

What is the ping-pong buffer?

Ping-pong Buffer is not a derivative of the circular buffer (although some implementations may use it as a base), but rather a new structure. His work policy follows something similar to: “Divide to conquer”. Which means that the processing of the data are divided into two groups, the data in capture, and the data in processing. The idea is then to have a logically divided memory area in two equal parts, with two independent access channels (the so-called switches). Thus, the “Ping” channel has the function of receiving a string of bytes produced by a source (eg samples from an A / D converter) until its memory space is completely filled. The “Pong” channel has the function of reading the previously reserved memory filled by the “Ping” channel and performing some relevant processing. Let’s look at the figure below:

Figure 1: Ping-pong buffer work policy.

Figure 1 perfectly illustrates the work policy of which we are shortly spoken, so that when the “Ping” channel is full and the “Pong” channel is empty (this operation is mostly managed by the user), a synchronization is performed and the Ping channel points the previously allocated memory to the Pong channel, and for the latter, it is left to point to the channel that was previously “Ping” with the new data available for processing. Meanwhile, the new Ping channel takes care of preacher the memory with new data. This strategy allows saving CPU usage, since the application task will only be put into active mode as new data becomes available, thus generating a delay line according to reserved memory size, multiplied by the fill rate of the CPU buffer.

See the figure below comparing sample-to-sample processing from an A / D converter against block processing using a Ping-pong Buffer:

Figure 2: Comparison of sample to sample processing of an A / D converter against block processing using a Ping-pong Buffer.

In the figure above we have two use cases for processing a signal arriving from an A / D converter. In case 1, the one on the left, an interrupt routine accesses the data logger and writes to a previously allocated region, so that the application has to process it to avoid distortion. However, the task will be executed with each A / D capture. In case 2, to the right, an automatic memory transfer peripheral captures the samples and writes them to a previously allocated memory area, and an interrupt will be generated only when that area is filled, greatly reducing the CPU usage that will only execute the application when the full buffer effect is generated (temporarily on a smaller scale than each occurrence of an interrupt to the A / D).

Cool this A / D use case, but where does one thing connect with another?

Let’s go to the part that I like most of these articles, the practical use. The great advantage of the presented buffer is that it can optimize both use cases from the example presented above. In case 1, our ping-pong buffer would serve as a memory area to be filled in to collect multiple samples. Thus, the interrupt routine becomes light, having to insert sample in the buffer and when the buffer is full, sending a signal waking the application, which in turn changes the channels, and processes the new buffer while the other fills up This case is even perfect on low-cost microcontrollers that do not have an automatic memory transfer device like the DMA. I wrote this article that demonstrates case 1 optimized with this technique.

The second use case turns out to be even more optimized, because with DMA the A / D interrupt routine is simply no longer necessary. The DMA in this case has the function to receive the memory address where the data will be deposited and automatically, with each new sample of the A / D, it will be copied to the channel “Ping”, and to fill it full, an interruption will be generated, where the data would be accessible for processing the access channels are then inverted and the new memory address is sent to the DMA that is in charge of filling the new block. The figure below illustrates well what happens:

Figure 3: Ping-pong buffer and DMA.

A third case worth mentioning for the use of a Ping-pong buffer is the handling of images for display controllers (the popular TFT). With the use of a Ping-pong buffer, it is possible to send a framebuffer already processed through the “Pong” channel to the display, while the processor uses the “Ping” channel to perform operations such as drawing or repositioning objects on the screen. The same case applies to the use of D / A converter for signal generation, see another figurine that illustrates this case:

Figure 4: Ping-pong buffer for image treatment example.

Ping-pong buffer basic implementation

In this example implementation of a ping-pong buffer I tried to be a little broader, providing the classic macro that declares a ping-pong properly initialized and ready for use. The insert and retrieve routines are available and automatically manage where the “Ping” and “Pong” channels should access the reserved memory. The ppbuf_get_full_signal function is responsible for the synchronization operation. Thus, when its return is true, the application that calls this function has the option to consume the event, and when this occurs the Ping and Pong channels are reversed, and in a new cycle can be initialized.

Let’s look at the interface of our Ping-pong Buffer:

/**
 * @brief simple ping-pong buffer implementation
 */

#ifndef PING_PONG_BUFFER_H_
#define PING_PONG_BUFFER_H_

#include <stdbool.h>

/* ping pong buffer control structure */
typedef struct {
	unsigned char *buffer_data;
	unsigned char ping;
	unsigned char pong;
	int buffer_size;
	int put_index;
	int get_index;
	bool full_signal;
}ppbuf_t;

/**
 * @brief insert on active buffer
 */
int ppbuf_insert_active(ppbuf_t *p, void *data, int size);

/**
 * @brief remove from inactive buffer
 */
int ppbuf_remove_inactive(ppbuf_t *p, void *data, int size);

/**
 * @brief USE WITH DMA ONLY! get the current active buffer address
 */
unsigned char *ppbuf_dma_get_active_addr(ppbuf_t* p, int *size);

/**
 * @brief USE WITH DMA ONLY! get the current inactive buffer address
 */
unsigned char *ppbuf_dma_get_inactive_addr(ppbuf_t* p, int *size);


/**
 * @brief USE WITH DMA ONLY! force full signaling to next buffer become available
 */
int ppbuf_dma_force_swap(ppbuf_t* p);

/**
 * @brief get full signal
 */
bool ppbuf_get_full_signal(ppbuf_t *p, bool consume);


/* instantiate a fully initialized and static ping-pong buffer */
#define PPBUF_DECLARE(name,size)                                \
		unsigned char ppbuf_mem_##name[size * 2] = {0};	\
		ppbuf_t name = {				\
			.buffer_data = &ppbuf_mem_##name[0],	\
			.ping = 1,				\
			.pong = 0,				\
			.buffer_size = size,			\
			.put_index = 0,				\
			.get_index = 0,				\
			.full_signal = false			\
		}

#endif /* PING_PONG_BUFFER_H_ */

The ppbuf_dma_xxx functions are for DMA-only use, some care should be taken when using them because it returns the memory address of the corresponding “Ping” or “Pong” channel to be passed to the DMA. Along with ppbuf_dma_force_swap, which forces channel switching asynchronously. Let’s look at the implementation:

/**
 * @brief simple ping pong buffer implementation
 */

#include <string.h>
#include "ping_pong_buffer.h"


int ppbuf_insert_active(ppbuf_t *p, void *data, int size){
	int ret = 0;
	unsigned char *ptr;

	if(p == NULL || data == NULL || size == 0) {
		/* check your parameters */
		ret = -1;
	} else {
		if(size > (p->buffer_size - p->put_index)) {
			/* not enough room for new samples */
			ret = -1;
		} else {
			/* take the current position */
			int mem_position = ((p->ping) * p->buffer_size) + p->put_index;
			ptr = (unsigned char *)p->buffer_data;

			/* copy the contents */
			memcpy(&ptr[mem_position], data, size);

			/* update put index */
			p->put_index += size;
			p->full_signal = (p->put_index >= p->buffer_size?true:false);

			/* swap will only generated when ppbuf_get_full_signal is called */
			ret = 0;
		}
	}
	return(ret);
}

int ppbuf_remove_inactive(ppbuf_t *p, void *data, int size){
	int ret = 0;
	unsigned char *ptr;

	if(p == NULL || data == NULL || size == 0) {
		/* check your parameters */
		ret = -1;
	} else {
		if(size > (p->buffer_size - p->get_index)) {
			/* not enough data in sample buffer */
			ret = -1;
		} else {
			/* take the current position */
			int mem_position = ((p->pong) * p->buffer_size) + p->get_index;
			ptr = (unsigned char *)p->buffer_data;

			/* copy the contents */
			memcpy(data,&ptr[mem_position], size);

			/* update put index */
			p->get_index += size;

			/* when buffer is empty we are not able to extract anymore data */
			ret = 0;
		}
	}
	return(ret);


}

unsigned char *ppbuf_dma_get_active_addr(ppbuf_t* p, int *size){
	if(p == NULL || size == NULL) {
		/* no valid parameters return a invalid pointer */
		return(NULL);
	} else {
		/* insertion buffer is always the pong */
		return((unsigned char *)&p->buffer_data[p->pong * p->buffer_size]);
	}
}


unsigned char *ppbuf_dma_get_inactive_addr(ppbuf_t* p, int *size){
	if(p == NULL || size == NULL) {
		/* no valid parameters return a invalid pointer */
		return(NULL);
	} else {
		/* insertion buffer is always the pong */
		return((unsigned char *)&p->buffer_data[p->ping * p->buffer_size]);
	}
}

int ppbuf_dma_force_swap(ppbuf_t* p) {
	int ret = 0;
	/* this function is asynchronous, so it must be used with
	 * caution or a buffer corrpution will occur
	 */


	if(p == NULL) {
		ret = -1;
	} else {
		/* for safety swaps ocurrs only with a presence of a previous full signal */
		if(p->full_signal != false) {

			p->full_signal = false;

			/* swap the buffer switches */
			p->ping = p->ping ^ p->pong;
			p->pong = p->pong ^ p->ping;
			p->ping = p->ping ^ p->pong;

			p->get_index = 0;
			p->put_index = 0;
		}

	}
	return(ret);
}

bool ppbuf_get_full_signal(ppbuf_t *p, bool consume) {
	/* take the last signaled full occurrence */
	bool ret = (p != NULL ? p->full_signal : false);

	if((consume != false) && (p != NULL) && (ret != false)) {
		p->full_signal = false;

		/* swap the buffer switches */
		p->ping = p->ping ^ p->pong;
		p->pong = p->pong ^ p->ping;
		p->ping = p->ping ^ p->pong;

		/* resets the buffer position */
		p->get_index = 0;
		p->put_index = 0;
	}

	return(ret);
}

The implementation has little to say, memory positioning operations, and again the efficiency for data copying on non-DMA processors is conditioned to the internal implementation of the memcpy function, generally optimized for IDE, Compiler, or architecture.

Conclusion

The Ping-pong buffer structure is a lightweight and efficient structure for data capture and processing almost simultaneously. Its data capture strategy, while another sequence is processed, makes it ideal for use with signal and image processing, avoiding high frequency of periodic events related to new samples arriving from (or going to) the hardware. I hope it is another useful object to the reader. Good projects!

Links

Access here the Github repository containing the files and a simple sample application that teaches you how to allocate, insert and retrieve data.

FelipePing-pong buffer to embedded systems

Circular Queue for embedded systems

No comments

This post is part of the Memory Control Structures series. Also, read the other posts in the series:

Dear reader, how are you? We spoke a few days ago about a simple and efficient data structure for temporary storage of byte sequences, and it can be used in communications and data processing, the circular buffer, a resource with a policy of insertion and removal of data of the FIFO type. Well, in this article we will take advantage of the use of the circular buffer structure and transform it into an even more powerful data structure, a circular queue.

What is the difference between circular buffer and circular queue?

Essentially, they are identical, have insertion and removal policies based on FIFO, wrap around control, ie in case of overflow the index of writing or reading points to the beginning of the memory allocated to the data. The essential difference lies in the data size unit. In the circular buffer we can insert a string of bytes of variable length between 1 and the size of the allocated memory (since it is free), just as we can pull a string of data under the same range.

In the same essence, the circular queue allows the removal of byte chain, but limited in size to a logical subdivision of the memory area known as slots, so a circular queue has the capacity of N slots where each slot holds up to B bytes. The most attentive reader can see that this already refers to the use of the circular queue for storing larger volumes of data and can be used as a simple memory manager with fixed block size. Perfect, for example, to sort complex data types as a sequence of TCP or UDP datagrams. Look at the figure below:

See in the figure that for each row of the table there are several fields to fill. In the circular queue approach, each row in this table would have the size of a slot, so insertion would be by the copy of a slot and withdrawal by an entire row of the table in the order in which the slots arrive. In the Circular Buffer, we were interested in the order in which the bytes arrived.

Where could I apply one circular queue?

Dear reader, let’s go back to the case of the UART that I quoted in the article on circular buffers. We already know that by using them we can cause the UART to capture incoming bytes and order them for further processing. Now let’s imagine the case whereby this same UART traffic a protocol of communication, that contains several commands, but all of them with the same sequence of bytes. Taking, for example, the well-known MODBUS protocol, a Telegram (MODBUS standard structure) can contain the following fields as shown below:

Disregard the Start and End fields (usually they are treated by hardware), let’s see that the protocol has a common structure, with the exception of the Data field, which can be variable. Let us assume, to simplify the reasoning, that we know that the size of the Data field at its maximum use is known. Our UART already has the circular buffer to capture bytes that come between Start and End. It reads from the hardware register and adds to the circular buffer, however when assembling a MODBUS packet and sending for processing, another packet may already be ready for to arrive. Would not it be much better then to remove all bytes from the circular buffer and copy the ready-to-process message in the order they arrive? This is where the circular queue should enter. With it each slot would be a complete MODBUS command and sorted from the oldest to the most recent, allowing the application to worry about just one thing, pull out the complete message and execute the command. This helps avoid excessive CPU usage and allows the use of empty queue events to disable the decoder when it does not need to be used.

Thus a circular queue has great application in managers of complex communication protocols, that is, that do not treat byte chain, but messages preformatted and with a defined sequence. In addition, the circular queue can also be used for scheduling operating system tasks in the same way as the circular buffer. However, because it has its slots-like structure, it will be able to handle complex data structures such as a Task Control Block (data structure that contains all the information of the tasks to be executed).

Example of a circular queue

The implementation presented brings the basic routines for creating and manipulating a circular queue. An example of usage is also being provided to assist the user in basic data creation, insertion, and deletion so that he can then use it in his own applications. The supplied module is as independent of architecture as possible and its performance of the data copy depends on the implementation of the generally optimized memcpy function in the LIBC provided with the user’s favourite toolchain. Below are the files:

/**
 * @brief Simple circular queue implementation with basic management functions
 */

#ifndef __QUEUE_CIRC_H
#define __QUEUE_CIRC_H


/** circular queue management structure */
typedef struct {
  unsigned char *data;
  unsigned int  items;
  unsigned int  slots_number;
  unsigned int  wr_ptr;
  unsigned int  rd_ptr;
  int  slot_len;
} queue_circ_t;



/**
 * @brief insert a stream data with size lenght to the queue
 */
int queue_insert(queue_circ_t *q,void *data, unsigned int size);

/**
 *  @brief retrieves a stream of dat with specified size
 */
int queue_retrieve(queue_circ_t *q, void *data, unsigned int *size);

/**
 *  @brief check if queue is already full
 */
int queue_full(queue_circ_t *q);

/**
 * @brief makes the queue empty
 */
int queue_flush(queue_circ_t *q);

/**
 * @brief  check if queue is empty
 */
int queue_empty(queue_circ_t *q);

/** declare a initialized circular queue
 * name - name of the allocated queue structure
 * slot_size -size in bytes of each slot
 * noof_slots - number of slots (elements) in this queue
 */
#define CIRCULAR_QUEUE_DECLARE(name,slot_size, noof_slots)    \
        unsigned char queue_memory_##name[(slot_size + sizeof(unsigned int)) * noof_slots];\
        queue_circ_t name = {                                 \
          .data = &queue_memory_##name[0],                    \
          .items = 0,                                         \
          .slots_number = noof_slots,                         \
          .wr_ptr = 0,                                        \
          .rd_ptr = 0,                                        \
          .slot_len = slot_size + sizeof(int)                 \
        }

#endif

In the file circ_queue.h the user will have available all the routines to instantiate and use their queue in a personalized way, at the end a macro (just like in the circular bufffer) is also available so that it is possible to initialize a totally initialized structure and ready For use. In addition, the checking routines are provided to check for a full or empty queue, in addition to the flush routine, which cleans it.

Below we’ll take a look at the implementation (as usual). See:

#include <string.h>
#include "queue_circ.h"


int queue_insert(queue_circ_t *q,void *data, unsigned int size)
{
  int ret = -1;

  if(q == NULL) {
    /* check your queue parameter */
    return(ret);
  } else {

    /* points the insertion index in correct queue
     * memory region
     */
    unsigned char *ptr = (unsigned char *)q->data;
    int pos = q->wr_ptr * q->slot_len;

    if(size > q->slot_len) {
      /* data will not fit in this slot */
      ret = -1;
    } else {
      /* the first 4 bytes are the size of slot in bytes */
      memcpy(&ptr[pos], &size, sizeof(size));
      /* insert the data */
      memcpy(&ptr[pos + sizeof(size)], data, size);
      /* update the indexes */
      q->wr_ptr = (q->wr_ptr + 1) % q->slots_number;
      ++q->items;
      ret = 0;
    }
  }
  return(ret);
}


int queue_retrieve(queue_circ_t *q,  void *data, unsigned int *size)
{
  int ret = -1;

  if(q == NULL || size == NULL) {
    /* check your queue parameter */
    return(ret);
  } else {

    /* points the insertion index in correct queue
     * memory region
     */
    unsigned char *ptr = (unsigned char *)q->data;
    int pos = q->rd_ptr * q->slot_len;

      /* the first 4 bytes are the size of slot in bytes */
      memcpy(&size,&ptr[pos], sizeof(size));
      /* retrieve the data */
      memcpy(data,&ptr[pos + sizeof(size)], size);
      /* update the indexes */
      q->rd_ptr = (q->rd_ptr + 1) % q->slots_number;
      --q->items;
      ret = 0;
  }
  return(ret);
}


int queue_full(queue_circ_t *q)
{
  int ret = 0;
  if(q->items >= q->slots_number) {
    ret = 1;
  }
  return(ret);
}


int queue_flush(queue_circ_t *q)
{
  int ret = 0;

  if(q != NULL) {
    q->wr_ptr = q->rd_ptr;
    q->items = 0;
  } else {
    ret = -1;
  }
  return(ret);
}


int queue_empty(queue_circ_t *q)
{
  return((q->items == 0? 1: 0));
}

Notice that the implementation of the Circular Queue is very similar to the Circular Buffer, in addition to the implicit protection of memory leakage using circular increment. For the most attentive reader, the main difference is in lines 18 and 19 for insertion, and lines 51 to 57 for removal. Notice that in the Circular Buffer we used the indexes to directly access the byte specified by it, but the queue is logically subdivided into slots, that is, smaller areas of smaller memory combined into a single memory area. Therefore, for correct access, we use the indexes in conjunction with the size of each slot, multiplying them with each other and obtaining the beginning of slot memory. In addition, logically 4 bytes are reserved for each slot that stores the amount of data inserted in it, and when performing a retrieve, the amount of bytes will be returned so that the user knows how much was in the slot accessed.

Conclusion

The circular buffer is versatile and may have its structure used to construct more complex data types, such as the Circular Queue example, a secure and efficient structure for temporary storage with the additional provision of an associated memory block to deal with larger and more complex data types. The simple implementation provided in this article meets the needs of using communication stacks for message processing that are extracted from on-chip controllers such as USB, UART, CAN, and protocols written on top of these interfaces.

I hope you enjoyed it. To the next!

Example in Github

FelipeCircular Queue for embedded systems

Circular buffer for embedded systems

No comments

This post is part of the Memory Control Structures series. Also, read the other posts in the series:

Hello dear reader. With this article, we intend to expose a very legal resource to be implemented in embedded software that can be used for the most diverse applications, especially for data management coming from communication controllers, the Circular Buffer.

Circular buffer, let it sort the data for you

The circular buffer is known to experienced developers because it is a data structure that has a FIFO type data insertion and removal policy, which in free translation means: first to enter, first to exit.

In practical terms, the data packets that are inserted into this buffer will be sorted by arrival, so the withdrawal will be done in that order. The large differential of the circular buffer is at the point where there is virtually no overflow of the data area reserved for depositing the variables, because if the insertion control finds the end of the data area, it automatically points to the beginning of the data area overwrite the oldest content.

Hence known problems like processor-generated exceptions that occur when we try to deposit data in a larger memory area than the one previously allocated cease to occur. The figure below demonstrates how an intuitive (and canonical circular buffer) works.

Figure 1: Circular Buffer

The figure also points out that the insertion controllers (indexes) move in the same direction so that the removal index “persecutes” the insert, and both respect the rule, and when they reach the end of the memory area, they return to the start. Additionally, overflow controls of the amount of written or withdrawn data can be implemented to prevent content loss.

Interesting, but where can I use the circular buffer?

The circular buffer behaviour is ideal for implementing any data structure that is statically allocated and behaves like FIFO. As an example, mailboxes and queues can be implemented using the circular buffer as a kernel.

In my articles on DSP and digital control, you, reader, will realize that the implementation of filters and differential equations occur in the form of circular buffers. Applications do not stop there, circular buffers can be used for temporary storage of data coming from communication drivers. The best example of this is the reception of data from a UART, as shown in the figure below:

Figure 2: UART Circular Buffer

With the circular buffer, we can put our UART just to do what it should do, receive data. With this, each byte received is placed in the circular buffer. In this way applications that need UART data can access them or even wait for the buffer to be read to read the data later. In addition, this approach maintains the timeline of the arrival of the data, ie, data that arrives first will be processed first.

I want an example of a Circular Buffer!

The idea of this article is to solve a problem by presenting a resource for this, then talk about the best circular buffer implementations that escape its scope. In the files buffer_circ.c and .h the reader will find a basic implementation of the circular buffer. It contains routines for insertion, removal, and full or empty buffer checking. It also provides a macro to declare an initialized buffer structure. See the files:

/**
 * @brief Simple circular buffer implementation with basic management functions
 */
 
#ifndef __BUFFER_CIRC_H
#define __BUFFER_CIRC_H
 
#define BUFFER_SIZE 128
 
/** circular buffer management structure */
typedef struct {
  unsigned char data[BUFFER_SIZE];
  unsigned int  items;
  unsigned int  wr_ptr;
  unsigned int  rd_ptr;
} buffer_circ_t;
 
/**
 * @brief insert a stream data with size lenght to the buffer
 */
int buffer_insert(buffer_circ_t *b,void *data, unsigned int size);
 
/**
 *  @brief retrieves a stream of dat with specified size
 */
int buffer_retrieve(buffer_circ_t *b, void *data, unsigned int size);
 
/**
 *  @brief check if buffer is already full
 */
int buffer_full(buffer_circ_t *b);
 
/**
 * @brief check if a data stream with specified size will full the buffer
 */
int buffer_will_full(buffer_circ_t *b, unsigned int size);
 
/**
 * @brief makes the buffer empty
 */
int buffer_flush(buffer_circ_t *b);
 
/**
 * @brief  check if buffer is empty
 */
int buffer_empty(buffer_circ_t *b);
 
/** declare a initialized circular buffer */
#define CIRCULAR_BUFFER_DECLARE(name)  \
        buffer_circ_t name = {         \
          .data = {0},                 \
          .items = 0,                  \
          .wr_ptr = 0,                 \
          .rd_ptr = 0,                 \
        }
 
#endif

The BUFFER_SIZE constant determines in bytes the amount of memory reserved for use and can be changed according to the user’s needs.

#include "string.h"
#include "buffer_circ.h"

int buffer_insert(buffer_circ_t *b,void *data, unsigned int size)
{
  int ret = -1;
  unsigned char *ptr = (unsigned char *)data;

  if(b == NULL) {
    /* check your buffer parameter */
    return(ret);
  }

  if(size + b->items <= BUFFER_SIZE){

    /* the buffer has enough room, insert
     * stream.
     */
    unsigned int i;
    for(i = 0; i < size; i++) {
        b->data[wr_ptr] = *ptr++;

        /* increment the input index in form  if it
         * reach the buffer end, its placed in it
         * initial address again
         */
        wr_ptr =  (wr_ptr + 1) % BUFFER_SIZE;
        b->items++;
    }
    ret = 0;
  }

  return(ret);
}
int buffer_retrieve(buffer_circ_t *b,  void *data, unsigned int size)
{
  int ret = 0;
  unsigned char *ptr = (unsigned char *)data;

  if(b == NULL) {
    /* check your buffer parameter */
    return((ret = -1));
  }

  /* if the size requested fits on
   * current contents, extract
   * the data stream
   */
  unsigned int i;
  for(i = 0; (i < size) && (b->items != 0); i++) {
      *ptr++ = b->data[rd_ptr];
      rd_ptr =  (rd_ptr + 1) % BUFFER_SIZE;
      ret++;
      b->items--;
  }

  return(ret);
}
int buffer_full(buffer_circ_t *b)
{
  int ret = 0;
  if(b->items ==  BUFFER_SIZE) {
    ret = 1;
  }
  return(ret);
}

int buffer_will_full(buffer_circ_t *b, unsigned int size)
{
  int ret = 0;

  if(b->items + size > BUFFER_SIZE ) {
    ret = 1;
  }
  return(ret);
}
int buffer_flush(buffer_circ_t *b)
{
  int ret = 0;

  if(b != NULL) {
    b->wr_ptr = b->rd_ptr;
    b->items = 0;
  } else {
    ret = -1;
  }

  return(ret);
}
int buffer_empty(buffer_circ_t *b)
{
  return((b->items == 0? 1: 0));
}

Look at lines 28 and 53 of this file, this operation is called a circular increment. Its function is to increment the index in a unit, and in case of an encounter with the end of the memory area, the result of this operation will be index 0, that is, the beginning of the reserved area.

Conclusion

Circular buffers are interesting, safety features and prevent memory leak problems. For example, they are powerful and can serve as the basis for more complex data structures, they are versatile, their applications range from operating systems to digital signal processing. The above example has basic operations functions and serves as a starting point for more complex implementations.

Click here for example code on Github

FelipeCircular buffer for embedded systems

Digital PID Controller: One practical modeling for microcontrolers – Part 2

No comments

Continuing from the point at which we stopped in the first part of this series, we will now model a digital PID with a slightly different approach than the one used to design the compensator described in the previous article, the idea here is to provide a PID compensator that can be implementable in a way simple in a general purpose processor (as discussed in part I of this series, general purpose processor is meant any microcontroller that has at least the hardware multiplication instruction).

Shall we recap?

In part I of this series, it was explained what is a closed-loop control system, the reason of its use, as well as the PID (Proportional – Integral – Derivative) type compensator as a versatile type of controller applied in large control systems of the most varied types.

We still remember that its expression takes the form of an equation, as the one exemplified below:

Eq(1)
  • Kp: Proportional action coefficient;
  • Ki: Coefficient of integral action;
  • Kd: Derivative action coefficient
  • t: State of the state to be processed;
  • u (t): System output signal at time t;
  • and (t): Error signal at controller input at time t.

From this equation followed by the use of the well – known Laplace transform, a much simpler equation was obtained than the one exemplified above, so that an electronic circuit could be designed to reproduce the behavior of a PID compensator.

In contrast, in order to implement a component of these software paths (in a microcontroller), both one equation and the other, are very difficult to implement, so, using an approach based on numerical methods, we were able to design an equation easily programmable on a processor:

Eq(1.1)

Where, the parameters (constants) can be calculated as follows:

In addition we have:

ti = integration time

td = Time of differentiation.

Also adopted was a name called the rectangular approximation of the integral term, so it could translate this simple equation into a program that can be embedded in the processor to be used.

Problems with the rectangular approximation

Although the rectangular approximation algorithm of the integral term is versatile and serves a good range of applications, some problems begin to emerge as the complexity of the system being controlled increases.

The first and great problem is the convergence of the integral of the error function to the correct value when the previous value of this error function is equal to zero, or when the difference between the current and the previous error becomes very large, this problem could be corrected by increasing the number of iterations for calculating the integral term (and not based on two points), but at the expense of execution efficiency.

Another solution would be to approximate the integral only by accumulating its value, each time the PID is computed, this maintains the efficiency of the code execution, and causes the same effect of the increase of iterations, however the accumulation would occur only once in each value of the computed PID, returning the problem of slow convergence in the results of the integral in the first values ​​calculated by the PID, as the process advances this effect is being minimized.

The other big problem of this algorithm can be elucidated with a practical example, considering a system where the proportional gain is something high, and the accumulated error begins to grow or to decrease for long periods of time, this can fatally lead the system in control to have overshoot with high values, and in more severe cases, lead the system to a permanent oscillation. This phenomenon is known by the name of wind-up, and can be minimized using low values ​​for the integral gain, added to techniques of saturation of the integrator (accumulator in the case of integral approximation), but thus, we return to the problem of delay of convergence, since the saturation can be understood as zeroing the integrator (or accumulator).

We then have some drawbacks with this algorithm, so what can be done to minimize the effects of convergence and wind-up? There are several ways to model algorithms of digital compensators from analog circuits, or even from numerical expressions, the algorithm that we will describe here offers a digital PID type compensator with the best approximation.

Digital PID Compensator 2 poles / 2 zeros

Knowing the problems with the first algorithm we can proceed with the modeling of a more precise digital PID controller. For this we will need a mathematical tool well known to those who have studied digital control (or digital signal processing), the Z transform.

This tool allows us to translate equations of difference (in the domain of time), ie those in which two events are separated by an interval whose value is finite and numerically representable, and to obtain an expression that takes a form very similar to equations obtained by the Laplace transform for continuous time systems.

Its definition is exemplified in the equation below:

Eq(1.2)

Equation 1.2 can intimidate at first, but observing with attention we can note that the Z transform by definition does nothing more than take a sequence of pulses whose capture instants are spaced by a time interval, the independent variable n denotes the instant of sampling, so that a discrete time signal can be described as the sequence below:

Thus, once passed to the domain Z each instant of the signal is accompanied by the complex operator (“z”), which denotes the displacement of that signal in time, simplistically the representation of the operator z serves to say if the sample is a value current, a future value (predictors and estimators) or an earlier value, the latter of our interest, let us see:

X (z) z-n: Sample delayed in time;

X (z) z0. ‘. X (z): Current sample;

X (z) zn: Future sample

Although the theory behind the Z transform involves much more concepts, if we look at it in a simplistic way, we can use this simple property (referred to as time shift) to realize a new digital PID controller with better performance than the previous article.

Now that we already know what z operators mean, let’s change a little subject and return to our problem, a way to better approximate the integral term. In addition to the accumulation, there are several numerical methods that allow us to solve integrals with considerable precision and much smaller convergence intervals. For this purpose, let’s select the rule of the trapezoids, with which we can solve an integral according to the following equation:

Eq(1.3)

A much simpler expression to solve, in addition to the rule of trapezoids by approaching the difference between two values ​​of the function f (t) by a trapezoidal aspect, if the sampling rate of the system is well chosen, the convergence for the desired result is much faster and does not suffer from the problem presenting in the first algorithm developed.

We already have a new solution for our integral term, since the derivative does not need modifications, why use the Z-transform, why not go straight to the accounts? With the use of Z transform, we can further reduce the final equation, pre – compute its coefficients, and we can also use its Z – transfer function for analysis in a simulator such as MATLAB.

We know that numerical methods by nature work in a domain of discrete intervals, which is perfect for our application, means that we can take the Z-transform of equation 1.3? Yes we can, but first let’s change the notations to become more familiar:

Eq(1.4)

The above equation is in a much more convenient notation for discrete time system. But now comes the question, how can this well-known rule be used for the purpose of this article? Let us consider the following case, we have the integral of a function, this is the value x [n], adding initial conditions or the integration constant we arrive at the following relation:

Taking the Laplace transform and then the Z transform on both sides of the equation we have (Note that we are computing two derivatives in the domain of continuous time):

Thus we can relate the complex variable s to the discrete time variable z. But before applying this to our PID control, let’s see how to do for the term derivative. Following the line of numerical reasoning to compute a derivative, we have:

As we did for the integral, let’s change the notation for the discrete time domain:

To facilitate computation we will delay both samples in one unit and representing x [n], thus we obtain:

Let us now take the Laplace’s transform then the Z transform in the same way as we did for the integral:

Algebra, algebra, algebra, eis that leads to:

Eq(1.4)

Here we now have an approximation for the integral term, and another for the derivative term. But note that they are in the frequency domain, and our PID is in the time domain, we cover equation 1:

This is not a big problem, first, let’s modify the equation above to the form of a transfer function, ie the quotient of the output by the input of the controller, hence this leads to:

Eq(1.5)

And finally we take the Laplace’s transform, from the above equation:

Eq(1.6)

Our PID has a much friendlier equation but it still finds itself in the continuous time domain, we now need to translate this equation into the discrete time domain, but as was done in 1.5 and 1.6 we now have terms in Z that represent S , so we can take the Z-transform of equation 1.5 substituting the integral and derivative terms:

And it is done, a digital PID compessor, but to the reader I say, do not be scared, although the equation does not seem to make much sense, even more if it is the implementation in a processor, it will begin to make sense as we better order the terms, before anything, we will disappear with the two denominators that does not contain any z, this will help facilitate the accounts. I will add two operators, h1 and h2 where:

h1 = Δt / 2;

h2 = 1 / Δt.

Here the new transfer function becomes:

From this point I will make the necessary mathematical passages until we obtain the final transfer function of our compensator, I ask the reader to follow the passages calmly:

Here we have our transfer function, it is much more friendly, and allows to work on it to form an equation of differences. But before proceeding, let’s make it even more intuitive for the reader, let’s note:

We will replace these coefficients in the transfer function, and we will note that our digital PID is now nothing more than a special case of IIR filter, having two poles and two zeros, from which comes the origin of the algorithm name:

Eq(1.7)

Look at how interesting math is, we start with a differential equation, we put a trapezoidal integral, a numerical derivative, we use two transforms, to find that our best approximated PID is a digital filter of infinite impulsive response. In other words, there are many ways to implement this in a microcontroller, this same equation may already be put into a simulator like MATLAB to evaluate the response of the system to be controlled.

Placing inside a microcontroller

We have the perfectly usable Z-transfer function to simulate its behavior in a MATLAB or any other system simulator, but this function of the form that it is is not very useful, to implement in any processor whatsoever, we must first compute a difference equation from the inverse transform Z, for this we will return to equation 1.7:

We have the input and output of the system, now let’s output according to the input:

It has improved, right? Let’s move a little bit by factoring U (z) and E (z) to “inside” the polynomials we will get:

But: a2 = 0, this simplifies the expression in:

In function of U (z) we obtain:

Interestingly, at this point we can take the inverse transform Z, and surprise the equation of differences becomes this simple expression:

As? Yes my friend, all that hassle of transforming here, transforming there, and using numeric method, will generate this trivial expression, proof that to get things done simple is necessary to think. Equation 1.8 is perfectly implementable, the coefficients are easily calculable, so we can translate this equation as, to compute a new compensator output, we need the previous output sample, and the last three calculated error values.

For the closer reader, this one must have realized that the full term windup problem does not even exist, one less worry, just choose whether to work with fixed or floating point and put it inside your processor.

Let’s test our example, we will ship a small firmware to control the brightness of an RGB LED by closed loop.

Example in test

And finally we come to the most fun session of the article, the practical implementation, we will ship our test firmware in the following development kit:

Figure 1: Led Booster Pack + Piccolo LaunchPad.

Again we will not go into details about the Kit or microcontroller, more information can be found on the Texas website, the link is available at the end of the article.

We will use our digital PID to control a string of LEDs, through the operation of the microcontroller in a DC-DC converter, in fact one of the most interesting parts of this kit in particular. As a simple example, I opted for the use of only the control of the chain of red LEDs, using the control only by the output voltage of the DC-DC converter (yes it is a boost), consulting the datasheet of these leds, arrived at the value for obtaining a current of 25 [mA], by means of experimental adjustment, the following coefficients were reached: Kp = 50, Ki = 15, Kd = 100, the sampling rate was placed at 58,000 samples per second and a carrier frequency of 150 [KHz] allows the PWM module of the microcontroller to operate at 9-bit resolution.

Let’s see the image below, we have the bus voltage perfectly controlled by the microcontroller, let’s see her stable behavior:

Figure 2: Voltage in the LED string bus.

Below we have a photo of the bench, where we have the LEDs being controlled by the microcontroller, one of the things that you write is more interesting is the possibility of the microcontroller itself become a digital control unit, flexible according to the application.

Figure 3: Microcontroller acting through DC-DC.

In spite of the simplicity of the example, the entire project is in a repository in GitHub, to you, reader, I leave the freedom to use in the C2000 itself or carry the PID.c and PID.c files for the microcontroller of your interest, I suggest only the alteration of the types according to the microcontroller used, in case of use of more compact platforms such as AVR, and microcontrollers based on the Cortex-M0 core. And for you most curious reader, why not explore the equation 1.7 and implement in your processor on your own? Writing code in assembly can also be an extremely pleasurable task.

Conclusion

So we come to the end of this small series of articles and hopefully have demystified some concepts on the implementation of digital compensators in microcontrollers of almost any nature. We also hope that the examples provided will be useful to those who wish to venture into the world of digital control. We have also been able to model a simple PID compensator using the Z transform in a simple way without worrying too much about the mathematics involved.

Supplement materials

Example Project: https://github.com/uLipe/PID_Embarcados_C2000

Microcontrollers C2000 : http://www.ti.com/product/tms320f28027

Led Booster Pack, Kit used in the example: http://www.ti.com/tool/boostxl-c2kled

References

CRENSHAW, Jack W. : Math toolkit for real time programming, 2001

MEYRATH, Todd : Multipurpose analog PID controller, 2005

POLEY, Richard : Introduction to digital control using digital signal processors, 2003

STARR, Gregory P. : Introduction to applied digital control , 2006

FelipeDigital PID Controller: One practical modeling for microcontrolers – Part 2

Digital PID Controller: One practical modeling for microcontrolers – Part 1

No comments

On this series we’ll show, in a practical and intuitive way, how to model one PID ( Proportional – Integral – Derivative) controller in a way that your final equation can be easy implemented in a general purpose microcontroller (general purpose here, meaning a processor with at least hardware multiplication instructions).

Introduction

When we speak about system control, is hard when a PID compensator type is not mentioned. This kind of controller is largely implemented to control the output of different systems. Utilized in closed loop systems, one controller of this type can be set to offer the desired response and maintain the system’s output with a minimum amount of error only using three setup parameters.

Figure 1: Example of a closed loop system.

The Desired State signal is the desired value for which the system output should remain until it is changed again. At the system output (Actual Robot State sample signal) we have the signal of the current state of the system which may or may not be the same as desired. To do this, its current value is sent to the input and the difference between the desired state and the current state is computed, forming a third commonly called error variable, ie the difference between the desired state and the current state. The Robot Kinematics block represents the characteristic of the simulated system, usually described by differential equations, that is, the system to be controlled (an electric motor having its speed controlled, for example). Finally, we have the Controller block, which as its name suggests is a block to compensate for the (often undesirable) characteristics of the system response to be controlled. This is where the PID controller is “docked”. Now that we remember (or know) the aspect of a closed loop system, let us better understand what this PID compensator is.

The compensator PID

Knowing that a closed loop system is usually added by a compensator block to get the response and control over a variable, the reader should ask what these compensators are. In fact, there are several implementations of compensators ranging from a simple first order filter to complex arrangements with equations greater than or equal to 3. Among them we highlight the PID controller, which has up to three forms of actuation (up to three, since the others actions can be deactivated just by zeroing one of the coefficients) each with a different effect to the system, we can generalize these effects as follows:

  • The proportional action (P) will provide a faster response of the system under a variation in the input signal (desired state);
  • The integral action (I) aims to cancel a phenomenon known as steady state error, so that when reaching a steady state this value is what is desired in the input signal;
  • The derivative action (D) has an effect of anticipating the correction of the output value of the system so that it also improves the speed of response of the system and reduces the value of the envelope (value that refers to the quantity in which the signal output is higher than desired).

The combination of the quantity of each of these three actions will cause the controller together with the plant (a common way of referring to the system in control) to provide a suitable response to a given input variation. In this way we can write the output state of the PID controller as a function of its input by the equation:

Eq(1)

Where:

  • Kp: Coefficient of proportional action;
  • Ki: Coefficient of the integral action;
  • Kd: Derivative action ratio;
  • t: Instant of the state to be processed;
  • u (t): System output signal at time t;
  • e (t): Error signal at controller input at time t.

The major problem for implementing such a compensator lies in the fact that it consists of an equation involving an integral and a differentiation of an unknown function, and it is difficult to estimate its behavior, much less the best values for the coefficients that provide an adequate response for this. So maybe it’s best to analyze your behavior in another right domain? Why not the frequency domain? Let’s resort to a mathematical resource well known to control engineers, the Laplace transform. With it we can modify this equation to a simpler form and solve using algebra instead of calculation. Due to the practical objective of this article we will not go into details of the mathematics involved for this resource, so we will look for the respective transformations of this equation using the basic table of transformations, which can be found in references such as OGATA (2010) and NISE (2010).

Let us return to equation 1:

“L” by transforming the equation above gives:

Where:

is a complex variable.

In the frequency domain (called for this case of Laplace domain) it is interesting to analyze the transfer function, ie the quotient between the output signal by the input signal, then we have:

Eq(1.1)

It is done, with equation 1.1 we have what is necessary to integrate with the equations of the system to be controlled (previously also “L” – transformed) just to evaluate the characteristics of the plant and to determine Kp, Ki and Kd. After this process a circuit developed from operational amplifiers may provide a simple but efficient analogue PID controller as shown in figure 2:

Figure 2: Analogue PID (adapted from MEYRATH, 2005)

If we take the inverse path and obtain the transfer function of the circuit of figure 2, we arrive at an expression if not equal, very similar and of the same order as the equation obtained in 1.1. Thus, we have the analogue system ready to solve the problem (or complete) of the control system.

Okay, but how do you implement this in a microcontroller? Incidentally, in a general purpose microcontroller is it possible to implement? The answer is partly yes. Partially because a PID algorithm to have an acceptable performance the processor needs at least the hardware multiplication instruction in the next, we will derive from equation 1.1 an optimal quality approximation of a PID controller.

Implementing a numeric PID

We now know what a closed loop system is and how to perform a PID controller in the form of an electronic circuit based on operational amplifiers. Now we will understand how to turn all this theory into numbers so that our processor can execute a sequence of commands compatible with the same operation of the electronic circuit shown in figure 2.

We will observe Figure 3, which illustrates a control system where a processor (or microcontroller) takes the form of the control block responsible for performing the compensation and acting on the plant to be controlled:

Figure 3: Digital Control System (STARR, 2006)

Disregarding the other signals that appeared in comparison with figure 1, and looking at the controller, we saw that it takes the form of a general purpose processor, where the error signal arrives at it through an A / D conversion. The numerical value is then processed by generating another that is passed to a D / A (it can be a modulation method like PWM) where it returns to its analog form and controls the plant. Thus we know that to model a digital PID we will need to work with numerical processing.

Let’s start this session by deriving a very popular PID algorithm and using even 8-bit microcontrollers as demonstrated in ATMEL (2006). The basic idea as absurd as it may seem is to take the inverse Laplace transform and return to equation 1:

Yes, for this first demonstration to work in the time domain and with differential equations will facilitate the accounts well, observing the equation we have an integral and a derivative operation, remembering (or presenting) some classes of numerical calculation it is known that it is possible to approach both one operation over another using algebraic operations that behave similarly.

ATTENTION! From this session we will use a little more mathematics and will remember some concepts of differential and integral calculus. So we ask for a little patience from the reader (especially from the beginner) until we get to the first algorithm. It is worth the effort as a gift will come a great understanding of what are derived and integral in practice.

Starting with the proportional term, this is undoubtedly the most intuitive. Note that this can be computed only by taking the multiplication of the proportional constant by the error variable that arrives at an instant t:

Eq(1.2)

Where n denotes the instant of sampling of the error variable received by the A / D converter. Note that we are still in the time domain, but now we are in a discrete domain where incoming samples are equally spaced by a difference of 1 new value. That is, we already have the hint from where to go to find the other terms of the equation. We now turn to the term derivative.

In order to approximate a derivative to a digital computer, we need to recall the concept of differentiation learned in the first calculation cells, so the derivative of a function with respect to t is defined as:

Eq(1.3)

Forgetting the calculus classes now, let’s take a closer look at what this expression says, that is, a derivative is nothing more than the subtraction of a value from a function for a given value, and a previous value of this same t, where the difference between them is very close to zero or even zero. Mathematically this is not possible, but if we make this t small enough, we can get a very similar approximation to that derived by the definition, so if we want a small spacing, why not choose the sampling rate value of the A / D converter? So:

Thus, we now have an acceptable numerical value which allows us to develop the limit presented in 1.3:

We can now move to the discrete-time domain because we know that each A / D conversion result will already be spaced from one sample to another. Thus we have:

Eq(1.4)

Adding the Kd gain to the equation, we finally arrive at:

Eq(1.5)

And so we have just unraveled the great meaning of a derivative, in the domain of continuous or discrete time, this operation is reduced to a subtraction of the current value of the function by the previous one. What differentiates it from one to the other is that this subtraction is done when the spacing between instants is symbolically zero, and in the case of the discrete time is a value close to zero but known in time, the term t contained in the denominator has been suppressed, because in addition to the samples already being properly spaced with this time base, this value is computed at the moment we adjust the parameter Kd of our controller.

Equation 1.5 is easily implemented in any microcontroller, and actually calculates the term derivative or any other derivative. Now we have the integral term, that follows the same numerical principle, but inversely, if before we had a subtraction, in the case of the integrator we will have a summation. Let us now go to the definition of an integral, as done for the derivative, let us begin with the domain of continuous time, it is worth remembering that for this demonstration, the integral of interest only if it is defined, ie there are intervals whose values ​​are finite:

Eq(1.6)

Although it seems more frightening at first glance, we can disaggregate the sum for only two samples, since for a control system the present and previous states of an interest concern, this leads to:

In the same way as we did for the derivative term, let us use the sampling time of the A / D converter as a sufficiently small value for t, this now transforms the integral into a fully calculable summation, which takes us straight to the discrete time domain:

Again with the term being suppressed because it is contained in Ki, so we have the integral term of the equation as:

Eq(1.7)

Thus we have the integral term of the PID equation, and we simply add equations 1.2, 1.5 and 1.7 to obtain the state value that will be sent to the D / A converter as a function of the input e (t):

Eq(1.8)

Comparing this equation obtained with equation 1, we now conclude that it is possible to compute a PID compensator digitally, right? A simple C program or even assembly if you want better performance could be easily implemented in a microcontroller.

Remembering that:

and that: ti = integration time, td = Time of differentiation.

We could already implement this code in C which is used even in some PID libraries used in Arduino, and thus we have a simple and efficient implementation of a digital PID capable of attending to the control of simple systems. We can call this digital PID approximation as a rectangular approximation given the approximation method used to calculate the integral term.

In this first part, we will stop here. In Github we have put a simple example of PID using the rectangular approximation, but in the next part we will discuss more about the limitations of this technique, as well as we will approach the design of a new digital PID using the dreaded (so far!) Transformed Z, as well as tips how to approach other analogue compensators for computational implementation.

Stay tuned for the second part of this article.

References


ATMEL – AVR221: Discrete PID Controller-2006, avaiable in: <http://www.atmel.com/images/doc2558.pdf>.

CRENSHAW, Jack W. – Math Toolkit for Real – Time Programming – 2000.

MEYRATH, Todd – Multipurpose Analog PID Controller – 2005.

NATIONAL INSTRUMENTS – Controls Law, (link deprecated by NI):

STARR, Gregory P. – Introduction to Applied Digital Control – 2006.

FelipeDigital PID Controller: One practical modeling for microcontrolers – Part 1

Sensor Fusion

No comments

Hello dear reader! Following the line of demystification of some areas of knowledge related to embedded systems, today we talk about the concept of sensor fusion. This article is divided into two parts: For now, we talk about the concept behind the fusion of sensors, why you should consider its use. Also, in the next article, we’ll introduce an application with the use of sensor fusion, comparing the before and after so you can use as a reference in your projects. So, first of all, let’s get into the problem.

Example of problem where sensor fusion can help.

Showing the problem and correlating it with a new concept always helps to understand why adding new knowledge to the developer’s “toolkit”. With the merger of sensors would not be different, Let’s enter the following scenario, imagine, reader, that you have on the bench:

Hardware with a set of inertial sensors (accelerometer, gyroscope and magnetometer);
This hardware performs the acquisition through these sensors, pre-processes and sends to a host responsible for performing an action based on these results;
The host needs such data to come with known and already properly scaled units (e.g., speed in m / s);
The hardware contains a processor connected to these sensors to perform this pre-processing.

Now that we already have our device making acquisitions and sending periodically to a host application, everything seems to work. It reads the accelerometer and from it determines the velocity by discretely integrating the acceleration samples and obtaining the current position by integrating the velocity previously obtained. In an ideal world, if we were to perform a process of position change, the data from the accelerometer would be sufficient to determine such a calculation.

However, in the real world, when we consult the calculated position, it does not match the accelerometer estimated value, and worse, if we return our device to the initial position, we see that the measure does not correspond to reality, it increase and increase as we move the device. Why does it happen? To begin with, let’s look at the figure from [3] and imagine that our hardware consists of the use of the well-known Bosch BMI160 inertial sensor.

Figure 1 – Bosh BMI160 Inertial Sensor Parameters

The sensitivity parameter relates the scale of the signal to bits per g, where g is related to the acceleration of gravity. So if I leave the device stopped at rest, it reports a value in bits that corresponds to the acceleration and not changes until it suffers some external force, correct?

Wrong! Notice that after the sensitivity parameter several others parameters cause variations of the value reported in the sensor output that even if it is at rest cause changes in the final value read. The Problem is that this value is read as valid data in the calculations, and between scale factors and multiplications cause several measurement errors. Measurement uncertainty is how we refer to this kind of error.

That is, however good the sensor may have been, it always has one or more sources of measurement error. To these combined errors we call a process error. How much this error can influence the real measure, we give him the qualifier of a degree of uncertainty of the measurement. From these parameters we will add some mathematics (nothing substantial, only for formal purposes), we can relate the degree of certainty of the measure to the expected value and the value reported by a sensor (or infrastructure of acquisition) to be:

E.Q. 1 Uncertainty Degree

The small function p is called a probability density function or comically abbreviated by FDP. This, in particular, tells us that it is a question of the probability of a given value xt (expected measure) falling on a range of values ​​zt (value reported by the sensor), that is, the higher the probability, the lower the degree of uncertainty of the reported measurement by the sensing infrastructure.

In the case of reading a sensor, this probability tends to be large, but not to the point of ensuring that the measurement alone is reliable. What causes our hypothetical device to present some problems during its operation:

Information obtained from the sensor is unreliable and has an error even at steady state;
Information estimated from the sensor data is not reliable, have a cumulative error;
Sensor and estimated information vary in the steady state causing drift (progressive offset) problems in other derived information.
 

Graphically, what happens in an even stopped sensor can be observed in the graph below extracted from the MATLAB simulation described in [1]. See how much the measured value moves as time goes by:

Figure 2 – Stationary position simulation calculus

Notice that the example in this article takes the use case as inertial sensors, but in practice, every sensor has the same problem, but with the magnitude relative to its measurement. To minimise or eliminate these errors and increase confidence in the measurement obtained, here comes the concept of sensor fusion, which we see below.

Fusion sensors? Literally?

Sounds strange, does not it? However, the concept of sensor fusion assumes that each sensor has its advantages and disadvantages and that different scaling factors can present the same measure with their particular sources of errors. The sensor fusion then literally gets the data from more than one type of sensor, applies a model (a set of scaling factors and estimation of the next states and methods of correction at runtime) and draws in its output:

Information measured directly from running and clean sensors;
Information calculated from the most accurately measured and run sensors.

So it’s the process of getting the best of each sensor’s worlds to combine them into a single set of measurements by making the FDP previously displayed in equation 1 report a higher degree of reliability.

Analogous to the sensor fusion criterion, we can illustrate this process based on the following case, consider that we want the degree of reliability of a person P living in the state of São Paulo E to be as high as possible, thus:


E.Q. 1.1 Uncertainty Degree

We can have as an initial source of data, that the person can live in São Paulo. Based on their nationality, if Brazilian, there is a probability of this being true, but Brazilians live all over the world, so only nationality and naturalness do not guarantee the high degree of reliability. Now, if we get their current cell number, we know that phone number should live in Brazil, so that the probability of being from São Paulo increases, as well as the degree of reliability, but we know that a cell phone with regional code 11 works in all Brazilian states with different regional codes, then we can not guarantee that this person lives in São Paulo.

Let’s get the address of a document, water bill, surely that person lives or must live in São Paulo, right? However, the house may be rented, our degree of reliability comes close but still does not give absolute certainty. So let’s add more documents, credit card, light bill, and vehicle fines, realize that there are three channels of similar measures (such as our accelerometer + gyroscope) provide the same information, but if a person pays water, credit and fine of vehicles (which needs to be transferred to the city of residence).

In this scenario, the possibility of person P residing in the state of São Paulo is close to absolute certainty and only in some particular case (corner cases) the information is false, and if so, the distance of the expected information is small. In the example, P can live in São Paulo but be travelling or out of state on the job.

The concept of sensor fusion is to get as much information as possible from the environment that the object resides in and combine that information so that the certainty of the measure being taken tends to be true. To do this, several sensor channels and commonly of different nature are combined in the most diverse mathematical models, and their outputs correspond to corrected measures and with a high assertiveness degree and safe for the use of data acquisition and processing application.

Awesome! How does sensor fusion work?

In embedded systems environment, the sensor fusion works with a module implemented according to the application, but it has a common core that is based on correcting and estimating a measurement based on the knowledge of its mathematical model of behaviour in specific. For example, specific sources of accelerometer errors, and specific sources of gyro error are corrected. Then a (specific) merging operation combines the results with what can literally be called “the best in each world”.

If in both cases the desired confidence level should be the angle of inclination estimated concerning that measured by an extra source (sensor), we can then calculate both from the sensors, then apply the specific corrections of each source, in then the measures are merged. In the exemplified case, a simple and properly scaled sum merges the corrected results. The process described here depicts one of the simpler sensor fusions known as complementary filtration. See the figure below, which illustrates the process described:

Figure 3 – Sensor fusion by complementary filtering

The beautiful example of the figure above solves a common problem which involves obtaining the angles of local coordinate systems practically error-free. Thus the current angle along with the angular velocity can feed a rotation matrix to a global coordinate system (referenced to Earth) and create a stable spatial orientation system. Without the above mechanism, the errors generated by the gyroscope data integration and reverse tangent tilt calculation would enter into the subsequent calculations causing dangerous errors for the user navigation system of that data source.

The simple fusion system illustrates well what we want to explain, but in more complex navigation structures (and with higher processing capacity) the complimentary architecture ends up being limited when the mathematical model of the sensor to be acquired is little known. For this, rather than just correcting the measures (or states) of the systems, the model must constantly adapt until all sources of errors in the specific channel are eliminated. Then nothing beats a …

… Kalman filter, an adaptive filter for sensor fusion
 

Explaining all the mathematics and derivation of this nice filter is outside (at least for now) the scope of this article; however, this type of filter has to be mentioned, since it is present in most of the sensor fusion architectures. The use of a Kalman filter changes the framework of the fusion of sensors that we present, see:

Figure 4 – Kalman filter sensor fusion

You must be wondering where the sum block shown in the first example of sensor fusion went. It is implicit in the Kalman filter itself. In a nutshell, the Kalman filter will:

Obtain an estimate of the next state of the system (understand state as one of the variables of interest);
Get the same status through the sensor;
Using the predicted and measured state, it applies the so-called Kalman gain (an updated continuously matrix based on the knowledge of the model the filter obtains) to these variables;
The corrected state present at the system output;
The corrected state and the new ideal state will enter the Preditor. The next state estimation is updated as well as the Kalman filter;
A new sample is obtained from the sensor and the cycle will start again.
 

Yes, in educational materials there is so much mathematics involved that we forget to understand how this tool works. Realise that the fusion occurs when we apply the so-called correction step to the estimated state and the measured state. In this way, we obtain the corrected state in which its FDP concerning the real state of the system results in a low and low variable degree of uncertainty. This fusion architecture is mostly employed in navigation systems since the Kalman filter alone is prepared to deal with multiple states.

Let’s take a look at what the Kalman filter does graphically:

Figure 5 – Kalman filter data flux

Consider the vectors xk and pk as being the model to be estimated. The suffix k denotes that the system is discretised (samples separated by equal and known time intervals), that is, we have xk-1 and pk-1 denoting the last corrected state of the system. These variables are returned to one of the filter inputs, and go through their first execution step, the prediction that estimates through former states (Bayes Rule) which may be a possibility of a future state.

These intermediate values ​​then feed the second block, which contains from the current state estimated the variable zk that carries the state estimated by the sensor measurement. Together with the other variables we have the new corrected state in the output along with the current corrected model Pk, that is, the Kalman filter searches for knowledge in the future state, and memorises a portion of the previous state to help calculate the current state, fantastic no? Its implementation complexity varies according to the model and the number of variables to be exploited, but there are several open implementations, and for simple mergers, one can resort to the complementary filtering presented previously.

Are there other architectures?
 

Yes, there are, the fusion of sensors we detail here, belongs to the class of fusion called complementary (already implicit in our first example), that is, its function is to obtain a complete view of a particular state combining the measurements of sensors that are not directly related but which can provide the same type of data (gyroscope and accelerometer). However, besides we can quickly cite two other ways to merge sensor data based on the project requirement:

Competitive, this type of fusion is applied when the requirement becomes robustness and precision, typical of life support systems. In this case what exists are the same sensors (same type of data) but in respective quantities followed by an uncertainty block. The merger here occurs by weighing the sum of the corrections where the most considerable weight is always pro sensor with the lowest degree of uncertainty at that instant of time;
Cooperative, the coolest of this type of fusion is the fact that the desired state of the system to be obtained has no direct relationship with what is being measured, ie, it needs a network of sensors, and only with the result of the reading of several sensors it will be possible to obtain the relevant data. The cooperative fusion individually treats the measurements and error removal of the sensors only to then evaluate which “piece” of the data of interest that given measure corresponds.
 
Thus the topics presented here are graphically visualized in the figure below:

Figure 6 – Sensor fusion methods

C

Conclusion

The purpose of this article was to popularise the mind of the reader with a simplistic view of sensor fusion and why it is so important, I believe that with this material the reader will be able to create courage and explore more academic texts containing the analytical approach of a specific form of fusion data.

However, the goal here is to bring practice, the part no one shows. In the next article we will do our first application of sensor fusion using an IMU (Inertial Measurement Unit) exploring the complementary filtering technique that will prepare the third article to fuse sensors using the Kalman filter approach, so stay connected reader! Leave your comment below what I would like to see related to sensor fusion, let’s discuss, see you next time.

References

[1] – NXP Sensor Fusion Guide

[2] – Sensor Data Fusion using Kalman Filters – Antiono Moran

[3] – Bosch BMI160 Inertial Measurement Unit datasheet

[4] – Sensor Fusion for Automotive Applications – Chrstian Lundquist

FelipeSensor Fusion