STM32 tutorial: Efficiently receive UART data using DMA

U(S)ART peripheral can work very good by using RXNE (Receive Not Empty) for each byte separatelly. In this case, every received byte is manipulated by CPU by jumping to appropriate UART interrupt service routine. To allow CPU to do fully other job when we receive UART data at high speed we can use DMA (Direct Memory Access) to offload CPU. We can think of DMA as co-processor who can only transfer data between different memories, in our case between peripheral data register UART and temporary DMA buffer we assign to it.

In general, before you start DMA, you have to assign number of bytes DMA should transfer before you say Stop, I’m done with transfer. This event is later called Transfer Complete (TC). But we know, in general, UART can receive data at any time. By UART specifications, we don’t know when and how many of bytes will arrive.


We receive each 5 minutes between 10 and 20 bytes but we don’t know exact number of bytes at a time. We have to set DMA number of bytes to receive before transfer complete notification is met to read the data with CPU. We can set DMA receive to 10 bytes and after again 10 bytes, but if we receive 14 bytes, then we miss 4 bytes. Actually, they will be in buffer but we won’t be notified that DMA has 4 bytes in memory. In this case, we will spend 5 minutes before another packet with data bytes arrives to first flush old data together with 6 bytes of new data. This can lead to Timeouts if our high-level protocol is packet based with command->response approach.


We can use very useful feature in UART peripheral, called IDLE line detection. Idle line is detected on RX line when there is no received byte for more than 1 byte time length. So, if we receive 10 bytes one after another (no delay), IDLE line is detected after 11th bytes should be received but is not.

We are able to force DMA to call transfer complete interrupt when we disable DMA stream by hand, thus disabling enable bit in stream control register. Then, DMA will make an interrupt if they are enabled and we can read number of bytes received already.


If we now do our example again by receiving between 10 and 20 bytes (we receive 14 in this example), we have two options:

  • Set DMA to receive 10 bytes and this will happen
    • When 10th byte is received, DMA will force transfer complete interrupt since all bytes were received (NO IDLE LINE by USART)
    • We will transfer 10 bytes to upper layer buffer and after start DMA receive again for 10 bytes
    • We will only receive 4 more bytes but now IDLE LINE will be detected and we will force DMA to stop at this point. Again, interrupt will be called and we can read data from buffer filled by DMA
  • Set DMA to receive 20 bytes and we will only receive 14 bytes at this time but we will receive IDLE LINE detection and therefore we will force DMA Transfer Complete interrupt.

Source code

To ease and to show what I was talking about, you can find example below. It was tested on STM32F4xx but the concept will work on other STM32 families too.

  • Tested on Nucleo-F411 and Nucleo-F401 with 9600, 115200 and 921600 bauds
  • USART2 is used at pins PA2 and PA3
  • It uses Standard Peripheral Drivers for principe, can easily be ported to HAL
  • Register access is used inside interrupts to manipulate data as fast as possible
  • It uses high level buffer for storing received data and uses DMA RX buffer of smaller size for temporary receive data
  • Code is documented inline

Above code was written using Standard Peripheral Drivers while code below is the same, just written in new LL (Low-Layer) which are already part of STM32Cube package for STM32 microcontrollers and are compatible to be used with HAL drivers together in single project.

Project was initially generated using CubeMX software but later modified to use LL drivers where necessary.



Owner of this site. Application engineer, currently employed by STMicroelectronics. Exploring latest technologies and owner of different libraries posted on Github.

You may also like...

Read before commenting!

Before you make a new comment, make sure you agree with things listed below:

  • - Read post to make sure if it is already posted what you are asking for,
  • - Make sure you have the latest version of libraries used in your project,
  • - Make a clean and grammatically correct written message,
  • - Report as many details as possible, including what have you done so far,
  • - Do NOT post any code here. Use Pastebin,
  • - Do NOT post any error codes here. Use Pastebin,
  • - Specify STM32Fxxx family and used Discovery/EVAL/Nucleo or custom made board,
  • - Make sure your clock is set correct for PLL,
  • - If you are using my HAL drivers, please check this post how to start.
Comment will be deleted on breaking these rules without notification!