Multiple display issues

I have a product with up to 12 displays, driven over 3 SPI buses, each SPI bus having 4 displays. The displays are 240x240 16 bit color, driven at 27mhz. I have an external 32MB 16bit SDRAM on this device, max bandwidth being around 100MB. (STM32F769)

My goal is the ability to maximize the parallel driving of these buses, so I have built a queue system where requests get queued to send to a chosen display via dma.

The way lvgl seems to be written currently I see some (possible) gotchas.

There are at least 3 “while(vdb->flushing)” in the code, which basically means the complete task blocks until the buffer is freed. Is there some task ordering that will allow the other displays tasks to continue until their tasks are completed, and their outbound flushing is called?

My idea is to initialize 12 disp_drv, but it is unclear to me how tasking will work with this many displays.

First of all, it looks like an interesting project! :slight_smile:

Take a look at this line. It shows that flushing is related to a display buffer. (This internal display buffer is passed to the flush function).

So if you have:

  • 1 display buffer for all the 12 displays you need to wait while it is flushed. Else lvgl would draw the content of the next display into it while you flush. It would be slow.
  • 3 display buffers for the 3 SPI theoretically lvgl could deal with displays on other SPIs while you flush from one. It’s not implemented now, but it seems simple to check if the display’s buffer is “free” and if not try and other display until all displays are refreshed.
  • 12 display buffer for the 12 displays lvgl will be able to render all displays after each other and you need to queue on the SPI side. The mentioned “schedule” would be required here too to choose an available display. In this case, you might consider using two buffers per display which allows lvgl to draw to the second buffer while you flush from the first.

I sort of have it working with separate display drive handles and buffers for each display. I’d like to double buffer, but I’m not sure how to best do it, as at present a full size double buffer is coded to do full screen draws, which I don’t want.

Another thing that comes to mind is task waiting, it seems like some parts of lvgl could be optimized for OS, any example like

#define OS_TaskYield vTaskYield// Or similar
while(vdb->flushing){
OS_TaskYield();
}

Also, it would be nice if lvgl tasks could be bypassed or delayed, but perhaps there isn’t enough performance advantage for most applications. This would be handy for example when using STM32 DMA2D or other GPU accelerated dma transfers.

Also, an conf.h define for 32bit would be nice, so there isn’t so many wasted cycles waiting for lv_gettick to wait another millisecond.

Excellent idea! Would you like to send a PR? The OS_TaskYield define could be added to lv_conf.h as LV_OS_TASK_YIELD (or a similar name). It should, by default, be defined to something like this:

#define LV_OS_TASK_YIELD() do {} while(0)

That way it doesn’t affect the behavior of any existing applications, and the user is not pressured to provide their own implementation.

Not sure what you meant by these. Could you explain further?

About the tasks, is there a way to process other tasks when one task is occupied waiting on dma from the GPU for example?

About the 32 bit define, there are lots of wasted cycles in lv_get_tick, because it always waits until a tick increment. I really don’t want to throw away 1 millisecond in cycles just to wait for a tick. It would be better to turn off the tick interrupt in 8 and 16 bit systems. Its not a huge deal though, as the defines allow my own to be implemented.

You can trick true double buffering by setting hor_res x ver_res + 1 pixel sized buffer. True double buffering checks exact match between resolution and buffer size.

I also like this idea!

It’d be possible in two ways:

  1. Use a full-featured task schedule instead of lv_tasks. However, it’d be against the goal of lv_tasks: be a simple hardware-independent schedule.
  2. Implement some kind of state-machines in the critical internal tasks (mainly the display refresh task), to “yield” at specific points and continue from there on the next call.

lvgl uses lv_tick to measure elapsed time and never delays anything. If you need a delay functionality you can implement it independently from lv_tick.

lv_tick_get delays up to 1 tick every time its called unless you define your own method.

I think you’re right.

Do we really need tick_irq_flag? Wouldn’t reading sys_time be an atomic operation?

Wow, I can’t believe my eyes. It’s absolutely not the intended behavior. I wonder how much performance was lost on this…

My goal was to “try again if there was an interrupt”.

It seems to work better:


while(1) {
   tick_irq_flag = 1;
   result        = sys_time;

   /* lv_tick_inc() clears this flag which can be in an interrupt.
      Continue until make a non interrupted cycle */
   if(tick_irq_flag == 1) break;
}

Wait… If tick_irq_flag = 1; remains 1 (which is the case almost every time) than while(!tick_irq_flag) will be while(! 1) -> while(0) so the cycle ends. So it runs only once.
Confirm or correct me, please. :fearful:

I think you are correct, sorry. I’m not sure why, but my debugger loops when single stepping, otherwise it doesn’t.

Single stepping usually results in interrupts being pended and then executed when the step command is given. An interrupt probably fires every time you single step and clears the flag.

1 Like