Porting Qt to STM32

Porting Qt to STM32Good afternoon We are in the project Embox launched Qt on STM32F7-Discovery and would like to talk about it. Earlier, we already told how we managed to launch OpenCV.

Qt is a cross-platform framework that includes not only graphical components, but also things like QtNetwork, a set of classes for working with databases, Qt for Automation (including IoT implementation) and much more. The developers of the Qt team have foreseen the use of Qt in embedded systems, so the libraries are quite well configurable. However, until recently, few people thought about porting Qt to microcontrollers, probably because such a task looks difficult - Qt is large, MCUs are small.

On the other hand, at the moment there are microcontrollers designed to work with multimedia and surpass the first Pentiums. Appeared on the Qt blog about a year ago post. The developers made a port of Qt under RTEMS OS, and ran examples with widgets on several boards running stm32f7. We are interested in this. It was noticeable, and the developers themselves write about it, that Qt slows down on STM32F7-Discovery. We wondered if we could run Qt under Embox, and not just draw a widget, but run an animation.

Qt 4.8 has been ported to Embox for a long time, so we decided to try it on it. We chose the moveblocks app, an example of springy animation.

Qt moveblocks on QEMUPorting Qt to STM32

To begin with, we configure Qt with the minimum set of components required to support animation, if possible. There is an option β€œ-qconfig minimal,small,medium…” for this. It includes a configuration file from Qt with a lot of macros - what to enable / what to disable. After this option, we add other flags to the configuration if we want to disable something else additionally. Here is an example of our configurations.

In order for Qt to work, you need to add an OS compatibility layer. One way is to implement QPA (Qt Platform Abstraction). The ready-made fb_base plugin as part of Qt was taken as a basis, on the basis of which QPA for Linux works. The result is a small emboxfb plugin that provides Qt with Embox's framebuffer, and then it draws there without outside help.

This is what the plugin looks like

QEmboxFbIntegration::QEmboxFbIntegration()
    : fontDb(new QGenericUnixFontDatabase())
{
    struct fb_var_screeninfo vinfo;
    struct fb_fix_screeninfo finfo;
    const char *fbPath = "/dev/fb0";

    fbFd = open(fbPath, O_RDWR);
    if (fbPath < 0) {
        qFatal("QEmboxFbIntegration: Error open framebuffer %s", fbPath);
    }
    if (ioctl(fbFd, FBIOGET_FSCREENINFO, &finfo) == -1) {
        qFatal("QEmboxFbIntegration: Error ioctl framebuffer %s", fbPath);
    }
    if (ioctl(fbFd, FBIOGET_VSCREENINFO, &vinfo) == -1) {
        qFatal("QEmboxFbIntegration: Error ioctl framebuffer %s", fbPath);
    }
    fbWidth        = vinfo.xres;
    fbHeight       = vinfo.yres;
    fbBytesPerLine = finfo.line_length;
    fbSize         = fbBytesPerLine * fbHeight;
    fbFormat       = vinfo.fmt;
    fbData = (uint8_t *)mmap(0, fbSize, PROT_READ | PROT_WRITE,
                             MAP_SHARED, fbFd, 0);
    if (fbData == MAP_FAILED) {
        qFatal("QEmboxFbIntegration: Error mmap framebuffer %s", fbPath);
    }
    if (!fbData || !fbSize) {
        qFatal("QEmboxFbIntegration: Wrong framebuffer: base = %p,"
               "size=%d", fbData, fbSize);
    }

    mPrimaryScreen = new QEmboxFbScreen(fbData, fbWidth,
                                        fbHeight, fbBytesPerLine,
                                        emboxFbFormatToQImageFormat(fbFormat));

    mPrimaryScreen->setPhysicalSize(QSize(fbWidth, fbHeight));
    mScreens.append(mPrimaryScreen);

    this->printFbInfo();
}

And this is how the redraw will look like

QRegion QEmboxFbScreen::doRedraw()
{
    QVector<QRect> rects;
    QRegion touched = QFbScreen::doRedraw();

    DPRINTF("QEmboxFbScreen::doRedrawn");

    if (!compositePainter) {
        compositePainter = new QPainter(mFbScreenImage);
    }

    rects = touched.rects();
    for (int i = 0; i < rects.size(); i++) {
        compositePainter->drawImage(rects[i], *mScreenImage, rects[i]);
    }
    return touched;
}

As a result, with the compiler optimization for memory size enabled -Os, the library image turned out to be 3.5 MB, which of course does not fit into the main memory of the STM32F746. As we wrote in our other article about OpenCV, this board has:

  • 1MB ROM
  • 320 KB RAM
  • 8 MB SDRAM
  • 16 MB QSPI

Since support for executing code from QSPI has already been added to OpenCV, we decided to start by loading the entire Embox image with Qt into QSPI. And hooray, everything started almost immediately from QSPI! But as in the case of OpenCV, it turned out to be too slow.

Porting Qt to STM32

Therefore, we decided to do this - first we copy the image to QSPI, then we load it into SDRAM and execute from there. From SDRAM it became a little faster, but still far from QEMU.

Porting Qt to STM32

Then there was the idea to include a floating point - after all, Qt does some calculations of the coordinates of the squares in the animation. We tried, but here we did not get a visible acceleration, although in article The Qt developers have claimed that the FPU provides a significant speed boost for β€œdragging animations” on the touchscreen. There may be significantly less floating point calculations in moveblocks, and it depends on the specific example.

The idea to transfer the framebuffer from SDRAM to internal memory turned out to be the most effective. To do this, we made the screen size not 480x272, but 272x272. We also lowered the color depth from A8R8G8B8 to R5G6B5, thus reducing the size of one pixel from 4 to 2 bytes. We got the framebuffer size 272 * 272 * 2 = 147968 bytes. This gave a significant speedup, perhaps the most noticeable, the animation became almost smooth.

The last optimization was the execution of Embox code from RAM, and Qt from SDRAM. To do this, we first, as usual, statically link Embox along with Qt, but we place the text, rodata, data and bss segments of the library in QSPI in order to copy it to SDRAM later.

section (qt_text, SDRAM, QSPI)
phdr	(qt_text, PT_LOAD, FLAGS(5))

section (qt_rodata, SDRAM, QSPI)
phdr	(qt_rodata, PT_LOAD, FLAGS(5))

section (qt_data, SDRAM, QSPI)
phdr	(qt_data, PT_LOAD, FLAGS(6))

section (qt_bss, SDRAM, QSPI)
phdr	(qt_bss, PT_LOAD, FLAGS(6))

By executing the Embox code from the ROM, we also got a noticeable speedup. As a result, the animation turned out to be quite smooth:


Already at the very end, preparing the article and trying different configurations of Embox, it turned out that Qt moveblocks works great from QSPI with a framebuffer in SDRAM, and the bottleneck was exactly the size of the framebuffer! Apparently, to overcome the initial β€œslideshow”, a 2-fold acceleration was enough due to a banal reduction in the size of the framebuffer. But it was not possible to achieve such a result by transferring only the Embox code to various fast memories (the acceleration was not 2, but about 1.5 times).

How to try it yourself

If you have STM32F7-Discovery, you can run Qt under Embox yourself. You can read how it's done on our wiki.

Conclusion

As a result, we managed to start Qt! The complexity of the task, in our opinion, is somewhat exaggerated. Naturally, you need to take into account the specifics of microcontrollers and generally understand the architecture of computing systems. The optimization results point to the well-known fact that the bottleneck in a computing system is not the processor, but the memory.

This year we will participate in the festival TechTrain. There we will tell in more detail and show Qt, OpenCV on microcontrollers and our other achievements.

Source: habr.com

Add a comment