Good afternoon We are in the project
Qt is a cross-platform framework that includes not only graphical components, but also things like QtNetwork, a set of classes for working with databases, Qt for Automation (including IoT implementation) and much more. The developers of the Qt team have foreseen the use of Qt in embedded systems, so the libraries are quite well configurable. However, until recently, few people thought about porting Qt to microcontrollers, probably because such a task looks difficult - Qt is large, MCUs are small.
On the other hand, at the moment there are microcontrollers designed to work with multimedia and surpass the first Pentiums. Appeared on the Qt blog about a year ago
Qt 4.8 has been ported to Embox for a long time, so we decided to try it on it. We chose the moveblocks app, an example of springy animation.
Qt moveblocks on QEMU
To begin with, we configure Qt with the minimum set of components required to support animation, if possible. There is an option β-qconfig minimal,small,mediumβ¦β for this. It includes a configuration file from Qt with a lot of macros - what to enable / what to disable. After this option, we add other flags to the configuration if we want to disable something else additionally. Here is an example of our
In order for Qt to work, you need to add an OS compatibility layer. One way is to implement QPA (Qt Platform Abstraction). The ready-made fb_base plugin as part of Qt was taken as a basis, on the basis of which QPA for Linux works. The result is a small emboxfb plugin that provides Qt with Embox's framebuffer, and then it draws there without outside help.
This is what the plugin looks like
QEmboxFbIntegration::QEmboxFbIntegration()
: fontDb(new QGenericUnixFontDatabase())
{
struct fb_var_screeninfo vinfo;
struct fb_fix_screeninfo finfo;
const char *fbPath = "/dev/fb0";
fbFd = open(fbPath, O_RDWR);
if (fbPath < 0) {
qFatal("QEmboxFbIntegration: Error open framebuffer %s", fbPath);
}
if (ioctl(fbFd, FBIOGET_FSCREENINFO, &finfo) == -1) {
qFatal("QEmboxFbIntegration: Error ioctl framebuffer %s", fbPath);
}
if (ioctl(fbFd, FBIOGET_VSCREENINFO, &vinfo) == -1) {
qFatal("QEmboxFbIntegration: Error ioctl framebuffer %s", fbPath);
}
fbWidth = vinfo.xres;
fbHeight = vinfo.yres;
fbBytesPerLine = finfo.line_length;
fbSize = fbBytesPerLine * fbHeight;
fbFormat = vinfo.fmt;
fbData = (uint8_t *)mmap(0, fbSize, PROT_READ | PROT_WRITE,
MAP_SHARED, fbFd, 0);
if (fbData == MAP_FAILED) {
qFatal("QEmboxFbIntegration: Error mmap framebuffer %s", fbPath);
}
if (!fbData || !fbSize) {
qFatal("QEmboxFbIntegration: Wrong framebuffer: base = %p,"
"size=%d", fbData, fbSize);
}
mPrimaryScreen = new QEmboxFbScreen(fbData, fbWidth,
fbHeight, fbBytesPerLine,
emboxFbFormatToQImageFormat(fbFormat));
mPrimaryScreen->setPhysicalSize(QSize(fbWidth, fbHeight));
mScreens.append(mPrimaryScreen);
this->printFbInfo();
}
And this is how the redraw will look like
QRegion QEmboxFbScreen::doRedraw()
{
QVector<QRect> rects;
QRegion touched = QFbScreen::doRedraw();
DPRINTF("QEmboxFbScreen::doRedrawn");
if (!compositePainter) {
compositePainter = new QPainter(mFbScreenImage);
}
rects = touched.rects();
for (int i = 0; i < rects.size(); i++) {
compositePainter->drawImage(rects[i], *mScreenImage, rects[i]);
}
return touched;
}
As a result, with the compiler optimization for memory size enabled -Os, the library image turned out to be 3.5 MB, which of course does not fit into the main memory of the STM32F746. As we wrote in our other article about OpenCV, this board has:
- 1MB ROM
- 320 KB RAM
- 8 MB SDRAM
- 16 MB QSPI
Since support for executing code from QSPI has already been added to OpenCV, we decided to start by loading the entire Embox image with Qt into QSPI. And hooray, everything started almost immediately from QSPI! But as in the case of OpenCV, it turned out to be too slow.
Therefore, we decided to do this - first we copy the image to QSPI, then we load it into SDRAM and execute from there. From SDRAM it became a little faster, but still far from QEMU.
Then there was the idea to include a floating point - after all, Qt does some calculations of the coordinates of the squares in the animation. We tried, but here we did not get a visible acceleration, although in
The idea to transfer the framebuffer from SDRAM to internal memory turned out to be the most effective. To do this, we made the screen size not 480x272, but 272x272. We also lowered the color depth from A8R8G8B8 to R5G6B5, thus reducing the size of one pixel from 4 to 2 bytes. We got the framebuffer size 272 * 272 * 2 = 147968 bytes. This gave a significant speedup, perhaps the most noticeable, the animation became almost smooth.
The last optimization was the execution of Embox code from RAM, and Qt from SDRAM. To do this, we first, as usual, statically link Embox along with Qt, but we place the text, rodata, data and bss segments of the library in QSPI in order to copy it to SDRAM later.
section (qt_text, SDRAM, QSPI)
phdr (qt_text, PT_LOAD, FLAGS(5))
section (qt_rodata, SDRAM, QSPI)
phdr (qt_rodata, PT_LOAD, FLAGS(5))
section (qt_data, SDRAM, QSPI)
phdr (qt_data, PT_LOAD, FLAGS(6))
section (qt_bss, SDRAM, QSPI)
phdr (qt_bss, PT_LOAD, FLAGS(6))
By executing the Embox code from the ROM, we also got a noticeable speedup. As a result, the animation turned out to be quite smooth:
Already at the very end, preparing the article and trying different configurations of Embox, it turned out that Qt moveblocks works great from QSPI with a framebuffer in SDRAM, and the bottleneck was exactly the size of the framebuffer! Apparently, to overcome the initial βslideshowβ, a 2-fold acceleration was enough due to a banal reduction in the size of the framebuffer. But it was not possible to achieve such a result by transferring only the Embox code to various fast memories (the acceleration was not 2, but about 1.5 times).
How to try it yourself
If you have STM32F7-Discovery, you can run Qt under Embox yourself. You can read how it's done on our
Conclusion
As a result, we managed to start Qt! The complexity of the task, in our opinion, is somewhat exaggerated. Naturally, you need to take into account the specifics of microcontrollers and generally understand the architecture of computing systems. The optimization results point to the well-known fact that the bottleneck in a computing system is not the processor, but the memory.
This year we will participate in the festival
Source: habr.com