I am one of the developers of the operating system
If you type something like “OpenCV on STM32 board” into a search engine, you can find quite a few people who are interested in using this library on STM32 boards or other microcontrollers.
There are several videos that, judging by the name, should demonstrate what is needed, but usually (in all the videos that I saw) on the STM32 board, only the image was received from the camera and the result was displayed on the screen, and the image processing itself was done either on a regular computer, or on more powerful boards (for example, Raspberry Pi).
Why is it difficult?
The popularity of search queries is explained by the fact that OpenCV is the most popular computer vision library, which means that more developers are familiar with it, and the ability to run desktop-ready code on a microcontroller greatly simplifies the development process. But why are there still no popular ready-made recipes for solving this problem?
The problem of using OpenCV on small shawls is related to two features:
- If you compile the library even with a minimal set of modules, it simply will not fit into the flash memory of the same STM32F7Discovery (even without taking into account the OS) due to a very large code (several megabytes of instructions)
- The library itself is written in C++, which means
- Need support for positive runtime (exceptions, etc.)
- Little support for LibC/Posix, which is usually found in OS for embedded systems - you need a standard plus library and a standard STL template library (vector, etc.)
Porting to Embox
As usual, before porting any programs to the operating system, it's a good idea to try to build it in the form in which the developers intended it. In our case, there are no problems with this - the source code can be found on
The good news is that OpenCV can be built as a static library out of the box, which makes porting easier. We collect a library with a standard config and see how much space they take up. Each module is collected in a separate library.
> size lib/*so --totals
text data bss dec hex filename
1945822 15431 960 1962213 1df0e5 lib/libopencv_calib3d.so
17081885 170312 25640 17277837 107a38d lib/libopencv_core.so
10928229 137640 20192 11086061 a928ed lib/libopencv_dnn.so
842311 25680 1968 869959 d4647 lib/libopencv_features2d.so
423660 8552 184 432396 6990c lib/libopencv_flann.so
8034733 54872 1416 8091021 7b758d lib/libopencv_gapi.so
90741 3452 304 94497 17121 lib/libopencv_highgui.so
6338414 53152 968 6392534 618ad6 lib/libopencv_imgcodecs.so
21323564 155912 652056 22131532 151b34c lib/libopencv_imgproc.so
724323 12176 376 736875 b3e6b lib/libopencv_ml.so
429036 6864 464 436364 6a88c lib/libopencv_objdetect.so
6866973 50176 1064 6918213 699045 lib/libopencv_photo.so
698531 13640 160 712331 ade8b lib/libopencv_stitching.so
466295 6688 168 473151 7383f lib/libopencv_video.so
315858 6972 11576 334406 51a46 lib/libopencv_videoio.so
76510375 721519 717496 77949390 4a569ce (TOTALS)
As you can see from the last line, .bss and .data do not take up much space, but the code is more than 70 MiB. It is clear that if this is statically linked to a specific application, the code will become less.
Let's try to throw out as many modules as possible so that a minimal example is assembled (which, for example, will simply output the OpenCV version), so we look cmake .. -LA
and turn off in the options everything that turns off.
-DBUILD_opencv_java_bindings_generator=OFF
-DBUILD_opencv_stitching=OFF
-DWITH_PROTOBUF=OFF
-DWITH_PTHREADS_PF=OFF
-DWITH_QUIRC=OFF
-DWITH_TIFF=OFF
-DWITH_V4L=OFF
-DWITH_VTK=OFF
-DWITH_WEBP=OFF
<...>
> size lib/libopencv_core.a --totals
text data bss dec hex filename
3317069 36425 17987 3371481 3371d9 (TOTALS)
On the one hand, this is only one module of the library, on the other hand, this is without compiler optimization for code size (-Os
). ~3 MiB of code is still quite a lot, but already gives hope for success.
Run in emulator
It is much easier to debug on the emulator, so first make sure that the library works on qemu. As an emulated platform, I chose Integrator / CP, because firstly, it is also ARM, and secondly, Embox supports graphics output for this platform.
Embox has a mechanism for building external libraries, using it we add OpenCV as a module (passing all the same options for the "minimal" build in the form of static libraries), after that I add a simple application that looks like this:
version.cpp:
#include <stdio.h>
#include <opencv2/core/utility.hpp>
int main() {
printf("OpenCV: %s", cv::getBuildInformation().c_str());
return 0;
}
We assemble the system, run it - we get the expected output.
root@embox:/#opencv_version
OpenCV:
General configuration for OpenCV 4.0.1 =====================================
Version control: bd6927bdf-dirty
Platform:
Timestamp: 2019-06-21T10:02:18Z
Host: Linux 5.1.7-arch1-1-ARCH x86_64
Target: Generic arm-unknown-none
CMake: 3.14.5
CMake generator: Unix Makefiles
CMake build tool: /usr/bin/make
Configuration: Debug
CPU/HW features:
Baseline:
requested: DETECT
disabled: VFPV3 NEON
C/C++:
Built as dynamic libs?: NO
< Дальше идут прочие параметры сборки -- с какими флагами компилировалось,
какие модули OpenCV включены в сборку и т.п.>
The next step is to run some example, preferably one of the standard ones offered by the developers themselves.
The example had to be slightly rewritten to display the image with the result directly in the frame buffer. I had to do this, because. function imshow()
can draw images through the QT, GTK and Windows interfaces, which, of course, will definitely not be in the config for STM32. In fact, QT can also be run on STM32F7Discovery, but this will be discussed in another article 🙂
After a short clarification in which format the result of the edge detector is stored, we get an image.
original picture
Experience the Power of Effective Results
Running on STM32F7Discovery
On 32F746GDISCOVERY there are several hardware memory sections that we can use one way or another
- 320KiB RAM
- 1MiB flash for image
- 8MiB SDRAM
- 16MiB QSPI NAND Flash
- microSD card slot
An SD card can be used to store images, but in the context of running a minimal example, this is not very useful.
The display has a resolution of 480×272, which means that the framebuffer memory will be 522 bytes at a depth of 240 bits, i.e. this is more than the size of RAM, so the framebuffer and the heap (which will be required, including for OpenCV, to store data for images and auxiliary structures) will be located in SDRAM, everything else (memory for stacks and other system needs) will go to RAM .
If we take the minimum config for STM32F7Discovery (throw out the entire network, all commands, make stacks as small as possible, etc.) and add OpenCV with examples there, the required memory will be as follows:
text data bss dec hex filename
2876890 459208 312736 3648834 37ad42 build/base/bin/embox
For those who are not very familiar with which sections go where, I will explain: in .text
и .rodata
instructions and constants (roughly speaking, readonly data) lie in .data
the data is mutable, .bss
there are "nulled" variables, which, nevertheless, need a place (this section will "go" to RAM).
The good news is that .data
/.bss
should fit, but with .text
the trouble is that there is only 1MiB of memory for the image. Can be thrown out .text
the picture from the example and read it, for example, from the SD card into memory at startup, but fruits.png weighs about 330KiB, so this will not solve the problem: most .text
consists of the OpenCV code.
By and large, there is only one thing left - loading a part of the code onto a QSPI flash (it has a special mode of operation for mapping memory to the system bus, so that the processor can access this data directly). In this case, a problem arises: firstly, the memory of a QSPI flash drive is not available immediately after the device is rebooted (you need to separately initialize the memory-mapped mode), and secondly, you cannot “flash” this memory with a familiar bootloader.
As a result, it was decided to link all the code in QSPI, and flash it with a self-written loader that will receive the required binary via TFTP.
Experience the Power of Effective Results
The idea to port this library to Embox appeared about a year ago, but over and over again it was postponed due to various reasons. One of them is support for libstdc++ and the standard template library. The problem of C++ support in Embox is beyond the scope of this article, so here I will only say that we managed to achieve this support in the right amount for this library to work 🙂
In the end, these problems were overcome (at least enough for the OpenCV example to work), and the example ran. It takes 40 long seconds for the board to search for boundaries using the Canny filter. This, of course, is too long (there are considerations on how to optimize this matter, it will be possible to write a separate article about this in case of success).
However, the intermediate goal was to create a prototype that will show the fundamental possibility of running OpenCV on STM32, respectively, this goal was achieved, hooray!
tl;dr: step by step instructions
0: Download Embox sources, like this:
git clone https://github.com/embox/embox && cd ./embox
1: Let's start by assembling a bootloader that will "flash" a QSPI flash drive.
make confload-arm/stm32f7cube
Now you need to configure the network, because. We will upload the image via TFTP. To set the board and host IP addresses, you need to edit the conf/rootfs/network.
Configuration example:
iface eth0 inet static
address 192.168.2.2
netmask 255.255.255.0
gateway 192.168.2.1
hwaddress aa:bb:cc:dd:ee:02
gateway
- host address from where the image will be loaded, address
- address of the board.
After that, we collect the bootloader:
make
2: The usual loading of the bootloader (sorry for the pun) on the board - there is nothing specific here, you need to do it like for any other application for STM32F7Discovery. If you don't know how to do it, you can read about it
3: Compiling an image with a config for OpenCV.
make confload-platform/opencv/stm32f7discovery
make
4: Extract from ELF sections to be written to QSPI to qspi.bin
arm-none-eabi-objcopy -O binary build/base/bin/embox build/base/bin/qspi.bin
--only-section=.text --only-section=.rodata
--only-section='.ARM.ex*'
--only-section=.data
There is a script in the conf directory that does this, so you can run it
./conf/qspi_objcopy.sh # Нужный бинарник -- build/base/bin/qspi.bin
5: Using tftp, download qspi.bin.bin to a QSPI flash drive. On the host, to do this, copy qspi.bin to the root folder of the tftp server (usually /srv/tftp/ or /var/lib/tftpboot/; packages for the corresponding server are available in most popular distributions, usually called tftpd or tftp-hpa, sometimes you have to do systemctl start tftpd.service
to start).
# вариант для tftpd
sudo cp build/base/bin/qspi.bin /srv/tftp
# вариант для tftp-hpa
sudo cp build/base/bin/qspi.bin /var/lib/tftpboot
On Embox (i.e. in the bootloader), you need to execute the following command (we assume that the server has the address 192.168.2.1):
embox> qspi_loader qspi.bin 192.168.2.1
6: With command goto
you need to "jump" into QSPI memory. The specific location will vary depending on how the image is linked, you can see this address with the command mem 0x90000000
(the start address fits into the second 32-bit word of the image); you will also need to flag the stack -s
, the stack address is at 0x90000000, example:
embox>mem 0x90000000
0x90000000: 0x20023200 0x9000c27f 0x9000c275 0x9000c275
↑ ↑
это адрес это адрес
стэка первой
инструкции
embox>goto -i 0x9000c27f -s 0x20023200 # Флаг -i нужен чтобы запретить прерывания во время инициализации системы
< Начиная отсюда будет вывод не загрузчика, а образа с OpenCV >
7: Launch
embox> edges 20
and enjoy the 40-second border search 🙂
If something goes wrong - write an issue in
Source: habr.com