
A few days ago, I decided to reverse engineer my router's firmware using binwalk.
I bought myself . Not the best router, but enough for my needs.
Every time I buy a new router, I install . For what? As a rule, manufacturers do not care much about supporting their routers, and over time, the software becomes outdated, vulnerabilities appear, and so on, in general, you understand. Therefore, I prefer the well-maintained open-source firmware OpenWRT.
After downloading myself OpenWRT, I also under my new Archer C7 from the official site and decided to analyze it. Purely for fun and talk about binwalk.
What is binwalk?
is an open source tool for analyzing, reverse engineering and extracting firmware images.
Created in 2010 by Craig Heffner, binwalk can scan firmware images and find files, identify and extract file system images, executable code, compressed archives, bootloaders and kernels, file formats such as JPEG and PDF, and much more.
You can use binwalk to reverse engineer the firmware to understand how it works. Look for vulnerabilities in binary files, extract files and look for backdoors or digital certificates. Can also be found opcodes for a bunch of different CPUs.
You can unpack filesystem images to look for specific password files (passwd, shadow, etc.) and try to break password hashes. You can perform binary analysis between two or more files. You can perform data entropy analysis to find compressed data or encoded encryption keys. All this without having to access the source code.
In general, everything you need is there 🙂
How does binwalk work?
The main feature of binwalk is its signature scanning. Binwalk can scan the firmware image for various built-in file types and file systems.
Do you know the command line utility file?
file /bin/bash
/bin/bash: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 3.2.0, BuildID[sha1]=12f73d7a8e226c663034529c8dd20efec22dde54, strippedOur team filelooks at the file header and looks for the signature (magic number) to determine the file type. For example, if the file starts with the byte sequence 0x89 0x50 0x4E 0x47 0x0D 0x0A 0x1A 0x0A, she knows it's a PNG file. On there is a list of common file signatures.
Binwalk works the same way. But instead of looking for signatures only at the beginning of the file, binwalk will scan the entire file. In addition, binwalk can extract files found in the image.
Tools file и binwalk use the library libmagic to identify file signatures. But binwalk Additionally supports a list of custom magic signatures for searching compressed/archived files, firmware headers, kernels Linux, bootloaders, file systems, and so on.
Let's have fun?
Installing binwalk
Binwalk is supported on multiple platforms including Linux, OSX, FreeBSD and Windows.
To install the latest version of binwalk, you can and follow or available on the project website.
Binwalk has many different options:
$ binwalk
Binwalk v2.2.0
Craig Heffner, ReFirmLabs
https://github.com/ReFirmLabs/binwalk
Usage: binwalk [OPTIONS] [FILE1] [FILE2] [FILE3] ...
Signature Scan Options:
-B, --signature Scan target file(s) for common file signatures
-R, --raw=<str> Scan target file(s) for the specified sequence of bytes
-A, --opcodes Scan target file(s) for common executable opcode signatures
-m, --magic=<file> Specify a custom magic file to use
-b, --dumb Disable smart signature keywords
-I, --invalid Show results marked as invalid
-x, --exclude=<str> Exclude results that match <str>
-y, --include=<str> Only show results that match <str>
Extraction Options:
-e, --extract Automatically extract known file types
-D, --dd=<type:ext:cmd> Extract <type> signatures, give the files an extension of <ext>, and execute <cmd>
-M, --matryoshka Recursively scan extracted files
-d, --depth=<int> Limit matryoshka recursion depth (default: 8 levels deep)
-C, --directory=<str> Extract files/folders to a custom directory (default: current working directory)
-j, --size=<int> Limit the size of each extracted file
-n, --count=<int> Limit the number of extracted files
-r, --rm Delete carved files after extraction
-z, --carve Carve data from files, but don't execute extraction utilities
-V, --subdirs Extract into sub-directories named by the offset
Entropy Options:
-E, --entropy Calculate file entropy
-F, --fast Use faster, but less detailed, entropy analysis
-J, --save Save plot as a PNG
-Q, --nlegend Omit the legend from the entropy plot graph
-N, --nplot Do not generate an entropy plot graph
-H, --high=<float> Set the rising edge entropy trigger threshold (default: 0.95)
-L, --low=<float> Set the falling edge entropy trigger threshold (default: 0.85)
Binary Diffing Options:
-W, --hexdump Perform a hexdump / diff of a file or files
-G, --green Only show lines containing bytes that are the same among all files
-i, --red Only show lines containing bytes that are different among all files
-U, --blue Only show lines containing bytes that are different among some files
-u, --similar Only display lines that are the same between all files
-w, --terse Diff all files, but only display a hex dump of the first file
Raw Compression Options:
-X, --deflate Scan for raw deflate compression streams
-Z, --lzma Scan for raw LZMA compression streams
-P, --partial Perform a superficial, but faster, scan
-S, --stop Stop after the first result
General Options:
-l, --length=<int> Number of bytes to scan
-o, --offset=<int> Start scan at this file offset
-O, --base=<int> Add a base address to all printed offsets
-K, --block=<int> Set file block size
-g, --swap=<int> Reverse every n bytes before scanning
-f, --log=<file> Log results to file
-c, --csv Log results to file in CSV format
-t, --term Format output to fit the terminal window
-q, --quiet Suppress output to stdout
-v, --verbose Enable verbose output
-h, --help Show help output
-a, --finclude=<str> Only scan files whose names match this regex
-p, --fexclude=<str> Do not scan files whose names match this regex
-s, --status=<int> Enable the status server on the specified portImage scanning
Let's start by looking for file signatures inside the image (image from the site ).
Running binwalk with the --signature option:
$ binwalk --signature --term archer-c7.bin
DECIMAL HEXADECIMAL DESCRIPTION
------------------------------------------------------------------------------------------
21876 0x5574 U-Boot version string, "U-Boot 1.1.4-g4480d5f9-dirty (May
20 2019 - 18:45:16)"
21940 0x55B4 CRC32 polynomial table, big endian
23232 0x5AC0 uImage header, header size: 64 bytes, header CRC:
0x386C2BD5, created: 2019-05-20 10:45:17, image size:
41162 bytes, Data Address: 0x80010000, Entry Point:
0x80010000, data CRC: 0xC9CD1E38, OS: Linux, CPU: MIPS,
image type: Firmware Image, compression type: lzma, image
name: "u-boot image"
23296 0x5B00 LZMA compressed data, properties: 0x5D, dictionary size:
8388608 bytes, uncompressed size: 97476 bytes
64968 0xFDC8 XML document, version: "1.0"
78448 0x13270 uImage header, header size: 64 bytes, header CRC:
0x78A267FF, created: 2019-07-26 07:46:14, image size:
1088500 bytes, Data Address: 0x80060000, Entry Point:
0x80060000, data CRC: 0xBB9D4F94, OS: Linux, CPU: MIPS,
image type: Multi-File Image, compression type: lzma,
image name: "MIPS OpenWrt Linux-3.3.8"
78520 0x132B8 LZMA compressed data, properties: 0x6D, dictionary size:
8388608 bytes, uncompressed size: 3164228 bytes
1167013 0x11CEA5 Squashfs filesystem, little endian, version 4.0,
compression:xz, size: 14388306 bytes, 2541 inodes,
blocksize: 65536 bytes, created: 2019-07-26 07:51:38
15555328 0xED5B00 gzip compressed data, from Unix, last modified: 2019-07-26
07:51:41Now we have a lot of information about this image.
Image uses as bootloader (image header at 0x5AC0 and a compressed bootloader image at 0x5B00). Based on the uImage header at 0x13270, we know that the processor architecture is MIPS and the kernel is Linux — version 3.3.8. And based on the image found at 0x11CEA5, we can see that rootfs is a file system squashfs.
Let's now extract the bootloader (U-Boot) using the command dd:
$ dd if=archer-c7.bin of=u-boot.bin.lzma bs=1 skip=23296 count=41162
41162+0 records in
41162+0 records out
41162 bytes (41 kB, 40 KiB) copied, 0,0939608 s, 438 kB/sSince the image is compressed with LZMA, we need to decompress it:
$ unlzma u-boot.bin.lzmaWe now have a U-Boot image:
$ ls -l u-boot.bin
-rw-rw-r-- 1 sprado sprado 97476 Fev 5 08:48 u-boot.binHow about finding the default value for bootargs?
$ strings u-boot.bin | grep bootargs
bootargs
bootargs=console=ttyS0,115200 board=AP152 rootfstype=squashfs init=/etc/preinit mtdparts=spi0.0:128k(factory-uboot),192k(u-boot),64k(ART),1536k(uImage),14464k@0x1e0000(rootfs) mem=128MU-Boot environment variable bootargs used to pass parameters to the kernel LinuxAnd from the above, we have a better understanding of the device's flash memory.
How about extracting the kernel image? Linux?
$ dd if=archer-c7.bin of=uImage bs=1 skip=78448 count=1088572
1088572+0 records in
1088572+0 records out
1088572 bytes (1,1 MB, 1,0 MiB) copied, 1,68628 s, 646 kB/sWe can verify that the image has been successfully extracted with the command file:
$ file uImage
uImage: u-boot legacy uImage, MIPS OpenWrt Linux-3.3.8, Linux/MIPS, Multi-File Image (lzma), 1088500 bytes, Fri Jul 26 07:46:14 2019, Load Address: 0x80060000, Entry Point: 0x80060000, Header CRC: 0x78A267FF, Data CRC: 0xBB9D4F94The uImage file format is basically a kernel image Linux with an additional header. Let's remove this header to get the final kernel image. Linux:
$ dd if=uImage of=Image.lzma bs=1 skip=72
1088500+0 records in
1088500+0 records out
1088500 bytes (1,1 MB, 1,0 MiB) copied, 1,65603 s, 657 kB/sThe image is compressed, so let's unpack it:
$ unlzma Image.lzmaNow we have a kernel image Linux:
$ ls -la Image
-rw-rw-r-- 1 sprado sprado 3164228 Fev 5 10:51 ImageWhat can we do with a kernel image? We could, for example, search for strings in the image and find the kernel version. Linux and learn about the environment used to build the kernel:
$ strings Image | grep "Linux version"
Linux version 3.3.8 (leo@leo-MS-7529) (gcc version 4.6.3 20120201 (prerelease) (Linaro GCC 4.6-2012.02) ) #1 Mon May 20 18:53:02 CST 2019Even though the firmware was released last year (2019), as I write this article, it uses an older kernel version. Linux (3.3.8), released in 2012, compiled with a very old version of GCC (4.6) also from 2012!
Do you still trust your routers at the office and at home?
With option --opcodes we can also use binwalk to look up machine instructions and determine the processor architecture of the image:
$ binwalk --opcodes Image
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
2400 0x960 MIPS instructions, function epilogue
2572 0xA0C MIPS instructions, function epilogue
2828 0xB0C MIPS instructions, function epilogueWhat about the root file system? Instead of extracting the image manually, let's use the option binwalk --extract:
$ binwalk --extract --quiet archer-c7.binThe complete root file system will be extracted to a subdirectory:
$ cd _archer-c7.bin.extracted/squashfs-root/
$ ls
bin dev etc lib mnt overlay proc rom root sbin sys tmp usr var www
$ cat etc/banner
MM NM MMMMMMM M M
$MMMMM MMMMM MMMMMMMMMMM MMM MMM
MMMMMMMM MM MMMMM. MMMMM:MMMMMM: MMMM MMMMM
MMMM= MMMMMM MMM MMMM MMMMM MMMM MMMMMM MMMM MMMMM'
MMMM= MMMMM MMMM MM MMMMM MMMM MMMM MMMMNMMMMM
MMMM= MMMM MMMMM MMMMM MMMM MMMM MMMMMMMM
MMMM= MMMM MMMMMM MMMMM MMMM MMMM MMMMMMMMM
MMMM= MMMM MMMMM, NMMMMMMMM MMMM MMMM MMMMMMMMMMM
MMMM= MMMM MMMMMM MMMMMMMM MMMM MMMM MMMM MMMMMM
MMMM= MMMM MM MMMM MMMM MMMM MMMM MMMM MMMM
MMMM$ ,MMMMM MMMMM MMMM MMM MMMM MMMMM MMMM MMMM
MMMMMMM: MMMMMMM M MMMMMMMMMMMM MMMMMMM MMMMMMM
MMMMMM MMMMN M MMMMMMMMM MMMM MMMM
MMMM M MMMMMMM M M
M
---------------------------------------------------------------
For those about to rock... (%C, %R)
---------------------------------------------------------------Now we can do a lot of things.
We can look for configuration files, password hashes, cryptographic keys, and digital certificates. We can parse binaries for and vulnerabilities.
Use и we can even run (emulate) an executable from an image:
$ ls
bin dev etc lib mnt overlay proc rom root sbin sys tmp usr var www
$ cp /usr/bin/qemu-mips-static .
$ sudo chroot . ./qemu-mips-static bin/busybox
BusyBox v1.19.4 (2019-05-20 18:13:49 CST) multi-call binary.
Copyright (C) 1998-2011 Erik Andersen, Rob Landley, Denys Vlasenko
and others. Licensed under GPLv2.
See source distribution for full notice.
Usage: busybox [function] [arguments]...
or: busybox --list[-full]
or: function [arguments]...
BusyBox is a multi-call binary that combines many common Unix
utilities into a single executable. Most people will create a
link to busybox for each function they wish to use and BusyBox
will act like whatever it was invoked as.
Currently defined functions:
[, [[, addgroup, adduser, arping, ash, awk, basename, cat, chgrp, chmod, chown, chroot, clear, cmp, cp, crond, crontab, cut, date, dd, delgroup, deluser, dirname, dmesg, echo, egrep, env, expr, false,
fgrep, find, free, fsync, grep, gunzip, gzip, halt, head, hexdump, hostid, id, ifconfig, init, insmod, kill, killall, klogd, ln, lock, logger, ls, lsmod, mac_addr, md5sum, mkdir, mkfifo, mknod, mktemp,
mount, mv, nice, passwd, pgrep, pidof, ping, ping6, pivot_root, poweroff, printf, ps, pwd, readlink, reboot, reset, rm, rmdir, rmmod, route, sed, seq, sh, sleep, sort, start-stop-daemon, strings,
switch_root, sync, sysctl, tail, tar, tee, telnet, test, tftp, time, top, touch, tr, traceroute, true, udhcpc, umount, uname, uniq, uptime, vconfig, vi, watchdog, wc, wget, which, xargs, yes, zcatGreat! But note that the BusyBox version is 1.19.4. This is a very old version of BusyBoxreleased in April 2012.
So TP-Link releases a firmware image in 2019 using software (GCC toolchain, kernel, BusyBox, etc.) from 2012!
Now do you understand why I always install OpenWRT on my routers?
That's not all
Binwalk can also perform entropy analysis, print raw entropy data, and generate entropy plots. Typically, more entropy is observed when the bytes in the image are random. This may mean that the image contains an encrypted, compressed, or obfuscated file. Hardcore Encryption Key? Why not.

We can also use the parameter --raw to search for a custom sequence of raw bytes in an image or parameter --hexdump to perform a hex dump that compares two or more input files.
can be added to binwalk either through a custom signature file specified on the command line with the parameter --magic, or by adding them to the directory $ HOME / .config / binwalk / magic.
You can find more information about binwalk at .
binwalk extension
Exist binwalk implemented as a Python module that can be used by any Python script to programmatically perform a binwalk scan, and the binwalk command line utility can be almost completely duplicated with just two lines of Python code!
import binwalk
binwalk.scan()With the help of the Python API, you can also create to customize and expand binwalk.
There is also a and cloud version .
So why don't you download the firmware image from the internet and try binwalk? I promise you will have a lot of fun 🙂
Source: habr.com
