About six months ago I decided to make a program to control a computer through a browser. I started with a simple one-socket HTTP server that passed pictures to the browser and received cursor coordinates for control.
At a certain stage, I realized that WebRTC technology is well suited for these purposes. The Chrome browser has such a solution, it is installed through the extension. But I wanted to make a lightweight program that will work without installation.
At first I tried to use the library provided by Google, but after compiling it takes 500MB. I had to implement the entire WebRTC stack almost from scratch, I managed to fit everything into a 2.5MB exe file. A friend helped with the interface in JS, that's what happened in the end.
We start the program:
Open the link in a browser tab and get full access to the desktop:
A small animation of the connection establishment process:
Supported by Chrome, Firefox, Safari, Opera.
It is possible to transfer sound, audio call, clipboard management, file transfer and call hotkeys.
In the course of working on the program, I had to study a dozen RFCs and understand that there is not enough information on the Internet about the operation of the WebRTC protocol. I want to write an article on the technologies that it uses, I want to know which questions from the following are of interest to the community:
- SDP streaming data description protocol
- ICE candidates and connection establishment between two points, STUN and TURN servers
- DTLS connection and transfer of keys to RTP session
- RTP and RTΠ‘P protocols with encryption for media data transmission
- Transfer H264, VP8 and Opus via RTP
- SCTP connection for binary data transfer
Source: habr.com