(Almost) useless webcam streaming from a browser. Media Stream and Websockets

In this article, I want to share my attempts to stream video via websockets without using third-party browser plugins like Adobe Flash Player. What came of it, read on.

Adobe Flash - formerly Macromedia Flash, is a platform for creating applications that run in a web browser. Prior to the introduction of the Media Stream API, it was practically the only platform for streaming video and voice from a webcam, as well as for creating various kinds of conferences and chats in the browser. The protocol for transferring media information RTMP (Real Time Messaging Protocol) was actually closed for a long time, which meant: if you want to raise your streaming service, please use the software from Adobe themselves - Adobe Media Server (AMS).

After some time in 2012, Adobe "gave up and spit out" to the public specification RTMP protocol, which contained errors and was essentially incomplete. By that time, the developers began to make their own implementations of this protocol, so the Wowza server appeared. In 2011, Adobe filed a lawsuit against Wowza for illegal use of patents related to RTMP, after 4 years the conflict was resolved by the world.

Adobe Flash platform is more than 20 years old, during this time many critical vulnerabilities have been discovered, support promised stop by 2020, so there aren't many alternatives for the streaming service.

For my project, I immediately decided to completely abandon the use of Flash in the browser. I indicated the main reason above, also Flash is not supported at all on mobile platforms, and I really didn’t want to deploy Adobe Flash for development on windows (wine emulator). Therefore, I undertook to write a client in JavaScript. This will be just a prototype, since later I learned that streaming can be done much more efficiently based on p2p, only for me it will be peer - server - peers, but more on that another time, because it's not ready yet.

To get started, we need the actual websockets server. I made the simplest one based on the melody go-package:

Server side code

package main

import (
	"errors"
	"github.com/go-chi/chi"
	"gopkg.in/olahol/melody.v1"
	"log"
	"net/http"
	"time"
)

func main() {
	r := chi.NewRouter()
	m := melody.New()

	m.Config.MaxMessageSize = 204800

	r.Get("/", func(w http.ResponseWriter, r *http.Request) {
		http.ServeFile(w, r, "public/index.html")
	})
	r.Get("/ws", func(w http.ResponseWriter, r *http.Request) {
		m.HandleRequest(w, r)
	})

         // Бродкастим Π²ΠΈΠ΄Π΅ΠΎ ΠΏΠΎΡ‚ΠΎΠΊ 
	m.HandleMessageBinary(func(s *melody.Session, msg []byte) {
		m.BroadcastBinary(msg)
	})

	log.Println("Starting server...")

	http.ListenAndServe(":3000", r)
}

On the client (transmitting side), you first need to access the camera. This is done through MediaStream API.

We get access (permission) to the camera / microphone through Media Devices API. This API provides a method MediaDevices.getUserMedia(), which shows the pop. a window that asks the user for permission to access the camera and/or microphone. I would like to note that I conducted all the experiments in Google Chrome, but I think everything will work in Firefox in about the same way.

Next, getUserMedia() returns a Promise, into which it passes a MediaStream object β€” a video-audio data stream. We assign this object to the src property of the video element. Code:

Broadcasting side

<style>
  #videoObjectHtml5ApiServer { width: 320px; height: 240px; background: #666; }
</style>
</head>
<body>
<!-- Π—Π΄Π΅ΡΡŒ Π² этом "ΠΎΠΊΠΎΡˆΠ΅Ρ‡ΠΊΠ΅" ΠΊΠ»ΠΈΠ΅Π½Ρ‚ Π±ΡƒΠ΄Π΅Ρ‚ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ сСбя -->
<video autoplay id="videoObjectHtml5ApiServer"></video>

<script type="application/javascript">
  var
        video = document.getElementById('videoObjectHtml5ApiServer');

// Ссли доступСн MediaDevices API, пытаСмся ΠΏΠΎΠ»ΡƒΡ‡ΠΈΡ‚ΡŒ доступ ΠΊ ΠΊΠ°ΠΌΠ΅Ρ€Π΅ (ΠΌΠΎΠΆΠ½ΠΎ Π΅Ρ‰Π΅ ΠΈ ΠΊ ΠΌΠΈΠΊΡ€ΠΎΡ„ΠΎΠ½Ρƒ)
// getUserMedia Π²Π΅Ρ€Π½Π΅Ρ‚ ΠΎΠ±Π΅Ρ‰Π°Π½ΠΈΠ΅, Π½Π° ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠ΅ подписываСмся ΠΈ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊ Π² ΠΊΠΎΠ»Π±Π΅ΠΊΠ΅ направляСм Π² video ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ Π½Π° страницС

if (navigator.mediaDevices.getUserMedia) {
        navigator.mediaDevices.getUserMedia({video: true}).then(function (stream) {
          // Π²ΠΈΠ΄Π΅ΠΎ ΠΏΠΎΡ‚ΠΎΠΊ привязываСм ΠΊ video Ρ‚Π΅Π³Ρƒ, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΊΠ»ΠΈΠ΅Π½Ρ‚ ΠΌΠΎΠ³ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ сСбя ΠΈ ΠΊΠΎΠ½Ρ‚Ρ€ΠΎΠ»ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ 
          video.srcObject = stream;
        });
}
</script>

To broadcast a video stream through sockets, it is necessary to encode it somewhere, buffer it, and transfer it in parts. The raw video stream cannot be transmitted via websockets. Here comes to our aid Media Recorder API. This API allows you to encode and split the stream into pieces. I do encoding to compress the video stream in order to drive fewer bytes over the network. Having broken into pieces, you can send each piece to a websocket. Code:

We encode the video stream, we beat it into parts

<style>
  #videoObjectHtml5ApiServer { width: 320px; height: 240px; background: #666; }
</style>
</head>
<body>
<!-- Π—Π΄Π΅ΡΡŒ Π² этом "ΠΎΠΊΠΎΡˆΠ΅Ρ‡ΠΊΠ΅" ΠΊΠ»ΠΈΠ΅Π½Ρ‚ Π±ΡƒΠ΄Π΅Ρ‚ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ сСбя -->
<video autoplay id="videoObjectHtml5ApiServer"></video>

<script type="application/javascript">
  var
        video = document.getElementById('videoObjectHtml5ApiServer');

// Ссли доступСн MediaDevices API, пытаСмся ΠΏΠΎΠ»ΡƒΡ‡ΠΈΡ‚ΡŒ доступ ΠΊ ΠΊΠ°ΠΌΠ΅Ρ€Π΅ (ΠΌΠΎΠΆΠ½ΠΎ Π΅Ρ‰Π΅ ΠΈ ΠΊ ΠΌΠΈΠΊΡ€ΠΎΡ„ΠΎΠ½Ρƒ)
// getUserMedia Π²Π΅Ρ€Π½Π΅Ρ‚ ΠΎΠ±Π΅Ρ‰Π°Π½ΠΈΠ΅, Π½Π° ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠ΅ подписываСмся ΠΈ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊ Π² ΠΊΠΎΠ»Π±Π΅ΠΊΠ΅ направляСм Π² video ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ Π½Π° страницС

if (navigator.mediaDevices.getUserMedia) {
        navigator.mediaDevices.getUserMedia({video: true}).then(function (stream) {
          // Π²ΠΈΠ΄Π΅ΠΎ ΠΏΠΎΡ‚ΠΎΠΊ привязываСм ΠΊ video Ρ‚Π΅Π³Ρƒ, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΊΠ»ΠΈΠ΅Π½Ρ‚ ΠΌΠΎΠ³ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ сСбя ΠΈ ΠΊΠΎΠ½Ρ‚Ρ€ΠΎΠ»ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ 
          video.srcObject = s;
          var
            recorderOptions = {
                mimeType: 'video/webm; codecs=vp8' // Π±ΡƒΠ΄Π΅ΠΌ ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊ Π² Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ webm ΠΊΠΎΠ΄Π΅ΠΊΠΎΠΌ vp8
              },
              mediaRecorder = new MediaRecorder(s, recorderOptions ); // ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ MediaRecorder

               mediaRecorder.ondataavailable = function(e) {
                if (e.data && e.data.size > 0) {
                  // ΠΏΠΎΠ»ΡƒΡ‡Π°Π΅ΠΌ кусочСк Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊΠ° Π² e.data
                }
            }

            mediaRecorder.start(100); // Π΄Π΅Π»ΠΈΡ‚ ΠΏΠΎΡ‚ΠΎΠΊ Π½Π° кусочки ΠΏΠΎ 100 мс ΠΊΠ°ΠΆΠ΄Ρ‹ΠΉ

        });
}
</script>

Now let's add websockets transmission. Surprisingly, all you need is an object WebSocket. It has only two send and close methods. The names speak for themselves. Added code:

Sending a video stream to the server

<style>
  #videoObjectHtml5ApiServer { width: 320px; height: 240px; background: #666; }
</style>
</head>
<body>
<!-- Π—Π΄Π΅ΡΡŒ Π² этом "ΠΎΠΊΠΎΡˆΠ΅Ρ‡ΠΊΠ΅" ΠΊΠ»ΠΈΠ΅Π½Ρ‚ Π±ΡƒΠ΄Π΅Ρ‚ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ сСбя -->
<video autoplay id="videoObjectHtml5ApiServer"></video>

<script type="application/javascript">
  var
        video = document.getElementById('videoObjectHtml5ApiServer');

// Ссли доступСн MediaDevices API, пытаСмся ΠΏΠΎΠ»ΡƒΡ‡ΠΈΡ‚ΡŒ доступ ΠΊ ΠΊΠ°ΠΌΠ΅Ρ€Π΅ (ΠΌΠΎΠΆΠ½ΠΎ Π΅Ρ‰Π΅ ΠΈ ΠΊ ΠΌΠΈΠΊΡ€ΠΎΡ„ΠΎΠ½Ρƒ)
// getUserMedia Π²Π΅Ρ€Π½Π΅Ρ‚ ΠΎΠ±Π΅Ρ‰Π°Π½ΠΈΠ΅, Π½Π° ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠ΅ подписываСмся ΠΈ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊ Π² ΠΊΠΎΠ»Π±Π΅ΠΊΠ΅ направляСм Π² video ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ Π½Π° страницС

if (navigator.mediaDevices.getUserMedia) {
        navigator.mediaDevices.getUserMedia({video: true}).then(function (stream) {
          // Π²ΠΈΠ΄Π΅ΠΎ ΠΏΠΎΡ‚ΠΎΠΊ привязываСм ΠΊ video Ρ‚Π΅Π³Ρƒ, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΊΠ»ΠΈΠ΅Π½Ρ‚ ΠΌΠΎΠ³ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ сСбя ΠΈ ΠΊΠΎΠ½Ρ‚Ρ€ΠΎΠ»ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ 
          video.srcObject = s;
          var
            recorderOptions = {
                mimeType: 'video/webm; codecs=vp8' // Π±ΡƒΠ΄Π΅ΠΌ ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊ Π² Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ webm ΠΊΠΎΠ΄Π΅ΠΊΠΎΠΌ vp8
              },
              mediaRecorder = new MediaRecorder(s, recorderOptions ), // ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ MediaRecorder
              socket = new WebSocket('ws://127.0.0.1:3000/ws');

               mediaRecorder.ondataavailable = function(e) {
                if (e.data && e.data.size > 0) {
                  // ΠΏΠΎΠ»ΡƒΡ‡Π°Π΅ΠΌ кусочСк Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊΠ° Π² e.data
                 socket.send(e.data);
                }
            }

            mediaRecorder.start(100); // Π΄Π΅Π»ΠΈΡ‚ ΠΏΠΎΡ‚ΠΎΠΊ Π½Π° кусочки ΠΏΠΎ 100 мс ΠΊΠ°ΠΆΠ΄Ρ‹ΠΉ

        }).catch(function (err) { console.log(err); });
}
</script>

The broadcast side is ready! Now let's try to receive a video stream and display it on the client. What do we need for this? First, of course, the socket connection. We hang a "listener" (listener) on the WebSocket object, subscribe to the 'message' event. Having received a piece of binary data, our server broadcasts to its subscribers, that is, clients. At the same time, the callback function associated with the β€œlistener” of the 'message' event is triggered on the client, the object itself is passed to the function argument - a piece of the video stream encoded by vp8.

Receive video stream

<style>
  #videoObjectHtml5ApiServer { width: 320px; height: 240px; background: #666; }
</style>
</head>
<body>
<!-- Π—Π΄Π΅ΡΡŒ Π² этом "ΠΎΠΊΠΎΡˆΠ΅Ρ‡ΠΊΠ΅" ΠΊΠ»ΠΈΠ΅Π½Ρ‚ Π±ΡƒΠ΄Π΅Ρ‚ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ тСбя -->
<video autoplay id="videoObjectHtml5ApiServer"></video>

<script type="application/javascript">
  var
        video = document.getElementById('videoObjectHtml5ApiServer'),
         socket = new WebSocket('ws://127.0.0.1:3000/ws'), 
         arrayOfBlobs = [];

         socket.addEventListener('message', function (event) {
                // "ΠΊΠ»Π°Π΄Π΅ΠΌ" ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ кусочСк Π² массив 
                arrayOfBlobs.push(event.data);
                // здСсь Π±ΡƒΠ΄Π΅ΠΌ Ρ‡ΠΈΡ‚Π°Ρ‚ΡŒ кусочки
                readChunk();
            });
</script>

For a long time I tried to understand why it is impossible to immediately send the received pieces for playback to the video element, but it turned out that of course you can’t do this, you must first put the piece in a special buffer attached to the video element, and only then will the video stream start playing. This will require MediaSource API ΠΈ FileReader API.

MediaSource acts as an intermediary between the media playback object and the source of this media stream. The MediaSource object contains a pluggable buffer for the source of the video/audio stream. One quirk is that a buffer can only contain data of type Uint8, so a FileReader is required to create such a buffer. Look at the code and it will become clearer:

Playing a video stream

<style>
  #videoObjectHtml5ApiServer { width: 320px; height: 240px; background: #666; }
</style>
</head>
<body>
<!-- Π—Π΄Π΅ΡΡŒ Π² этом "ΠΎΠΊΠΎΡˆΠ΅Ρ‡ΠΊΠ΅" ΠΊΠ»ΠΈΠ΅Π½Ρ‚ Π±ΡƒΠ΄Π΅Ρ‚ Π²ΠΈΠ΄Π΅Ρ‚ΡŒ тСбя -->
<video autoplay id="videoObjectHtml5ApiServer"></video>

<script type="application/javascript">
  var
        video = document.getElementById('videoObjectHtml5ApiServer'),
         socket = new WebSocket('ws://127.0.0.1:3000/ws'),
        mediaSource = new MediaSource(), // ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ MediaSource
        vid2url = URL.createObjectURL(mediaSource), // создаСм ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ URL для связывания Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊΠ° с ΠΏΡ€ΠΎΠΈΠ³Ρ€Ρ‹Π²Π°Ρ‚Π΅Π»Π΅ΠΌ
        arrayOfBlobs = [],
        sourceBuffer = null; // Π±ΡƒΡ„Π΅Ρ€, ΠΏΠΎΠΊΠ° Π½ΡƒΠ»ΡŒ-ΠΎΠ±ΡŠΠ΅ΠΊΡ‚

         socket.addEventListener('message', function (event) {
                // "ΠΊΠ»Π°Π΄Π΅ΠΌ" ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ кусочСк Π² массив 
                arrayOfBlobs.push(event.data);
                // здСсь Π±ΡƒΠ΄Π΅ΠΌ Ρ‡ΠΈΡ‚Π°Ρ‚ΡŒ кусочки
                readChunk();
            });

         // ΠΊΠ°ΠΊ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ MediaSource Π±ΡƒΠ΄Π΅Ρ‚ ΠΎΠΏΠΎΠ²Π΅Ρ‰Π΅Π½ , Ρ‡Ρ‚ΠΎ источник Π³ΠΎΡ‚ΠΎΠ² ΠΎΡ‚Π΄Π°Π²Π°Ρ‚ΡŒ кусочки 
        // Π²ΠΈΠ΄Π΅ΠΎ/Π°ΡƒΠ΄ΠΈΠΎ ΠΏΠΎΡ‚ΠΎΠΊΠ°
        // создаСм Π±ΡƒΡ„Π΅Ρ€ , слСдуСт ΠΎΠ±Ρ€Π°Ρ‚ΠΈΡ‚ΡŒ Π²Π½ΠΈΠΌΠ°Π½ΠΈΠ΅, Ρ‡Ρ‚ΠΎ Π±ΡƒΡ„Π΅Ρ€ Π΄ΠΎΠ»ΠΆΠ΅Π½ Π·Π½Π°Ρ‚ΡŒ Π² ΠΊΠ°ΠΊΠΎΠΌ Ρ„ΠΎΡ€ΠΌΠ°Ρ‚Π΅ 
        // ΠΊΠ°ΠΊΠΈΠΌ ΠΊΠΎΠ΄Π΅ΠΊΠΎΠΌ Π±Ρ‹Π» Π·Π°ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²Π°Π½ ΠΏΠΎΡ‚ΠΎΠΊ, Ρ‡Ρ‚ΠΎΠ±Ρ‹ Ρ‚Π΅ΠΌ ΠΆΠ΅ способом ΠΏΡ€ΠΎΡ‡ΠΈΡ‚Π°Ρ‚ΡŒ Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊ
         mediaSource.addEventListener('sourceopen', function() {
            var mediaSource = this;
            sourceBuffer = mediaSource.addSourceBuffer("video/webm; codecs="vp8"");
        });

      function readChunk() {
        var reader = new FileReader();
        reader.onload = function(e) { 
          // ΠΊΠ°ΠΊ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ FileReader Π±ΡƒΠ΄Π΅Ρ‚ Π³ΠΎΡ‚ΠΎΠ², ΠΈ Π·Π°Π³Ρ€ΡƒΠ·ΠΈΡ‚ сСбС кусочСк Π²ΠΈΠ΄Π΅ΠΎΠΏΠΎΡ‚ΠΎΠΊΠ°
          // ΠΌΡ‹ "прицСпляСм" ΠΏΠ΅Ρ€Π΅ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹ΠΉ Π² Uint8Array (Π±Ρ‹Π» Blob) кусочСк Π² Π±ΡƒΡ„Π΅Ρ€, связанный
          // с ΠΏΡ€ΠΎΠΈΠ³Ρ€Ρ‹Π²Π°Ρ‚Π΅Π»Π΅ΠΌ, ΠΈ ΠΏΡ€ΠΎΠΈΠ³Ρ€Ρ‹Π²Π°Ρ‚Π΅Π»ΡŒ Π½Π°Ρ‡ΠΈΠ½Π°Π΅Ρ‚ Π²ΠΎΡΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚ΡŒ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ кусочСк Π²ΠΈΠ΄Π΅ΠΎ/Π°ΡƒΠ΄ΠΈΠΎ
          sourceBuffer.appendBuffer(new Uint8Array(e.target.result));

          reader.onload = null;
        }
        reader.readAsArrayBuffer(arrayOfBlobs.shift());
      }
</script>

The streaming service prototype is ready. The main disadvantage is that video playback will lag behind the transmitting side by 100 ms, we set this ourselves when splitting the video stream before transferring it to the server. Moreover, when I checked on my laptop, I gradually accumulated a lag between the transmitting and receiving sides, it was clearly visible. I started looking for ways to overcome this shortcoming, and ... came across RTCPeerConnection API, which allows you to transfer a video stream without tricks such as splitting the stream into pieces. The cumulative lag, I think, is due to the fact that the browser recodes each piece into the webm format before transmission. I no longer began to dig further, but began to study WebRTC, I think about the results of my research, I will write a separate article if I find it interesting to the community.

Source: habr.com

Add a comment