Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency

Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
Software as a Service, Infrastructure as a Service, Platform as a Service, Communications Platform as a Service, Video Conferencing as a Service, and what about Cloud Gaming as a Service? There have already been several attempts to create cloud gaming (Cloud Gaming), such as Stadia, recently launched by Google. Stadia not new to WebRTC, but can others use WebRTC in the same way?

Thanh Nguyen decided to test this possibility on his open source project CloudRetro. CloudRetro is based on Pion, popular WebRTC library based on Go (thanks Shownu from the Pion development team for their help with this article). In this article, Thanh gives an overview of the architecture of his project, and also talks about what he learned useful and what challenges he encountered while working.

Entry

Last year, when Google announced Stadia, I was blown away. The idea is so unique and innovative that I constantly wondered how this is even possible with existing technology. The desire to better understand this topic prompted me to create my own version of an open source cloud game. The result was just fantastic. Below I would like to share the process of working on my annual project.

TLDR: short slide version with highlights

Why cloud gaming is the future

I believe that Cloud Gaming will soon become a new generation of not only games, but also other areas of computer science. Cloud gaming is the pinnacle of the client/server model. This model maximizes backend management and minimizes frontend work by hosting game logic on a remote server and streaming images/audio to the client. The server does the heavy processing so the client is no longer subject to hardware limitations.

Google Stadia basically lets you play AAA games (i.e. high-end blockbuster games) on an interface like YouTube. The same methodology can be applied to other heavy offline applications such as operating system or 2D/3D graphic design etc. so that we can run them stably on low-spec devices across platforms.

Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
The future of this technology: imagine if Microsoft Windows 10 ran in the Chrome browser?

Cloud gaming is technically difficult

Gaming is one of those rare areas where a constant fast reaction of the user is required. If occasionally we encounter a delay of 2 seconds when clicking on a page, this is acceptable. Live video streams tend to be a few seconds behind, but still offer a fair amount of usability. However, if the game is often delayed by 500ms, it is simply not possible to play. Our goal is to achieve extremely low latency so that the gap between input and media is as small as possible. Therefore, the traditional approach to streaming video is not applicable here.

Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
General cloud game template

Open source project CloudRetro

I decided to create a test sample of a cloud game to see if all this is possible with such severe network restrictions. I chose Golang for the proof of concept because it's the language I'm most familiar with and a good fit for this implementation for many other reasons, as I later found out. Go is simple and develops very quickly; Channels in Go are great for managing multithreading.

Project CloudRetro.io is an open source cloud gaming service for retro gaming. The goal of the project is to bring the most comfortable gaming experience to traditional retro games and add multiplayer.
You can find out more about the project here: https://github.com/giongto35/cloud-game.

CloudRetro functionality

CloudRetro uses retro games to showcase the power of cloud gaming. That allows you to get a lot of unique gaming experiences.

  • Game Portability
    • Instant playback when opening a page; no download and installation required
    • Runs on a mobile browser so no software is needed to run

  • Game sessions can be shared across multiple devices and stored in the cloud for next login
  • The game can be streamed, or you can play it with several users at once:
    • Crowdplay like TwitchPlayPokemon, only more cross-platform and more real-time
    • Offline games online. Many users can play without network setup. Samurai Shodown can now be played with 2 players over the CloudRetro network

    Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
    Demo version of online multiplayer game on different devices

    Infrastructure

    Requirements and technology stack

    Below is a list of requirements that I set before starting the project.

    1. One player
    This requirement may not seem too important and obvious here, but it's one of my key takeaways, it keeps cloud gaming as far away from traditional streaming services as possible. If we focus on the single player game, we can get rid of the centralized server or CDN because we don't have to stream to the masses. Instead of uploading streams to an ingesting server or passing packets to a centralized WebSocket server, service streams are streamed directly to the user via a WebRTC peer connection.

    2. Low latency media stream
    When reading about Stadia, I often see WebRTC mentioned in some articles. I realized that WebRTC is an outstanding technology, and it is great for use in cloud gaming. WebRTC is a project that provides web browsers and mobile applications with real-time communication through a simple API. It provides peer-to-peer connectivity, is optimized for media, and has built-in standard codecs such as VP8 and H264.

    I prioritized providing the best possible user experience over maintaining high quality graphics. Some losses are allowed in the algorithm. Google Stadia has an additional step to reduce the image size on the server and the frames are upscaled to a higher quality before being passed to the peers.

    3. Distributed infrastructure with geographic routing
    No matter how optimized the compression algorithm and code are, the network is still the deciding factor that contributes the most to latency. The architecture should have a mechanism for pairing the server closest to the user to reduce round-trip time (RTT). The architecture should have 1 coordinator and multiple streaming servers distributed around the world: US West, US East, Europe, Singapore, China. All streaming servers must be completely isolated. The system can adjust its distribution when the server joins or leaves the network. Thus, with high traffic, adding additional servers allows horizontal scaling.

    4. Browser compatibility
    Cloud gaming is at its best when it requires the bare minimum from users. This means that it is possible to run in the browser. Browsers help make the gaming experience as comfortable as possible for users by saving them from installing software and hardware. Browsers also help provide cross-platform for mobile and desktop versions. Fortunately, WebRTC is well supported in various browsers.

    5. Clear separation of the game interface and service
    I view cloud gaming service as a platform. Everyone should be able to connect anything to the platform. I have now integrated LibRetro with cloud game service because LibRetro offers beautiful game emulator interface for retro games like SNES, GBA, PS.

    6. Rooms for multiplayer, crowd play and external linking (deep-link) with the game
    CloudRetro supports many new gameplays such as CrowdPlay and Online MultiPlayer for retro games. If several users open the same deep-link on different computers, they will see the same game running and even be able to join it.

    Moreover, game states are stored in the cloud storage. This allows users to continue the game at any time on any other device.

    7. Horizontal scaling
    Like any SAAS nowadays, cloud gaming must be designed to be horizontally scalable. The coordinator-worker design allows you to add more workers to serve more traffic.

    8. Not tied to one cloud
    The CloudRetro infrastructure is hosted by different cloud providers (Digital Ocean, Alibaba, custom provider) for different regions. I enable running in an infrastructure Docker container and configure network settings with a bash script to avoid being dependent on a single cloud provider. Combining this with NAT Traversal in WebRTC, we can have the flexibility to deploy CloudRetro on any cloud platform and even any user's machine.

    Architectural design

    Worker: (or the streaming server mentioned above) multiplies the games, runs the encoding pipeline, and streams the encoded media to the users. Worker instances are distributed all over the world, and each worker can handle multiple user sessions at the same time.

    Coordinator: is responsible for pairing the new user with the most suitable streaming worker. The coordinator communicates with the workers via WebSocket.

    Game state storage: central remote storage for all game states. This storage provides important features such as remote save/load.

    Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
    CloudRetro top-level architecture

    Custom Script

    When a new user opens CloudRetro in steps 1 and 2 shown in the figure below, the coordinator, along with a list of available workers, is requested to the first page. After that, in step 3, the client calculates delays for all candidates using an HTTP ping request. This list of delays is then sent back to the coordinator so that he can determine the most appropriate worker to serve the user. Step 4 below creates a game. A WebRTC streaming connection is established between the user and the assigned worker.
    Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
    Custom script after gaining access

    What's inside the worker

    Game and streaming pipelines are stored inside the worker in isolation and exchange information there through the interface. Currently, this communication is carried out by transferring data in memory over golang channels in the same process. The next goal is segregation, i.e. independent launch of the game in another process.

    Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
    Interaction of worker components

    Main components:

    • WebRTC: a client component that accepts user input and outputs the encoded media from the server.
    • Game emulator: game component. Thanks to the Libretro library, the system is able to run the game inside the same process and intercept the media and input stream internally.
    • In-game frames are captured and sent to the encoder.
    • Image/audio encoder: an encoding pipeline that receives media frames, encodes them in the background, and outputs encoded images/audio.

    implementation

    CloudRetro relies on WebRTC as a backbone technology, so before diving into the details of the Golang implementation, I decided to talk about WebRTC itself. It's an amazing piece of technology that has helped me immensely in achieving sub-second latency streaming.

    WebRTC

    WebRTC is designed to provide high quality peer-to-peer connections on native mobile app and browsers using simple APIs.

    NAT Traversal

    WebRTC is known for its NAT Traversal functionality. WebRTC is designed for peer-to-peer communication. Its purpose is to find the most appropriate direct route, avoiding NAT gateways and firewalls, to peer-to-peer through a process called ICE. As part of this process, the WebRTC APIs find your public IP address using STUN servers and forward it to a relay server (TURN) when a direct connection cannot be established.

    However, CloudRetro does not fully exploit this capability. Its peer-to-peer connections do not exist between users, but between users and cloud servers. The server side of the model has fewer restrictions on direct communication than a typical user device. This allows you to pre-open incoming ports or use public IP addresses directly, since the server is not behind NAT.

    Previously, I wanted to turn the project into a game distribution platform for Cloud Gaming. The idea was to allow game creators to provide games and streaming resources. And users would interact with providers directly. In this decentralized fashion, CloudRetro is just a medium for connecting third-party streaming resources to users, which makes it more scalable when hosting is no longer hanging on it. The role of WebRTC NAT Traversal is very important here to facilitate initialization of a peer-to-peer connection on third-party streaming resources, which makes it easier for the creator to connect to the network.

    Video compression

    Video compression is an indispensable part of the pipeline and contributes greatly to the smoothness of the stream. Although it is not necessary to know all the details of VP8/H264 video encoding, understanding the concept helps to understand video streaming speed parameters, debug unexpected behavior, and adjust latency.

    Compressing video for a streaming service is challenging because the algorithm must ensure that the total encoding time + network transfer time + decoding time is as small as possible. In addition, the encoding process must be consistent and continuous. Some trade-offs in encoding do not apply - for example, we cannot prefer a long encoding time over a smaller file size and decoding time, or use inconsistent compression.

    The idea behind video compression is to eliminate unnecessary bits of information while maintaining an acceptable level of fidelity for users. In addition to encoding individual static image frames, the algorithm infers the current frame from the previous and next, so only their difference is sent. As you can see from the Pacman example, only differential points are transmitted.

    Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
    Comparison of video frames using Pacman as an example

    Audio compression

    Similarly, the audio compression algorithm omits data that cannot be perceived by humans. Opus is currently the best performing audio codec. It is designed to transmit an audio wave over an ordered datagram protocol such as RTP (Real Time Transport Protocol). Its delay is less than that of mp3 and aac, and the quality is higher. The latency is usually around 5~66,5ms.

    Pion, WebRTC in Golang

    pawn is an open source project that brings WebRTC to Golang. Instead of the usual wrapping of native C++ WebRTC libraries, Pion is a native Golang WebRTC implementation with better performance, Go integration, and version control on WebRTC protocols.

    The library also provides streaming data with a lot of great built-in modules with a delay of less than a second. It has its own implementation of STUN, DTLS, SCTP, etc. and some experimentation with QUIC and WebAssembly. By itself, this open source library is a really good source of learning with great documentation, network protocol implementations, and cool examples.

    The Pion community, led by a very passionate creator, is quite lively and has a lot of quality discussion about WebRTC. If you are interested in this technology, join http://pion.ly/slack - you will learn a lot of new things.

    Writing CloudRetro in Golang

    Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
    Worker Implementation in Go

    Go channels in action

    With the beautiful design of Go's channels, the issues of event streaming and concurrency are greatly simplified. As in the diagram, there are several components running in parallel in different GoRoutines. Each component manages its own state and communicates through channels. Golang's selective assertion causes one atomic event to be processed every time in the game (game tick). This means that no blocking is needed for this design. For example, when a user is saved, a full game state snapshot is required. This state must remain continuous, logging in until the save is complete. During each game tick, the backend can only process a save or enter operation, which makes the process thread-safe.

    func (e *gameEmulator) gameUpdate() {
    for {
    	select {
    		case <-e.saveOperation:
    			e.saveGameState()
    		case key := <-e.input:
    			e.updateGameState(key)
    		case <-e.done:
    			e.close()
    			return
    	}
        }
    }

    fan-in / fan-out

    This Golang template is great for my CrowdPlay and Multiple Player use case. Following this pattern, all user inputs in the same room are built into the center input channel. The game media is then deployed to all users in the same room. In this way, we achieve the division of the game state between several game sessions of different users.

    Open source cloud gaming on WebRTC: p2p, multiplayer, zero latency
    Synchronization between different sessions

    Disadvantages of Golang

    Golang is not perfect. The channel is slow. Compared to blocking, a Go channel is simply an easier way to handle concurrent and streaming events, but a channel does not provide the best performance. Under the channel there is a complex blocking logic. Therefore, I made some adjustments to the implementation by reapplying locks and atomic values ​​when replacing channels to optimize performance.

    In addition, Golang's garbage collector is unmanageable, which sometimes causes suspiciously long pauses. This greatly interferes with the real-time streaming application.

    CGO

    The project uses the existing VP8/H264 open source Golang library for media compression and Libretro for game emulators. All of these libraries are just wrappers for the C library in Go using CGO. Some of the disadvantages are listed in this post Dave Cheney. Problems I faced:

    • inability to catch a crash in CGO, even with Golang RecoveryCrash;
    • the inability to identify a performance bottleneck when we cannot detect granular issues in CGO.

    Conclusion

    I have achieved my goal of figuring out cloud gaming services and creating a platform that helps me play nostalgic retro games with my friends online. This project would not have been possible without the Pion library and the support of the Pion community. I am extremely grateful for his intensive development. The simple APIs provided by WebRTC and Pion ensured seamless integration. My first proof of concept was released the same week, despite not being aware of peer-to-peer (P2P) communications beforehand.

    Despite the ease of integration, P2P streaming is indeed a very complex area in computer science. It has to deal with the complexity of multi-year network architectures such as IP and NAT to create a peer-to-peer session. While working on this project, I have accumulated a lot of valuable knowledge about networking and performance optimization, so I recommend that everyone try building P2P products using WebRTC.

    CloudRetro caters for all the use cases I expected from my point of view as a retro gamer. However, I think there are many areas in the project that I can improve, such as making the network more reliable and performant, providing higher quality game graphics, or the ability to share games between users. I'm working hard on this. Please follow project and support him if you like him.

Source: habr.com

Add a comment