What is WebRTC and how are we scaling it?

·

6 min read

What is WebRTC and how are we scaling it?

1.png

Huddle01 is building the next generation of internet communication by aiming to improve the current tech stack by keeping the decentralizing approach in mind. Not only this, we plan to revolutionize the current internet communication landscape!

Coming to WebRTC, it was originally designed by Global IP Solutions (or GIPS), a company founded around 1999 in Sweden. Google acquired GIPS in 2011 and open-sourced the WebRTC technology. Since then, different browsers like Mozilla, Chrome, and Opera show great support for WebRTC.

What is webRTC? 🤔

2.png

Without wasting any time, let’s start with what webRTC is. WebRTC is an acronym for Web Real-Time Communication. According to Wikipedia, “WebRTC is a free and open-source project providing web browsers and mobile applications with real-time communication via application programming interfaces.”

Breaking it down in layman’s terms, WebRTC is an open-source program that enables Peer-to-Peer communication between two separate devices via the Internet. These can be mobiles, tablets, and PCs. It is helping projects integrate audio/video services into their Applications. Imagine webRTC as the hard work in making a pickle, once it is made, everyone who has it can use it in their recipes as they please.

WebRTC working ⚙️

3.png

WebRTC working is quite complicated, but to very simply break it down to you, we can say WebRTC has 4 steps in establishing a connection and these are followed strictly in the given order:

Signaling 📡

This is pretty intuitive for understanding how this works. You need to find someone to connect with, you’ll send a signal to search and notify if the recipient is present. Think of it as ordering your favorite shoes from Amazon, first amazon check if they are available and if they are they send you the shoes!

Connecting 🤝

The two WebRTC Agents now know enough details to attempt to connect to each other. WebRTC then uses another established technology called ICE (Interactive Connectivity Establishment).

Securing 🔒

We have two sides that are connected and are communicating in real-time. As we don’t like anyone else to listen to our gossip with our best friends, these connections must be secured by a third-party attacker. This is done by DTLS (Datagram Transport Layer Protocol) and SRTP(Secure Real-time Transport Protocol) which predates WebRTC!

Communicating 🗨️

Now the connection is established and secured, the only thing is to share data in real-time. This is again done by pre-existing protocols like RTP and SCTP. I know this is a bit vague but think of it as you are playing cards with your friend, real-time data sharing essentially means you can see what cards your friend is playing without delays. Thus just connection establishment isn’t enough, real-time data sharing is the key alpha here!

Problems with WebRTC 😢

4.png

WebRTC is truly a great piece of innovation, but as there are 2 sides to each coin, WebRTC also has some significant issues! WebRTC isn’t actually scalable!

Wait, what? WebRTC isn’t scalable? 😲

Yup, WebRTC is great for peer-to-peer communication, but when it comes to more than 10 people trying to communicate in real time, the system can’t be used.

Why? 😞

WebRTC needs bidirectional connections to each client and as we increase the number of clients, the communication complexes maintaining so many connections are in place. Well, the problem is each node or client is connected with each of the other clients present, thus creating intensive and resource-heavy connections from one client.

5.png

Let me explain this with an example, It’s a fare! You are out with your friends and playing a game. The rules of the games are simple, anyone who is playing has to bring their own sweet and has to exchange it for the sweet of the other person.

Let’s say there are 3 friends who have Oreo, Candy, and a lollipop. So, you’ll have 2 of each to share and you’ll receive one of the other sweets each. Let’s say you are the one with Oreo then the problem arises when,

  1. You can only afford a certain number of Oreos to give away.

  2. As, the number of players increases, each would bring in a new sweet, & after a point, you can’t receive any more sweets as your bag is already full.

Here, Oreos refer to uplink and all other sweets refer to downlinks for the client connections for you.

Solution for scaling 🛠

6.png

Selective Forwarding Unit (SFU)

The most popular solution for the scalability problem. SFU reduces the total number of links by a large factor as it uses a pass-through routing system to offload streaming. Coming back to our game example, now the rules are changed. Rather than giving Oreos and receiving sweets individually from other players you give them to a newly appointed helper.

This helps in two ways,

A. You’ll need fewer Oreos for playing the game 🍪

B. The helper can also speed up the game by taking up the physical work going and exchanging the sweets 🏃‍♂️

7.png

2. MCU (Multipoint Conferencing Unit)

This is similar to SFU, but by offloading CPU-intensive work on the server. This essentially is similar to SFU but the major processing is done by the Server rather than the clients. Going back to our game, we saw SFU introducing a helper for ease of players. In the case of MCU, we can say that we get a helper which is a robot who comes to the players for sweet exchange thus even speeding up the process of exchange and enabling even more players to join in on the fun! Helper combines the sweets and then gives out an Icecream rather than you processing the candy!

The issue is,

A. Depending on the helper this much would also mean that the Helper should be at 100% efficiency all the time or it might slow down the sweet exchange 🔋

B. Any changes to the game in the later stage will be expensive 💸

8.png

There are various solutions that are being used for the scaling issues for WebRTC, but we feel these two are some of the easier to implement and maintain.

What are we using? 😎

9.png

We created our own customized SFU solution by using some open-source libraries. We developed smart algorithms that would scale the meetings and eventually optimize the server resources. The main reason for using SFU is simple, the architecture is less complicated than other solutions also it has some perks like,

Streams are separate, so each can be rendered individually — allowing full control of the layout of streams on the client side.

Since there is only one outgoing stream, the client does not need a wide outgoing channel. When we talk about decentralizing the communication system, we can implement SFU rather than MCU as servers need not be having high processing powers, rather normal or above-average hosts can just do the trick.

SFU architecture is less demanding to the server as compared to other video conferencing architecture.

That’s it, folks, these are some of the basics of WebRTC and how we plan to scale it! If you have any questions/suggestions or team-ups in mind, reach us out on Twitter or land on our discord!