Lead Image © pat138241, 123RF.com

Lead Image © pat138241, 123RF.com

Supporting WebRTC in the enterprise

Nexus

Article from ADMIN 45/2018
By
WebRTC has the potential to bring high-quality, easily developed, and interoperable real-time voice, video, and data communication to all manner of applications in web browsers.

Although it caused huge excitement at its announcement by Google in 2011, web real-time communication (WebRTC) has somewhat faded from the general tech consciousness. Maybe it was oversold as a done deal back at the time, when it should have been presented as the start of a long and painful journey toward inter-browser real-time communications nirvana. Although the froth has disappeared off the top, the high-quality microbrew continues to ferment, and WebRTC maintains a steady upward trajectory in terms of development, progress toward standardization, and adoption, with the goal of making rich interpersonal communications easier and more prevalent than ever before. In this article, I'll explore some of the underpinnings of WebRTC, its popular use cases, and how to solve common user problems – many of which are the same as the common problems that arise in any form of real-time unified communications – so you can help your users reap the benefits that modern WebRTC-enabled browser apps have to offer.

WebRTC

WebRTC-compliant browsers can access media devices directly on their host system and then exchange and render the resulting media tracks (audio, camera video, desktop content, and arbitrary data) directly with other WebRTC endpoints, which are normally browsers but can also be conferencing services, native apps, or gateways to other communications networks, such as a public switched telephone network (PSTN). WebRTC places much emphasis on peer-to-peer communication, wherein a browser exchanges media directly with one or many other browsers. However, an equally common scenario, especially in enterprise-grade WebRTC apps, is a client-server relationship, in which all media is routed through a central hub.

Although the client-server approach comes at the expense of optimal routing of media over the network (typically the Internet), it allows the application to provide many other benefits that enterprise applications require, including many more participants in a web conference, recording, interoperability with other services, and much tighter control over network address translation (NAT) traversal by implementing traversal using relay NAT (TURN) server functionality. Figure 1 shows a typical example of such a conference, combining WebRTC participants, hard video conference systems, and telephone callers.

Figure 1: Multipoint WebRTC video conference.

WebRTC is made up of separate IETF and W3C standards, currently in Candidate Recommendation status, with each covering a different aspect of setting up a peer-to-peer connection. The essential parts include the following.

  • MediaStream (aka getUserMedia) is "… a set of JavaScript APIs that allow local media, including audio and video, to be requested from a platform" [1]. This allows a JavaScript app to enumerate, access, and control the available cameras, speakers, and microphones on a device. MediaStream allows specific limits to be placed on the resolution of the video streams (camera and screen-share) that will be returned, which will have a significant effect on the bit rate of the media streams that are transmitted after the connection is established. getUserMedia is a W3C Candidate Recommendation by W3C's Media Capture Task Force and is a part of HTML5.
  • WebRTC 1.0 (aka RTCPeerConnection) is the API used to control the browser's WebRTC peer-to-peer connections. This is what a JavaScript app uses to initiate and conduct a real-time call containing video, audio, and screen-sharing streams [2]. RTCDataChannel is the equivalent API for sending arbitrary data streams between browser peers, with file sharing being a typical example.
  • Real-time communication in web browsers (RTCWEB) is not an API, but the protocol that two WebRTC endpoints use to set up a call with each other – when instructed to do so by the WebRTC API. It covers signaling, NAT traversal, media capability negotiation, bandwidth and rate control, and security [3].
  • JavaScript Session Establishment Protocol (JSEP) is a subset of RTCPeerConnection and is used for exchanging the offer/answer Session Description Protocol (SDP) and Interactive Connectivity Establishment (ICE) protocol of RTCWEB between the browser's WebRTC implementation and the JavaScript app. This approach is necessary because RTCWEB relies on the application's signaling channel (whose protocol and transport type are not specified by any part of WebRTC) to carry SDP/ICE data from one browser to another [3]. SDP is used for negotiating the video and audio codecs that the browsers will use, and ICE is for determining to which IP address/port combination each browser will send its media flows. The JavaScript app uses JSEP to "ask" WebRTC for the information when needed (and then sends it to the far end through the signaling channel), and vice versa, to "give" it to WebRTC when it is received from the far side.

WebRTC does not specify a high-level call control protocol, nor the type of network transport it should use – that part is left up to the application. An implementer of a WebRTC app that wants to interoperate with other WebRTC apps might base their signaling on an existing standard like the Session Initiation Protocol (SIP) or XMPP/Jingle (and, in fact, the SDP information that RTCWEB generates is designed to be interoperable directly with "normal" SIP calls). More commonly, given that interoperation between different WebRTC solutions is rare, WebRTC apps will simply implement their own signaling methods. Figure 2 shows how the getUserMedia application, WebRTC, and RTCWEB interoperate within and between browser peers.

Figure 2: WebRTC and RTCWEB in the browser.

What WebRTC Can Do

By their very nature, WebRTC applications are delivered in the form of web pages and JavaScript programs, which means your users are most likely accessing them and using them of their own accord without you knowing about it. Popular applications based on WebRTC include Google Hangouts, Facebook Messenger, Amazon Chime, Appear.in, and GoToMeeting's browser implementation [4]. Google recorded a 45% year-on-year increase in the use of Chrome's WebRTC features in 2017, at a run rate of 1.5 billion minutes in a week. This number is huge but still miniscule compared with the 3 billion minutes of Skype calls that are made each day!

WebRTC is the only (currently) viable inter-browser real-time communication solution; it is easy to use in development and offers free access to secure, high-quality media codecs. One consequence of the relative ease of implementing a WebRTC solution is the appearance of many more applications that are "unified communications (UC) enabled"; that is, their fundamental purpose is not general communications, but to provide a specialized app for some other purpose, such as telemedicine. As these apps increase in number and quality, it's highly likely that usage will increase toward Skype-like levels. IT departments will have to adapt to meet user demand, rather than try to manage the scale of usage.

Barriers to Successful Usage

The frustration and wasted time caused by failed video calls mean that users have very limited patience with new apps, and the nuances of firewalls and network bandwidth are (rightly) irrelevant to anyone who's simply trying to get a day's work done. No app exists in isolation – its success is hugely dependent on its host device, operating system, and network. I will point out some of the common stumbling blocks to successful use of a WebRTC app and suggest some approaches that might help you identify and solve them. I'll present these in five categories:

1. Device availability – enumeration and permission for cameras, microphones, and speakers.

2. Browser capabilities – support for the WebRTC APIs and RTCWEB, with or without plugins.

3. Signaling channel – connection to the signaling server for application call control and SDP/ICE exchange.

4. Discovery – NAT traversal possibilities: direct, session traversal utilities for NAT (STUN), and TURN.

5. Network bandwidth – media flowing from endpoint to endpoint; bandwidth availability and rate control.

Many of the troubleshooting considerations are generic to any application that requires users to communicate using their PCs.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus