CMU-CS-22-143
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-22-143

Integrating Video Codec Design and Network Transport
for Emerging Internet Video Streaming Applications

Devdeep Ray

Ph.D. Thesis

August 2022

CMU-CS-22-143.pdf


Keywords: Video, Networks, Bandwidth, Latency, Packet loss, Streaming, Cloud, Live, Low-latency, Real-time, Gaming, Virtual Reality, Augmented Reality, Compression, Codec, Congestion Control

Video streaming applications on the Internet are diverse, and have very distinct notions of the quality of experience (QoE). These distinctions require carefully designed systems and protocols in order to balance factors like video quality, delay, bandwidth utilization, and video coding performance in a manner that is appropriate for each specific application. While these trade-offs are clear cut and simple to implement for traditional video streaming applications (eg. conferencing, live TV broadcasts, video-on-demand), emerging video streaming applications like social live video streaming, cloud gaming and remote-rendered AR/VR have unique properties and demanding QoE requirements that make choosing the right trade-offs and achieving the desired QoE challenging.

Conventional video streaming systems largely treat the two key aspects of video streaming, video encoding and data transmission, as separate entities, and rely on naïvely combining techniques that have been developed independently in the field of video coding and network transport. This approach severely limits the capability to navigate the complex trade-offs required to achieve good QoE for emerging video streaming applications. In this thesis, we explore techniques that use encoder and network co-design techniques that expand the breadth of the trade-offs that can be achieved by a video streaming system, and thus, enable designs that are tailored to the demands of specific video streaming applications. Our work shows that integration of video encoding and network transport at various levels is crucial in achieving good QoE for emerging video streaming applications like social live streaming, cloud gaming and cloud AR/VR.

We first explore the space of social live video streaming (SLVS), where the key distinction from traditional video streaming applications is the presence of viewers who view the video stream at different delays. Our system, Vantage [1], dynamically optimizes bandwidth allocation across low latency video frames and selective quality-enhancing retransmissions. In the presence of bandwidth variations, Vantage enables low-latency interaction for real-time viewers, and achieves high video quality for time-shifted viewers.

Second, we explore the application space of cloud-rendered video games. Cloud gaming demand extremely low interaction latency, and very high video quality in order to achieve parity with locally-rendered applications - a challenging task when streaming over the Internet. We developed a new end-to-end video streaming architecture, called Prism, in order to improve the frame delay and video quality in the presence of transient packet loss. When a video stream is affected by transient packet loss, Prism carefully splits the available bandwidth between a low latency stream, and a quality-preserving secondary stream, where the different sub-streams address different stages of loss recovery. Prism accounts for the complex relationships between video compression bitrate and the resulting video quality in order to achieve higher video quality and lower video frame delay. Optimizing the allocation of bandwidth between the streams enables the use of aggressive loss prediction techniques, rapid loss recovery, and high quality post-recovery, with zero computational and bandwidth overhead during normal operation - avoiding the pitfalls of existing approaches.

Third, we show that existing approaches for performing congestion control in the context of emerging Internet video streaming applications like cloud gaming and cloud AR/VR severely limit the QoE. We demonstrate that the design choices made by existing congestion control algorithms severely limit their suitability for the demanding requirements of cloud streaming applications, and discuss how the complex interactions between the congestion control algorithm and the video encoder rate control mechanism have a significant impact on the video frame delay and video quality. We also discuss the challenges with testing and deploying new congestion control algorithms designed for emerging applications. We propose a tool called CC-Fuzz for automatically stress testing a congestion control algorithm in order to identify problems with the design and implementation of the congestion control algorithm, with the goal of inspiring confidence in the design of the algorithm and catching implementation bugs.

This work shows that rethinking traditional designs for video streaming with a focus on integrated video codec and network transport design enables novel QoE trade-off modalities that are able to push the QoE envelope of video streaming systems and achieve the demanding QoE goals of emerging Internet video streaming applications.

199 pages

Thesis Committee:
Srinivas Seshan (Chair)
Justine Sherry Martin
Anthony Rowe
David Chu (Magic Leap)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu