Window title: demuxed

The conference for video devs

October 16th – 17th, 2024
Regency Ballroom, San Francisco


Ticket
Buy tickets now!

Buy tickets

Mail
Join our mailing list


Megaphone
Want to sponsor? Get in touch

Loudspeaker
Speakers
Map pin
Venue
Question mark
Why us?
Demuxed logo
About us
Heart
Sponsors
Calendar
Schedule
Window title: demuxed > why attend?

Why attend?

No marketing, ever.

Speakers are selected based on their submission, not how much money their company paid; we will never, ever sell a speaking slot. Attendee information isn’t for sale either, and that includes any sponsors.

Affordable

We want anyone in the industry to be able to come, which means keeping tickets reasonably priced (thanks largely to our generous sponsors). We also offer free and discounted tickets to students and open source distributors, so please reach out if you’re interested.

For everyone in the community

Our community is dedicated to providing an inclusive, enjoyable experience for everyone in the video industry. In this pursuit, and in keeping with our love for reasonable standards, we adopted the Ada Initiative’s code of conduct.

Window title: demuxed > photos
Window title: demuxed > photos
Window title: demuxed > photos
Window title: demuxed > photos
Window title: demuxed > photos
Window title: demuxed > speakers

Speakers

Alex Field

Alex Field

Sky/NBCU

Talk Overview

Alex Giladi

Alex Giladi

Comcast

Talk Overview

Anand Vadera

Anand Vadera

Meta

Talk Overview

Bruce Spang

Bruce Spang

Netflix

Talk Overview

Constanza Dibueno

Constanza Dibueno

Qualabs

Talk Overview

Derek Buitenhuis

Derek Buitenhuis

Vimeo

Talk Overview

Eric Tang

Eric Tang

Livepeer

Talk Overview

Fabio Sonnati

Fabio Sonnati

NTT Data

Talk Overview

Gwendal Simon

Gwendal Simon

Synamedia

Talk Overview

Jan De Cock

Jan De Cock

Synamedia

Talk Overview

Jason Cloud

Jason Cloud

Dolby Laboratories

Talk Overview

Jeff Riedmiller

Jeff Riedmiller

Dolby Laboratories

Talk Overview

Jill Boyce

Jill Boyce

Nokia

Talk Overview

John Bowers

John Bowers

Twitch/Amazon IVS

Talk Overview

Jon Dahl

Jon Dahl

Mux

Talk Overview

Katerina Dobnerova

Katerina Dobnerova

CDN77

Talk Overview

Li-Heng Chen

Li-Heng Chen

Netflix

Talk Overview

Luke Curley

Luke Curley

Discord

Talk Overview

Matteo Naccari

Matteo Naccari

Visionular

Talk Overview

RongKai Guo

RongKai Guo

NVIDIA

Talk Overview

Ryan Cunningham

Ryan Cunningham

Scenery

Talk Overview

Ryan Lei

Ryan Lei

Meta

Talk Overview

Steve Robertson

Steve Robertson

YouTube

Talk Overview

Tanushree Nori

Tanushree Nori

Vimeo

Talk Overview

Thomas Edwards

Thomas Edwards

Amazon Web Services

Talk Overview

Tony McNamara

Tony McNamara

Paramount Streaming

Talk Overview

Tracey Jaquith

Tracey Jaquith

Internet Archive

Talk Overview

Vanessa Pyne

Vanessa Pyne

Daily

Talk Overview

Walker Griggs

Walker Griggs

Mux

Talk Overview

Wei Wei

Wei Wei

Netflix Inc

Talk Overview

Will Law

Will Law

Akamai

Talk Overview

Yingyu Yao

Yingyu Yao

YouTube

Talk Overview

Yuriy Reznik

Yuriy Reznik

Brightcove, Inc.

Talk Overview

Zoe Liu

Zoe Liu

Visionular

Talk Overview

Window title: demuxed > venue and location

Venue & location

The Regency Ballroom
1300 Van Ness Ave.
San Francisco, CA 94109

The Regency Ballroom is a beautiful, centrally-located San Francisco event venue.

According to their website, the building is noted as a fine example of Scottish Rite architecture. Its ballroom is a beaux-art treasure with thirty-five foot ceilings and twenty-two turn-of-the-century teardrop chandeliers.

According to one intrepid online reviewer, “Took my son to a death metal concert here and it was awesome!” …so, you know it's gotta be good.

Map pin
See map
Window title: demuxed > photos
Window title: demuxed > photos
Window title: demuxed > photos
Window title: demuxed > photos
Window title: demuxed > schedule

The Schedule

9:40 AM PDT

Matt McClure

Matt McClure

Demuxed

Opening Remarks

Tanushree Nori

Tanushree Nori

Vimeo

Budgeting Bytes: Acing Cost-Efficient Video Storage

In today's world, where data never stops growing, Vimeo is at the forefront, cleverly slashing storage costs while keeping videos readily accessible. In my talk, I’ll peel back the curtain on how we fine-tune cloud storage using Machine Learning, balancing cost savings with cheap and quick video access at Vimeo. We’ve cut our storage bills by an impressive 60% by applying smart lifecycle policies and a dash of machine learning methods. I'll share insights on how we determine the best times to tuck away older videos into cheaper storage tiers and what factors go into these decisions. This talk will offer practical strategies and a peek into the tools that help Vimeo manage a sprawling video library efficiently. Discover how these innovations can help reshape your approach to data storage too!

Read more

Alex Field

Alex Field

Sky/NBCU

The Colorful Truth of Automated Tests

Trying to automatically test what the end user actually sees and hears on their streaming device is hard - very hard. Automated testing methods often rely on unreliable data from player APIs, leading to inaccurate results. This talk aims to showcase our journey of how we experimented with content encoded with visual and audio queues to validate that our player APIs are really telling the truth about what the user is seeing.

Read more

10:40 AM PDT

Break

11:15 AM PDT

Walker Griggs

Walker Griggs

Mux

PSSH, or the Primordial Soup of Secure Headers

Consider our friendly, neighborhood PSSH box. The semantics are simple -- to identify encryption keys -- but, as with any permissive specification, there’s a lot more going on than meets the eye. In some cases, they contain deeply nested little-endian UTF16 XML. In others, we’ll find protocol buffers containing base64-encoded JSON. In all cases, they have surprising amount of personality. In this talk, we will dive deep into several PSSH boxes, dissecting them bit by bit across various popular DRM schemes. Along the way, we will: 1. Explore the history of the PSSH box and how it mirrors the evolution of DRM standards. 2. Discover how each provider has imparted their own company idioms onto the loosely-defined PSSH payload. 3. Identify where the decisions of one provider impacted the rest.

Read more

Eric Tang

Eric Tang

Livepeer

Progress and Opportunities in Video Content Authenticity

In an era where AI-generated video is rapidly becoming the norm, the need for video content authenticity has never been more critical. Over the past year, we've witnessed significant strides in this area, with industry giants like OpenAI, Google, and BBC joining the Coalition for Content Provenance and Authenticity (C2PA) and committing to integrate this technology into their core products. Join us in this enlightening session as we dive into C2PA’s significant technical advancements over the past year, and map out a practical approach for implementing C2PA in any video application. Discover the intricacies of C2PA’s trust model and understand how it safeguards users on video platforms. We'll also cover essential implementation considerations, from video player UX to backend video workflow management. As long standing members of the Content Authenticity Initiative (CAI) and a key contributor to C2PA, we bring a wealth of experience from participating in weekly working groups and shaping the last two versions of the C2PA specification. Our expertise is backed by numerous workshops and presentations at leading conferences and industry events like NAB and C2PA symposium.

Read more

Jason Cloud

Jason Cloud

Dolby Laboratories

Jeff Riedmiller

Jeff Riedmiller

Dolby Laboratories

Does a multi-CDN setup (truly) requiring switching? Deploying an anti-switching multi-CDN delivery platform.

It seems pretty clear that using multiple CDNs to deliver media is a good thing; but it’s hard to do effectively. What is the best policy to use? How do you determine when to switch? How often do you switch? Do you switch based on client performance alone, consolidated user metrics, or something else? What happens when the CDN you switched to isn’t performing as good as you thought it would? Answering these questions (let alone designing a multi-CDN switching architecture) is enough to give anyone a headache. What if we throw out “switching” by downloading media from multiple CDNs at the same time? We could then realize the best performance by merging the performance of each. Seems simple enough until you start trying to do it efficiently. Do you race media from multiple CDNs at the same time, or do you try to perform sub-segment/byte-level scheduling? This seems even more complicated than before! This talk will focus on how to implement and deploy a switchless multi-source media delivery platform that is highly performant and efficient which avoids having to answer these difficult questions or solving massively complicated scheduling problems. Enabling true multi-source delivery without all the fuss requires us to do something a little bit unique to the content we are streaming. We first create multiple “versions” (one for each source aka. CDN) of each and every HLS or MPEG-DASH segment. This is done by packaging these segments into something called the Coded Multisource Media Format (CMMF), which is currently being standardized in ETSI. CMMF is essentially a container that is used to communicate network encoded content (network coding is kind of like forward error correction (FEC), but not – we’ll expand upon this more during the talk). Each CMMF version is then cached on a different CDN. Now let’s say a media player wants to download a particular video segment. Instead of downloading the entire segment from one CDN or requesting bytes x1 through y1 of that segment from one CDN and x2 through y2 from another, the media player requests multiple CMMF versions of that segment from different CDNs at the same time. Once the player receives enough data (the amount required is very close to that of the original segment) from the collection of CDNs, it can stop the download and recover the segment it wanted. By doing it this way, we don’t have to worry about hiccups (like temporary congestion or slow response times) on one CDN because we will just download more from the others. During the talk, we will introduce CMMF, the core concepts behind it, as well as go over how we deployed it within a streaming service with 20+ million subscribers world-wide and streamed approximately one million hours of content using it. We will also provide performance data that shows how CMMF-enabled multi-CDN delivery stacks up against a popular multi-CDN switching approach (as you can guess, it stacked up well). We hope this talk provides the audience with a different perspective to an “age-old” problem and inspires them to explore multisource delivery in greater detail.

Read more

Thomas Edwards

Thomas Edwards

Amazon Web Services

Video Processing on Quantum Computers

Quantum computing (QC) utilizes quantum mechanics to solve complex problems faster than on "classical" computers. QCs available today are considered "Noisy Intermediate-Scale Quantum" (NISQ) computers with a small number of quantum bits (qubits) and limited performance due to short coherence time and noisy gates. QCs are improving all the time, so it is possible that in the future they could provide practical acceleration to video processing workflows (remember how neural networks were in the 1990's?). This presentation will give a short overview of QC basics, results of representing (simple) images on an actual cloud-accessible QC, and will describe some research on potential video processing applications of QCs. [Note: I've timed that this can be presented in 20 minutes]

Read more

12:30 PM PDT

Lunch

1:45 PM PDT

Katerina Dobnerova

Katerina Dobnerova

CDN77

Enhancing CDN Performance and Cutting Egress Costs in Large Video Libraries Delivery: Advanced Caching Strategies and Edge Computing Optimization

During the 20 minutes of my presentation, users worldwide will generate content equivalent to the volume created from the dawn of civilisation until 2003. The volume of content being created today is staggering. Consider this: from the beginning of recorded history until 2003, we produced roughly 5 exabytes of content. However, projections suggest a monumental leap to 147 zettabytes in 2024 alone, with video content leading the charge. With such exponential growth in content and its shortening life span, content delivery networks (CDN) face significant challenges in effectively caching large video libraries. While cache hit rates of 98% and higher are taken for granted, the figures above suggest that simple disc space inflation is not remotely enough to keep the cache hot ratio at the desired figures. This presentation explores many approaches, including tiered cache systems which use a hierarchical system of caching servers employing consistent hashing and other techniques to maximize scalability and performance while minimizing failover and downtime. It also covers one-hit-wonder elimination, utilizing simple counters to reduce cache pollution by avoiding storing unpopular content. It also addresses cache-state sharing, which employs Bloom-filter-based technology to further improve cache scalability and effective disk space utilization. Moreover, it will examine the deployment of edge computing to amplify caching efficiency in specific use cases.

Read more

Yingyu Yao

Yingyu Yao

YouTube

Your TV Is Eating Your Frames

At YouTube, we aspire to stream cat videos to everything that has a screen, including the largest of them all: TVs in your living room. Despite being devices engineered to be video playback powerhouses, it is unexpectedly difficult to make videos play consistently and smoothly on them. From the lens of a player engineer, I will take you on a shallow dive through the TV media stack, and we will explore different ways a playback can get tripped on those large screens.

Read more

Alex Giladi

Alex Giladi

Comcast

Ads and overlays

The concept of server-guided ad insertion (SGAI), first introduced by Hulu in 2019, is getting increasingly popular in the industry. It is markedly more scalable than the traditional server-side (SSAI) approach, but nearly as resilient. It is more interoperable and more resilient than the client-side (CSAI) approach but is nearly as efficient and versatile. Client-side graphic overlays are to a degree a reincarnation of the banner ads plaguing the web since the '90's. Their main use is not necessarily ad-related -- they are used in a variety of roles from station identification to localization to emergency notification. Their traditional implementation in the video was inserting them in baseband (i.e., pre-transcoder) in a playout system, which is the least scalable and the highest-latency approach possible in the video world. The streaming ecosystem has standardized and maturing support for SGAI. Interstitials are used to implement the approach in HLS. XLink was used in the original MPEG DASH implementation of the approach; however, XLink suffers from a number of design flaws and was never widely implemented in context of live channels and events. Media Presentation Insertion, a recent addition to MPEG DASH, revisits this concept and allows spawning a new media presentation while pausing the main channel. As opposed to HLS interstitials, media presentation insertion allows asynchronous termination ("return to network"), supports VAST tracking, and more. The same server-guided model can be applied to the overlay use case and has a potential to improve scalability, targeting, and glass-to-glass latency in a dramatic way. This talk will first describe the new MPEG-DASH media presentation description approach and its application to SGAI and blackouts. It will then cover the application of the same principles to the graphic overlays in MPEG-DASH. This presentation will conclude with a description and a demo of an open-source implementation of both technologies.

Read more

2:50 PM PDT

Break

3:10 PM PDT

John Bowers

John Bowers

Twitch/Amazon IVS

Free ABR Renditions for User Generated Content Platforms

Well, not exactly free - but much, MUCH lower cost than server-side transcoding! Providing an ABR ladder is table stakes for live viewer experiences, but it’s expensive for at-scale video platforms to provision and maintain specialized infrastructure to handle peak transcoding demand. A recently developed update to the Enhanced RTMP specification adds multitrack video, multitrack audio, and advanced codec support. With implementations in OBS Studio and Wireshark, the technology is ready for you to adopt it. Now you can offer all creators - regardless of audience size or creator - ABR playback. Come and learn why encoding multiple video tracks on the content creator’s machine at the edge is higher quality, lower latency, more scalable compared to server-side transcoding – all while allowing faster innovation and deployment of newer codecs like HEVC and AV1.

Read more

Ryan Cunningham

Ryan Cunningham

Scenery

WebCodecs vs. WASM for Fast Video Scrubbing

We built a web-based video editor capable of fast scrubbing and advanced WebGL compositing features. This talk explores the intricacies of building such an editor using WebCodecs for video decoding and preview and contrasts it with traditional methods, specifically using HTML video, or a WASM H264 decoder. The goal is to provide a comprehensive guide on implementing a high-performance video editor preview that leverages modern web technologies while addressing practical challenges and limitations, and also reveal areas where improvement is needed. HTML video elements, while widely used, pose significant limitations for fast scrubbing and precise frame accuracy. Slow seeking, lack of control over frame rendering, and the need to use drawImage to get frames into a WebGL texture can hinder the perceived speed in a video editor. WebCodecs provides a low-level API that allows developers to decode video segments and render them to textures, enabling extremely fast scrubbing and WebGL compositing directly in the web browser. By holding video data in GPU textures, we achieve advanced features such as alpha-transparency using just the H264 decoder. The talk will dive into the implementation details, showcasing pre-loading and garbage collection techniques. We will also discuss the pipeline nature of WebCodecs decoders, which necessitates efficient management of VideoFrames to maintain performance. Despite its advantages, WebCodecs comes with its own set of challenges. The hardware-based implementation means no actual concurrent decodes, and rendering VideoFrames to textures is surprisingly CPU-intensive. Additionally, the performance can be inconsistent across different hardware due to Google's GPU exclusion list in Chrome, which defaults to software decoding on certain computers. This session will cover mitigation strategies, including conducting test decodes to determine performance viability. We will discuss the trade-offs and potential pitfalls of using WebCodecs. Before the advent of WebCodecs, our approach involved using a WASM-compiled H264 decoder, tinyh264. Using WASM in Web Workers, we achieve true concurrent decoding. However, it comes with its own set of limitations. Running entirely on the CPU, it requires managing frames in main memory and handling the upload to the GPU, alongside color space conversions from YUV to RGB. Furthermore, it creates licensing issues since it distributes an H264 decoder. We will discuss the implementation details, performance considerations, and how it compares to WebCodecs.

Read more

Bruce Spang

Bruce Spang

Netflix

Wei Wei

Wei Wei

Netflix Inc

Innovate Live Streaming with a Client Simulator

One of the major challenges in live streaming is the scarcity of real-world events to test innovative ideas and algorithms, such as adaptive bitrate (ABR) algorithms. Relying on actual live events for testing not only prolongs the innovation cycle but also increases the risk of negatively impacting user experience. To overcome this obstacle, we at Netflix have enhanced our existing client simulator to emulate live streaming scenarios. This simulator utilizes network traces and device characteristics gathered from real-world sessions to drive our production client library. We will delve into the specifics of how this simulator operates during our presentation. In summary, the client simulator plays a crucial role in driving innovation at Netflix, which we will explore in detail during our presentation. In this talk, we will first present how the client simulator simulates live streaming. Then we will demonstrate how it can be used to test new live encoding methods, like Variable Bitrate (VBR) encoding, and to evaluate various ABR algorithms on a large scale. We will conclude the talk with future directions.

Read more

4:00 PM PDT

Break

4:40 PM PDT

Gwendal Simon

Gwendal Simon

Synamedia

Token Renewal: Keeping your Streaming Party Smooth and Secured

CDN leaching is a growing concern for content providers. The recent specification of the Common Access Token (CAT) has introduced a vehicle for designing more secure streaming delivery systems. Best practices for CDN content protection often involve renewing the token, either due to short expiration times or probabilistic rejections. However, token renewal is far from trivial. In token-based delivery systems, we identify three key entities: the client, the CDN server, and the token generator. Typically, these communicate via HTTP(S). At any point during a streaming session, the CDN server may request the client to renew its token, ensuring seamless video playback, even for low-latency ABR streaming. The CAT specification includes two claims related to renewal: catr and catif. While the specification details several operation modes, none fully satisfy the combined requirements for fast renewal, legacy clients, and the unique characteristics of DASH and HLS. In this talk, we will unpack the current situation, presenting the pros and cons of each proposed solution. We aim to open the door to a better solution and outline the community effort needed for its implementation.

Read more

Will Law

Will Law

Akamai

Creative Monkeys Contemplate Dating

The geeky primates at WAVE are releasing version 2 of the popular CMCD standard . While CMCD v1 was restrained to a CDN (data) relationship, v2 gives you three different modes for concurrently sharing data. Now you can date a content steering service, and an analytics service, at the same time as maintaining a committed relationship with your CDN :) This talk highlights the new features and capabilities of CMCD v2. In addition to the reporting mode enhancements, we'll investigate the host of news keys being offered: media start delay, target buffer length, buffer starvation duration, prefetching multiple objects at once, player state, response code, TTFB, timestamps, request URLS and many more. We'll explore how v2 can be used to drive lightweight data for content steering decisioning, rich collection for analytics providers that is decoupled from the delivery and even improved prefetching performance and visibility for the CDN. We'll show it all working and release some code so that you too can experiment. Join us!

Read more

RongKai Guo

RongKai Guo

NVIDIA

Zoe Liu

Zoe Liu

Visionular

AI Enhanced GPU Video Coding: Achieving Joint High Compression Efficiency and Throughput

We are here to present a novel approach to significantly boost video compression efficiency on Nvidia NVENC hardware encoders, by leveraging AI-driven pre-analysis and pre-processing algorithms. We refer to this method as AI Enhanced GPU Video Coding, which combines Nvidia NVENC's high density, low latency, and high throughput with ML-based techniques to enhance video compression efficiency and boost visual quality, while maintaining high throughput. NVENC, as a leading hardware-based encoder, excels in providing high throughput and low latency but generally offers lower compression efficiency compared to CPU-based software encoders. Our AI-driven GPU video compression approach aims to leverage the advantages of both NVENC and AI algorithms to achieve high compression efficiency and throughput performance. Our optimization algorithms mainly include: 1. ML-based Scene & Region Classification: Identifying effective coding tools based on scene and region classification. 2. Regions of Interest (ROI) Identification: Focusing on perceptually significant regions, such as faces and jersey numbers in typical sports videos. 3. Pre-processing Techniques: Applying deblurring, denoising, sharpening, contrast adjustment, etc. to boost up visual quality. 4. Hierarchical pre-analysis and pre-classification: Setting fine granular QPs, including block-based QPs, and enabling quick quality monitoring. These techniques combined improve video compression efficiency, boosting both objective and subjective quality while achieving significant bitrate savings. We have applied these methods to large UGC content platforms. Our results demonstrate promising improvements in compression efficiency for both VOD and live use cases. Using the NVIDIA T4 Tensor Core, we maintained the same high throughput for multiple parallel encoding threads and achieved a 15-20% bitrate saving and a 1-2 VMAF score improvement at the same time, on typical UGC & PUGC content compared to the out-of-the-box NVENC approach. Further enhancements, such as re-encoding, are currently being developed and further compression gains are expected.

Read more

Yuriy Reznik

Yuriy Reznik

Brightcove, Inc.

Streaming in 1970s. NVP & ST: the very first real-time streaming protocols.

In this talk we will go back in history and look at the very first protocols and systems developed for internet streaming. The venerable NVP (network voice protocol) and ST (Internet Stream Protocol, aka IP v5) protocols developed by Danny Cohen, Jim Forgie, and other brilliant engineers at MIT Lincoln labs in 1970s. We will discuss the key ideas introduced by these protocols (the concepts of sessions, available capacity assessment, rate negotiation between sender and receiver, data transfer protocols, the need for network-layer support for sessions, resource provisioning, etc.) and show how most of these ideas become incorporated in subsequent designs. Specifically, we will show how many ideas introduced in NVP and ST have eventually found their implementations in modern protocols, such as WebRTC, QUIC and MOQ. The talk will include many historical pictures and some videos of those early pioneering systems build in 1970s. It will also try to explain why and what motivated these original developers to come up with all these techniques.

Read more

Matt McClure

Matt McClure

Demuxed

Closing Remarks

Window title: demuxed > we are 10

10 years of Demuxed!

Demuxed is video engineers talking about video technology

Our first meeting was a single day event back in 2015, born out of the SF Video Technology meetup. The video industry had plenty of trade shows and other opportunities for The Business, but our goal was to create a conference and community for the engineers building the technology powering video, from encoding, to delivery, to playback, and beyond. We’ve grown a lot since then, but our goal remains the same.

After creating Demuxed, some of the organizers went on to start and work at Mux. Mux continues to sponsor most of the organizational work behind the scenes (thanks for the salary!), but Demuxed is, at its core, a community-led event.

Every year we get a group together that’s kind enough to do things like schedule planning, help brainstorm cool swag, and, most importantly, argue heatedly over which talk submissions should make the final cut. These folks are the ones hanging out in Video Dev Slack, and they hail from all over the industry.

Window title: demuxed > photos
Window title: demuxed > illos
Window title: demuxed > photos
Window title: demuxed > sponsors

Our sponsors

We thank all our amazing sponsors!