HDR/WCG Live Distribution Challenges
HDR/WCG Live Distribution Challenges
HDR/WCG contents
By Julien Le Tanou and Michael Ropert (November 2018)
We understand that for many in the industry, the move from SDR to HDR may seem like a cumbersome task. While, working
alongside our industry partners and clients, MediaKind has gained extensive experience regarding efficient HDR/WCG
distribution. Playing a part in our client’s journey to HDR/WCG provided us with a unique opportunity to acquire an in-depth
perspective of how to resolve issues stemming from the SDR to HDR shift. It has helped inspire new developments in our
products and solutions, and more specifically to the MediaKind SW encoding product line to meet the needs of running an
HDR/WCG live distribution chain. We invite you to explore the context of HDR/WCG with us, and to take a closer look at how
transitioning to HDR/WCG can benefit you.
[Link]
The HDR/WCG context
End-users have high expectations and demand the same level of quality across multiple systems and devices. Yes, the transition to HDR/WCG
can pose a certain number of challenges to broadcasters and providers who are expected to maintain legacy systems, and provide content for new
displays, while ensuring the same level of quality. However, there are now effective solutions available to help face those challenges and meet
the requirements for HDR/WCG evolutions. Before we can fully explain the benefits of transitioning, let’s take a moment to define HDR and WCG.
Evolution to HDR
What is HDR?
High Dynamic Range (HDR) imaging aims to capture, and reproduce, the entirety of the dynamic range of visible light perceivable by the human eye,
and thus, improve the perceived image quality.
As illustrated in Figure 1, the visible light range varies between 10-6 to 108 nits. The visible light range thus offers 14 orders of luminance magnitude.
The human vision system (HVS), or more simply put, the human eye, uses a certain number of mechanisms to adapt our eyesight to the range of
perceived light, allowing us to perceive the entirety of the dynamic. Without these adaptations, the human eye can only instantaneously perceive 5
orders of luminance magnitude, and only if those orders range between 10-6 to 108 nits.
Today, HDR devices can capture or render about 5 orders of luminance magnitude. In contrast to HDR, and as depicted in Figure 1, the Standard
Dynamic Range (SDR) displays, and capture devices, are only able to process a total of 3 orders of luminance magnitude.
The main benefit of HDR systems is the ability to process a much larger scale of luminance than SDR systems (5 vs 3 orders) to improve the
range of light perception.
Figure 1 : Light sources and associated luminance levels showing the dynamic ranges
of the human eye in relation to SDR and HDR systems
Studies show that differences in luminance are not perceived with the same intensity in low or high lights. HVS is much sensible to difference in low
lights. This phenomenon can be described using a Threshold vs Intensity (TVI) non-linear function [7]. It describes the just noticeable difference of
luminance and chrominance that can be detected by the HVS. In the context of digital TV transmission, this TVI function is referred to as an OETF
(Opto-Electronic Transfer Function), and must be applied prior to the digitization or quantization of the video signal.
For any video systems, the digitization or quantization of the signal impacts the perceived quality, and the related bit-depth precision must also be
optimized accordingly to prevent from visible artifacts.
The power-law, also called the gamma-correction (or inverse gamma function), as specified in ITU-R BT.709 or in ITU-R BT.1886 [5], has been defined
as the OETF function for the luminance range of SDR system. Figure 5 illustrates the differences between quantization with the inverse gamma OETF
and without. These diagrams show that linear uniform quantization does not result in uniform perceptual error. Quantization errors are more visible
in the dark areas. However, with the inverse-gamma OETF, we obtain a uniform perceptual error for the SDR luminance range. It is noted that, at the
display side, we apply an EOTF (Electro Optical Transfer Function = OETF –1) to retrieve and display the relative luminance values, which means
having to turn-back digital code values into visible display light.
Figure 6 : SDR Luminance range quantization with varying bits (or bit-depth). Based on your display capacity
in luminance range, the luminance steps have varying visibility.
The same applies for color components, where bit-depth needs to properly span the color gamut in order to remain below the visible threshold of
the HVS.
HDR system requires changing both OETF/EOTF and bit-depth to optimize the perceived quality of the HDR digitalized signal to transmit, as further
detailed in ITU-R BT.2390 [3] and ITU-R BT.2100 [4].
ITU-R BT.2100 specifies a bit-depth of 10 or 12 bits along with the introduction of two new OETF/EOTFs: Perceptual Quantizer and Hybrid-Log
Gamma.
For sake of comparison, shapes of each transfer function (and corresponding OETF/EOTF) are plotted in Figure 7, showing the mapping
between the relative scene, or displayed light from/to 10-bit digital code values.
OETF: The opto-electronic transfer function, which converts linear scene light into the video digital signal (code values),
typically within a camera.
EOTF: The electro-optical transfer function, which converts the video digital signal (code value) into the linear light output
of the display.
Figure 7 : Transfer functions for SDR (Rec.709/purple), HLG-HDR (Hybrid/green) and PQ-HDR (ST2084/red) systems.
SLog3 (in blue) is a proprietary transfer function from Sony used in Camera or for production (not related to distribution)
Resolution Up to 8K Up to HD 480i/576i
Color range Up to BT.2020 color gamut Up to BT.2020 color gamut Up to BT.709 color Gamut
Broadly speaking there are two format types: scene referred, and display referred. A scene referred format describes how the light, captured by
the camera (or output from the production process) is translated into the values stored in the container. A display referred format describes how
the values in the container may be converted into light to be emitted from the display.
Today, the three main formats for HDR coding and distribution are: PQ10, HLG10 and HDR10. They are all relying on a Y’CbCr container, [Link] sub-
sampling and 10 bit-depth precisions.
Main formats:
PQ10: display referred signal, using PQ transfer function, Rec.2020 color gamut, Y’CbCr container, [Link] sub-sampling and 10 bit-depth
precisions.
HLG10: scene referred signal, using HLG transfer function, Rec.2020 color gamut, Y’CbCr container, [Link] sub-sampling and 10 bit-depth
precisions.
All these static (set once per production) metadata are used for adapting (mapping) signal (referred to a display used at the production) to the
end-user display. HDR10 media profile has been initially specified by CTA for Blue-ray support.
HDR10+: HDR10 as baseline + dynamic metadata defined in the SMPTE 2094-1 and SMPTE 2094-40 describing scene-based color volume
mapping technology from Samsung.
Dolby Vision (ST. 2094-based): HDR10 as baseline + dynamic metadata defined in SMPTE 2094-1 and the SMPTE 2094-10 describing a
parametric Tone Mapping for display adaptation from Dolby.
VUI messages are mandatory; three messages are relevant to HDR signaling:
transfer_characteristics: specifies the transfer function (i.e. OETF/EOTF info)
colour_primaries: specifies the color primaries (R,G,B) coordinates of the color gamut
matrix_coeffs: specifies matrix coefficients from RGB to Y’CbCr container conversion for a given color gamut
SEI messages are optional and may be discarded. Regarding the main formats (i.e. PQ10, HLG10, HDR10) listed above, there
are three messages relevant to HDR signaling:
1. Mastering Display Color Volume information (i.e. ST.2086 info)
2. Content Light Level information (i.e. MaxFALL/MaxCLL info)
3. Alternative transfer characteristics information: this SEI specifies a preferred transfer characteristic to be
used as an alternative to the transfer characteristic specified in VUI. It is mainly used for signaling of HLG10 with
backward compatibility to SDR BT.2020 display. Typically, this HLG10 backward compatible mode would signal
SDR BT.2020 information in VUI and HLG as preferred characteristic in SEI. Such the output stream would be
interpretable by both SDR BT.2020 display and HDR HLG10 display.
VUI
UHD-TV SDR WCG 9 (BT. 2020) 14 (Rec. 2020) 9 (BT. 2020) None
UHD-TV HDR HLG10 9 (BT. 2020) 14 (Rec. 2020) 9 (BT. 2020) preferred_transfer_characteristic = 18
UHD HDR HDR10 9 (BT. 2020) 16 (PQ) 9 (BT. 2020) SMPTE ST 2086 + MaxFALL + MaxCLL
Table 3 summarizes the signaling for the more recent UHD-TV HDR formats using dynamic metadata:
VUI
UHD HDR HDR10+ 9 (BT. 2020) 16 (PQ) 9 (BT. 2020) SMPTE ST 2086 + MaxFALL + MaxCLL +
SMPTE 2094-1 + SMPTE 2094-40
UHD HDR Dolby Vision 9 (BT. 2020) 16 (PQ) 9 (BT. 2020) SMPTE ST 2086 + MaxFALL + MaxCLL +
SMPTE 2094-1 + SMPTE 2094-10
Interfaces
HDMI:
HDMI 2.0a introduced support for signaling PQ formats and the BT.2020 color space and static metadata. (i.e. PQ10 and HDR10
support)
HDMI 2.0b added support for the signaling of Hybrid Log-Gamma [Link]-861-G [18] specifies how ETSI TS 103 433 [19][20]
metadata can be carried on HDMI.
SDI: includes the carriage of the signaling and metadata required for the HDR formats [14].
Figure 8 presents an overview of the processing pipeline to broadcast, or to stream, an HDR/WCG channel (without lacking the generalities in an SDR
channel). The first step in the pipeline is “ingest”, where input/source characteristics (the transfer function, color gamut, etc.) must be extracted and
forwarded along with the raw pixel data at each step of the encoding process. Either the input is uncompressed (SDI) or compressed (TS IP/HEVC).
For SDI, the specific fields are reserved for carrying HDR information [14]. For IP/HEVC, HDR related information are extracted from VUI/SEI when
decoding the elementary stream. The “convert” module is responsible for applying the proper mapping (w.r.t. to dynamic range and color space
information) from input format to the requested output format. After the possible luminance and color conversions, a pre-processing for input
characteristics may be applied to ease and optimize the final encoding step. Finally, encoding is processed as per the dynamic range and color
gamut.
The first example of mixed format management is related to premium channels. The premium “HDR-TV” channel requires end-users to
have a UHD HDR capable display at home. However, the reality is that content for the premium channel not only includes native
premium HDR/WCG content but also several legacy SDR BT.709 content (or possibly BT.2020) that requires to be up-converted.
Examples would be advertisements, movie and program trailers, etc. This requires mixing legacy content with HDR/WCG content.
The most common example of mixed format management is wanting to provide a level of backward-compatibility with legacy SDR
devices or systems. This is normally the case when addressing a second screen at home (e.g. tablet or smartphone), or for some end-
users that are not yet eligible for HDR/WCG.
As we have experienced with our customers, there is a need for seamless mapping (for the end-user). Each case requires a specific approach and
design. Figure 9 displays an example of the most common use-cases.
The format conversion (or mapping) has to allow for “up-conversion”, for any SDR feed segment toward a given HDR output format, as well as for
“down-conversion”, for any HDR feed segment toward SDR output format. It also must allow for converting one HDR format to another, for
example, from PQ to HLG.
The “down-conversion” of the Luminance range or dynamic is named Tone Mapping (TM)
The “up-conversion” of the Luminance range or dynamic is named Inverse Tone Mapping (ITM or TM-1)
The reduction of the color gamut is named Color gamut Reduction (CR)
The extension (or extrapolation) of the color gamut is named Color gamut Extension (CE)
The non-linear transformation from one given transfer function to another for the same Luminance range/dynamic is named
Luminance Conversion (LC)
Output signal
HLG10 HDR10/PQ10 SDR BT709
Input signal
Schematically, the basic idea of these conversions is to return to the linear light domain, meaning that we apply the OETF-1 , which is the signal
domain where luminance values are comparable, and to reapply the requested output transfer function (OETF) and related signaling. These
transformations are described in the ITU-R BT.2100 recommendation. More details and explanations can also be found in the ITU-R BT.2390
recommendation.
HLG has been designed to provide backward-compatibility with an SDR BT.2020 10 bit signal. If compared with the same color gamut and
bit-depth, then SDR and HLG share the same transfer function (converting light to numerical values in 10 bits). This means that HLG can
natively interpret SDR values. Consequently, only a signaling change would be required to identify SDR values as HLG ones.
SDR to PQ conversion requires a nonlinear operation. Similarly, to the HLG to PQ conversion, SDR is brought back to the linear light domain
(OETF-1), assuming the pick luminance is 100 cd/m2. Then, an ad-hoc transfer function (OETF) and associated signaling are applied to
produce a PQ signal.
In addition to the SDR and HLG, or SDR to PQ conversions, a light (luminance) expansion can be applied to provide an HDR-look to the legacy
content. This means that a non-linear transformation is applied to original values. However, we suggest avoiding a light or luminance expansion,
from a content provider perspective, because it modifies the artistic intent of the content creator.
Pre-processing and the local adaptive quantization are the two main levers to improve the video quality of HDR compressed images, and they
must be advantageously utilized in any industrial implementation.
Ingest
Uncompressed SDI
metadata
(extraction/convey)
Compressed IP
HDR to HDR
HDR to SDR
HDR-aware adaptive
quantization Roadmap
Compression
HEVC 10-bit encoding
References
1. Recommendation ITU-R BT.709, “Parameter values for the HDTV standards for production and international programme exchange”, 2012
2. Recommendation ITU-R BT.2020, “Parameter values for ultra-high definition television systems for production and international programme exchange”,
2014
3. Recommendation ITU-R BT.2390, “High dynamic range television for production and international programme exchange”, 2016
4. Recommendation ITU-R BT.2100, “Image parameter values for high dynamic range television for use in production and international programme
exchange”, 2016
5. Recommendation ITU-R BT.1886, “Reference electro-optical transfer function for flat panel displays used in HDTV studio production”, 2011
6. F. Banterle, A. Artusi, K. Debattista, and A. Chalmers. “Advanced High Dynamic Range Imaging, Second Edition”. CRC Press, 2017.
7. D. Gommelet, “Methods for Improving the Backward Compatible High Dynamic Range Compression”, PhD Thesis, Ericsson-INRIA, 2018
8. SMPTE, Standard ST 2084, “High Dynamic Range Electro Optical Transfer Function of Mastering Reference Displays” September 2014.
9. SMPTE, Standard ST 2086, “Mastering Display Color Volume Metadata Supporting High Luminance and Wide Color Gamut Images”, 2014.
10. ARIB STD B67, “Essential Parameter Values for the Extended Image Dynamic Range Television System for Programme Production.” July 3, 2015, Available
online at [Link] STD-B67v1_0.pdf
11. “End-to-end Guidelines for UHD Phase A implementation” Available online at [Link]
12. ETSI TS 101 154 v2.3.1 Digital Video Broadcasting (DVB), “Specification for the use of Video and Audio Coding in Broadcasting Applications based on the
MPEG-2 Transport Stream”, Feb. 2017.
13. Recommendation ITU-T H.265 and ISO/IEC 23008-2, “High efficiency video coding” 2013.
14. SMPTE, Standard ST 425-1, “Source Image Format and Ancillary Data Mapping for the 3 Gb/s Serial Interface”, 2012.
15. T-F Bronner et al. “Evaluation of Color Mapping Algorithms in Different Color Spaces”, Proceedings Volume 9971, Applications of Digital Image
Processing, SPIE, 2016.
16. Recommendation ITU-T Rec H Supplement 15, “Conversion and coding practices for HDR/ WCG Y'CbCr [Link] video with PQ transfer characteristics”, 2017.
17. SMPTE Standard, ST 2094-1, ”Dynamic Metadata for Color Volume Transform— Core Components”, jun. 2016.
18. CTA Standard CTA-861-G: "A DTV Profile for Uncompressed High Speed Digital Interfaces”, Nov. 2016.
19. ETSI TS 103 433-1 v1.2.1 Digital Video Broadcasting (DVB), “High-Performance Single Layer High Dynamic Range (HDR) System for use in Consumer
Electronics devices; Part 1: Directly Standard Dynamic Range (SDR)”, Aug. 2017.
20. ETSI TS 103 433-2 v1.1.1 Digital Video Broadcasting (DVB), “High-Performance Single Layer High Dynamic Range (HDR) System for use in Consumer
Electronics devices; Part 2: Enhancements for Perceptual Quantization (PQ) transfer function based High Dynamic Range (HDR) Systems (SL-HDR2)”, Jan.
2018.
© 2018 MediaKind
MediaKind maintains a policy of product improvement and reserves the right to [Link]
modify the specifications without prior notice.