Compression Fundamentals

As display resolutions, frame rates, and color depths continue to increase, the amount of raw pixel data in a production grows rapidly. Compression is the set of strategies used to store and transmit this data efficiently — reducing file sizes, bandwidth requirements, and storage costs while preserving as much visual quality as possible.

Understanding compression fundamentals helps you make informed decisions about source formats, optimization settings, and playback reliability in WATCHOUT. Two key concepts underpin every compression workflow:

Encoding — transforming color data from its original, uncompressed form into a more storage-efficient representation.
Decoding — transforming encoded data back into pixel values that can be rendered and displayed.

During encoding, the accuracy of the original data may decrease depending on the algorithm used:

Lossless compression reconstructs the original data exactly. No information is sacrificed, but file size reductions are modest.
Lossy compression removes data that the algorithm determines to be perceptually less important, achieving much smaller files at the cost of some accuracy.

Chroma Subsampling

The human visual system perceives variations in brightness (luminance) much more accurately than variations in color (chrominance). Chroma subsampling exploits this asymmetry by storing color information at a lower resolution than brightness information, significantly reducing data volume with minimal perceptible quality loss.

To apply chroma subsampling, the pixel data is first separated into components:

Y' (luma) — the gamma-encoded luminance channel, representing perceived brightness.
Cb and Cr (chroma) — two chrominance channels that together describe the color of each pixel.

The a:b:c Notation

Chroma subsampling ratios are described using the notation a:b:c, where:

Parameter	Meaning
a	Width of the sample region in pixels (conventionally 4)
b	Number of chrominance samples (Cb, Cr) in the first row of the region
c	Number of chrominance changes between the first and second row of the region

The most common subsampling formats are:

Format	Description	Data Reduction	Quality Impact
4:4:4	Every pixel has its own unique chrominance values	None	No loss — full color fidelity
4:2:2	Two chrominance samples per row, with changes between rows	~33%	Minimal — suitable for high-quality production
4:2:0	Two chrominance samples in the first row, no change samples between rows	~50%	Moderate — standard for consumer video delivery
4:1:1	One chrominance sample per row, with one change sample between rows	~50%	Moderate — used in some DV formats

For critical content where fine color detail matters (graphics with thin colored lines, chroma-keyed footage, or text overlays), prefer 4:4:4 or 4:2:2 sources. The color smearing introduced by 4:2:0 subsampling is often visible on hard color edges.

Temporal Compression

Consecutive frames in a video sequence are typically very similar — most of the image stays the same from one frame to the next. Temporal compression (also called inter-frame compression) takes advantage of this redundancy by analyzing motion between frames and encoding only the differences rather than storing each frame independently.

The core technique is motion compensation: the encoder predicts the current frame based on one or more reference frames and stores only the residual (the difference between prediction and reality). The smaller the difference, the less data needs to be stored.

Frame Types

Temporally compressed video uses three types of frames:

Frame Type	Description
I-frame (Intra / Keyframe)	A complete frame encoded independently, with no reference to other frames. Can be decoded on its own.
P-frame (Predicted)	Stores only the pixels that changed relative to the previous reference frame. Requires the reference frame to decode.
B-frame (Bidirectional)	A predicted frame that references both previous and subsequent frames, looking both forward and backward for changes. Achieves the highest compression but requires multiple reference frames.

Group of Pictures (GOP)

A Group of Pictures (GOP) is the sequence of frames between one I-frame and the next. It consists of a single keyframe followed by a series of predicted frames (P-frames and B-frames). The GOP length is the distance between consecutive I-frames.

Short GOPs (e.g., 1–15 frames) provide frequent random-access points at the cost of larger files. Long GOPs (e.g., 60–300 frames) achieve better compression but make seeking and scrubbing slower. A typical YouTube video might use a GOP of 300 frames.

Media servers need I-frames to display a complete image. When playback jumps to a position in the middle of a GOP, the system must decode forward from the nearest preceding I-frame to reconstruct the target frame. This causes seek delays and can produce visual artifacts during scrubbing — a critical consideration for live show operation where instant timeline navigation is expected.

Relevance to WATCHOUT

Understanding compression directly affects how you prepare and manage media in a WATCHOUT production:

Asset Manager optimization — when media is added to a show, the Asset Manager processes it into an optimized format for real-time playback. Knowing how your source material is compressed helps you anticipate optimization time, output file sizes, and quality trade-offs.
Choosing input formats — source files with long GOPs (H.264, HEVC) compress well but require more processing during optimization. Sources already in I-frame-only formats arrive closer to their final playback form.
I-frame-only codecs for instant seeking — codecs such as HAP, NotchLC, and ProRes encode every frame independently (effectively a GOP of 1). This eliminates seek delays entirely, which is why WATCHOUT's optimizer targets these formats by default.
Chroma subsampling and visual quality — the subsampling format of your source material sets the upper limit of color detail that survives through the pipeline. Delivering 4:2:0 content when the production requires precise color edges means that quality is lost before WATCHOUT ever processes it.

For the best playback experience, deliver source media in I-frame-only codecs (HAP, NotchLC, or ProRes) with 4:2:2 or 4:4:4 chroma subsampling. This minimizes optimization time and ensures instant seeking on the timeline.