VP9 codec - What is VP9?

What is VP9?

VP9 is an open-source video codec developed by Google after their acquisition of On2Technologies in Feb 2010. This was the second codec, after VP8, to be released, and it became available to the masses in June 2013.

In terms of usage, the largest distributor of VP9-codec content was YouTube. Later, in 2016, Netflix, too, announced the use of VP9 for their platform. Since then, many new use cases have emerged where VP9 has proven to be the viable solution.

VP8 vs VP9

VP8 and VP9 are video compression formats developed by Google. VP8 was the first video compression format developed by Google and was released in 2010. VP9 is the successor to VP8, released in 2013. Both formats use a block-based motion estimation/compensation approach to achieve higher compression performance.
VP8 is designed to be a more efficient format than its predecessor, VP7. It has improved compression algorithms and better rate control, which allows for better-quality video at lower bitrates. VP8 also supports multiple resolutions, including HD and UHD. VP9 is an improved version of VP8. It has better compression performance, improved rate control, and more efficient coding tools, which allow for higher-quality video at lower bitrates. VP9 also supports High Dynamic Range (HDR) and 10-bit color depth.
VP9 is a more advanced and efficient codec than VP8. It generally produces better-quality video at lower bitrates than VP8. It has better support for 4K and 8K resolutions, making it ideal for streaming high-resolution videos. However, VP9 is only supported by a limited number of devices and browsers, so it is not as widely used as VP8. It also requires more power to decode, so it is not suitable for low-power devices.

How does VP9 codec work?

The VP9 codec is similar to the H.265 process in terms of its working and supports parallel processing. It reduces the bitrate to half of the original without compromising on the quality. This way, the VP9 codec works to support better video streaming experience for low-end devices like tablets or smartphones. The VP9 codec works by compressing the initial raw video file using an algorithm into half the original size which makes it fit for seamless transmission over the internet. Here is a brief overview of how this goes on behind the scenes with VP9.

Picture partitioning

VP9 first divides the picture frame into superblocks - which are 64x64 sized blocks. The processing of these superblocks happens in a raster order: from left to right, and top to bottom. This processing is similar to that of other codecs. However, superblocks can be further divided into smaller components - as small as 4x4. This is made possible by using a quadtree that is similar to the one used in HEVC. However, unlike HEVC, there is no restriction in the subdivision, and it can be only horizontal or vertical as well.

VP9 also supports tiles - which is where the picture is divided into a grid of tiles along superblock boundaries. Unlike HEVC, these tiles are as evenly distributed as possible, and the total number of tiles possible is always a power-of-two. Tiles must be at least 256 pixels wide, and not more than 4096 pixels wide. Further, there can’t be more than four tile rows. The tiles are scanned in a raster order, and so are the superblocks within the tiles. This ensures that the ordering of superblocks depends on the tile structure.

At the end of all tiles, except the last one, a byte count is transmitted that indicates the number of bytes that will be needed to code the next tile. This allows a multithreaded decoder to skip ahead in order to start a decoding thread, and keeps things fast and optimized.

Bitstream coding

The bitstream generated is containerized either with WebM or IVF. IVF is simple, and WebM is just a subset of MKV. It is important to use a container otherwise it will be difficult to seek a particular frame without performing a full decode of preceding frames.

Like VP8, VP9, too, compresses the complete bitstream using an 8-bit arithmetic coding engine called the Bool-Coder. Each frame is coded into three buckets as follows:

Uncompressed header - which contains information like loop filter strength, picture size, and takes a dozen bytes or so.
Compressed header - This transmits the probabilities used for the whole frame.
Compressed frame data - this contains the data required to reconstruct the frame. This includes block partition sizes, motion vectors, transform coefficients, and more.

Unlike VP8, VP9 does not have any data partitioning, and all the data types are used in super block coding order. This makes things easier for hardware designers.

Intra prediction

VP9’s intra prediction is similar to that of HEVC, and follows similar block partitions. As a result of this, intra prediction operations always result in a square. For instance, a 16x8 block with 8x8 transforms will result in two 8x8 prediction operations.

With VP9, there are 10 prediction modes. 8 of these prediction modes are directional. They use two 1D arrays containing the restructured upper and left pixels of neighbor blocks.

The manner of arrangement is such that the above array is twice as long as the current block’s width, and the left one is the same height. However, for intra prediction of blocks larger than 4x4, the horizontal array’s second half is extended beyond the first part’s last pixel.https://i.imgur.com/0n7jgj4.png

Inter prediction

For inter prediction, VP9 uses 1/8th pixel motion compensation, which offers twice the precision of other standards. With other standards, the motion compensation happens in a unidirectional manner. However, with VP9, this happens in a compound manner, which essentially refers to bi-prediction where there are two motion vectors for each block and the two resulting prediction samples are averaged together.

Segmentation:

Another feature offered by VP9, that makes it stand out from the rest, is segmentation. When this is enabled, the incoming bitstream is code a segment ID for each block - which is essentially an integer between 0 to 7. Each of these eight segments can have any of the following four features enabled:

Skip – blocks that have this feature active are assumed to not have any residual signal. These are important for static backgrounds.
Alternate quantizer – blocks having this feature may use a different quantization scale. This is useful for regions that require greater detail than the rest of the picture.
Ref – blocks having this feature are assumed to be pointing to a particular reference frame.
AltLf - blocks with this feature enabled use a different smoothing strength. This is useful for when we want to smooth areas from our video or images that would otherwise be too rough and blocky.

All of these features and operations aid the working of VP9 codec and make it an efficient approach. Owing to its strengths, the VP9 codec is supported by Netflix video services, YouTube streams, and even many tech giants like Sony, Panasonic, Samsung, Qualcomm, NVIDIA, and so on.

Comparison with other video codec

VP9 vs H.264

The H.264 codec compresses large amounts of information from video files to enable them to be streamed online. The HD images that H.264 works with are 1280x720 pixels, which is 720p resolution, or 1920x1080 pixels, which is 1080p resolution. With 4K , on the other hand, the total number of pixels are 3840x2160. Such a drastic increase in the level of detailing demands a superior way to perform better compression in order to transmit, store, and use the data. In that context, VP9 is twice as effective as H.264, and uses only half the data to stream 4K content without compromising on quality.

VP9 vs H.265

When talking about VP9 vs H.265, it is important to note that there are many technical similarities between the two. Even the primary goal of both these codecs is the same - to compress video files into half the bitrate to stream HD video and provide better compression techniques for 4K video to become more approachable for people with regular internet bandwidths. That said, the biggest difference between these two is that VP9 is an open-source codec and can be used by anyone, whereas H.265 requires a license to be purchased before using. In terms of usage and efficiency, these two codecs are by and large comparable.

VP9 vs AV1

The primary difference between AV1 and VP9 is that while AV1 is worthwhile only for videos with views in the mid-to-high millions, VP9 is worth considering for even videos with view counts in excess of a few thousands. Further, since VP9 is free and enjoys widespread use, it is going to be a much more viable choice in the near future.

Conclusion

In conclusion, VP9 codec has proved to be extremely useful for streaming 4K videos seamlessly, even with limited bandwidth. Being open-sourced, VP9 allows anyone to get started with it and compress their 4K videos in a manner like never before!

FAQs

1. Is VP9 better than H.265?

Although VP9 is more widely supported, H.265 outperforms VP9 in terms of better compression efficiency to create a smaller video.

2. What devices support VP9?

VP9 is supported on many devices and browsers, including Windows, Mac, and Linux computers, Android devices, and some Smart TVs. The list of devices supporting VP9 is constantly expanding as more companies support the codec.

3. Is VP9 a good codec?

Yes, VP9 is a good codec. It is an open-source codec developed by Google and is used in many web-based video streaming services. VP9 is designed to provide high-quality video with relatively low bitrates and is also very efficient in terms of computational complexity. In addition, VP9 is supported by most modern browsers and operating systems, making it an ideal choice for web video.