The three main stages of a video pipeline
What many of the manufacturers of these new applications are experiencing is that a complete video system is a complex and difficult domain to master. A video pipeline system typically contains many different components, each complex in its own right. Explore the three main stages of a video pipeline to learn why no component should be treated as island if you want to achieve optimal video quality.
Stage 1: Analog
The first stage is in the analog domain, namely the lens system. Different lenses have different properties. For instance, a wide-angle lens performs completely differently in many aspects (other than the obvious focal length) than a zoom lens. Different lenses from different manufacturers behave very differently with regards to aberrations, distortions and similar phenomena. One cause of complexity is that today’s products normally also carry not just one but multiple cameras for different purposes. It is not uncommon to find up to five cameras on a standard smartphone today. Image stabilization is sometimes also applied at this stage using optical image stabilization (OIS).
Stage 2: Digital image processing
The second stage is in the digital image processing domain, or the Image Signal Processing (ISP) domain. This is after light has passed through the lens system, hit the image sensor and been converted into the digital domain via an A/D converter. This stage contains many low-layer functions, such as demosaicking, denoising and blur correction, 3A, white balancing, color enhancements and tone mapping stages, lens distortion correction, and much more. These functions contribute to creating, correcting and perfecting a single frame, but a video consists of many frames per second (FPS) (in extreme cases, up to 960 FPS in today’s smartphones). to perfect a video, more processing is needed. Much of this processing takes place in the third stage.
Stage 3: Computer cluster
The third stage is the compute cluster, which contains different processing units with various degrees of compute power, each optimized for its specific usage. The modern compute clusters usually offer one or multiple CPUs for general computations and GPUs for graphical processing along with Digital Signal Processors (DSPs) and Neural Processing Units (NPUs) for machine learning and AI computations. This is the stage where video enhancement applications are taken to a higher level. Examples of applications include video stabilization, object identification, smooth transitions between cameras in a multi-camera solution, facial recognition and object tracking.
Each component in a camera pipeline cannot be treated as an individual island.