Software can do what no small fortune worth of hardware can
There’s no doubt that manufacturing processes keep improving and camera modules get increasingly powerful. But you cannot change the laws of physics, captain. Even if you’re prepared to pay a small fortune, some qualities simply don’t manifest themselves on such small scales. Fortunately, with a lot of processing power available, video enhancement software can minimize or even eliminate some defects. We’ve already talked about hardware augmentations such as Optical Image Stabilization (OIS) in a previous post so we will focus on video enhancement software here.
First, there really is no “objective” way of looking at the world, even for electronic devices. How scenes are captured depends largely on well-known and similar-sounding features like auto focus, auto white balance and auto exposure time to try to capture a good photo. They do not necessarily replicate what you see with your own eyes. It is common to make these algorithms saturate the colors a bit “extra”, to make them “pop”.
These features are also a requirement for video recording, where the time dimension plays a critical role. While subsequent photos may retune these settings (for better or worse) between shots, they only need to smoothly and gently change between subsequent frames in a video. Various algorithms for computing these settings exist and are subject to ongoing research.
Case in point: the bokeh effect
Something popularized in smartphones is creating a more-or-less fake bokeh effect in portrait shots. This means the camera distinguishes between the foreground and background and blurs the background, creating an emphasis on the subject (usually a person). Large camera bodies can narrow the focus and create this effect naturally, but the small cameras in devices like smartphones and drones cannot.
Instead, machine learning can be used to understand to a certain degree what the foreground subject is, what its contours are and blur the rest of the image. A better, and increasingly common approach is to use a dual camera. The smartphone uses the offset of both lenses to calculate a depth map of the scene, just as our brain does with a pair of eyes.
The bokeh effect: The subject in the foreground is in focus while the background is blurry, which helps to emphasize the subject. Original photo by carlosluis on Flickr.
A synthetic process like this has some flaws that don’t occur naturally. Individual hairs are sometimes mistakenly blurred together with the background. Glasses in the foreground will be considered to be in the foreground, but the background seen through them will not be blurred, whereas it would be in a proper bokeh effect. So it’s not perfect but generally good enough.
Improving image quality with artificial brightness
Some devices also have an additional purpose for the multiple cameras. For example, the second sensor can be a high-resolution black-and-white sensor to supplement the primary sensor with additional brightness information, improving image quality. Specializing beyond a general depth map, a trained AI and dual cameras can construct a 3D map of a face and re-light it with fake studio lighting, highlighting points of the face like the nose, cheeks and chin that would have been emphasized by external studio light. This gives the image a dimensionality you could normally only achieve using external lighting solutions or a lot of post processing.
Taking zoom to the next level
Zoom functionality is harder to emulate. Although some optical zoom may be available, especially with multiple cameras, most of the smartphone zoom is digital. Digital zoom means just cropping the already available image. As there is no more information to use, quality is reduced. All hope is not lost, as AI may now be able to essentially recognize what is missing and fill in the details on the cropped image, as an artist would improvise on a canvas, based on prior experience of other photos. Laughing at “zoom and enhance” on TV crime shows may become a thing of the past, after all.
Even with parts of the photo removed, the AI was successful at reproducing the original with decent quality in some examples. Check out more examples.
Speaking of zooming, any distortions caused by movement are amplified by the amount of zoom. Video enhancement software can help by applying video stabilization and creating a smoother zoom experience with live auto zoom. Good quality is not just clever engineering but about helping the user accomplish tasks, like an accurate and smooth zoom. This, and more, is the subject of the next section.