The history of object tracking
Object tracking is the process of locating and following one or more objects over time using a camera. It has a variety of uses, including human-computer interaction, security and surveillance, video communication, augmented reality, traffic control, medical imaging, video editing, and even compression – our blog post on compression talked about detecting movement from frame to frame.
This is quite different from object detection, generally used to automatically detect a predetermined set of types of objects in a video, and object recognition, used to recognize and identify objects in a video – distinguishing people from cars and so on. It is interesting to note that machine learning (ML) and deep learning (DL) methods completely dominate detection and recognition algorithms today, whereas they play a smaller role in object tracking, although this is bound to increase over time.
Object tracking, object detection and object recognition are still open problems in image and video analysis, each with many different approaches. Both detection and recognition build upon object tracking techniques, but this blog post will focus only on object tracking.
Trackers are steadily becoming better and better, but common problems remain. Trackers can easily fail in cluttered scenes, occlusion (the tracked object disappears behind another object) is a difficult problem, and good initialization (the tracker’s starting region) still matters a great deal.
Different object tracking approaches
As stated above, there are many approaches to object tracking and many variations within each approach. Which method is “best” is largely dependent on a specific application. The cutting-edge methods constantly change with new research and breakthroughs in related fields.
Below, four very different object tracking fields will be outlined on the basis of feature clouds and correlation filters. The aim is to visualize different techniques and make them easier to understand. Typically, the art of finding the location of the object and the art of estimating its new size (if it’s moving towards or away from the camera) are considered separate processes.
Different trackers are constantly benchmarked. For examples, see the “Object Tracking Benchmark” report in IEEE Transactions on Pattern Analysis and Machine Intelligence, and a more recent unpublished study on Github.
Color histogram trackers
Color histogram trackers are commonly used for auto focus, where precision does not matter at all. They don’t stand a chance in competitions because they can easily follow something else, but their fast redetection works even if the object has changed shape completely. They are not robust enough in the face of light changes and backgrounds with similar color patterns.
Motion-based trackers
Motion-based trackers are well-suited for mounted cameras in security systems and for following fast moving objects. The method basically just follows a detected region that differs from the trained background. They are better at handling changes in illumination but will not be able to detect objects when they are not moving.