For the past week, we have been working on building a Template search system which is basically selecting an event, object or person from an image and searching for it in a large collection of images.
This system is part of the ibeyonde Bingo AI system which allows our users to search for an object of interest in any range of dates across multiple cameras that can be present in different locations. This will provide them the power to search simultaneously in multiple areas without 24/7 monitoring. For example, if the administrator of a factory needs to monitor an intruder he can easily do it by just cropping the person from an image and searching it in all the cameras. In this way, he can get the locations of all the places the person visited and is last seen. The same system can be used in the shopping mall for tracking any suspicious person across dates.
The example shown below is from test cameras installed in a various lift in a society. It captures images if detected motion, which in this case is roughly about 1-2 seconds per image. That will around 80K images in a day.
1. Image History
Now begins the fun part, Using this system now I will search for this man which I definitely feel is suspicious..hmm maybe i am wrong, but it is a society we are talking about, no risk right. Alright, so 24 hrs of history the day 16 June. First selecting the region of interest from the image of the man, selecting the range of dates, the ROI can be tagged for conducting future searches (I do not need to right now). All set.. just need to click the search button. (Bingo)
And there it is…. just in few glimpses of time, I could know places this person has visited …..reviewing these pictures… I could get the floors he went, the time he went, the time he came back, his current location and the people he meets…Analysing the current searches.. … this man….. I think….hmmm….. yeah he is not a threat.
3. Search Results
That was fun Now Lets dive in… Technically this system is a type of content-based image retrieval system which uses K-nearest neighbor (k-NN) algorithm to determine the images with most similar features. Features are just the textures of an image mapped to a histogram with specified no of bins. The template i.e. the cropped image textures are extracted and plotted in a histogram. This histogram is then compared with another feature histogram of images present in the database. The searching is a process of calculating distances(differences) between the histogram. Once the calculation is complete the images are then ranked in increasing order of the distances with the template and finally displayed on the screen.
This system utilizes the power of artificial intelligence and cloud for smart surveillance through your cameras in the least possible time and resources. Searching in 80K images of a day – manually is a crazy idea but letting bingo do it is a smart one.
— Siddharth Gupta