Using MotionDSP’s video processing filters to improve deep learning-based image classification results

In April, MotionDSP announced Ikena® Cloud, our new API for cloud-based processing of video files and live video streams. Since then we’ve tested it with developers in various use cases, one of which is: pre-processing video to improve image classification results. The results have been promising.

Amazon Rekognition, Google Cloud Vision API, and Microsoft Azure’s Computer Vision API all offer high quality image classification — you give their APIs an image, and they return tags which describe what’s in it.

One challenge we see with our customers is that a lot of their real-world video doesn’t match the video that these training sets were trained on. Like this example:

If you look carefully, you can see what’s in the video, but a classification engine may not. Google/Amazon/Microsoft could expand their training sets to include dark versions of all their images, but that would increase the complexity of their training. And then even more complexity if you want to add all the other possible environmental conditions — overly bright scenes, haze, etc.

Solution: use video pre-processing

Ikena Cloud offers any combination of MotionDSP’s 20+ GPU-accelerated image processing filters for video pre-processing, and can output enhanced still images at user-selected intervals, sending the enhanced images to various image classification APIs. Here is one result:

MotionDSP video filters used: automatic brightness, contrast, super-resolution


In the above example, a lot more of the scene was recognized — “guitar,” “musician,” etc., enough that you could argue the tags on the right describe the scene better than the tags on the left.

Here is another example showing what our super-resolution algorithm does to improve OCR accuracy in the Google Cloud Vision API.

 MotionDSP video filters used: super-resolution, de-blurring


In the above example, it is not a perfect OCR result, but you can certainly see it is improved — “Oppenheim Schafer” vs “ForReno sTHAFER”

Benefits of video pre-processing

  1. Benefit to Image Classification APIs: more tags, more accurate results, stronger confidence
  2. Benefits to Deep Learning Training: reduce the amount of data required for training and inference/prediction


The workflow

  1. Send video (file or live stream) to the Ikena Cloud API
  2. Process the video with a desired preset (combination of one or many of our 20+ GPU-accelerated image processing filters and their settings)
  3. Output JPEG still images at user-specified time-points
  4. Send these enhanced images to the Amazon Rekognition API or Google’s Cloud Vision API (we also plan to add Microsoft)
  5. Compare the classification tag results of the enhanced images vs their un-enhanced originals


Last Example

This image came from an aircraft using a large gimbal (camera) taking standard definition video over a distance of a few miles. It suffers from atmospheric haze, lighting issues, and compression artifacts. See the difference in tags reported from Amazon’s Rekognition.

To better see what our filters are doing, watch the video below. Notice the difference that the super-resolution filter does to the noise and compression artifacts.

We are just in the early stages of testing, but so far the results are promising. We have also seen benefits to other image classification features, for example using super-resolution to increase detail to improve face sentiment results, face recognition and optical character recognition.

If you have data you are interested in pre-processing with our cloud API, Ikena Cloud, get in touch with us by signing up for our private beta.












Read More

Why Built-in BWC Video Redaction Won’t Work

I’ll come right to the point: some body camera vendors provide a video redaction option, but it will “only” work with their body cameras.

We’ve spoken to hundreds of law enforcement customers from around the world (as well district attorneys, law firms and universities), and they all need video redaction. Many tell the same story: they bought a body camera system (also known as body-worn camera or BWC) which included a video redaction feature. And then they realized they didn’t have an effective way of redacting videos from other sources such as in-car video systems (also known as dash cams), surveillance systems and even mobile phones.  With FOIA (Freedom of Information Act) and state-level open records requests mounting, they need a redaction solution to handle all of their redaction tasks.

Read More

Five Questions You Should Answer Before Buying Video Redaction Software

Over the past five years of shipping a video redaction product, my team at MotionDSP has met with numerous police departments, district attorneys, universities and body camera manufacturers. We’ve learned a lot from those conversations and thought we would share a list of the 5 key questions customers should ask before purchasing a video redaction solution for FOIA requests.

Read More

5 Things Hardware Companies Need to Know Before Attempting Computer Vision Software Development

At MotionDSP, we have a lot of business development meetings with hardware companies who need image processing and computer vision software to augment their products and differentiate them in competitive markets. We often see that many are trying to build software teams of their own, as their long-term strategy is to make software in-house.

Read More

All Computer Vision Tracking Algorithms Are Not Created Equal

We receive a lot of positive feedback from customers who are impressed with how well our Ikena ISR and Spotlight software applications detect and track moving objects from both aerial and ground-based cameras. A common question we hear is why our tracking is so much better than what is found in mobile phones, drone cameras and even expensive gimbals, and whether our tracking algorithms could be embedded on those kinds of small devices.

Read More

Image Processing On Live Video Streams Should Be More Than Just Mustaches and Swapping Faces

As the race for live video streaming supremacy heats up with tech companies like Facebook, YouTube and Periscope, it will be interesting to see what they do to differentiate their services from one another. One area we see a lot of opportunity is with live image processing. We don’t mean drawing mustaches on people or swapping faces. We think there are more sophisticated things you can do with real-time, high-performance computer vision and image processing to significantly increase the level of viewer engagement on live video streams.

Read More

How sUAS & Real-time Mapping Software Can Help with Disaster Response

Natural disasters such as hurricanes, earthquakes, flooding, mudslides and forest fires can dramatically change the landscape or infrastructure of an area. These disasters can lead to rerouted rivers, flooded or damaged roads, and collapsed bridges or buildings, making normal routes that first responders use unavailable. It’s crucial that these first responders have updated maps to find safe routes in their affected areas and know which locations need aid.

Creating maps after a disaster comes with a number of challenges that our software, Ikena ISR, is built to address.

Read More

Meet us at XPONENTIAL 2016 for a Chance to Win a NVIDIA Graphics Card!

MotionDSP is excited to announce that we’ll be exhibiting for the first time at this year’s AUVSI XPONENTIAL conference in New Orleans from May 2-5. We look forward to introducing Ikena ISR, our aerial image processing software to the AUVSI community.

To celebrate our first appearance at the show, we’ve partnered with NVIDIA to give away two of their newest high-performance graphics cards – the NVIDIA Quadro M4000 and the Quadro M5000. To enter, please stop by the MotionDSP booth (#2422) to get a demo of our GPU-accelerated software to see how it works with your aerial sensor.

Read More

MotionDSP Partners with Ascent Vision to Create Innovative Aerial Intelligence Solution

When time, lives or money are on the line, having airborne intelligence is crucial to a successful mission. Organizations need to be able to get in the air, collect video and quickly process that information to create real-time products that provide actionable data. For example, an operator in an aircraft needs to provide the ground crew with up-to-date information regarding the location of a fire line. Rather than calling out the location over a radio, live video can be processed and turned into maps to provide others with accurate information in a matter of minutes compared to hours.

Read More

Resolutions have Surpassed Moore’s Law. What does that Mean for Aerial Sensors?

A new class of next-generation, high-resolution aerial sensors are coming online in the next few years. Northrop Grumman just tested United Technology’s SYERS-2 sensor on Global Hawk last week. Whether they are installed on satellites, manned aircraft, or drones, these new sensors will be collecting vastly more data than their predecessors. That data will need next-generation software applications that take advantage of the very latest in commercial computing technology. Are you using the right computing architecture?

Read More