Latest News

From manufacturing to the metaverse: Computer vision will impact every area of our business and social lives in 2022

Written by Appu Shaji, CEO and Chief Scientist at Mobius Labs

 

In 2022, computer vision will become the dominant category of artificial intelligence,  as the technology becomes more scalable and affordable to businesses of all sizes and sectors. It’s hard to encapsulate all that will happen in the space of a single article, but here are four important areas where I see computer vision evolving in the next 12 months.

 

A paradigm shift in robotic devices

Although autonomous devices have been with us for several decades on industrial assembly lines, computer vision extends the benefits of the technology much further along the supply chain.

Mining and the extraction of raw materials is a good place to start. Using computer vision to classify ore grades and generate terrain maps accelerates exploration and could even support automated mining given sufficient data.

In factories, computer vision, in the form of object recognition, will play a greater role in helping to sort and distribute parts more efficiently and reliably than the human brain—especially when it comes to the organisation of thousands of miniature components.

Modern cars, for example, have about 30,000 different parts including radars, imaging sensors and computing systems. This next generation of vehicles—autonomous and electric—will require more advanced assembly line technology where computer vision will play an even greater role.

We’re also seeing significant advances in quality control. As well as spotting the tiniest surface defect of a manufactured object, computer vision can count objects and flag a missing component to a human operator. This applies equally to the body work of a high-end motor vehicle as it does to the number of buttons on a shirt, or the charging cables packaged with an electronic device.

 

Making sense of the metaverse

While industrial robots have been with us for many years, the past few months have been dominated by the metaverse with several big tech companies jostling for a position in the headlines.

A lot of people have asked me about my view on the technology and I always reply that augmented reality, delivered via smart glasses, will be far more accessible to the majority of people, whether they are following directions on the street or manipulating virtual objects in an engineering workshop.

I’m also very optimistic that this will help solve some of the social issues that go with smartphones and tablets. Handheld smart devices keep us in ‘heads down’ mode whereas smart glasses enable us to maintain eye contact with friends and colleagues while managing our online lives.

It also helps to think about the metaverse as a series of layers, starting with a completely transparent display and then ‘mixed reality’ which blends digital information and objects with the real world. Finally, complete immersion where the participant is free to move around in a virtual space whether meeting with online colleagues, playing games or other simulations.

 

A victory for video

I believe that 2022 is the year that video catches up with photography when it comes to the application of computer vision. Software on smartphone cameras already helps end-users to choose their favourite photo or apply smart edits to landscapes or portraits. In fact, unless you shoot ‘raw’, it’s probable that every photograph you now take is enhanced by computer vision.

We’re going to see something similar with video. In the same way that a camera can enhance different images depending on the context (landscape, portrait, sports etc) computer vision will edit a clip depending on the subject.

This is especially useful in social media where clips last for a matter of seconds. Imagine being an influencer or marketer on TikTok and being able to extract and stitch together a ten second edit automatically from a two minute clip.

The software will make the edit automatically or the creator can choose a tone: tranquility for a landscape, energy for sports and so on. It will, if required, even offer a choice of edits so that the videographer or social media manager can make a final decision and upload their favourite version.

This is great news as well for media organisations. The software can also be used to enhance and amplify content from many thousands of hours of archived footage. Media organisations will be able to sift through their content using a choice of filters based on fashion, sports, geography and many more. Once selected, these clips can be licensed for third parties or used as promotional material that attract customers to the archive to search for similar content.

 

A reduction in the volume of data to train algorithms

Anyone involved in deep learning knows that one of the biggest challenges is the need for vast quantities of annotated data to train huge neural networks. This has been the traditional way to train computer vision models for some time. But we’re starting to see more innovative approaches that enable machine learning with substantially less training data. Examples include the move away from supervised learning to self-supervised and weakly supervised learnings. Here, the amount of data is less of an issue.

Also called shot learning, this technique detects objects as well as new concepts with not many more than 20 images. It’s an important breakthrough that widens the application of computer vision since businesses no longer need to rely on third-parties with massive computing infrastructures and data sets to create concepts that are closely tailored to the needs of consumers or business customers.

Space only allows for the four examples above, but overall I think this is the year when computer vision becomes as widely discussed in the mainstream media as artificial intelligence in 2021. What else do you think is on the horizon for the technology?