Artificial Intelligence is getting better – latest news and trends in AI concerning image processing

Artificial intelligence is now a part of new, more useful applications and it is getting better. In this blog post we will present you some of these new and interesting AI apps. And, let us just inform you that, from this blog post, every couple of months, we will show and discuss news and trends in image processing field, including new papers, research and applications!

And now, let’s start with news from our favorite, NVIDIA. What is NVIDIA up to?

Image source: https://pixabay.com/

AI can Detect Open Parking Spaces

Image source: https://pixabay.com/

With as many as 2 billion parking spaces in the United States, finding an open spot in a major city can be complicated. To help city planners and drivers more efficiently manage and find open spaces, MIT researchers developed a deep learning-based system that can automatically detect open spots from a video feed.

Parking spaces are costly to build, parking payments are difficult to enforce, and drivers waste an excessive amount of time searching for empty lots,” the researchers stated in their paper.

Article from:
https://news.developer.nvidia.com/ai-algorithm-aims-to-help-you-find-a-parking-spot/

New AI Imaging Technique Reconstructs Photos with Realistic Results

Researchers from NVIDIA, led by Guilin Liu, introduced a state-of-the-art deep learning method that can edit images or reconstruct a corrupted imageone that has holes or is missing pixels. The method can also be used to edit images by removing content and filling in the resulting holes. The method, which performs a process called “image inpainting”, could be implemented in photo editing software to remove unwanted content, while filling it with a realistic computer-generated alternative.

Our model can robustly handle holes of any shape, size location, or distance from the image borders. Previous deep learning approaches have focused on rectangular regions located around the center of the image, and often rely on expensive post-processing,” the NVIDIA researchers stated in their research paper.

Article from:
https://news.developer.nvidia.com/new-ai-imaging-technique-reconstructs-photos-with-realistic-results/

AI Can Now Fix Your Grainy Photos by Only Looking at Grainy Photos

What if you could take your photos that were originally taken in low light and automatically remove the noise and artifacts? Have grainy or pixelated images in your photo library and want to fix them? This deep learning-based approach has learned to fix photos by simply looking at examples of corrupted photos only. The work was developed by researchers from NVIDIA, Aalto University, and MIT, and was presented at the International Conference on Machine Learning in Stockholm, Sweden.

Recent deep learning work in the field has focused on training a neural network to restore images by showing example pairs of noisy and clean images. The AI then learns how to make up the difference. This method differs because it only requires two input images with the noise or grain.

Without ever being shown what a noise-free image looks like, this AI can remove artifacts, noise, grain, and automatically enhance your photos.

It is possible to learn to restore signals without ever observing clean ones, at performance sometimes exceeding training using clean exemplars,” the researchers stated in their paper.

Article from:
https://news.developer.nvidia.com/ai-can-now-fix-your-grainy-photos-by-only-looking-at-grainy-photos/

AI Model Can Generate Images from Natural Language Descriptions

Image source: https://pixabay.com/

To potentially improve natural language queries, including the retrieval of images from speech, Researchers from IBM and the University of Virginia developed a deep learning model that can generate objects and their attributes from natural language descriptions.

We show that under minor modifications, the proposed framework can handle the generation of different forms of scene representations, including cartoon-like scenes, object layouts corresponding to real images, and synthetic images,” the researchers stated in their paper.

Article from:
https://news.developer.nvidia.com/ai-model-can-generate-images-from-natural-language-descriptions/

Now, some new research papers with different fields that need AI as well as image processing:

Digital image analysis in breast pathology—from image processing techniques to artificial intelligence 

From: https://www.sciencedirect.com/science/article/pii/S1931524417302955 

Image source: https://pixabay.com/

Abstract: Breast cancer is the most common malignant disease in women worldwide. In recent decades, earlier diagnosis and better adjuvant therapy have substantially improved patient outcome. Diagnosis by histopathology has proven to be instrumental to guide breast cancer treatment, but new challenges have emerged as our increasing understanding of cancer over the years has revealed its complex nature. As patient demand for personalized breast cancer therapy grows, we face an urgent need for more precise biomarker assessment and more accurate histopathologic breast cancer diagnosis to make better therapy decisions. The digitization of pathology data has opened the door to faster, more reproducible, and more precise diagnoses through computerized image analysis. Software to assist diagnostic breast pathology through image processing techniques have been around for years. But recent breakthroughs in artificial intelligence (AI) promise to fundamentally change the way we detect and treat breast cancer in the near future. Machine learning, a subfield of AI that applies statistical methods to learn from data, has seen an explosion of interest in recent years because of its ability to recognize patterns in data with less need for human instruction. One technique in particular, known as deep learning, has produced groundbreaking results in many important problems including image classification and speech recognition. In this review, we will cover the use of AI and deep learning in diagnostic breast pathology, and other recent developments in digital image analysis.

Predicting tool life in turning operations using neural networks and image processing

From: https://www.sciencedirect.com/science/article/pii/S088832701730599X 

Abstract: A two-step method is presented for the automatic prediction of tool life in turning operations. First, experimental data are collected for three cutting edges under the same constant processing conditions. In these experiments, the parameter of tool wear, VB, is measured with conventional methods and the same parameter is estimated using Neural Wear, a customized software package that combines flank wear image recognition and Artificial Neural Networks (ANNs). Second, an ANN model of tool life is trained with the data collected from the first two cutting edges and the subsequent model is evaluated on two different subsets for the third cutting edge: the first subset is obtained from the direct measurement of tool wear and the second is obtained from the Neural Wear software that estimates tool wear using edge images. Although the complete-automated solution, Neural Wear software for tool wear recognition plus the ANN model of tool life prediction, presented a slightly higher error than the direct measurements, it was within the same range and can meet all industrial requirements. These results confirm that the combination of image recognition software and ANN modelling could potentially be developed into a useful industrial tool for low-cost estimation of tool life in turning operations.

Automatic food detection in egocentric images using artificial intelligence technology 

From:
https://www.cambridge.org/core/journals/public-health-nutrition/article/automatic-food-detection-in-egocentric-images-using-artificial-intelligence-technology/CAE3262B945CC45E4B14C06C83A68F42  

Image source: https://pixabay.com/

Abstract:

Objective:To develop an artificial intelligence (AI)-based algorithm which can automatically detect food items from images acquired by an egocentric wearable camera for dietary assessment.

Design:To study human diet and lifestyle, large sets of egocentric images were acquired using a wearable device, called eButton, from free-living individuals. Three thousand nine hundred images containing real-world activities, which formed eButton data set 1, were manually selected from thirty subjects. eButton data set 2 contained 29 515 images acquired from a research participant in a week-long unrestricted recording. They included both food- and non-food-related real-life activities, such as dining at both home and restaurants, cooking, shopping, gardening, housekeeping chores, taking classes, gym exercise, etc. All images in these data sets were classified as food/non-food images based on their tags generated by a convolutional neural network.

Results:A cross data-set test was conducted on eButton data set 1. The overall accuracy of food detection was 91·5 and 86·4 %, respectively, when one-half of data set 1 was used for training and the other half for testing. For eButton data set 2, 74·0 % sensitivity and 87·0 % specificity were obtained if both ‘food’ and ‘drink’ were considered as food images. Alternatively, if only ‘food’ items were considered, the sensitivity and specificity reached 85·0 and 85·8 %, respectively.

Conclusions: The AI technology can automatically detect foods from low-quality, wearable camera-acquired real-world egocentric images with reasonable accuracy, reducing both the burden of data processing and privacy concerns.

Bioinformatics and Image Processing—Detection of Plant Diseases 

From:
https://link.springer.com/chapter/10.1007/978-981-13-1580-0_14 

Image source: https://pixabay.com/

Abstract:

This paper gives an idea of how a combination of image processing along with bioinformatics detects deadly diseases in plants and agricultural crops. These kinds of diseases are not recognizable by bare human eyesight. First occurrence of these diseases is microscopic in nature. If plants are affected with such kind of diseases, there is deterioration in the quality of production of the plants. We need to correctly identify the symptoms, treat the diseases, and improve the production quality. Computers can help to make correct decision as well as can support industrialization of the detection work. We present in this paper a technique for image segmentation using HSI algorithm to classify various categories of diseases. This technique can also classify different types of plant diseases as well. GA has always proven itself to be very useful in image segmentation.

And, at the end, some news from public sector and applied algorithms:

China Now has Facial Recognition Based Toilets 

Image source: https://pixabay.com/

China has integrated facial recognition in the toilets across the country. Citizens now need WeChat or face scans to get the toilet papers. People will stand in the yellow recognition spot and will bring their face near the face identification machine.  Then after about three seconds, 90 centimeters of toilet paper will come out. People will then go in and use the toilet but only for limited time as alarm will buzz if someone occupies it for too long. In toilet, sensors will assess ammonium amount and spray a deodorant if required. The two bathrooms integrated with face scanners for being “clean and convenient,” and “reducing toilet paper waste.”

Read more here:
https://www.aitechnologies.com/china-now-has-facial-recognition-based-toilets/ 

Apple’s Camera-Toting Watch Band Uses Facial Recognition For Flawless FaceTime Calls 

Image source: https://pixabay.com/

U.S. Patent and Trademark Office granted a patent to Apple which says that the tech titan wants to widen the set of attributes of its wearable, by integrating an original camera system with the ability to automatically crop subject matter, trace objects such as user’s face and produce angle-adjusted avatars for FaceTime calls. “Image-capturing watch” U.S. Patent No. 10,129,503 of Apple tells a software and hardware solution that creates a camera-toting Apple Watch, that is both handy and feasible. Using a camera-toted Watch, consumers can put aside a heavy handheld device while playing sports, exercising or doing other energetic activities. However, a feasible smartwatch solution is hard to accomplish. The camera captures the motion data and then the watch processes it, after which it is mapped onto the computer produced picture, which imitates a consumer’s facial movements and expressions in real time. On the other hand, source movement data can be utilized to tell about the motion of inhuman avatars such as Apple’s Memoji and Animoji. It still remains unknown whether Apple wants to integrate its Apple Watch camera band tech.

Read more here:
https://www.aitechnologies.com/apples-camera-toting-watch-band-uses-facial-recognition-for-flawless-facetime-calls/

Metropolitan Police London is to Integrate Face Recognition Tech 

Image source: https://pixabay.com/

London’s police will integrate face recognition tech as an experiment for two days. In the areas of Leicester Square, Piccadilly Circus, and Soho in London, the technology will examine crowds’ faces and compare them with the database of individuals wanted by the courts and Metropolitan Police in London. If the tech founds a match, the police officers in that field will analyze it and perform further tests to make sure the identity of that individual.

Read more here:
https://www.aitechnologies.com/metropolitan-police-london-is-to-integrate-face-recognition-tech/

That’s all for now folks. But, tell me, what do you think, what are some areas where AI is going to bring most benefits? What are areas, by your opinion where there is space for more research? Can you actually believe that it is possible to have AI solutions in every day life?

All news are citations from the mentioned sites, where you can find the whole text about the topic.

What is real time processing (online VS offline)

If you are a beginner in the area of the image and video processing, you may often hear the term real time processing. In this post, we will try to explain the term and list some typical concerns related to this term.

Real time processing – circuit board (image souce: https://pixabay.com/)

Real time image processing is related with typical frame rate. Current standard for capture is typically 30 frames per second. Real time processing would require processing all the frames as soon as they are captured. So broadly speaking, if capture rate is 30 FPS then 30 frames needs to be processed in one second. That comes to around 33 milliseconds (1000 ms / 30 frames = 33 ms/frame). Similar calculation can be done for any frame rate to get required processing time per frame.

In image and video processing, the source of our signal is a camera. So, what real time image processing really means is: produce output simultaneously with the input. What is actually meant is that the algorithm will run at the rate of the source (e.g. a camera) supplying the images, so the algorithm can process images at the frame rate of the camera.

Image source: https://pixabay.com/

Source of image signal is camera

Human vision:

Just out of curiosity, let’s see how the human vision works:

The first thing to understand is that we perceive different aspects of vision differentlyDetecting motion is not the same as detecting light. Another thing is that different parts of the eye perform differently. The center of vision is good at different stuff than the periphery. And another thing is that there are naturalphysical limits to what we can perceive. It takes time for the light that passes through your cornea to become information on which your brain can act, and our brains can only process that information at a certain speed.

Another important concept: the whole of what we perceive is greater than what any one element of our visual system can achieve. This point is fundamental to understanding our perception of vision.

The temporal sensitivity and resolution of human vision varies depending on the type and characteristics of visual stimulus, and it differs between individuals. The human visual system can process 10 to 12 images per second and perceive them individually, while higher rates are perceived as motion. Modulated light (such as a computer display) is perceived as stable by the majority of participants in studies when the rate is higher than 50 Hz through 90 Hz. This perception of modulated light as steady is known as the flicker fusion threshold. However, when the modulated light is non-uniform and contains an image, the flicker fusion threshold can be much higher, in the hundreds of hertz. Regarding image recognition, people have been found to recognize a specific image in an unbroken series of different images, each of which lasts as little as 13 millisecondsPersistence of vision sometimes accounts for very short single-millisecond visual stimulus having a perceived duration of between 100 ms and 400 ms. Multiple stimuli that are very short are sometimes perceived as a single stimulus, such as a 10 ms green flash of light immediately followed by a 10 ms red flash of light perceived as a single yellow flash of light.

Image source: https://pixabay.com/

Human vision

Applications:

The real-time aspect is critical in many real-world devices or products such as mobile phones, digital still/video/cell-phone cameras, portable media players, personal digital assistants, high-definition television, video surveillance systems, industrial visual inspection systems, medical imaging devices, vision-assisted intelligent robots, spectral imaging systems, and many other embedded image or video processing systems.

With the increasing capabilities of imaging systems like cameras with very high-density captures having 16 or more megapixels, it is extremely difficult to get real time performance for many applications.

Image source: https://pixabay.com/

Applications

What applications need real time performance and what applications do not:

When talking about the numerous applications of image and video processing, we can say that some applications in some systems need real time processing, and some don’t. That is why we will talk about online (real time) and offline processing.

Image made by author

Offline processing is processing already recorded video sequence or image. So, digital video stabilization, video enhancement, video coloring, or any application can work with already prepared video. These applications can be found in marketing, industry, medical imaging, film industry or in some ordinary commercial applications, such as a user that wants to stabilize and enhance some video from the phone library.

Offline processing enables using more complex and computationally demanding algorithms, therefore usually gives better results than real time processing. That is why offline processing tools are used a lot in academic research and in some kinds of challenges.

Some of Deep Learning tools for offline processing (on CPU) are:

Image made by author

On the other hand, some applications have a demand for real time processing. For example, traffic monitoring, target tracking in military applications, surveillance and monitoring, real time video games, etc. are apps that demand real time feedback and processed image from sensor.

The algorithms that work in real time do not have the luxury of high complexity, since the processing time for each frame is determined by source frame rate and resolution. New hardware solutions nowadays offer better processing speeds, but there are still limitations, depending of the specific application.

Image made by author

Systems with multiple complex applications working in parallel:

Sometimes the application demands multiple complex algorithms working in parallel. That is the time when not only the complexity of the algorithms is considered, but also which algorithm will be processed first and how this affects the desired performance of the application. One good example is when video enhancement and digital video stabilization algorithm work in parallel.

Video stabilization and video dehazing algorithms in the same video processing pipeline can affect the results of each other. This interesting topic is described in a paper [Dehazing Algorithms Influence on Video Stabilization Performance] given in references at the end of the post. When there is no severe haze, noise or low contrast in the scene, it is important to perform video stabilization algorithm prior to video dehazing algorithm. On the other hand, when the feature level in the scene is low, which happens because of severe haze or low contrast in image, the stabilization algorithm cannot perform well, since it cannot calculate global motion accurately. That is why, for the sake of the better stabilization performance, the proposed pipeline performs video dehazing algorithm prior to video stabilization.

Image source: scientific paper Dehazing Algorithm Influence on Video Stabilization Performance

Dehazing Algorithms Influence on Video Stabilization Performance

At the end, we will mention some of the possibilities for real time image processing platforms:

  • FPGA – very good for complex parallel operations, example of the application in paper [High-performance electronic image stabilization for shift and rotation correction] given in references.
  • Nvidia Jetson TX1, TX2, Xavier –

Get real-time Artificial Intelligence (AI) performance where you need it most with the high-performance, low-power NVIDIA Jetson AGX systems. Processing of complex data can now be done on-board edge devices. This means you can count on fast, accurate inference in everything from robots and drones to enterprise collaboration devices and intelligent cameras. Bringing AI to the edge unlocks huge potential for devices in network-constrained environments.”  – from Nvidia site, given in references.

References