Machines can already see, process and interpret the reality around them. This is thanks to systems that simulate human cognition and obtain information from images. The so-called computer vision is one of the technologies that supports advanced engineering, acting, above all, in the monitoring and analysis of various processes and activities. Scientists from several countries, including Brazil, are investing in improving this technology, aiming to expand its use. The objective is to incorporate it into systems that help diagnose diseases, devices that help people with speech problems communicate or autonomous cars, among other applications.
“Computer vision uses systems associated with image capture technologies and decision-making support based on analysis algorithms or artificial intelligence”, says mechanical engineer Paulo Gardel Kurka, from the Faculty of Mechanical Engineering at the State University of Campinas (FEM -Unicamp), an important research center in this area in the country. Studies involving image processing systems gained relevance in the 1970s, with the increase in computer processing capacity and the creation of electronic sensors capable of capturing images and digitizing them. In the following decades, the advancement of studies with semiconductor materials and the miniaturization of electronics drove the creation of more sophisticated systems, capable of obtaining, processing and analyzing information from these images more efficiently.
Computer vision technology is based on three image processing steps. The first involves capturing the image using a device, such as a camera. The image is recorded on a two-dimensional mesh of light-sensitive elements. Each of these elements is capable of storing a binary numerical value, corresponding to the light intensity that falls on it. The elements of this mesh correspond to the pixels that are used to display the image on a bright screen. “This data is subjected to processing techniques to improve image quality and highlight or eliminate certain characteristics that do not add information to the intended use”, explains Kurka.
This second step is done through techniques that identify and select regions of the image element mesh containing relevant data. “During processing, the regions of interest used in subsequent analyzes are defined, which will adapt the characteristics of each image to the objectives intended to be achieved.” The last step involves analysis, classification and recognition of the data of interest. This is done based on the type of application for which each computer vision system was developed.
The global market for image processing systems was worth US$11,9 billion in 2018, and could reach US$17,3 billion in 2023, according to North American consultancy Markets and Markets. Seven of the 20 companies that invest the most in this technology are in the United States. Google is one of them. The Silicon Valley giant focuses on applying this technology to autonomous cars. The idea is to equip them with cameras and sensors capable of producing, processing and analyzing images, differentiating people from objects and helping with their movement. All in real time.
Advances are also taking place in the academic sector. Two of the main research centers in this area in the world, the California (Caltech) and Massachusetts (MIT) technology institutes, work on image processing systems with diverse applications. Part of the knowledge generated at Caltech was used to create georeferencing devices for robots sent to Mars. At MIT, the technology is used to identify objects in dark places, to improve human abilities in robots, such as sensitivity and dexterity, and in autonomous cars.
In Brazil, one of the most frequent applications of computer vision is the monitoring of industrial processes, one of the pillars of industry 4.0. This is the case of Autaza, a startup from São José dos Campos (SP), which created an industrial inspection system in which cameras photograph parts on a production line and use artificial intelligence resources to detect possible defects (see FAPESP Research No. 259).
These systems are also used in the country by the timber industry. Researchers from the São Paulo State University (Unesp), campus de Itapeva, and the Institute of Mathematical and Computer Sciences of the University of São Paulo (ICMC-USP), in São Carlos, created a technology capable of indicating the quality of the boards and the species of tree to which they belong (see FAPESP Research No. 257). The system, in use by the timber company Sguario, from Itapeva (SP), checks whether the wood is of legal origin and separates it by type of tree, which can affect sales prices.
Improved communication
Computer vision is also being used to assist people with physical limitations and hospital patients. São Paulo startup Hoobox Robotics, founded in 2016 by researchers from Unicamp, created a facial recognition system that captures and translates expressions into commands to control the movement of a wheelchair, without the need for body sensors. The solution, called Wheelie 7, can identify and recognize more than 10 expressions, such as eyebrow arching or eye blinking. Using a camera aimed at the user's face, the system captures expressions, which are then interpreted by an algorithm. A program turns them into commands, such as go forward or turn left.
The solution was developed with support from FAPESP's Innovative Research in Small Businesses (Pipe) program. For now, the wheelchair is only sold in the United States, through a monthly subscription of US$300. Hoobox, in partnership with Albert Einstein Hospital, in São Paulo, is already testing the technology to detect human behaviors, such as agitation or spasms, in patients in intensive care units (ICUs).
In the same vein, mechanical engineer Marcus Lima, a researcher at FEM-Unicamp, applied computer vision to create an application for mobile devices that uses the front camera to detect eye commands, allowing communication through text-to-speech conversion. EyeTalk, as it was named, is the result of research supported by Pipe. According to Lima, the initial intention was to create a system to capture eye commands and control the movement of drones. “During the project, I saw that I could adapt the technology so that it could be used by people with speech impairment”, he recalls.
The solution works like a virtual keyboard, with keys that flash in sequence. The idea is simple: a tablet or smartphone containing the application is attached to a support in front of the person. The front camera is aimed at the individual's eyes, which, with a blink, selects the letters of interest, forming words and sentences. These sentences are then converted to audio by a digital voice. The solution is similar to that used by British physicist Stephen Hawking (1942-2018), who suffered from amyotrophic lateral sclerosis (ALS) and spent most of his life immobilized in a wheelchair without being able to speak. “The problem,” says Lima, “is that the available models can cost up to £15 [around R$75], unaffordable for most people.”
Lima is currently heading Acta Visio, a company created in 2017 to develop solutions based on computer vision. The researcher and his team are working on the prototype of an image processing system to monitor the hand hygiene of healthcare professionals. The objective is to reduce hospital infections and provide quantitative data that can assist hospital managers in evaluating hygiene procedures. The first tests will be carried out in April at the Hospital Universitário Cajuru, in Curitiba, in partnership with the hospital management company 2iM.
More accurate diagnosis
Still in the health sector, image processing systems are used to identify biomarkers that help in the diagnosis, prognosis and treatment of some types of cancer, such as breast cancer. Many of these tumors are discovered when the disease is at an advanced stage. Early detection involves clinical examination and mammography, carried out with an X-ray device capable of identifying early lesions with cancerous potential. In many cases, to reach a more accurate diagnosis, a biopsy is used, the removal of a fragment of suspicious tissue for analysis. On average, out of every eight biopsies performed, only one is positive.
An innovation created by professor of biomedical informatics and medical physics Paulo Mazzoncini de Azevedo Marques, from the Faculty of Medicine of Ribeirão Preto at USP, allows the identification of patterns associated with this type of tumor, potentially reducing the number of biopsies. “We used computer vision to create an algorithm capable of detecting and analyzing microcalcifications, small calcium crystals present in the breast and represented as small light dots in the image, indicating whether they are associated with a benign or malignant lesion [cancer]”, he says. “The algorithm evaluates each pixel of the image within specific neighborhoods and, based on the extraction of quantitative attributes, identifies variations that may be associated with a suspicious pattern.”
In the healthcare sector, computer vision can help diagnose cancer and autoimmune diseases
The model can also be trained to recognize patterns that support diagnosis and treatment in cases of lung tumors and autoimmune rheumatic diseases. Based on the variation in shades of gray in images of suspicious lesions captured by X-ray, magnetic resonance and computed tomography devices, and using artificial intelligence algorithms, it is possible to analyze suspicious regions in isolation, comparing their characteristics with those of lesions identified in images of other previously diagnosed patients.
“Our intention is for the system to use this accumulated data and learn through experience, being able to establish patterns that help the doctor to recognize where the significant findings are in the images, their characteristics and whether they are associated with more aggressive tumors”, explains the researcher. The solution is being improved based on information available in local and public clinical databases. The idea is to train the system and refine its ability to recognize patterns associated with these diseases.
Projects
1. Automatic quality inspection of automotive bodies (nº 17/25873-8); Modality Innovative Research in Small Businesses (Pipe); Responsible researcher Jorge Augusto de Bonfim Gripp (Autaza Tecnologia); Investment £ 1.452.695,60.
2. Wheelie, innovative technology for driving a wheelchair (nº 17/07367-8); Modality Innovative Research in Small Businesses (Pipe); Responsible researcher Paulo Gurgel Pinheiro (Hoobox Robotics); Investment £ 723.814,04.
3. Design, development and assembly of a prototype eye tracking system embedded in first-person vision glasses to control a drone intended for quadriplegic people (nº 16/15351-1);Modality Innovative Research in Small Businesses (Pipe); Research responsible Marcus Vinicius Pontes Lima (Acta Visio); Investment £ 57.974,44.
4. Development and implementation of radiology and diagnostic imaging domain ontology for clinical practice in a teaching hospital (nº 11/08943-6);Modality Research Assistance – Regular; Responsible researcher Paulo Mazzoncini de Azevedo Marques (USP Ribeirão Preto); Investment £ 55.731,32.
This text was originally published by FAPESP Research according to the Creative Commons license CC-BY-NC-ND. read the original here.