Research involves computational and artificial intelligence resources in solving everyday problems
Unicamp had three projects selected in the 8th edition of the LARA 2020 Prize, the Latin American Research Awards research scholarship program. The initiative is carried out by the Google Engineering Center in Latin America. This year, 22 projects were selected, 13 Brazilian, four Argentine, two Chilean, one from Peru, one from Colombia and one from Mexico. Everyone proposes technological solutions to everyday problems. Along with the Federal University of Minas Gerais (UFMG), Unicamp was the Brazilian institution with the most proposals considered. In total, there were three projects from Unicamp and UFMG, in addition to projects from the Federal Universities of Uberlândia (1), Rio de Janeiro (1), Fluminense (1) and Rio Grande do Sul (1) and a project also from University of São Paulo (1).
Master's and doctoral projects selected by LARA receive research grants for one year. The values are US$1,2 for doctoral students and US$750 for doctoral advisors and US$750 for master's students and US$675 for master's advisors. Discover the research carried out by Unicamp included in this edition of the award. The three are from the Computing Institute (IC) and are characterized by dialogue between this area and other everyday demands.
Slow Internet? Call the drone
Every time you access a website, application or use internet services, your device sends a request to a data center, which responds by sending the requested data. The time elapsed between sending this data and receiving the response is called latency and can increase when network demand increases. This can occur when many people access many services at the same time, such as at large events, or when the network was not designed for the amount of data being transmitted. The solution to this problem involves expanding these data centers, giving them greater processing capacity, or decentralizing them, bringing them closer to users.
Thinking about an alternative to this problem, the project "Reduction of service latency through the use of fixed-wing unmanned aerial vehicles", by doctoral student Rodrigo Augusto Cardoso da Silva, from IC, proposes the use of drones to make these structures, which they operate as data centers, closer to people, in addition to having mobility to meet demands in different locations and times. The research, which is supervised by Professor Nelson Luis Saldanha da Fonseca, is based on the idea of fog computing.
Rodrigo explains that the difference between this type of computing and so-called cloud computing, a concept better known among non-specialized users, concerns the structure of these networks, designed in a decentralized way. In the first case, the data is concentrated in large centers, often located outside the country. In fog computing, the idea is that they are distributed across smaller centers. "With the concept of fog computing, our goal is to bring these resources closer to the user, within cities or even within neighborhoods, depending on the type of application. With this, we are able to have faster access and processing of this data in fractions of a second, something very fast. This is important because it is expected that in the future, with greater processing capacity in mobile devices, there will be more and more applications of this type, which require a quick response", he details.
The project then seeks to expand the way these smaller and decentralized data centers, called "nodes" within the network structure, evaluate the possibility of drones meeting this need. "Today there are many uses for drones, such as delivering packages, capturing aerial images, and, within the area of network development, there is research that studies how to use a drone instead of a cell tower, for example. In our In this case, we looked at this processing problem, how we can use a drone to process data closer to users", says Rodrigo. The researcher clarifies that the objective is not to replace fixed structures with drones, but to evaluate which situations the equipment can be complementary to, reducing latency.
To achieve this, the research invests in simulations of hypothetical scenarios in which drones could be used as alternatives. "We vary the number of users, the types of requests made, the amount of data they need to transmit through the infrastructure. Then we evaluate these parameters to see in which scenarios this solution works best and in which it doesn't", points out Rodrigo, who also takes into account factors such as drone flight autonomy, energy consumption, area coverage. Therefore, the research gives preference to fixed-wing drones, similar to model airplanes, as they can fly higher with less energy consumption.
She cites the Unicamp campus itself as an example of these scenarios, with different locations, where the demand for internet access in each of them can vary throughout the day: "For example, on the Unicamp campus, there is the Basic Cycle, the Hospital de Clínicas and the Rectory. Let's assume that each location has a spike in data traffic at different times of the day. I may be able to serve these three locations with just one drone, instead of a ground infrastructure in each one from them".
Machine learning at the service of health and inclusion
Data from the Brazilian Society of Dermatology show that skin cancer accounts for 33% of all cases of the disease in the country. According to the agency, 185 new cases of cancer are recorded every year. Of these, 177 thousand (95,6%) are of the non-melanoma type, less aggressive and with low mortality, but very recurrent. The remainder, around 8,4 cases (4,4%) are of the melanoma type, but more aggressive.
To assist in diagnosing the disease, image banks are now common with information on which lesions, such as moles and spots, are cases of cancer or not. Based on artificial intelligence, computer systems can learn what patterns exist in each case, facilitating diagnosis when a new image is analyzed. However, this process faces some limitations. The first is the availability of time, specialists and resources to insert important data into the images, one by one, so that the systems can recognize the patterns and start doing this on their own: whether or not it is a type of cancer, what part of the body, what is the important information about the patient.
Another limitation, sensitive to the Brazilian reality, are the types of skin found in these image banks. "These databases come, essentially, from Australia, the United States and Europe. In other words, they have examples of people, most of the time, with white skin. If we want to create a model that can be used by the Brazilian population , and the Brazilian's skin type is not in it, it will not work. There are even techniques to adapt the domain of a bank to our domain, we are already worried about this, but when we come across melanomas in black skin, for example , the patterns are not the same as melanomas in white skin", details Sandra Avila, professor at IC.
Thinking about expanding the diversity existing in these databases, favoring the diagnosis of skin cancer in the country, and facilitating the formation of these banks, she and doctoral student Alceu Bissoto were awarded by the LARA program with the project "Rethinking the automatic classification of skin cancer skin with unsupervised representation learning.”
The proposal is to increase the autonomy of the systems so that they depend less on so-called annotated images, which receive data added by specialists, so that they can identify cancer cases in common images. "How about we try to work with unannotated images, with a large set of unannotated images and a small set of annotated images, where it is possible to work with this representation, find these patterns, so that first the machine learns to represent this data and then it is able to represent different data? So first it detects a lesion, within the characteristics of the Brazilian population, and then identifies what is a malignant or benign lesion, makes this differentiation", explains Sandra. In this group of images, different types of skin are represented, in a manner consistent with the country's diversity.
According to the professor, enabling the use of common images in these databases facilitates the diagnosis and referral of possible cases of skin cancer that could be identified in basic health units through photos taken with a cell phone. But due to the lack of medical equipment, such as dermatoscopes, and specialized professionals, they often go unnoticed. However, she emphasizes that the tool would function as a facilitator in decision-making regarding diagnosis and referral, without replacing the role of dermatologists and other professionals in the field.
"The idea is not to make a diagnosis, in the sense of taking the place of the specialist, in fact no one works with artificial intelligence with this objective, the idea is to have increased intelligence, supporting the specialist. Especially because what comes out of the system is information numerical, but not an instruction on what to do with that number. So it really requires the interpretation of a specialist", he comments.
Alceu Bissoto, a PhD candidate in Computer Science, has been working on this project for three years. For him, directing the potential of computing to other areas is an important opportunity. "Computing, for many areas, is seen as a tool, it acts to solve problems. But, in fact, working with professionals from other areas, with this plurality of backgrounds, is very positive, because you end up not just worrying about computing problems and ends up paying attention to things that don't even cross our minds", reflects the researcher.
"The work of gathering information is becoming increasingly difficult"
Being in doubt whether information is false or not has become common in many people's lives. For journalists and professionals who work with content production, this is a fundamental concern. A simple mistake can have major consequences. In the search for a tool that helps this work, the project "Combating fake news through authorship attribution and phylogeny analysis" aims to make short texts published on Twitter more transparent. The initiative is authored by Anderson Rocha, professor at IC, and doctoral student Antonio Theóphilo.
The work focuses on two areas related to checking whether or not a message was produced by a specific social network user: authorship attribution, when it is checked whether a message has the same stylistic characteristics as others published by the same person, and phylogeny analysis , which indicates whether it is an original message or whether it was reproduced by other people and, along the way, underwent modifications in its content. To do this, researchers use machine learning resources. With them, computer systems can recognize patterns in a person's writing and tell whether a text fits that style.
"When there are different messages from the same person, how can you identify their writing style? It is generally the language the person uses, the most common words and expressions, whether they use abbreviations, whether they use emojis. When this analysis is done in a automatic to identify what is standard, this is done with artificial intelligence", explains Anderson Rocha. According to the researchers, the study carried out using messages on Twitter opens up possibilities for the development of resources applicable to other texts, from news reproduced on websites with questionable authorship and credibility to expertise carried out by police agencies.
"Imagine that you see a post on Twitter, supposedly coming from an authority or a journalist. It would be interesting if there was a platform that used these techniques and informed, based on that person's old texts, whether or not this text is from the person and issue . Whether the consumer of the news will believe it or not is another story. We do not believe in silver bullet solutions that solve all fake news problems, but we believe that we can develop tools that help people, at the end of the flow of information, to decide whether it is false or not", explains Antonio Theóphilo.
The research is part of a series of projects developed by the RECOD Laboratory - Reasoning for Complex Data, which works with digital information data management. Anderson Rocha explains that the research carried out at the site favors partnerships with a series of institutions, including journalistic information-checking projects. "We have a large project at the RECOD laboratory called DéjàVu. Its objective is precisely to investigate, in different types of media, whether there has been any falsification or not. We analyze scientific articles, images, videos. There are also works on fake news that try to comparing articles with the comments left and with the posts on social networks left on those articles, then we can look at these three things in the same context, to check if there are signs of falsification in any of these publications", explains the professor.