Unicamp's daughter company makes one of Google's artificial intelligences available for free in Portuguese

image editing

NeuralMind, a daughter company of Unicamp located in the Scientific and Technological Park, works with artificial intelligence and has just made available, in an unprecedented and free way, a Google algorithm trained for the Brazilian Portuguese language.

In this case, it is the platform's open source, called Bidirectional Encoder Representations from Transformers (BERT), which was released in December and aims to make searches more accurate by processing natural language. In other words, the tool better understands what users want to find with their keywords based on this new process.

According to Google, 15% of searches carried out on its platform per day are new, which justifies the development of the algorithm to offer more accurate results. This is just one of the applicability of the code for artificial intelligence, as explained by professor at the Faculty of Electrical and Computer Engineering (FEEC) at Unicamp and technical director of NeuralMind, Roberto Lotufo.

“Google’s precise search example is just one of the many applications for using BERT. For example, at NeuralMind, we use BERT in other natural language processing tasks such as data extraction, be it people's names, addresses, institutions and dates”, explains Lotufo.

Despite the benefits of the code, Google distributed the algorithm with training only in English, Mandarin and multilingual, a generic version used for other languages ​​not covered. As the generic version is not as effective as training in a specific language, several entities around the world decided to train the tool in their own language.

“In Brazil, we train BERT Portuguese, as it presents better results than if we used BERT-multilingual. Now, the algorithm is available free of charge to disseminate the technology in Brazil and other Portuguese-speaking countries, which can contribute to the advancement of research and development of products in this area, such as chatbots”, highlights the professor about the unprecedented feat in the country.

In training, the daughter company had to use an extensive text in Brazilian Portuguese, using the free text corpus Brazilian Web as Corpus (BrWaC). Lotufo recalls that the training “was a Herculean effort, involving several days of Google Cloud machines, in addition to several weeks of data preparation”, but with positive results.

Today, companies or developers who wish to adopt the solution can access it on NeuralMind GitHub, a source code hosting platform used by the daughter company.

More information on the website of company

cover image
Audio description: Employees in the office of the company Neuralmind

twitter_icofacebook_ico

Internal Community

Delegation learned about research carried out at Unicamp and expressed interest in international cooperation

The show class with chef and gastrologist Tibério Gil on the role of nutrition and gastronomy in contemporary women's health, this Thursday (7), opened the program that runs until Friday (8)

news

According to Maria Luiza Moretti, despite the progress seen in recent years, the occupation of command positions is still unequal between men and women

There will be four years of partnership, with six places offered each year in the first two periods; the offer increases to nine beneficiaries in the following two years

The publications are divided in a didactic manner into the themes General Women's Health, Reproductive Health, Obstetric Health and Adolescent Women's Health

Culture & Society

For rector Antonio Meirelles, a political commitment in favor of the solution is necessary and the Brazil can play an extremely important role in global environmental solutions 

 

Writer and columnist, the sociologist was president of the National Association of Postgraduate Studies and Research in Social Sciences in the 2003-2004 biennium