Multidisciplinary work results in an unprecedented methodology capable of predicting repetition with 100% accuracy

Venous thrombosis results from the accumulation of clots in the veins. Despite the treatment, which generally lasts three to six months, two main complications can result: the fatal occurrence of pulmonary embolism, caused by the detachment of a clot, and recurrent thrombosis, which affects around 30% of individuals in first five years after the first thrombotic episode. Determining the possibility of recurrence in patients is still uncertain. Several studies have been carried out to identify patients prone to this recurrence by identifying factors that predispose them. However, as it is a multifactorial disease, this task is not easy, made even more difficult by the possibility of the interaction between the various factors being susceptible to variation from one population to another. Even so, in the last decade, three methods of *s *were proposed with a view to calculating the risks of recurrence based on clinical data, but they all present a series of limitations.

Multivariate statistical techniques – such as Principal Component Analysis – and artificial intelligence techniques – such as artificial neural networks – have gained great prominence in medical literature. The first, as the name suggests, allows you to detect patterns and important factors in the phenomenon, analyzing the variance of those that influence it. The second is capable of learning from examples provided and mathematically modeling various phenomena using artificial intelligence.

Given this framework of limitations and possibilities, chemical engineer Tiago Dias Martins, professor at the Federal University of São Paulo (Unifesp), Diadema campus, developed work with the Department of Chemical Processes at the Faculty of Chemical Engineering (FEQ) at Unicamp with a focus on with two main objectives: 1) to determine the main predictive factors of recurrent thrombosis and 2) to obtain mathematical models to predict it using artificial neural networks. Such models, in which only clinical and laboratory data are inserted, could be used in the future to predict this complication, assisting doctors in making decisions regarding whether or not to continue anticoagulant treatment, which has limitations because it can lead to hemorrhages due to the blood thinning.

The study, developed together with the Laboratory of Optimization, Design and Advanced Control, coordinated by supervising professor Rubens Maciel Filho, and co-supervised by professor Joyce Maria Annichino-Bizzacchini, doctor at the Hemocentro, which belongs to the Unicamp hospital complex, constitutes an alternative to the models in *s* existing. It assumes that recurrent thrombosis can be described by an equation that, depending on several predetermined factors, provides two possible answers, yes or no, regarding the possibility of disease recurrence.

The multidisciplinary work, developed by a group from medicine and another from chemical engineering and carried out based on a database obtained from 235 patients treated by the University's Blood Center, shows that artificial neural networks are capable of predicting the possibility of recurrence of thrombosis with a 100% possibility. The methodology is unprecedented, as there is no mention in the international literature of using the tool to predict recurrent thrombosis.

**Equations and parameters adopted**

Artificial neural networks were developed by a mathematician and a doctor in the 1950s, at IMT, in Boston, with the idea of creating a mathematical tool capable of simulating the human brain and which, similarly to it, could be subjected to a step of learning to acquire conditions to make decisions about new situations related to what you learned. To do so, these networks need to be equipped with elements that show how the system to be studied works.

By the way, the researcher explains: “Our focus was to determine whether the patient is likely to have recurrent thrombosis. But as it depends on many factors, we initially chose 39 of them, which were later refined with a view to achieving practicality and reducing costs.” As venous thrombosis can affect various parts of the body, the study focused on the legs, the most common occurrence, and the brain, the rarest and highly complex situation.

Initially, the idea was to use in the equation, which describes artificial neural networks, data from previous thrombosis and those from blood tests after the end of treatment, which made up the 39 selected factors, in order to predict the possibility of a second thrombosis. To this end, we considered, among others, information about sex, age, time of occurrence of the event, association with pulmonary embolism, affected limb or organ, patient treatment time with anticoagulants, whether the manifestation was spontaneous or provoked, here considered that which results from long trips without movement of the lower limbs, diabetes, use of contraceptives, cancer. The blood test, carried out in the first weeks after the end of treatment, provided a count of red and white blood cells, number and average volume of platelets, size distribution of red cells, HDL and LDL cholesterol, triglycerides, glucose, among other factors mentioned. in the literature that may be related to thrombosis. This model, called complete by the researcher, was followed by three other aspects, that is, the development of three other mathematical equations.

In fact, in parallel, the author used the statistical technique called Principal Component Analysis, which allows selecting the most relevant factors within a set of data. With this mathematical algorithm he managed to reduce the factors to be considered from 39 to 18, which led him to propose a second and new equation, which endorsed the results obtained in the complete model.

But as in medicine the difference between provoked and spontaneous thrombosis constitutes a watershed, Tiago proposed to develop a third equation that took this variable into account, increasing the number of factors to 19. In this case, the results were equally compatible .

There was yet another problem. Among the factors considered were the determinations of proteins S and C and antithrombin, whose low levels indicate a tendency to thrombosis. It turns out that determining these levels is expensive and the ideal would be to arrive at a mathematical model that would allow the exclusion of these three parameters. This is what the researcher successfully proposed in the use of a fourth equation, reducing the number of necessary factors to 16.

The four models proved to be equally efficient, allowing the patient's susceptibility to recurrent thrombosis to be predicted with practically 100% accuracy, a conclusion reached by comparing the predictions resulting from the use of the equations with what was actually observed in the 235 patients studied. In other words, the mathematical treatment endorsed what had actually happened to the patients treated in the last five years by the Hemocentro.

**Dimensions of work**

Regarding the scope of the work, Tiago states: “Although we were successful, we still intend to monitor a new group of patients treated at the Hemocentro to carry out new checks so that this tool can be used safely as a predictive element. Furthermore, checks still need to be carried out in other populations to verify whether the equations need to be adjusted to different situations, since the artificial neural network is in itself an equation susceptible to adaptable changes to each case”.

The researcher emphasizes that the mathematical equation associated with artificial neural networks has great potential for public use, especially in centers that need to work with lower costs, as it uses variables determined through blood tests. Furthermore, it addresses a disease that affects many people and its recurrence is quite serious because it can return more aggressively or manifest itself in more sensitive sites involving more serious situations such as the lungs and the brain. The equation can be used in the future in a simple spreadsheet editor or in a cell phone application that can even be used by doctors.