A data-mining based process to early identify breast cancer from metabolomic data

Abstract of our work presented at EURO 2018, the largest and most important conference for Operational Research, co-authored by Víctor M. Rivas Santos, jointly with researchers of Complejo Hospitalario de Jaén and Fundación Medina.

This paper was presented last 9-July-2018 at Valencia, as part of the stream Data Mining and Statistics.

A data-mining based process to early identify breast cancer from metabolomic data


We present the results yielded by our multidisciplinary group in the task of discriminating blood samples coming from breast cancer patients and healthy people. Models used to classify samples have been built using data mining techniques; data have been collected by means of liquid chromatography-mass spectrometry, a technique that detects and quantifies the metabolites present in blood samples.

Different algorithms have been tested under 10-CV and 75/25 scenarios. Our experiments showed that IBk, and J48 and Logistic Model Trees yielded rates greater than 90% only for healthy people. Naive Bayes and Random Forest enhanced the previous results in the 10-CV approach, but they did not yield more than 85% of true positives for patients in the 75/25 one. Finally, Bayesian network resulted to be the best algorithm as rates greater than 90% were yielded for both patients and rest of the people.

Many statistics have been computed as well as confusion matrices, showing that the model built by Bayesian network can effectively be used to solve this problem. Currently, the metabolites used to do built the model are being identified by biochemists. This last step will be definitive in order to consider them as a valid biomarker for breast cancer.


On Volunteer-Computing and Self-driving car fuzzy controllers in the sunny Cádiz

Every two years, the International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU) brings together the most important researchers in the area of uncertainty and fuzzy systems. As I am working in Cadiz, it was a great opportunity to present some of the latest work that the Geneura group has recently developed.

The first of these has been developed together with the Technical Institute of Tijuana and describes the social behaviour of users of a voluntary computer system. It is very interesting to discover how the use of a leaderboard makes users spend more time collaborating. Take  a look to the presentation:

Mario García Valdez, Juan Julián Merelo Guervós, Lucero Lara, Pablo García-Sánchez:
Increasing Performance via Gamification in a Volunteer-Based Evolutionary Computation System. IPMU (3) 2018: 342-353

Here is the abstract:

Distributed computing systems can be created using volunteers, users who spontaneously, after receiving an invitation, decide to provide their own resources or storage to contribute to a common effort. They can, for instance, run a script embedded in a web page; thus, collaboration is straightforward, but also ephemeral, with resources depending on the amount of time the user decides to lend. This implies that the user has to be kept engaged so as to obtain as many computing cycles as possible. In this paper, we analyze a volunteer-based evolutionary computing system called NodIO with the objective of discovering design decisions that encourage volunteer participation, thus increasing the overall computing power. We present the results of an experiment in which a gamification technique is applied by adding a leader-board showing the top scores achieved by registered contributors. In NodIO, volunteers can participate without creating an account, so one of the questions we wanted to address was if the need to register would have a negative impact on user participation. The experiment results show that even if only a small percentage of users created an account, those participating in the competition provided around 90% of the work, thus effectively increasing the performance of the overall system.


The second work uses an evolutionary algorithm to optimize the parameters of a fuzzy controller that drives a car in the TORCS video game and continues our previous work. We have been collaborating with Mohammed Salem of University of Mascara along this line for a while.

Mohammed Salem, Antonio Miguel Mora, Juan Julián Merelo Guervós, Pablo García-Sánchez: Applying Genetic Algorithms for the Improvement of an Autonomous Fuzzy Driver for Simulated Car Racing. IPMU (3) 2018: 236-247

Games offer a suitable testbed where new methodologies and algorithms can be tested in a near-real life environment. For example, in a car driving game, using transfer learning or other techniques results can be generalized to autonomous driving environments. In this work, we use evolutionary algorithms to optimize a fuzzy autonomous driver for the open simulated car racing game TORCS. The Genetic Algorithm applied improves the fuzzy systems to set an optimal target speed as well as the instantaneous steering angle during the race. Thus, the approach offer an automatic way to define the membership functions, instead of a manual or hill-climbing descent method. However, the main issue with this kind of algorithms is to define a proper fitness function that best delivers the obtained result, which is eventually to win as many races as possible. In this paper we define two different evaluation functions, and prove that fine-tuning the controller via evolutionary algorithms robustly finds good results and that, in many cases, they are able to play very competitively against other published results, with a more relying approach that needs very few parameters to tune. The optimized fuzzy-controllers (one per fitness) yield a very good performance, mainly in tracks that have many turning points, which are, in turn, the most difficult for any autonomous agent. Experimental results show that the enhanced controllers are very competitive with respect to the embedded TORCS drivers, and much more efficient in driving than the original fuzzy-controller.


Self-organized criticality in code repositories

The GeNeura team is spread all over the world, and Dr. Juanlu Jiménez is in Le Havre as associate professor. He’s been so kind to invite us to a visit, and here’s the presentation we have made there.

Equipe Réseaux d’interactions et Intelligence Collective

During the last two weeks, we have been enjoying the visit of JJ Merelo at Ri2C team. On May 19th, he was delivering a seminar entitled Self-organized criticality in code repositories, of which you can find the abstract and the presentation next.


It’s been known for some time that work in code repositories tend to self-organize and possibly in a self-organized state. What was not known is the conditions for this to happen, and what kind of description of the repository is needed to find these properties. In this talk we describe how a self-organized critical state has been found in a wide variety of repositories, including code or not.

The slides of the presentation are available at: https://jj.github.io/soc-code-repos/#/

View original post

Creating Hearthstone decks by using Genetic Algorithms

I’m glad you’re here, friend! There’s a chill outside, so pull up a chair by the hearth of our inn and prepare to learn how the Ancient Gods use the power of the secret and ancient branch of the Evolution to generate Hearthstone decks by means of the magic and mistery!!


Several months ago, my colleague Alberto Tonda and I were discussing about our latest adventures playing the Digital Collectible Card Game Hearthstone, when one of us said “Uhm, Genetic Algorithms usually work well with combinatorial problems, and solutions are usually a vector of elements. Elements such as cards. Such as cards of Hearthstone, the game we are playing right now while we are talking. Are you thinking what I’m thinking?”

Five minutes later we found an open-source Hearthstone simulator and started to think how to address the possibility of automatically evolve decks of Hearthstone.

The idea is quite simple: Hearthstone is played using a deck of 30 cards (from a pool of thousands available), so it is easy to model the candidate solution. With the simulator, we can perform several matches using different enemy decks, and obtain the number of victories. Therefore, we have a number that can be used to model the performance (fitness) of the deck.

Soooo, it’s easy to see one and one makes two, two and one makes three, and it was destiny, that we created a genetic algorithm that generates deck for Hearthstone for free.

Our preliminary results where discussed here, but we wanted to continue testing our method, so we tested using all available classes of the game, with the help of JJ, Giovanny and Antonio. All the best human-made decks were outperformed by our approach! And not only that, we applied a new operator called Smart Mutation that it is based in what we do when we test new decks in Hearthstone: we remove a card, and place another instead, but with +/-1 mana crystals, and not one completely random from the pool. The results were even better. Neat!

Maybe you prefer to read the abstract, that it is written in a more formal way than this post. You know, using the language of the science.

Collectible card games have been among the most popular and profitable products of the entertainment industry since the early days of Magic: The Gathering in the nineties. Digital versions have also appeared, with HearthStone: Heroes of WarCraft being one of the most popular. In Hearthstone, every player can play as a hero, from a set of nine, and build his/her deck before the game from a big pool of available cards, including both neutral and hero-specific cards.
This kind of games offers several challenges for researchers in artificial intelligence since they involve hidden information, unpredictable behaviour, and a large and rugged search space. Besides, an important part of player engagement in such games is a periodical input of new cards in the system, which mainly opens the door to new strategies for the players. Playtesting is the method used to check the new card sets for possible design flaws, and it is usually performed manually or via exhaustive search; in the case of Hearthstone, such test plays must take into account the chosen hero, with its specific kind of cards.
In this paper, we present a novel idea to improve and accelerate the playtesting process, systematically exploring the space of possible decks using an Evolutionary Algorithm (EA). This EA creates HearthStone decks which are then played by an AI versus established human-designed decks. Since the space of possible combinations that are play-tested is huge, search through the space of possible decks has been shortened via a new heuristic mutation operator, which is based on the behaviour of human players modifying their decks.
Results show the viability of our method for exploring the space of possible decks and automating the play-testing phase of game design. The resulting decks, that have been examined for balancedness by an expert player, outperform human-made ones when played by the AI; the introduction of the new heuristic operator helps to improve the obtained solutions, and basing the study on the whole set of heroes shows its validity through the whole range of decks.

You can download the complete paper from the Knowledge-based Systems Journal https://www.sciencedirect.com/science/article/pii/S0950705118301953

See you in future adventures!!!

A better TORCS driving controller presented in EvoStar 2018

Amazing bench
Last year, we presented along with Mohammed Salem, from the university of Mascara, in Algeria, our TORCS driving controller. This controller effectively drives a simulated vehicle, considering input from its sensors, and deciding on a target speed and how to turn the steering wheel.
Poster session, with our poster in the first position
This year, in Evostar 2018 in Parma, we had again our paper accepted for the poster session, which took place in the incredible corridor to the right of these words. The poster included interactive elements, such as a small car used for demonstration on how the driver worked.

And it works really well, or at least better than the previous versions. The key element was the design of a new fitness function that includes damages, and also terms related to speed. Still some way to go; in the near future we will be posting our new results in this area.

The book of proceedings can be downloaded from Springer. Our paper is in page 342 and you can also download just the paper from here, but we do open science, so you can follow our writing process and download the paper from this GitHub repository too



Detección y predicción de flujos de personas y vehículos

En el marco del congreso CIMAS 21, que se celebrará en Granada, haré una presentación sobre las posibilidades de nuestro sistema de detección de tramas WiFi y Bluetooth, del que ya hemos hablado varias veces.

La presentación se centrará en los aspectos más analíticos de la plataforma, viendo las posibilidades que puede tener para un destino turístico con énfasis deportivo.



Early prediction of the outcome of Starcraft Games

As a result of Antonio Álvarez Caballero master’s thesis, we’ll be presenting tomorrow at the IJCCI 2017 conference a poster on the early prediction of Starcraft games.
The basic idea behind this line of research is to try and find a model of the game so that we can do fast fitness evaluation of strategies without playing the whole game, which can take up to 60 minutes. That way, we can optimize those strategies in an evolutionary algorithm and find the best ones.
In our usual open science style, paper and data are available in a repository.
Our conclusions say that we might be able to pull that off, using k-nearest neighbor algorithm. But we might have to investigate a bit further if we really want to find a model that gives us some insight about what makes a strategy a winner.


Dark clouds allow early prediction of heavy rain in Funchal, near where IJCCI is taking place