P2P technology for computing tasks does not always mean P2P computing

March 30, 2009

What would you think of a Bugatti Veyron limited to a maximum speed of 20 Km/h?? mmmm… YES!! Give me back the money!!

… well, that’s roughly my feeling when I read a paper on P2P computing restricting the scalability analysis to some few nodes. Of course, the analogy is not completely fair since performing a real massively distributed (and decentralized) experiment presents some challenges that, in most of the cases so far, remain out of the scope of the state of the art research. That happens, for instance, to P2P environments applied to optimization, and more precisely to Evolutionary Computation.

Usually, you can find two different approches for P2P optimization testing, either using simulations or using a “GRID-like style” in real environments, each case presents its own advantanges and drawbacks:

  • Using a real P2P environment ( we performed a study like that using a 8×2 cluster). The adventage here is that the design has to face real restrictions, as communication or evaluation times, message passing restrictions or fault tolerance. Nevertheless, there are extreme difficulties to set a real and proper environment to test a model.

When you face a real environment, you find that:

  • Large number of resources are hard to grab
  • Scalability is hard to study. In the case of having few nodes, no real P2P computing is going on since no conclusions about large scalability can be drawn. On the other hand, if there are some good dozens of peers, other questions such as fault tolerance arise.

In the last friday paper seminar our team was discussing the following paper that uses the “grid-like style” for testing:

in which the authors propose a hybrid model combining islands with cellular EAs. Every peer holds an island and every island a cellular EA. As previously commented for grid-like testing, the scalability analysis is quite poor (up to 10 peers), additionally, the algorithm yields the best performance in the five peers’ scenario, pointing to a really poor scalability of the model. Nevertheless, the fact is that P-CAGE outperforms canonical EAs making of it a valid solution based on P2P technology, just that, such a solution is not really scalable and therefore, can not be really understood as P2P computing.

To conclude, I do think that simulations can benefit the understanding of complex interactions in P2P EAs at this stage of research, preventing situations as the one shown above, afterwards, there will be time to validate the models in real environments, letting that theory meets practice.


Pervasive evolutionary algorithms on mobile devices

March 25, 2009

Last week I received the paper aceptance notification for the DCAI conference in Salamanca. This time we are going to present a distributed algorithm framework for mobile devices using Bluetooth. Ok, today I am quite lazy (today, and all the days, actually), so I think the best idea is to copy the abstract.

Abstract. This work presents a Java framework which allows to implement easily connectivity applications via Bluetooth. Nowadays it is difficult to program Bluetooth devices, so it is necessary to use a high-level Application Programming Interface (API) to make easy the creation of applications in Java ME and Java SE platforms, the most extended ones. As a solution we show the development of a distributed computing environment using a layered, client-server, and event-based with asynchronous communication architecture. In addition we solve two well-known evolutionary computation problems (the Traveler Salesman Problem and the Wave Function Problem), as an example of use.

The most interesting thing is that we have used real mobiles in order to execute the experiments, with all associated problems. It is difficult to find a compatible mobile phone with a Bluetooth stack that works properly. Even is not easy communicate two phones of the same fabricant but different model! But finally we managed to start the experiment, as you can see in the next photo.

Two Nokias executing our distributed algorithm. Photo by DraXus.

You can download the draft from here.


Immigrants do all the work

March 16, 2009

Genetic Agorithms with Memory- and Elitism-based Immigrants in Dynamic Environments was the paper selected for presentation and discussion in last Friday’s seminar. The article, authored by Shengxiang Yang, was published in the last Fall edition of Evolutionary Computation and addresses evolutionary optimization in dynamic environments.

Yang proposes two Genetic Algorithms (GAs) for dynamic optimization based on Grefenstette’s classical Random Immigrants GA (RIGA). RIGA tackles (or tries to) changing optima by inserting, in each and every generation, a certain number of randomly generated individuals that replace the worst individuals in the population (or randomly selected individuals, in another version). This way, genetic novelty is constantly being introduced and the population is expected to have enough diversity to react to changes in the environments. However, RIGA suffers from a major “weakness”: the raw building-blocks introduced by the randomly generated individuals are quickly removed from the population because their fitness is usually below average. RIGA is frequently chosen as a peer-algorithm for comparison purposes in evolutionary dynamic optimization studies, but, due to this “weak spot”, it may be questioned if a Standard GA is not better suited to assess the efficiency of a new method (in fact, studies currently being developed in our lab reinforce this hypothesis). In order to improve RIGA’s performance, several alternative RIGA-based methods have been proposed in the past few years.

The two GAs described in Yang’s paper try to overcome the problem with random immigrants by inserting in the population mutated copies of the elite — Elitism-based Immigrants Genetic Algorithm (EIGA) — or mutated copies of the chromosomes kept in a memory — Memory-based Immigrants Genetic Algorithm (MIGA). Memory-based approaches for dynamic optimization use a memory to store good solutions, which are retrieved later, periodically, or when the environments changes. Memory GAs are known to improve traditional GAs performance when the dynamics of changes are cyclic, that is, the fitness function returns to previous “shapes” from time to time; on the other hand, memory schemes are not that effective when the changes are not periodic. Therefore, and as expected, MIGA outperforms other GAs when the changes are cyclic. EIGA is better when the changes in the environment are not severe. This behaviour is explained by the fact that introducing mutated copies of the best individual in the population provides the GA with means to tackle small changes because the algorithm is maintaining a kind of sub-population around the optimal solution, and small shifts in the environment are easily traceable by those near-optimal individuals.

Summarizing, the study shows that MIGA and EIGA are able to outperform other GAs under the conditions of the test set. However, there is one question that remains unanswered: what happens when changing the parameter values? For instance, diversity maintenance schemes for dynamic optimization deal with non-stationary environments by maintaining the diversity at a higher level. This means that maybe the optimal mutation probability of these algorithms is different from those of Standard GAs. Shouldn’t a proper comparison between the algorithms consider a range of possible mutation probabilities (Yang’s studies used the traditional pm = 1/l, where l is the chromosome length)? And what about population size? Isn’t population size the major bottleneck for GAs’ performance in stationary environments? Is it possible that a variation in the population size of the GA for dynamic optimization conduces the research to different conclusions? Studies currently under way will try to answer to some of these questions.


Genetic algorithm with rough set theory

March 12, 2009

Last Friday we discussed the paper “The generic genetic algorithm incorporates with rough-set theory – An application of the web services composition” of Liang and Huang.  This is the standard mixture paper: a kind of algorithm + another soft computing technique + an application of the real world = a complete paper. I am not kidding, I think that combining several techniques, and more important, using them in real applications, should be a basis of research.

Rought set theory provides a way to create a set of decission rules that can be selected in every problem with functional requirements. For example in the extensive area of web services composition. We can provide this information to a genetic algorithm to compose services avoiding constrained solutions and initial population using that decission rules. The authors also use non-functional requirements, such QoS, cost or avilability in the fitness function. They conclude that the usage of rough set in a GA could increase the convergence (but it is necessary to keep some unfeasible solutions during the process, just in case).

It is a easy-to-read paper, so probably you would like to read it instead my summary ;) Moreover, they present some ideas in the web service composition area.


So you want a summer internship in Granada, Spain

February 23, 2009

The GeNeura team is composed by an international team of doctors and graduate students, mainly focused in bioinspired algoritms such as ant colony optimization algorithms, also in multiobjective versions, distributed and sequential EAs applied to neural net training, along with other applications.
So, if you have knowledge of

  • Evolutionary algorithms, or
  • neural nets, or
  • P2P or parallel or distributed computing and
  • Java, Perl, Ruby, Javascript or Python, and
  • your own source of funding, such as Erasmus, a Foreign Ministry scholarship, or anything like that

please send your resumé to JJ Merelo, specifying what you request from us, and the support you will have, and we’ll be back to you ASAP.


KohonAnts Slides (ALIFE XI)

February 17, 2009

Hello again to everyone!

These are the slides of the presentation of KohonAnts algorithm in ALIFE XI conference. ;)

It is an hybrid Ant Colony and Self-organizing Map algorithm for clustering and pattern classification.

A bit late, but I had some troubles with slideshare…

In any case…….Enjoy it. ;) :D

Here you can see an example of the evolution of the ants in the grid for the IRIS dataset:

Ants movement in the toroidal grid

Ants movement in the toroidal grid

Green class is quite similar to the other two classes, so it is difficult to get a fine cluster with it.

Thanks to Dave Oranchak. ;)


Decentralized Genetic Algorithms and Multiobjective Optimization

February 9, 2009

Last friday we were discussing the issues of decentralization and multiobjective optimization in Genetic Algorithms, in more detail:

  • Decentralizing GAs is a radical way of parallelizing GAs in which the parallel infraestructure is devoid of any central server. Hence, individuals have no global but local information provided by the neighbourhood.
  • Multiobjective GAs are GAs optimizing several conflicting objectives. In such a way the output of the GA is not a single but a set of solutions (usually named pareto-front) and it is based on some criterion of dominance.

Matching both issues is not straightforward since inserting a new solution within the set of solutions requires of a global comparison through the pareto-front and, talking about a decentralized GA, the pareto-front should be decentralized in advance. The consequence is that there are few works tackling such challenge. One of them is the paper A Cellular Genetic Algorithm for Multiobjective Optimization by Nebro et al.which can be considered as an step towards, although not a definitive solution to decentralized multiobjective optimization. The authors propose:

  • A decentralized GA: the Cellular Genetic Algorithm in which every individual is a single process.
  • An external archive (and centralized) to store the non-dominated solutions.
  • A simulation of the parallel environment.

In spite of the drawback of having a centralized archive which does not allow a real parallel/decentralized execution, results show to be algorithmically competitive against the state of the art algorithms NSGA-II and SPEA2. Therefore, one of the possible reading from the paper is that decentralized GAs are promising, although there is no a pure decentralized GA for Multiobjective optimization yet.


The Quest for the Holy Grail

February 2, 2009

Last week’s seminar was focused on Genetic Algorithms with varying population size. Like other GA’s parameters, population size was the subject of some studies in 1990s that aimed at exploring the established notion that different stages of the search require populations with different sizes. In general, we may say that larger populations are needed in the beggining of the search, and as the search proceeds towards the end, the GA is able to work with fewer individuals. One of the main objectives of the research on dynamic population GAs was to eliminate the population size parameter and design a GA that could adapt the size independently of the number of individuals in the first generation. That way, we could not only speed up the search process but also diminish the tuning effort. (On the other hand, improvements on scalability are, in my opinion, impossible to achieve with just a plain dynamic scheme.)

Some schemes only aim at overcoming the hard task of determining an optimal population size, and they do that by mimicking the process of tuning the population size by a GA’s expert (see also bisection method, K. Sastry). The parameter-less GA (Lobo et al.) works that way by evolving simultaneous and increasingly larger populations until the convergence criterion is met. This method does not speed up a standard GA. Instead, it may slow it down, but the worst-case scenario is of such a magnitude that is well worth using the parameter-less GA to avoid the tuning of the initial population size. Other techniques try to speed up the algorithms by dynamically changing the population size during the run, in a deterministic or adaptive manner.

Back in 1994, the authors (Arabas et al.) of the Genetic Algorithm with Varying Population Size (GAVaPS) claimed that the algorithm was able to adapt its population size during the run, according to the average fitness of the population and to the quality of the best or worst solutions. However, further studies could no replicate that described behaviour. Instead, GAVaPS population easily evolved towards extinction or experienced demographic burst that would increase its size and computational effort. The Adaptive Population Size GA (APGA) (Eiben et al.) tries to overcome this problem by engaging in a traditional steady state reproduction, that is, the population only generates two offspring in each generation. Although the authors presented a study were APGA outperforms several dynamic population sizing methods, F. Lobo and C. Lima later demonstrated that the algorithm does not really adapts its population size, but instead it remains below a specific values after a certain number of generation (In addition, they show that the test set were APGA apparently excels is probably not suited to evaluate GAs with dynamic population).

PRoFIGA (Valkó) is an example of deterministic control of the population (GAVaPS and APGA are adaptive). Unfortunately, it is also an example of an extremely complex algorithm when it comes to tuning it. PROFIGA removes the population size from the usual set of parameters, but it adds six (!) parameters. One of the main guidelines to parameter control is: make it simple. If we replace a parameter by another one (or more!), then the first objective – simplify the tuning for a non-expert user – is not met.

In 2001 (at GECCO), De Jong identified some of the open problems in Evolutionary Computation. Dynamic populations was one of them. Since then, not much has been done that we could consider as major breakthrough. In fact, varying population size is a serious candidate for the title of Evolutionary Computation Holy Grail! Take for instance the self-adaptation of the optimal population size. If the algorithm starts with a smaller population than the required, it has to grow its size, but it has to add genetic diversity along the way, because the initial genetic pool will no be able to converge to a solution (if it did, then the optimal population size would be expected one, but instead a smaller population, of course). Is it possible to design an effective dynamic scheme (that is, a scheme that could start with a population with random size, an then adapt in the first generations to the proper size)? Or should we start considering other applications for dynamic population GAs, and abandon the quest the for a GA which is independent of the initial population size?


Pherographia in SIGEvolution

January 28, 2009

Our partner Carlos Fernandes has published an article on the latest number of SIG evolution, called A Camera Obscura for Ants. This article describes from the scientific point of view his Pherography method, that uses ant colony algorithms for creating curious and nice effects on photography. A bit like our Kohonants, but without the Kohonen part.


How to find Wynona Ryder’s images in a content-based image system

January 21, 2009

Last Friday I presented the paper “PicSOM – content-based image retrieval with self-organizing maps ” of Laaksonen et al. This authors uses T-SOMS (Tree-based SOMs) to classify images based in their content: using distinct measure types, like colour, sFFT (shape Fast Fourier Transform) and others several SOM maps are created (one per measure). The solution to how to give weight to that measures is simple: the user feedback. A set of images is presented to the user in a web-based application, so he can select the interested ones (example: Wynona Ryder’s face) to obtain the next set of images (an example can be seen in Figure 1). The algorithm learns with the selected images and gives weight to select the images of an specific map (in our example, shape measures are more important than colour measure).

winona

Figure 1: Winona Ryder’s face in PicSom interface

To test the performance of every map and the global performance of the whole system the authors use sumatory things to stablish if the selected images belongs to a determined class (i.e. Planes, dinosaurs of faces). The conclusion is that it is necessary to use the whole set of measures at the same time to acquire the best performance. But the most interesting part is that you can read the paper and test the algorithm by yourself, a not so common practice.