Finding self-organized criticality in collaborative work via repository mining (IWANN’2017)

Captura de pantalla 2017-06-16 a la(s) 09.58.27

I would like to spot your attention to three points:

  • Development teams eventually become complex systems, mainly in collaborative work environments.
  • Relations and collaborations take place through the environment.
  • Pattern mining and analysing social-based information is a complex problem.

Thus, our main objective was studying new methodologies to analyse patterns of collaboration in collaborative work environments, as it is a complex problem that needs new tools to explore and analyse data related to relations-based information.

Also, we wanted to explore and analyse relations-based data, e.g. to answer the question “Do developers self-organize?”, and finally, to contribute to open science tools and methodologies.

In Statistical Physics, criticality is defined as a type of behaviour observed when a system undergoes a phase transition. A state on the edge between two different types of behaviour is called the critical state, and in this state the system is at criticality.

A clear example is the sandpile model, in which, if we add one grain to the pile, in average the steepness of slopes increases. However, the slopes might evolve to a critical state where a single grain of sand is likely to settle on the pile, or to trigger an avalanche:

Captura de pantalla 2017-06-16 a la(s) 09.42.24

In this report we work on a repository for several papers. There, we examined 4 repositories where the collaborative writing of scientific papers take place using GitHub. Repositories with a certain “length”, more than 50 commits (changes), have been chosen. Thus, we could analyse changes in files, looking for the existence of:

  • a scale free structure
  • long-distance correlations
  • pink noise

Several macro measures extracted from the size of changes to the files in the repository were obtained:
1. Sequence of changes
2. Timeline of commit sizes
3. Change sizes ranked in descending order
4. Long-distance correlations
5. Presence of pink noise (1/f)

Paying attention to the sequence of changes and the timeline of commit sizes (1, 2), no particular “rhythm” can be seen: daily nor on the changes. Repositories can be static for a long time, to experience a burst of changes all of a sudden (avalanche), that is a symptom of the underlying self-organized criticality state.

After plotting change sizes ranked in descending order (3), it can be seen that some authors send atomic changes while others write down big paragraphs/sections before commit those big changes. At the end, we can see a tail corresponding to big changes at the end (just before sending the paper).

Long-distance correlations plots show how long distance autocorrelations appear in different places depending on the repository, but is present in most cases anyway.

Finally, pink noise refers to any noise with a power spectral density of the form 1/f. In order to see clearly the presence of pink noise, the spectrum should present a slope equal to -1. However, there is not a clear trend downwards. Maybe this could appear later on in development. Maybe could see that trend using repositories with a longest history. In any case, the fact that this third characteristic is not present does not obscure the other two, which appear clearly.

As conclusions, we have demonstrated that, after analysing several repositories for scientific papers writing, they are in a critical state, as (1) changes have a scale-free form, (2) there are long-distance correlations, and (3) pink noise has been detected (only in some cases).

For the shake of reproducibility and as we support open science, both the programs and data related to this report are available online at the repository “Measuring progress in literature and in other creative endeavours, like programming”
http://github.com/JJ/literaturame

The slides used to present this work in IWANN’2017 Congress are available at:
https://es.slideshare.net/pacvslideshare/finding-selforganized-criticality-in-collaborative-work-via-repository-mining

Advertisements

Jornada sobre Smart Cities y Movilidad

El jueves 26 de noviembre de 2015 celebramos en la Sala de Usos Múltiples del CITIC-UGR (C/ Periodista Rafael Gómez, nº 2) la Jornada sobre Smart Cities y Movilidad, enmarcada en el Programa De Ayudas Genil Para Realización De Actividades Por Grupos De Investigación Interdiciplinares (RAGII-2015).

El objeto de esta Jornada fue la investigación en el área de la gestión de la movilidad, internet de las cosas y smart cities.

A lo largo de la mañana asistimos a varias conferencias, impartidas por responsables del Area de Movilidad del Ayuntamiento de Granada, de varias empresas, así como por parte de investigadores de la Universidad de Granada en este ámbito.

Los objetivos finales fueron crear sinergia entre los diversos grupos de investigación y empresas de este área, así como facilitar el contacto de cara a promover colaboraciones, tales como solicitar proyectos, o realizar transferencia de conocimiento a partir de los resultados de investigación.

El desarrollo de la Jornada se basó en presentaciones de unos 40 minutos, en las que el ponente, por parte del Área de Movilidad del Ayuntamiento de Granada, Nazaríes, UXMobile, Geokeda, e investigadores de los grupos de investigación, comentaron los proyectos en los que trabajan actualmente en el área de las smart cities, así como las problemáticas, y los retos a los que se enfrentan.

cartel_mobilidad_smart_cities