"Reproducibility in Data Sciences: What, Why, and How"
One of the key aspects of modern technological research lies on the use of personal computers (PCs) either for the simulation of known phenomena or for the evaluation of data collected from natural observations. Mashups of these data, organized in tables and figures are attached to textual descriptions leading to scientific publications. In the current practice, data sets, code and actionable software leading to those results are excluded upon recording and preservation of articles.
This panorama slows down potential scientific development in at least two major aspects: (1) re-using ideas from different sources normally implies on the re-development of software leading to original results and (2) the reviewing process of candidate ideas is based on trust rather than on hard, verifiable evidence that can be thoroughly analyzed. In this talk, I'll discuss Reproducible Research (RR) for scientists and engineers working with software applications in Pattern Recognition (PR) and Machine Learning (ML). I’ll motivate and explain concepts behind RR, an increasing trend in scientific publications in this niche, its implications and tools for implementing it on an individual or group level.
More Information