# Alberto Riccardo Martinelli

PhD Student
Computer Science Department, University of Turin
Parallel Computing group
Via Pessinetto 12, 10149 Torino – Italy
E-mail:  alberto.martinelli AT edu.unito.it

## Short Bio

Alberto Riccardo Martinelli is a Ph.D. student in computer science at the University of Turin. He received his master’s degree with honors in Computer Science from the University of Turin.

## Fields of interest:

• Parallel computing
• Distributed computing
• High performance computing

## Publications

### 2021

• G. Agosta, W. Fornaciari, A. Galimberti, G. Massari, F. Reghenzani, F. Terraneo, D. Zoni, C. Brandolese, M. Celino, F. Iannone, P. Palazzari, G. Zummo, M. Bernaschi, P. D’Ambra, S. Saponara, M. Danelutto, M. Torquati, M. Aldinucci, Y. Arfat, B. Cantalupo, I. Colonnelli, R. Esposito, A. R. Martinelli, G. Mittone, O. Beaumont, B. Bramas, L. Eyraud-Dubois, B. Goglin, A. Guermouche, R. Namyst, S. Thibault, A. Filgueras, M. Vidal, C. Alvarez, X. Martorell, A. Oleksiak, M. Kulczewski, A. Lonardo, P. Vicini, F. L. Cicero, F. Simula, A. Biagioni, P. Cretaro, O. Frezza, P. S. Paolucci, M. Turisini, F. Giacomini, T. Boccali, S. Montangero, and R. Ammendola, “TEXTAROSSA: towards extreme scale technologies and accelerators for eurohpc hw/sw supercomputing applications for exascale,” in Proc. of the 24th euromicro conference on digital system design (DSD), Palermo, Italy, 2021. doi:10.1109/DSD53832.2021.00051
[BibTeX] [Abstract]

To achieve high performance and high energy effi- ciency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: en- ergy efficiency and thermal control; extreme computation effi- ciency via HW acceleration and new arithmetics; methods and tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research.

@inproceedings{21:DSD:textarossa,
abstract = {To achieve high performance and high energy effi- ciency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: en- ergy efficiency and thermal control; extreme computation effi- ciency via HW acceleration and new arithmetics; methods and tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research.},
author = {Giovanni Agosta and William Fornaciari and Andrea Galimberti and Giuseppe Massari and Federico Reghenzani and Federico Terraneo and Davide Zoni and Carlo Brandolese and Massimo Celino and Francesco Iannone and Paolo Palazzari and Giuseppe Zummo and Massimo Bernaschi and Pasqua D'Ambra and Sergio Saponara and Marco Danelutto and Massimo Torquati and Marco Aldinucci and Yasir Arfat and Barbara Cantalupo and Iacopo Colonnelli and Roberto Esposito and Alberto Riccardo Martinelli and Gianluca Mittone and Olivier Beaumont and Berenger Bramas and Lionel Eyraud-Dubois and Brice Goglin and Abdou Guermouche and Raymond Namyst and Samuel Thibault and Antonio Filgueras and Miquel Vidal and Carlos Alvarez and Xavier Martorell and Ariel Oleksiak and Michal Kulczewski and Alessandro Lonardo and Piero Vicini and Francesco Lo Cicero and Francesco Simula and Andrea Biagioni and Paolo Cretaro and Ottorino Frezza and Pier Stanislao Paolucci and Matteo Turisini and Francesco Giacomini and Tommaso Boccali and Simone Montangero and Roberto Ammendola},
booktitle = {Proc. of the 24th Euromicro Conference on Digital System Design ({DSD})},
date-modified = {2021-09-04 12:23:41 +0200},
doi = {10.1109/DSD53832.2021.00051},
keywords = {textarossa, streamflow},
month = aug,
publisher = {IEEE},
title = {{TEXTAROSSA}: Towards EXtreme scale Technologies and Accelerators for euROhpc hw/Sw Supercomputing Applications for exascale},
year = {2021}
}

• M. Aldinucci, V. Cesare, I. Colonnelli, A. R. Martinelli, G. Mittone, and B. Cantalupo, “Practical parallelizazion of a Laplace solver with MPI,” in Enea cresco in the fight against covid-19, 2021, p. 21–24.
[BibTeX] [Abstract]

This work exposes a practical methodology for the semi-automatic parallelization of existing code. We show how a scientific sequential code can be parallelized through our approach. The obtained parallel code is only slightly different from the starting sequential one, providing an example of how little re-designing our methodology involves. The performance of the parallelized code, executed on the CRESCO6 cluster, is then exposed and discussed. We also believe in the educational value of this approach and suggest its use as a teaching device for students.

@inproceedings{21:laplace:enea,
abstract = {This work exposes a practical methodology for the semi-automatic parallelization of existing code. We show how a scientific sequential code can be parallelized through our approach. The obtained parallel code is only slightly different from the starting sequential one, providing an example of how little re-designing our methodology involves. The performance of the parallelized code, executed on the CRESCO6 cluster, is then exposed and discussed. We also believe in the educational value of this approach and suggest its use as a teaching device for students.},
author = {Aldinucci, Marco and Cesare, Valentina and Colonnelli, Iacopo and Martinelli, Alberto Riccardo and Mittone, Gianluca and Cantalupo, Barbara},
booktitle = {ENEA CRESCO in the fight against COVID-19},
optdoi = {},
pages = {21--24},
editor = {Francesco Iannone},
publisher = {ENEA},
title = {Practical Parallelizazion of a {Laplace} Solver with {MPI}},
opturl = {},
year = {2021},
keywords = {hpc4ai}
}

• M. Aldinucci, V. Cesare, I. Colonnelli, A. R. Martinelli, G. Mittone, B. Cantalupo, C. Cavazzoni, and M. Drocco, “Practical parallelization of scientific applications with OpenMP, OpenACC and MPI,” Journal of parallel and distributed computing, vol. 157, pp. 13-29, 2021. doi:10.1016/j.jpdc.2021.05.017

This work aims at distilling a systematic methodology to modernize existing sequential scientific codes with a little re-designing effort, turning an old codebase into \emph{modern} code, i.e., parallel and robust code. We propose a semi-automatic methodology to parallelize scientific applications designed with a purely sequential programming mindset, possibly using global variables, aliasing, random number generators, and stateful functions. We demonstrate that the same methodology works for the parallelization in the shared memory model (via OpenMP), message passing model (via MPI), and General Purpose Computing on GPU model (via OpenACC). The method is demonstrated parallelizing four real-world sequential codes in the domain of physics and material science. The methodology itself has been distilled in collaboration with MSc students of the Parallel Computing course at the University of Torino, that applied it for the first time to the project works that they presented for the final exam of the course. Every year the course hosts some special lectures from industry representatives, who present how they use parallel computing and offer codes to be parallelized.

@article{21:jpdc:loop,
abstract = {This work aims at distilling a systematic methodology to modernize existing sequential scientific codes with a little re-designing effort, turning an old codebase into \emph{modern} code, i.e., parallel and robust code. We propose a semi-automatic methodology to parallelize scientific applications designed with a purely sequential programming mindset, possibly using global variables, aliasing, random number generators, and stateful functions. We demonstrate that the same methodology works for the parallelization in the shared memory model (via OpenMP), message passing model (via MPI), and General Purpose Computing on GPU model (via OpenACC). The method is demonstrated parallelizing four real-world sequential codes in the domain of physics and material science. The methodology itself has been distilled in collaboration with MSc students of the Parallel Computing course at the University of Torino, that applied it for the first time to the project works that they presented for the final exam of the course. Every year the course hosts some special lectures from industry representatives, who present how they use parallel computing and offer codes to be parallelized. },
author = {Aldinucci, Marco and Cesare, Valentina and Colonnelli, Iacopo and Martinelli, Alberto Riccardo and Mittone, Gianluca and Cantalupo, Barbara and Cavazzoni, Carlo and Drocco, Maurizio},
date-modified = {2021-06-10 22:30:05 +0200},
doi = {10.1016/j.jpdc.2021.05.017},
journal = {Journal of Parallel and Distributed Computing},
keywords = {saperi},
pages = {13-29},
title = {Practical Parallelization of Scientific Applications with {OpenMP, OpenACC and MPI}},
url = {https://iris.unito.it/retrieve/handle/2318/1792557/770851/Practical_Parallelization_JPDC_preprint.pdf},
volume = {157},
year = {2021},
bdsk-url-1 = {https://iris.unito.it/retrieve/handle/2318/1792557/770851/Practical_Parallelization_JPDC_preprint.pdf},
bdsk-url-2 = {https://doi.org/10.1016/j.jpdc.2021.05.017}
}

### 2018

• C. Misale, M. Drocco, G. Tremblay, A. R. Martinelli, and M. Aldinucci, “Pico: high-performance data analytics pipelines in modern c++,” Future generation computer systems, vol. 87, pp. 392-403, 2018. doi:10.1016/j.future.2018.05.030

In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo’s programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: (1) unifying batch and stream data access models, (2) decoupling processing from data layout, and (3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to reuse the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo, when compared to Spark and Flink, can attain better performances in terms of execution times and can hugely improve memory utilization, both for batch and stream processing.

@article{18:fgcs:pico,
abstract = {In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo's programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: (1) unifying batch and stream data access models, (2) decoupling processing from data layout, and (3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to reuse the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo, when compared to Spark and Flink, can attain better performances in terms of execution times and can hugely improve memory utilization, both for batch and stream processing.},
author = {Claudia Misale and Maurizio Drocco and Guy Tremblay and Alberto R. Martinelli and Marco Aldinucci},
booktitle = {Future Generation Computer Systems},
}