Alberto Riccardo Martinelli


Alberto Martinelli

PhD Student
Computer Science Department, University of Turin
Parallel Computing group
Via Pessinetto 12, 10149 Torino – Italy
E-mail:  alberto.martinelli AT edu.unito.it

Short Bio

Alberto Riccardo Martinelli is a Ph.D. student in computer science at the University of Turin. He received his master’s degree with honors in Computer Science from the University of Turin.

Fields of interest:

  • Parallel computing
  • Distributed computing
  • High performance computing

Publications

2023

  • J. Garcia-Blas, G. Sanchez-Gallegos, C. Petre, A. R. Martinelli, M. Aldinucci, and J. Carretero, “Hercules: scalable and network portable in-memory ad-hoc file system for data-centric and high-performance applications,” in Euro-par 2023: parallel processing, Cham, 2023, p. 679–693.
    [BibTeX] [Abstract]

    The growing demands for data processing by new data-intensive applications are putting pressure on the performance and capacity of HPC storage systems. The advancement in storage technologies, such as NVMe and persistent memory, are aimed at meeting these demands. However, relying solely on ultra-fast storage devices is not cost-effective, leading to the need for multi-tier storage hierarchies to move data based on its usage. To address this issue, ad-hoc file systems have been proposed as a solution. They utilise the available storage of compute nodes, such as memory and persistent storage, to create a temporary file system that adapts to the application behaviour in the HPC environment. This work presents the design, implementation, and evaluation of a distributed ad-hoc in-memory storage system (Hercules), highlighting the new communication model included in Hercules. This communication model takes advantage of the Unified Communication X framework (UCX). This solution leverages the capabilities of RDMA protocols, including Infiniband, Onmipath, shared memory, and zero-copy transfers. The preliminary evaluation results show excellent network utilisation compared with other existing technologies.

    @inproceedings{10.1007/978-3-031-39698-4_46,
    abstract = {The growing demands for data processing by new data-intensive applications are putting pressure on the performance and capacity of HPC storage systems. The advancement in storage technologies, such as NVMe and persistent memory, are aimed at meeting these demands. However, relying solely on ultra-fast storage devices is not cost-effective, leading to the need for multi-tier storage hierarchies to move data based on its usage. To address this issue, ad-hoc file systems have been proposed as a solution. They utilise the available storage of compute nodes, such as memory and persistent storage, to create a temporary file system that adapts to the application behaviour in the HPC environment. This work presents the design, implementation, and evaluation of a distributed ad-hoc in-memory storage system (Hercules), highlighting the new communication model included in Hercules. This communication model takes advantage of the Unified Communication X framework (UCX). This solution leverages the capabilities of RDMA protocols, including Infiniband, Onmipath, shared memory, and zero-copy transfers. The preliminary evaluation results show excellent network utilisation compared with other existing technologies.},
    address = {Cham},
    author = {Garcia-Blas, Javier and Sanchez-Gallegos, Genaro and Petre, Cosmin and Martinelli, Alberto Riccardo and Aldinucci, Marco and Carretero, Jesus},
    booktitle = {Euro-Par 2023: Parallel Processing},
    editor = {Cano, Jos{\'e} and Dikaiakos, Marios D. and Papadopoulos, George A. and Peric{\`a}s, Miquel and Sakellariou, Rizos},
    isbn = {978-3-031-39698-4},
    pages = {679--693},
    publisher = {Springer Nature Switzerland},
    title = {Hercules: Scalable and Network Portable In-Memory Ad-Hoc File System for Data-Centric and High-Performance Applications},
    year = {2023}
    }

2022

  • I. Colonnelli, B. Casella, G. Mittone, Y. Arfat, B. Cantalupo, R. Esposito, A. R. Martinelli, D. Medić, and M. Aldinucci, “Federated learning meets HPC and cloud,” in Astrophysics and space science proceedings, Catania, Italy, 2022.
    [BibTeX] [Abstract] [Download PDF]

    HPC and AI are fated to meet for several reasons. This article will discuss some of them and argue why this will happen through the set of methods and technologies that underpin cloud computing. As a paradigmatic example, we present a new federated learning system that collaboratively trains a deep learning model in different supercomputing centers. The system is based on the StreamFlow workflow manager designed for hybrid cloud-HPC infrastructures.

    @inproceedings{22:ml4astro,
    abstract = {HPC and AI are fated to meet for several reasons. This article will discuss some of them and argue why this will happen through the set of methods and technologies that underpin cloud computing. As a paradigmatic example, we present a new federated learning system that collaboratively trains a deep learning model in different supercomputing centers. The system is based on the StreamFlow workflow manager designed for hybrid cloud-HPC infrastructures.},
    address = {Catania, Italy},
    author = {Iacopo Colonnelli and Bruno Casella and Gianluca Mittone and Yasir Arfat and Barbara Cantalupo and Roberto Esposito and Alberto Riccardo Martinelli and Doriana Medi\'{c} and Marco Aldinucci},
    booktitle = {Astrophysics and Space Science Proceedings},
    keywords = {across, eupilot, streamflow},
    publisher = {Springer},
    title = {Federated Learning meets {HPC} and cloud},
    url = {https://iris.unito.it/retrieve/5631da1c-96a0-48c0-a48e-2cdf6b84841d/main.pdf},
    year = {2022},
    bdsk-url-1 = {https://iris.unito.it/retrieve/5631da1c-96a0-48c0-a48e-2cdf6b84841d/main.pdf}
    }

  • G. Agosta, M. Aldinucci, C. Alvarez, R. Ammendola, Y. Arfat, O. Beaumont, M. Bernaschi, A. Biagioni, T. Boccali, B. Bramas, C. Brandolese, B. Cantalupo, M. Carrozzo, D. Cattaneo, A. Celestini, M. Celino, I. Colonnelli, P. Cretaro, P. D’Ambra, M. Danelutto, R. Esposito, L. Eyraud-Dubois, A. Filgueras, W. Fornaciari, O. Frezza, A. Galimberti, F. Giacomini, B. Goglin, D. Gregori, A. Guermouche, F. Iannone, M. Kulczewski, F. Lo Cicero, A. Lonardo, A. R. Martinelli, M. Martinelli, X. Martorell, G. Massari, S. Montangero, G. Mittone, R. Namyst, A. Oleksiak, P. Palazzari, P. S. Paolucci, F. Reghenzani, C. Rossi, S. Saponara, F. Simula, F. Terraneo, S. Thibault, M. Torquati, M. Turisini, P. Vicini, M. Vidal, D. Zoni, and G. Zummo, “Towards extreme scale technologies and accelerators for eurohpc hw/sw supercomputing applications for exascale: the textarossa approach,” Microprocessors and microsystems, vol. 95, p. 104679, 2022. doi:10.1016/j.micpro.2022.104679
    [BibTeX] [Abstract]

    In the near future, Exascale systems will need to bridge three technology gaps to achieve high performance while remaining under tight power constraints: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetic; methods and tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA addresses these gaps through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models, and tools derived from European research.

    @article{textarossa2022micpro:,
    abstract = {In the near future, Exascale systems will need to bridge three technology gaps to achieve high performance while remaining under tight power constraints: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetic; methods and tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA addresses these gaps through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models, and tools derived from European research.},
    author = {Giovanni Agosta and Marco Aldinucci and Carlos Alvarez and Roberto Ammendola and Yasir Arfat and Olivier Beaumont and Massimo Bernaschi and Andrea Biagioni and Tommaso Boccali and Berenger Bramas and Carlo Brandolese and Barbara Cantalupo and Mauro Carrozzo and Daniele Cattaneo and Alessandro Celestini and Massimo Celino and Iacopo Colonnelli and Paolo Cretaro and Pasqua D'Ambra and Marco Danelutto and Roberto Esposito and Lionel Eyraud-Dubois and Antonio Filgueras and William Fornaciari and Ottorino Frezza and Andrea Galimberti and Francesco Giacomini and Brice Goglin and Daniele Gregori and Abdou Guermouche and Francesco Iannone and Michal Kulczewski and Francesca {Lo Cicero} and Alessandro Lonardo and Alberto R. Martinelli and Michele Martinelli and Xavier Martorell and Giuseppe Massari and Simone Montangero and Gianluca Mittone and Raymond Namyst and Ariel Oleksiak and Paolo Palazzari and Pier Stanislao Paolucci and Federico Reghenzani and Cristian Rossi and Sergio Saponara and Francesco Simula and Federico Terraneo and Samuel Thibault and Massimo Torquati and Matteo Turisini and Piero Vicini and Miquel Vidal and Davide Zoni and Giuseppe Zummo},
    doi = {10.1016/j.micpro.2022.104679},
    issn = {0141-9331},
    journal = {Microprocessors and Microsystems},
    keywords = {textrossa},
    pages = {104679},
    title = {Towards EXtreme scale technologies and accelerators for euROhpc hw/Sw supercomputing applications for exascale: The TEXTAROSSA approach},
    volume = {95},
    year = {2022},
    bdsk-url-1 = {https://doi.org/10.1016/j.micpro.2022.104679}
    }

2021

  • G. Agosta, W. Fornaciari, A. Galimberti, G. Massari, F. Reghenzani, F. Terraneo, D. Zoni, C. Brandolese, M. Celino, F. Iannone, P. Palazzari, G. Zummo, M. Bernaschi, P. D’Ambra, S. Saponara, M. Danelutto, M. Torquati, M. Aldinucci, Y. Arfat, B. Cantalupo, I. Colonnelli, R. Esposito, A. R. Martinelli, G. Mittone, O. Beaumont, B. Bramas, L. Eyraud-Dubois, B. Goglin, A. Guermouche, R. Namyst, S. Thibault, A. Filgueras, M. Vidal, C. Alvarez, X. Martorell, A. Oleksiak, M. Kulczewski, A. Lonardo, P. Vicini, F. L. Cicero, F. Simula, A. Biagioni, P. Cretaro, O. Frezza, P. S. Paolucci, M. Turisini, F. Giacomini, T. Boccali, S. Montangero, and R. Ammendola, “TEXTAROSSA: towards extreme scale technologies and accelerators for eurohpc hw/sw supercomputing applications for exascale,” in Proc. of the 24th euromicro conference on digital system design (DSD), Palermo, Italy, 2021. doi:10.1109/DSD53832.2021.00051
    [BibTeX] [Abstract]

    To achieve high performance and high energy effi- ciency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: en- ergy efficiency and thermal control; extreme computation effi- ciency via HW acceleration and new arithmetics; methods and tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research.

    @inproceedings{21:DSD:textarossa,
    abstract = {To achieve high performance and high energy effi- ciency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: en- ergy efficiency and thermal control; extreme computation effi- ciency via HW acceleration and new arithmetics; methods and tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research.},
    address = {Palermo, Italy},
    author = {Giovanni Agosta and William Fornaciari and Andrea Galimberti and Giuseppe Massari and Federico Reghenzani and Federico Terraneo and Davide Zoni and Carlo Brandolese and Massimo Celino and Francesco Iannone and Paolo Palazzari and Giuseppe Zummo and Massimo Bernaschi and Pasqua D'Ambra and Sergio Saponara and Marco Danelutto and Massimo Torquati and Marco Aldinucci and Yasir Arfat and Barbara Cantalupo and Iacopo Colonnelli and Roberto Esposito and Alberto Riccardo Martinelli and Gianluca Mittone and Olivier Beaumont and Berenger Bramas and Lionel Eyraud-Dubois and Brice Goglin and Abdou Guermouche and Raymond Namyst and Samuel Thibault and Antonio Filgueras and Miquel Vidal and Carlos Alvarez and Xavier Martorell and Ariel Oleksiak and Michal Kulczewski and Alessandro Lonardo and Piero Vicini and Francesco Lo Cicero and Francesco Simula and Andrea Biagioni and Paolo Cretaro and Ottorino Frezza and Pier Stanislao Paolucci and Matteo Turisini and Francesco Giacomini and Tommaso Boccali and Simone Montangero and Roberto Ammendola},
    booktitle = {Proc. of the 24th Euromicro Conference on Digital System Design ({DSD})},
    date-added = {2021-09-04 12:07:42 +0200},
    date-modified = {2021-09-04 12:23:41 +0200},
    doi = {10.1109/DSD53832.2021.00051},
    keywords = {textarossa, streamflow},
    month = aug,
    publisher = {IEEE},
    title = {{TEXTAROSSA}: Towards EXtreme scale Technologies and Accelerators for euROhpc hw/Sw Supercomputing Applications for exascale},
    year = {2021},
    bdsk-url-1 = {https://doi.org/10.1109/DSD53832.2021.00051}
    }

  • M. Aldinucci, V. Cesare, I. Colonnelli, A. R. Martinelli, G. Mittone, and B. Cantalupo, “Practical parallelizazion of a Laplace solver with MPI,” in Enea cresco in the fight against covid-19, 2021, p. 21–24.
    [BibTeX] [Abstract]

    This work exposes a practical methodology for the semi-automatic parallelization of existing code. We show how a scientific sequential code can be parallelized through our approach. The obtained parallel code is only slightly different from the starting sequential one, providing an example of how little re-designing our methodology involves. The performance of the parallelized code, executed on the CRESCO6 cluster, is then exposed and discussed. We also believe in the educational value of this approach and suggest its use as a teaching device for students.

    @inproceedings{21:laplace:enea,
    abstract = {This work exposes a practical methodology for the semi-automatic parallelization of existing code. We show how a scientific sequential code can be parallelized through our approach. The obtained parallel code is only slightly different from the starting sequential one, providing an example of how little re-designing our methodology involves. The performance of the parallelized code, executed on the CRESCO6 cluster, is then exposed and discussed. We also believe in the educational value of this approach and suggest its use as a teaching device for students.},
    author = {Aldinucci, Marco and Cesare, Valentina and Colonnelli, Iacopo and Martinelli, Alberto Riccardo and Mittone, Gianluca and Cantalupo, Barbara},
    booktitle = {ENEA CRESCO in the fight against COVID-19},
    editor = {Francesco Iannone},
    keywords = {hpc4ai},
    pages = {21--24},
    publisher = {ENEA},
    title = {Practical Parallelizazion of a {Laplace} Solver with {MPI}},
    year = {2021}
    }

  • M. Aldinucci, V. Cesare, I. Colonnelli, A. R. Martinelli, G. Mittone, B. Cantalupo, C. Cavazzoni, and M. Drocco, “Practical parallelization of scientific applications with OpenMP, OpenACC and MPI,” Journal of parallel and distributed computing, vol. 157, pp. 13-29, 2021. doi:10.1016/j.jpdc.2021.05.017
    [BibTeX] [Abstract] [Download PDF]

    This work aims at distilling a systematic methodology to modernize existing sequential scientific codes with a little re-designing effort, turning an old codebase into \emph{modern} code, i.e., parallel and robust code. We propose a semi-automatic methodology to parallelize scientific applications designed with a purely sequential programming mindset, possibly using global variables, aliasing, random number generators, and stateful functions. We demonstrate that the same methodology works for the parallelization in the shared memory model (via OpenMP), message passing model (via MPI), and General Purpose Computing on GPU model (via OpenACC). The method is demonstrated parallelizing four real-world sequential codes in the domain of physics and material science. The methodology itself has been distilled in collaboration with MSc students of the Parallel Computing course at the University of Torino, that applied it for the first time to the project works that they presented for the final exam of the course. Every year the course hosts some special lectures from industry representatives, who present how they use parallel computing and offer codes to be parallelized.

    @article{21:jpdc:loop,
    abstract = {This work aims at distilling a systematic methodology to modernize existing sequential scientific codes with a little re-designing effort, turning an old codebase into \emph{modern} code, i.e., parallel and robust code. We propose a semi-automatic methodology to parallelize scientific applications designed with a purely sequential programming mindset, possibly using global variables, aliasing, random number generators, and stateful functions. We demonstrate that the same methodology works for the parallelization in the shared memory model (via OpenMP), message passing model (via MPI), and General Purpose Computing on GPU model (via OpenACC). The method is demonstrated parallelizing four real-world sequential codes in the domain of physics and material science. The methodology itself has been distilled in collaboration with MSc students of the Parallel Computing course at the University of Torino, that applied it for the first time to the project works that they presented for the final exam of the course. Every year the course hosts some special lectures from industry representatives, who present how they use parallel computing and offer codes to be parallelized. },
    author = {Aldinucci, Marco and Cesare, Valentina and Colonnelli, Iacopo and Martinelli, Alberto Riccardo and Mittone, Gianluca and Cantalupo, Barbara and Cavazzoni, Carlo and Drocco, Maurizio},
    date-added = {2021-06-10 22:05:54 +0200},
    date-modified = {2021-06-10 22:30:05 +0200},
    doi = {10.1016/j.jpdc.2021.05.017},
    journal = {Journal of Parallel and Distributed Computing},
    keywords = {saperi},
    pages = {13-29},
    title = {Practical Parallelization of Scientific Applications with {OpenMP, OpenACC and MPI}},
    url = {https://iris.unito.it/retrieve/handle/2318/1792557/770851/Practical_Parallelization_JPDC_preprint.pdf},
    volume = {157},
    year = {2021},
    bdsk-url-1 = {https://iris.unito.it/retrieve/handle/2318/1792557/770851/Practical_Parallelization_JPDC_preprint.pdf},
    bdsk-url-2 = {https://doi.org/10.1016/j.jpdc.2021.05.017}
    }

2018

  • C. Misale, M. Drocco, G. Tremblay, A. R. Martinelli, and M. Aldinucci, “Pico: high-performance data analytics pipelines in modern c++,” Future generation computer systems, vol. 87, pp. 392-403, 2018. doi:10.1016/j.future.2018.05.030
    [BibTeX] [Abstract] [Download PDF]

    In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo’s programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: (1) unifying batch and stream data access models, (2) decoupling processing from data layout, and (3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to reuse the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo, when compared to Spark and Flink, can attain better performances in terms of execution times and can hugely improve memory utilization, both for batch and stream processing.

    @article{18:fgcs:pico,
    abstract = {In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo's programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: (1) unifying batch and stream data access models, (2) decoupling processing from data layout, and (3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to reuse the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo, when compared to Spark and Flink, can attain better performances in terms of execution times and can hugely improve memory utilization, both for batch and stream processing.},
    author = {Claudia Misale and Maurizio Drocco and Guy Tremblay and Alberto R. Martinelli and Marco Aldinucci},
    booktitle = {Future Generation Computer Systems},
    date-added = {2018-05-18 21:24:31 +0000},
    date-modified = {2020-11-15 17:22:30 +0100},
    doi = {10.1016/j.future.2018.05.030},
    journal = {Future Generation Computer Systems},
    keywords = {toreador, bigdata, fastflow},
    pages = {392-403},
    title = {PiCo: High-performance data analytics pipelines in modern C++},
    url = {https://iris.unito.it/retrieve/handle/2318/1668444/414280/fgcs_pico.pdf},
    volume = {87},
    year = {2018},
    bdsk-url-1 = {https://iris.unito.it/retrieve/handle/2318/1668444/414280/fgcs_pico.pdf},
    bdsk-url-2 = {https://doi.org/10.1016/j.future.2018.05.030}
    }