Papers | Parallel Computing

2024

Simone Leo, Michael R. Crusoe, Laura Rodríguez-Navas, Raül Sirvent, Alexander Kanitz, Paul De Geest, Rudolf Wittner, Luca Pireddu, Daniel Garijo, José M. Fernández, Iacopo Colonnelli, Matej Gallo, Tazro Ohta, Hirotaka Suetake, Salvador Capella-Gutierrez, Renske Wit, Bruno P. Kinoshita, Stian Soiland-Reyes

Recording provenance of workflow runs with RO-Crate Journal Article

In: PLoS ONE, vol. 19, no. 9, pp. 1–35, 2024.

Abstract | Links | BibTeX | Tags: across, eupex, icsc, streamflow

@article{24:pone:wfrunrocrate,

title = {Recording provenance of workflow runs with RO-Crate},

author = {Simone Leo and Michael R. Crusoe and Laura Rodríguez-Navas and Raül Sirvent and Alexander Kanitz and Paul De Geest and Rudolf Wittner and Luca Pireddu and Daniel Garijo and José M. Fernández and Iacopo Colonnelli and Matej Gallo and Tazro Ohta and Hirotaka Suetake and Salvador Capella-Gutierrez and Renske Wit and Bruno P. Kinoshita and Stian Soiland-Reyes},

url = {https://iris.unito.it/retrieve/57752f7b-9f8f-4013-8cef-5b498703d882/journal.pone.0309210.pdf},

doi = {10.1371/journal.pone.0309210},

year  = {2024},

date = {2024-09-01},

journal = {PLoS ONE},

volume = {19},

number = {9},

pages = {1–35},

publisher = {Public Library of Science},

abstract = {Recording the provenance of scientific computation results is key to the support of traceability, reproducibility and quality assessment of data products. Several data models have been explored to address this need, providing representations of workflow plans and their executions as well as means of packaging the resulting information for archiving and sharing. However, existing approaches tend to lack interoperable adoption across workflow management systems. In this work we present Workflow Run RO-Crate, an extension of RO-Crate (Research Object Crate) and Schema.org to capture the provenance of the execution of computational workflows at different levels of granularity and bundle together all their associated objects (inputs, outputs, code, etc.). The model is supported by a diverse, open community that runs regular meetings, discussing development, maintenance and adoption aspects. Workflow Run RO-Crate is already implemented by several workflow management systems, allowing interoperable comparisons between workflow runs from heterogeneous systems. We describe the model, its alignment to standards such as W3C PROV, and its implementation in six workflow systems. Finally, we illustrate the application of Workflow Run RO-Crate in two use cases of machine learning in the digital image analysis domain.},

keywords = {across, eupex, icsc, streamflow},

pubstate = {published},

tppubtype = {article}

}

Iacopo Colonnelli, Doriana Medić, Alberto Mulone, Viviana Bono, Luca Padovani, Marco Aldinucci

Introducing SWIRL: An Intermediate Representation Language for Scientific Workflows Proceedings Article

In: Platzer, André, Rozier, Kristin Yvonne, Pradella, Matteo, Rossi, Matteo (Ed.): Formal Methods. FM 2024, pp. 226–244, Springer Nature Switzerland, Milano, Italy, 2024.

Abstract | Links | BibTeX | Tags: eupex, icsc

Alberto Mulone, Doriana Medić, Marco Aldinucci

A Fault Tolerance mechanism for Hybrid Scientific Workflows Proceedings Article

In: 1st workshop about High-Performance e-Science (HiPES), Madrid, Spain, 2024.

Abstract | BibTeX | Tags: eupex, icsc, streamflow

Giulio Malenza, Valentina Cesare, Marco Aldinucci, Ugo Becciani, Alberto Vecchiato

Toward HPC application portability via C++ PSTL: the Gaia AVU-GSR code assessment Journal Article

In: The Journal of Supercomputing, 2024, ISSN: 09208542.

Abstract | Links | BibTeX | Tags: eupex, HPC, icsc

Giulio Malenza, Valentina Cesare, Marco Edoardo Santimaria, Robert Birke, Alberto Vecchiato, Ugo Becciani, Marco Aldinucci

Performance portability via C++ PSTL, SYCL, OpenMP, and HIP: the Gaia AVU-GSR case study Proceedings Article

In: SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1152-1163, IEEE, 2024, ISBN: 979-8-3503-5554-3.

Abstract | Links | BibTeX | Tags: eupex, icsc

@inproceedings{Malenza_P3HPC_24,

title = {Performance portability via C++ PSTL, SYCL, OpenMP, and HIP: the Gaia AVU-GSR case study},

author = {Giulio Malenza and Valentina Cesare and Marco Edoardo Santimaria and Robert Birke and Alberto Vecchiato and Ugo Becciani and Marco Aldinucci},

url = {https://conferences.computer.org/sc-wpub/pdfs/SC-W2024-6oZmigAQfgJ1GhPL0yE3pS/555400b152/555400b152.pdf},

doi = {10.1109/SCW63240.2024.00157},

isbn = {979-8-3503-5554-3},

year  = {2024},

date = {2024-01-01},

booktitle = {SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis},

pages = {1152-1163},

publisher = {IEEE},

abstract = {Applications that analyze data from modern scientific experiments will soon require a computing capacity of ExaFLOPs. The current trend to achieve such performance is to employ GPU-accelerated supercomputers and design applications to optimally exploit this hardware. Since each supercomputer is typically a one-off project, the necessity of having computational languages portable across diverse CPU and GPU architectures without performance losses is increasingly compelling. Here, we study the performance portability of the LSQR algorithm as found in the AVU-GSR code of the ESA Gaia mission. This code computes the astrometric parameters of the ~108 stars in our Galaxy. The LSQR algorithm is widely used across a broad range of high-performance computing (HPC) applications, elevating the study's relevance beyond the astrophysical domain. We developed different GPU-accelerated ports based on CUDA, C++ PSTL, SYCL, OpenMP, and HIP. We carefully verified the correctness of each port and tuned them to five different GPU-accelerated platforms from NVIDIA and AMD to evaluate the performance portability (PP) in terms of the harmonic mean of the application's performance efficiency across the tested hardware. HIP was demonstrated to be the most portable solution with a 0.94 average PP across the tested problem sizes, closely followed by SYCL coupled with AdaptiveCpp (ACPP) with 0.93. If we only consider NVIDIA platforms, CUDA would be the winner with 0.97. The tuning-oblivious C++ PSTL achieves 0.62 when coupled with vendor-specific compilers.},

keywords = {eupex, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

2023

Alberto Riccardo Martinelli, Massimo Torquati, Marco Aldinucci, Iacopo Colonnelli, Barbara Cantalupo

CAPIO: a Middleware for Transparent I/O Streaming in Data-Intensive Workflows Proceedings Article

In: 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC), IEEE, Goa, India, 2023.

Abstract | Links | BibTeX | Tags: admire, capio, eupex, icsc

Iacopo Colonnelli

Workflow Models for Heterogeneous Distributed Systems Proceedings Article

In: Bena, Nicola, Martino, Beniamino Di, Maratea, Antonio, Sperduti, Alessandro, Nardo, Emanuel Di, Ciaramella, Angelo, Montella, Raffaele, Ardagna, Claudio A. (Ed.): Proceedings of the 2nd Italian Conference on Big Data and Data Science (ITADATA 2023), Naples, Italy, September 11-13, 2023, CEUR-WS.org, 2023.

Abstract | Links | BibTeX | Tags: across, eupex, icsc, jupyter-workflow, streamflow

Doriana Medić, Marco Aldinucci

Towards formal model for location aware workflows Proceedings Article

In: Shahriar, Hossain, Teranishi, Yuuichi, Cuzzocrea, Alfredo, Sharmin, Moushumi, Towey, Dave, Majumder, A. K. M. Jahangir Alam, Kashiwazaki, Hiroki, Yang, Ji-Jiang, Takemoto, Michiharu, Sakib, Nazmus, Banno, Ryohei, Ahamed, Sheikh Iqbal (Ed.): 47th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2023, pp. 1864–1869, IEEE, Torino, Italy, 2023.

Abstract | Links | BibTeX | Tags: eupex, icsc, semantics

2022

Iacopo Colonnelli, Marco Aldinucci

Hybrid Workflows For Large - Scale Scientific Applications Proceedings Article

In: Sixth EAGE High Performance Computing Workshop, pp. 1–5, European Association of Geoscientists & Engineers , Milano, Italy, 2022, ISSN: 2214-4609.

Abstract | Links | BibTeX | Tags: across, eupex

@inproceedings{22:eage-hpc-workshop,

title = {Hybrid Workflows For Large - Scale Scientific Applications},

author = {Iacopo Colonnelli and Marco Aldinucci},

url = {https://iris.unito.it/retrieve/d79ddabb-f9d7-4a55-9f84-1528b1533ba3/Extended_Abstract.pdf},

doi = {10.3997/2214-4609.2022615029},

issn = {2214-4609},

year  = {2022},

date = {2022-09-01},

booktitle = {Sixth EAGE High Performance Computing Workshop},

pages = {1–5},

publisher = {European Association of Geoscientists & Engineers },

address = {Milano, Italy},

abstract = {Large-scale scientific applications are facing an irrevrsible transition from monolithic, high-performance oriented codes to modular and polyglot deployments of specialised (micro-)services. The reasons behind this transition are many: coupling of standard solvers with Deep Learning techniques, offloading of data analysis and visualisation to Cloud, and the advent of specialised hardware accelerators. Topology-aware Workflow Management Systems (WMSs) play a crucial role. In particular, topology-awareness allows an explicit mapping of workflow steps onto heterogeneous locations, allowing automated executions on top of hybrid architectures (e.g., cloud+HPC or classical+quantum). Plus, topology-aware WMSs can offer nonfunctional requirements OOTB, e.g. components' life-cycle orchestration, secure and efficient data transfers, fault tolerance, and cross-cluster execution of urgent workloads. Augmenting interactive Jupyter Notebooks with distributed workflow capabilities allows domain experts to prototype and scale applications using the same technological stack, while relying on a feature-rich and user-friendly web interface. This abstract will showcase how these general methodologies can be applied to a typical geoscience simulation pipeline based on the Full Wavefront Inversion (FWI) technique. In particular, a prototypical Jupyter Notebook will be executed interactively on Cloud. Preliminary data analyses and post-processing will be executed locally, while the computationally demanding optimisation loop will be scheduled on a remote HPC cluster.},

keywords = {across, eupex},

pubstate = {published},

tppubtype = {inproceedings}

}

Valentina Cesare, Ugo Becciani, Alberto Vecchiato, Mario Gilberto Lattanzi, Fabio Pitari, Mario Raciti, Giuseppe Tudisco, Marco Aldinucci, Beatrice Bucciarelli

The Gaia AVU-GSR parallel solver: Preliminary studies of a LSQR-based application in perspective of exascale systems Journal Article

In: Astronomy and Computing, pp. 100660, 2022, ISSN: 2213-1337.

Abstract | Links | BibTeX | Tags: eupex

@article{CESARE2022100660,

title = {The Gaia AVU-GSR parallel solver: Preliminary studies of a LSQR-based application in perspective of exascale systems},

author = {Valentina Cesare and Ugo Becciani and Alberto Vecchiato and Mario Gilberto Lattanzi and Fabio Pitari and Mario Raciti and Giuseppe Tudisco and Marco Aldinucci and Beatrice Bucciarelli},

url = {https://openaccess.inaf.it/handle/20.500.12386/32451},

doi = {10.1016/j.ascom.2022.100660},

issn = {2213-1337},

year  = {2022},

date = {2022-01-01},

journal = {Astronomy and Computing},

pages = {100660},

abstract = {The Gaia Astrometric Verification Unit–Global Sphere Reconstruction (AVU–GSR) Parallel Solver aims to find the astrometric parameters for circa 10^8 stars in the Milky Way, the attitude and the instrumental specifications of the Gaia satellite, and the global parameter γ of the post Newtonian formalism. The code iteratively solves a system of linear equations, A×x=b, where the coefficient matrix A is large (circa 10^11×10^8 elements) and sparse. To solve this system of equations, the code exploits a hybrid implementation of the iterative PC-LSQR algorithm, where the computation related to different horizontal portions of the coefficient matrix is assigned to separate MPI processes. In the original code, each matrix portion is further parallelized over the OpenMP threads. To further improve the code performance, we ported the application to the GPU, replacing the OpenMP parallelization language with OpenACC. In this port, ∼95% of the data is copied from the host to the device at the beginning of the entire cycle of iterations, making the code compute bound rather than data-transfer bound. The OpenACC code presents a speedup of circa 1.5 over the OpenMP version but further optimizations are in progress to obtain higher gains. The code runs on multiple GPUs and it was tested on the CINECA supercomputer Marconi100, in anticipation of a port to the pre-exascale system Leonardo, that will be installed at CINECA in 2022.},

keywords = {eupex},

pubstate = {published},

tppubtype = {article}

}

2021

Marco Aldinucci, Giovanni Agosta, Antonio Andreini, Claudio A. Ardagna, Andrea Bartolini, Alessandro Cilardo, Biagio Cosenza, Marco Danelutto, Roberto Esposito, William Fornaciari, Roberto Giorgi, Davide Lengani, Raffaele Montella, Mauro Olivieri, Sergio Saponara, Daniele Simoni, Massimo Torquati

The Italian research on HPC key technologies across EuroHPC Proceedings Article

In: ACM Computing Frontiers, pp. 279–286, ACM, Virtual Conference, Italy, 2021.

Abstract | Links | BibTeX | Tags: admire, eupex, eupilot, textarossa

@inproceedings{21:CINI_acm_CF,

title = {The Italian research on HPC key technologies across EuroHPC},

author = {Marco Aldinucci and Giovanni Agosta and Antonio Andreini and Claudio A. Ardagna and Andrea Bartolini and Alessandro Cilardo and Biagio Cosenza and Marco Danelutto and Roberto Esposito and William Fornaciari and Roberto Giorgi and Davide Lengani and Raffaele Montella and Mauro Olivieri and Sergio Saponara and Daniele Simoni and Massimo Torquati},

url = {https://iris.unito.it/retrieve/handle/2318/1783118/744641/preprint.pdf},

doi = {10.1145/3457388.3458508},

year  = {2021},

date = {2021-05-01},

booktitle = {ACM Computing Frontiers},

pages = {279–286},

publisher = {ACM},

address = {Virtual Conference, Italy},

abstract = {High-Performance Computing (HPC) is one of the strategic priorities for research and innovation worldwide due to its relevance for industrial and scientific applications. We envision HPC as composed of three pillars: infrastructures, applications, and key technologies and tools. While infrastructures are by construction centralized in large-scale HPC centers, and applications are generally within the purview of domain-specific organizations, key technologies fall in an intermediate case where coordination is needed, but design and development are often decentralized. A large group of Italian researchers has started a dedicated laboratory within the National Interuniversity Consortium for Informatics (CINI) to address this challenge. The laboratory, albeit young, has managed to succeed in its first attempts to propose a coordinated approach to HPC research within the EuroHPC Joint Undertaking, participating in the calls 2019-20 to five successful proposals for an aggregate total cost of 95M Euro. In this paper, we outline the working group's scope and goals and provide an overview of the five funded projects, which become fully operational in March 2021, and cover a selection of key technologies provided by the working group partners, highlighting their usage development within the projects.},

keywords = {admire, eupex, eupilot, textarossa},

pubstate = {published},

tppubtype = {inproceedings}

}

WE ARE HIRING! If you are Research Engineers, Ph.D. Candidates and Post-Doctoral Researchers send your CV to alpha@di.unito.it