Papers | Parallel Computing

2024

Gianluca Mittone, Giulio Malenza, Marco Aldinucci, Robert Birke

Distributed Edge Inference: an Experimental Study on Multiview Detection Proceedings Article

In: Proc. of the 16th IEEE/ACM Intl. Conference on Utility and Cloud Computing Companion (UCC), pp. 1-6, ACM, Taormina, Italy, 2024, (eupilot, icsc).

Abstract | Links | BibTeX | Tags: ai, eupilot, icsc

Bruno Casella, Alessio Barbaro Chisari, Marco Aldinucci, Sebastiano Battiato, Mario Valerio Giuffrida

Federated Learning in a Semi-Supervised Environment for Earth Observation Data Proceedings Article

In: Proceedings of the 32nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN, Bruges, Belgium, 2024.

Abstract | Links | BibTeX | Tags: ai, epi, icsc

Bruno Casella, Jakobs Matthias, Marco Aldinucci, Sebastian Buschjager

Federated Time Series Classification with ROCKET features Proceedings Article

In: Proceedings of the 32nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN, Bruges, Belgium, 2024.

Abstract | Links | BibTeX | Tags: ai, epi, icsc

Samuele Fonio, Mirko Polato, Roberto Esposito

FedHP: Federated Learning with Hyperspherical Prototypical Regularization Proceedings Article

In: 32nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, (ESANN), Bruges, Belgium, 2024.

Abstract | Links | BibTeX | Tags: ai, icsc

Simone Leo, Michael R. Crusoe, Laura Rodríguez-Navas, Raül Sirvent, Alexander Kanitz, Paul De Geest, Rudolf Wittner, Luca Pireddu, Daniel Garijo, José M. Fernández, Iacopo Colonnelli, Matej Gallo, Tazro Ohta, Hirotaka Suetake, Salvador Capella-Gutierrez, Renske Wit, Bruno P. Kinoshita, Stian Soiland-Reyes

Recording provenance of workflow runs with RO-Crate Journal Article

In: PLoS ONE, vol. 19, no. 9, pp. 1–35, 2024.

Abstract | Links | BibTeX | Tags: across, eupex, icsc, streamflow

@article{24:pone:wfrunrocrate,

title = {Recording provenance of workflow runs with RO-Crate},

author = {Simone Leo and Michael R. Crusoe and Laura Rodríguez-Navas and Raül Sirvent and Alexander Kanitz and Paul De Geest and Rudolf Wittner and Luca Pireddu and Daniel Garijo and José M. Fernández and Iacopo Colonnelli and Matej Gallo and Tazro Ohta and Hirotaka Suetake and Salvador Capella-Gutierrez and Renske Wit and Bruno P. Kinoshita and Stian Soiland-Reyes},

url = {https://iris.unito.it/retrieve/57752f7b-9f8f-4013-8cef-5b498703d882/journal.pone.0309210.pdf},

doi = {10.1371/journal.pone.0309210},

year  = {2024},

date = {2024-09-01},

journal = {PLoS ONE},

volume = {19},

number = {9},

pages = {1–35},

publisher = {Public Library of Science},

abstract = {Recording the provenance of scientific computation results is key to the support of traceability, reproducibility and quality assessment of data products. Several data models have been explored to address this need, providing representations of workflow plans and their executions as well as means of packaging the resulting information for archiving and sharing. However, existing approaches tend to lack interoperable adoption across workflow management systems. In this work we present Workflow Run RO-Crate, an extension of RO-Crate (Research Object Crate) and Schema.org to capture the provenance of the execution of computational workflows at different levels of granularity and bundle together all their associated objects (inputs, outputs, code, etc.). The model is supported by a diverse, open community that runs regular meetings, discussing development, maintenance and adoption aspects. Workflow Run RO-Crate is already implemented by several workflow management systems, allowing interoperable comparisons between workflow runs from heterogeneous systems. We describe the model, its alignment to standards such as W3C PROV, and its implementation in six workflow systems. Finally, we illustrate the application of Workflow Run RO-Crate in two use cases of machine learning in the digital image analysis domain.},

keywords = {across, eupex, icsc, streamflow},

pubstate = {published},

tppubtype = {article}

}

Iacopo Colonnelli, Doriana Medić, Alberto Mulone, Viviana Bono, Luca Padovani, Marco Aldinucci

Introducing SWIRL: An Intermediate Representation Language for Scientific Workflows Proceedings Article

In: Platzer, André, Rozier, Kristin Yvonne, Pradella, Matteo, Rossi, Matteo (Ed.): Formal Methods. FM 2024, pp. 226–244, Springer Nature Switzerland, Milano, Italy, 2024.

Abstract | Links | BibTeX | Tags: eupex, icsc

Alberto Mulone, Doriana Medić, Marco Aldinucci

A Fault Tolerance mechanism for Hybrid Scientific Workflows Proceedings Article

In: 1st workshop about High-Performance e-Science (HiPES), Madrid, Spain, 2024.

Abstract | BibTeX | Tags: eupex, icsc, streamflow

Alessia Antelmi, Massimo Torquati, Giacomo Corridori, Daniele Gregori, Francesco Polzella, Gianmarco Spinatelli, Marco Aldinucci

Analyzing FOSS license usage in publicly available software at scale via the SWH-analytics framework Journal Article

In: The Journal of Supercomputing, vol. 80, no. 11, pp. 15799-15833, 2024, ISSN: 1573-0484.

Abstract | Links | BibTeX | Tags: analytics, icsc

Miruna Bețianu, Abele Mălan, Marco Aldinucci, Robert Birke, Lydia Chen

DALLMi: Domain Adaption for LLM-based Multi-label Classifier Proceedings Article

In: Yang, De-Nian, Xie, Xing, Tseng, Vincent S., Pei, Jian, Huang, Jen-Wei, Lin, Jerry Chun-Wei (Ed.): Proceedings of the 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 277–289, Springer, Taipei, Taiwan, 2024.

Abstract | Links | BibTeX | Tags: ai, eupilot, icsc

@inproceedings{24:betianu:llm,

title = {DALLMi: Domain Adaption for LLM-based Multi-label Classifier},

author = {Miruna Bețianu and Abele Mălan and Marco Aldinucci and Robert Birke and Lydia Chen},

editor = {De-Nian Yang and Xing Xie and Vincent S. Tseng and Jian Pei and Jen-Wei Huang and Jerry Chun-Wei Lin},

url = {https://hdl.handle.net/2318/1976672},

doi = {10.1007/978-981-97-2259-4_21},

year  = {2024},

date = {2024-05-01},

booktitle = {Proceedings of the 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining},

volume = {14647},

pages = {277–289},

publisher = {Springer},

address = {Taipei, Taiwan},

series = {Lecture Notes in Computer Science},

abstract = {Large language models (LLMs) increasingly serve as the backbone for classifying text associated with distinct domains and simultaneously several labels (classes). When encountering domain shifts, e.g., classifier of movie reviews from IMDb to Rotten Tomatoes, adapting such an LLM-based multi-label classifier is challenging due to incomplete label sets at the target domain and daunting training overhead. The existing domain adaptation methods address either image multi-label classifiers or text binary classifiers. In this paper, we design DALLMi, Domain Adaptation Large Language Model interpolator, a first-of-its-kind semi-supervised domain adaptation method for text data models based on LLMs, specifically BERT. The core of DALLMi is the novel variation loss and MixUp regularization, which jointly leverage the limited positively labeled and large quantity of unlabeled text and, importantly, their interpolation from the BERT word embeddings. DALLMi also introduces a label-balanced sampling strategy to overcome the imbalance between labeled and unlabeled data. We evaluate DALLMi against the partial-supervised and unsupervised approach on three datasets under different scenarios of label availability for the target domain. Our results show that DALLMi achieves higher mAP than unsupervised and partially-supervised approaches by 19.9% and 52.2%, respectively.},

keywords = {ai, eupilot, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Chi Hong, Robert Birke, Pin-Yu Chen, Lydia Chen

On Dark Knowledge for Distilling Generators Proceedings Article

Abstract | Links | BibTeX | Tags: ai, epi, icsc

@inproceedings{24:chen:llm,

title = {On Dark Knowledge for Distilling Generators},

author = {Chi Hong and Robert Birke and Pin-Yu Chen and Lydia Chen},

editor = {De-Nian Yang and Xing Xie and Vincent S. Tseng and Jian Pei and Jen-Wei Huang and Jerry Chun-Wei Lin},

url = {https://hdl.handle.net/2318/1976671},

doi = {10.1007/978-981-97-2253-2_19},

year  = {2024},

date = {2024-05-01},

booktitle = {Proceedings of the 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining},

volume = {14646},

pages = {235–247},

publisher = {Springer},

address = {Taipei, Taiwan},

series = {Lecture Notes in Computer Science},

abstract = {Knowledge distillation has been applied on generative models, such as Variational Autoencoder (VAE) and Generative Adversarial Networks (GANs). To distill the knowledge, the synthetic outputs of a teacher generator are used to train a student model. While the dark knowledge, i.e., the probabilistic output, is well explored in distilling classifiers, little is known about the existence of an equivalent dark knowledge for generative models and its extractability. In this paper, we derive the first kind of empirical risk bound for distilling generative models from a Bayesian perspective. Through our analysis, we show the existence of the dark knowledge for generative models, i.e., Bayes probability distribution of a synthetic output from a given input, which achieves lower empirical risk bound than merely using the synthetic output of the generators. Furthermore, we propose a Dark Knowledge based Distillation , DKtill, which trains the student generator based on the (approximate) dark knowledge. Our extensive evaluation on distilling VAE, conditional GANs, and translation GANs on Facades and CelebA datasets show that the FID of student generators trained by DKtill combining dark knowledge are lower than student generators trained only by the synthetic outputs by up to 42.66%, and 78.99%, respectively.},

keywords = {ai, epi, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Bruno Casella, Iacopo Colonnelli, Gianluca Mittone, Robert Birke, Walter Riviera, Antonio Sciarappa, Carlo Cavazzoni, Marco Aldinucci

A Performance Analysis for Confidential Federated Learning Proceedings Article

In: Proceedings of the 2024 Deep Learning Security and Privacy Workshop, IEEE Symposium on Security and Privacy 2024, San Francisco, CA, 2024.

Abstract | Links | BibTeX | Tags: ai, confidential, epi, icsc

@inproceedings{24:casella:sgx,

title = {A Performance Analysis for Confidential Federated Learning},

author = {Bruno Casella and Iacopo Colonnelli and Gianluca Mittone and Robert Birke and Walter Riviera and Antonio Sciarappa and Carlo Cavazzoni and Marco Aldinucci},

url = {https://iris.unito.it/retrieve/b5877a97-2d8d-4e95-8791-0aa4a1b953b3/DLSP___CONFIDENTIAL_FL.pdf},

doi = {10.1109/SPW63631.2024.00009},

year  = {2024},

date = {2024-05-01},

booktitle = {Proceedings of the 2024 Deep Learning Security and Privacy Workshop, IEEE Symposium on Security and Privacy 2024},

address = {San Francisco, CA},

abstract = {Federated Learning (FL) has emerged as a solution to preserve data privacy by keeping the data locally on each participant's device. However, FL alone is still vulnerable to attacks that can cause privacy leaks. Therefore, it becomes necessary to take additional security measures at the cost of increasing runtimes. The Trusted Execution Environment (TEE) approach promises to offer the highest degree of security during execution. However, TEEs suffer from memory limits which prevent safe end-to-end FL training of modern deep models. State-of- the-art approaches limit secure training to selected layers, failing to avert the full spectrum of attacks or adopt layer-wise training affecting model performance. We benchmark the usage of a library OS (LibOS) to run the full, unmodified end-to-end FL training inside the TEE. We extensively evaluate and model the overhead of the different security mechanisms needed to protect the data and model during computation (TEE), communication (TLS), and storage (disk encryption). The obtained results across three datasets and two models demonstrate that LibOSes are a viable way to seamlessly inject security into FL with limited overhead (at most 2x), offering valuable guidance for researchers and developers aiming to apply FL in data-security-focused contexts.},

keywords = {ai, confidential, epi, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Giulio Malenza, Valentina Cesare, Marco Aldinucci, Ugo Becciani, Alberto Vecchiato

Toward HPC application portability via C++ PSTL: the Gaia AVU-GSR code assessment Journal Article

In: The Journal of Supercomputing, 2024, ISSN: 09208542.

Abstract | Links | BibTeX | Tags: eupex, HPC, icsc

Marco Edoardo Santimaria, Samuele Fonio, Giulio Malenza, Iacopo Colonnelli, Marco Aldinucci

Benchmarking Parallelization Models through Karmarkar Interior-point method Proceedings Article

In: Chis, Horacio González-Vélez Adriana E. (Ed.): Proc. of 32nd Euromicro intl. Conference on Parallel, Distributed and Network-based Processing (PDP), pp. 1-8, IEEE, Dublin, Ireland, 2024, ISSN: 2377-5750.

Abstract | Links | BibTeX | Tags: HPC, icsc

Giulio Malenza, Valentina Cesare, Marco Edoardo Santimaria, Robert Birke, Alberto Vecchiato, Ugo Becciani, Marco Aldinucci

Performance portability via C++ PSTL, SYCL, OpenMP, and HIP: the Gaia AVU-GSR case study Proceedings Article

In: SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1152-1163, IEEE, 2024, ISBN: 979-8-3503-5554-3.

Abstract | Links | BibTeX | Tags: eupex, icsc

@inproceedings{Malenza_P3HPC_24,

title = {Performance portability via C++ PSTL, SYCL, OpenMP, and HIP: the Gaia AVU-GSR case study},

author = {Giulio Malenza and Valentina Cesare and Marco Edoardo Santimaria and Robert Birke and Alberto Vecchiato and Ugo Becciani and Marco Aldinucci},

url = {https://conferences.computer.org/sc-wpub/pdfs/SC-W2024-6oZmigAQfgJ1GhPL0yE3pS/555400b152/555400b152.pdf},

doi = {10.1109/SCW63240.2024.00157},

isbn = {979-8-3503-5554-3},

year  = {2024},

date = {2024-01-01},

booktitle = {SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis},

pages = {1152-1163},

publisher = {IEEE},

abstract = {Applications that analyze data from modern scientific experiments will soon require a computing capacity of ExaFLOPs. The current trend to achieve such performance is to employ GPU-accelerated supercomputers and design applications to optimally exploit this hardware. Since each supercomputer is typically a one-off project, the necessity of having computational languages portable across diverse CPU and GPU architectures without performance losses is increasingly compelling. Here, we study the performance portability of the LSQR algorithm as found in the AVU-GSR code of the ESA Gaia mission. This code computes the astrometric parameters of the ~108 stars in our Galaxy. The LSQR algorithm is widely used across a broad range of high-performance computing (HPC) applications, elevating the study's relevance beyond the astrophysical domain. We developed different GPU-accelerated ports based on CUDA, C++ PSTL, SYCL, OpenMP, and HIP. We carefully verified the correctness of each port and tuned them to five different GPU-accelerated platforms from NVIDIA and AMD to evaluate the performance portability (PP) in terms of the harmonic mean of the application's performance efficiency across the tested hardware. HIP was demonstrated to be the most portable solution with a 0.94 average PP across the tested problem sizes, closely followed by SYCL coupled with AdaptiveCpp (ACPP) with 0.93. If we only consider NVIDIA platforms, CUDA would be the winner with 0.97. The tuning-oblivious C++ PSTL achieves 0.62 when coupled with vendor-specific compilers.},

keywords = {eupex, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Daniele De Vinco, Andrea Tranquillo, Alessia Antelmi, Carmine Spagnuolo, Vittorio Scarano

High-Performance Computation on a Rust-based distributed ABM engine Proceedings Article

In: Antelmi, Alessia, Carlini, Emanuele, Dazzi, Patrizio (Ed.): Proceedings of BigHPC2024: Special Track on Big Data and High-Performance Computing, co-located with the 3textsuperscriptrd Italian Conference on Big Data and Data Science, ITADATA2024, CEUR-WS.org, 2024.

Abstract | Links | BibTeX | Tags: analytics, icsc

Oussama Harrak, Bruno Casella, Samuele Fonio, Piero Fariselli, Gianluca Mittone, Tiziana Sanavia, Marco Aldinucci

Federated AdaBoost for Survival Analysis Proceedings Article

In: Proceedings of the ECML-PKDD Workshop, 2nd workshop on advancements in Federated Learning, Vilnius, Lithuania, 2024.

Abstract | BibTeX | Tags: epi, icsc

Adriano Marques Garcia, Giulio Malenza, Robert Birke, Marco Aldinucci

Assessing Large Language Models Inference Performance on a 64-core RISC-V CPU with Silicon-Enabled Vectors Proceedings Article

Abstract | Links | BibTeX | Tags: eupilot, icsc

Lorenzo Brescia, Iacopo Colonnelli, Marco Aldinucci

Performance Analysis on DNA Alignment Workload with Intel SGX Multithreading Proceedings Article

Abstract | Links | BibTeX | Tags: confidential, icsc

Sunwoo Kim, Soo Yong Lee, Yue Gao, Alessia Antelmi, Mirko Polato, Kijung Shin

A Survey on Hypergraph Neural Networks: An In-Depth and Step-By-Step Guide Proceedings Article

In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 6534–6544, Association for Computing Machinery, Barcelona, Spain, 2024, ISBN: 9798400704901.

Abstract | Links | BibTeX | Tags: ai, analytics, icsc

Iacopo Colonnelli, Robert Birke, Giulio Malenza, Gianluca Mittone, Alberto Mulone, Jeroen Galjaard, Lydia Y. Chen, Sanzio Bassini, Gabriella Scipione, Jan Martinovič, Vit Vondrák, Marco Aldinucci

Cross-Facility Federated Learning Journal Article

In: Procedia Computer Science, vol. 240, pp. 3–12, 2024, ISSN: 1877-0509.

Abstract | Links | BibTeX | Tags: icsc, space, streamflow

@article{24:eurohpc:xffl,

title = {Cross-Facility Federated Learning},

author = {Iacopo Colonnelli and Robert Birke and Giulio Malenza and Gianluca Mittone and Alberto Mulone and Jeroen Galjaard and Lydia Y. Chen and Sanzio Bassini and Gabriella Scipione and Jan Martinovič and Vit Vondrák and Marco Aldinucci},

url = {https://www.sciencedirect.com/science/article/pii/S1877050924016909},

doi = {10.1016/j.procs.2024.07.003},

issn = {1877-0509},

year  = {2024},

date = {2024-01-01},

booktitle = {Proceedings of the First EuroHPC user day},

journal = {Procedia Computer Science},

volume = {240},

pages = {3–12},

publisher = {Elsevier},

address = {Bruxelles, Belgium},

abstract = {In a decade, AI frontier research transitioned from the researcher's workstation to thousands of high-end hardware-accelerated compute nodes. This rapid evolution shows no signs of slowing down in the foreseeable future. While top cloud providers may be able to keep pace with this growth rate, obtaining and efficiently exploiting computing resources at that scale is a daunting challenge for universities and SMEs. This work introduces the Cross-Facility Federated Learning (XFFL) framework to bridge this compute divide, extending the opportunity to efficiently exploit multiple independent data centres for extreme-scale deep learning tasks to data scientists and domain experts. XFFL relies on hybrid workflow abstractions to decouple tasks from environment-specific technicalities, reducing complexity and enhancing reusability. In addition, Federated Learning (FL) algorithms eliminate the need to move large amounts of data between different facilities, reducing time-to-solution and preserving data privacy. The XFFL approach is empirically evaluated by training a full LLaMAv2 7B instance on two facilities of the EuroHPC JU, showing how the increased computing power completely compensates for the additional overhead introduced by two data centres.},

keywords = {icsc, space, streamflow},

pubstate = {published},

tppubtype = {article}

}

Simon Queyrut, Robert Birke, Pascal Felber, Valerio Schiavon

CLUES: Collusive Theft of Conditional Generative Adversarial Networks Proceedings Article

In: 43rd International Symposium on Reliable Distributed Systems SRDS, 2024.

BibTeX | Tags: ai, icsc

Lorenzo Brescia, Marco Aldinucci

Secure Generic Remote Workflow Execution with TEEs Proceedings Article

In: Proc. of the 2nd Workshop on Workflows in Distributed Environments (WiDE), pp. 8-13, ACM, Athens, Greece, 2024.

Abstract | Links | BibTeX | Tags: confidential, icsc

Bruno Casella, Roberto Esposito, Antonio Sciarappa, Carlo Cavazzoni, Marco Aldinucci

Experimenting With Normalization Layers in Federated Learning on Non-IID Scenarios Journal Article

In: IEEE Access, vol. 12, pp. 47961-47971, 2024.

Links | BibTeX | Tags: epi, icsc

Daniele De Vinco, Alessia Antelmi, Carmine Spagnuolo, Luca Maria Aiello

Deciphering Conversational Networks: Stance Detection via Hypergraphs and LLMs Proceedings Article

In: Companion Publication of the 16th ACM Web Science Conference, pp. 3–4, Association for Computing Machinery, Stuttgart, Germany, 2024, ISBN: 9798400704536.

Abstract | Links | BibTeX | Tags: analytics, icsc

Alessia Antelmi, Daniele De Vinco, Carmine Spagnuolo

HypergraphRepository: A Community-Driven and Interactive Hypernetwork Data Collection Proceedings Article

In: Dewar, Megan, Kamiński, Bogumił, Kaszyński, Daniel, Kraiński, Łukasz, Prałat, Paweł, Théberge, François, Wrzosek, Małgorzata (Ed.): Modelling and Mining Networks, pp. 159–173, Springer Nature Switzerland, Cham, 2024, ISBN: 978-3-031-59205-8.

Abstract | Links | BibTeX | Tags: analytics, icsc

Alessia Antelmi, Pasquale Caramante, Gennaro Cordasco, Giuseppe D'Ambrosio, Daniele De Vinco, Francesco Foglia, Luca Postiglione, Carmine Spagnuolo

Reliable and Efficient Agent-Based Modeling and Simulation Journal Article

In: Journal of Artificial Societies and Social Simulation, vol. 27, no. 2, pp. 4, 2024, ISSN: 1460-7425.

Abstract | Links | BibTeX | Tags: analytics, icsc

Bruno Casella, Walter Riviera, Marco Aldinucci, Gloria Menegaz

Protocol for training MERGE: A federated multi-input neural network for COVID-19 prognosis Journal Article

In: STAR Protocols, 2024, (https://prod-shared-star-protocols.s3.amazonaws.com/protocols/3225.pdf).

Abstract | Links | BibTeX | Tags: epi, icsc

2023

Alberto Riccardo Martinelli, Massimo Torquati, Marco Aldinucci, Iacopo Colonnelli, Barbara Cantalupo

CAPIO: a Middleware for Transparent I/O Streaming in Data-Intensive Workflows Proceedings Article

In: 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC), IEEE, Goa, India, 2023.

Abstract | Links | BibTeX | Tags: admire, capio, eupex, icsc

Marco Aldinucci, Elena Maria Baralis, Valeria Cardellini, Iacopo Colonnelli, Marco Danelutto, Sergio Decherchi, Giuseppe Di Modica, Luca Ferrucci, Marco Gribaudo, Francesco Iannone, Marco Lapegna, Doriana Medic, Giuseppa Muscianisi, Francesca Righetti, Eva Sciacca, Nicola Tonellotto, Mauro Tortonesi, Paolo Trunfio, Tullio Vardanega

A Systematic Mapping Study of Italian Research on Workflows Proceedings Article

In: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, SC-W 2023, pp. 2065–2076, ACM, Denver, CO, USA, 2023.

Abstract | Links | BibTeX | Tags: icsc, jupyter-workflow, streamflow

@inproceedings{WORKS2023,

title = {A Systematic Mapping Study of Italian Research on Workflows},

author = {Marco Aldinucci and Elena Maria Baralis and Valeria Cardellini and Iacopo Colonnelli and Marco Danelutto and Sergio Decherchi and Giuseppe Di Modica and Luca Ferrucci and Marco Gribaudo and Francesco Iannone and Marco Lapegna and Doriana Medic and Giuseppa Muscianisi and Francesca Righetti and Eva Sciacca and Nicola Tonellotto and Mauro Tortonesi and Paolo Trunfio and Tullio Vardanega},

url = {https://doi.org/10.1145/3624062.3624285},

doi = {10.1145/3624062.3624285},

year  = {2023},

date = {2023-11-01},

booktitle = {Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, SC-W 2023},

pages = {2065–2076},

publisher = {ACM},

address = {Denver, CO, USA},

abstract = {An entire ecosystem of methodologies and tools revolves around scientific workflow management. They cover crucial non-functional requirements that standard workflow models fail to target, such as interactive execution, energy efficiency, performance portability, Big Data management, and intelligent orchestration in the Computing Continuum. Characterizing and monitoring this ecosystem is crucial to develop an informed view of current and future research directions. This work conducts a systematic mapping study of the Italian workflow research community, collecting and analyzing 25 tools and 10 applications from several scientific domains in the context of the ``National Research Centre for HPC, Big Data, and Quantum Computing'' (ICSC). The study aims to outline the main current research directions and determine how they address the critical needs of modern scientific applications. The findings highlight a variegated research ecosystem of tools, with a prominent interest in advanced workflow orchestration and still immature but promising efforts toward energy efficiency.},

keywords = {icsc, jupyter-workflow, streamflow},

pubstate = {published},

tppubtype = {inproceedings}

}

Zilong Zhao, Robert Birke, Lydia Y. Chen

FCT-GAN: Enhancing Global Correlation of Table Synthesis via Fourier Transform Proceedings Article

In: 32nd ACM International Conference on Information and Knowledge Management (CIKM '23), ACM, Birmingham, United Kingdom, 2023.

Abstract | Links | BibTeX | Tags: icsc

@inproceedings{23:zhao:fctgan,

title = {FCT-GAN: Enhancing Global Correlation of Table Synthesis via Fourier Transform},

author = {Zilong Zhao and Robert Birke and Lydia Y. Chen},

url = {https://iris.unito.it/retrieve/966ba767-dbbd-41e1-b4e3-7ab7ba09303f/FCT-GAN.pdf},

doi = {10.1145/3583780.3615202},

year  = {2023},

date = {2023-10-01},

booktitle = {32nd ACM International Conference on Information and Knowledge Management (CIKM '23)},

publisher = {ACM},

address = {Birmingham, United Kingdom},

abstract = {An alternative method for sharing knowledge while complying with strict data access regulations, such as the European General Data Protection Regulation (GDPR), is the emergence of synthetic tabular data. Mainstream table synthesizers utilize methodologies derived from Generative Adversarial Networks (GAN). Although several state-of-the-art (SOTA) tabular GAN algorithms inherit Convolutional Neural Network (CNN)-based architectures, which have proven effective for images, they tend to overlook two critical properties of tabular data: (i) the global correlation across columns, and (ii) the semantic invariance to the column order. Permuting columns in a table does not alter the semantic meaning of the data, but features extracted by CNNs can change significantly due to their limited convolution filter kernel size. To address the above problems, we propose FCT-GAN– the first conditional tabular GAN to adopt Fourier networks into table synthesis. FCT-GAN enhances permutation invariant GAN training by strengthening the learning of global correlations via Fourier layers. Extensive evaluation on benchmarks and real-world datasets show that FCT-GAN can synthesize tabular data with better (up to 27.8%) machine learning utility (i.e. a proxy of global correlations) and higher (up to 26.5%) statistical similarity to real data. FCT-GAN also has the least variation on synthetic data quality among 7 SOTA baselines on 3 different training-data column orders.},

keywords = {icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Samuele Fonio, Lorenzo Paletto, Mattia Cerrato, Dino Ienco, Roberto Esposito

Hierarchical priors for Hyperspherical Prototypical Networks Proceedings Article

In: 31th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN, Bruges, Belgium, 2023, (In print).

Abstract | Links | BibTeX | Tags: ai, icsc

Samuele Fonio

Benchmarking Federated Learning Frameworks for Medical Imaging Tasks Proceedings Article

In: Foresti, G. L., Fusiello, A., Hancock, E. (Ed.): Image Analysis and Processing - ICIAP 2023 Workshops. ICIAP 2023, Springer, Cham, Udine, Italy, 2023, (In print).

Abstract | Links | BibTeX | Tags: ai, eupilot, icsc

Gianluca Mittone, Samuele Fonio

Benchmarking Federated Learning Scalability Proceedings Article

In: Proceedings of the 2nd Italian Conference on Big Data and Data Science, ITADATA 2023, September 11-13, 2023, CEUR, Naples, Italy, 2023.

Abstract | Links | BibTeX | Tags: eupilot, HPC, icsc

Chi Hong, Jiyue Huang, Robert Birke, Lydia Y. Chen

Exploring and Exploiting Data-Free Model Stealing Proceedings Article

In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Turin, Italy, 2023.

Abstract | Links | BibTeX | Tags: eupilot, icsc

@inproceedings{23:hong:datafree,

title = {Exploring and Exploiting Data-Free Model Stealing},

author = {Chi Hong and Jiyue Huang and Robert Birke and Lydia Y. Chen},

url = {https://iris.unito.it/retrieve/ce44dec6-12c9-443d-99e7-f1141e50aa3a/Data-free%20Model%20Stealing.pdf},

doi = {10.1007/978-3-031-43424-2_2},

year  = {2023},

date = {2023-09-01},

booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)},

address = {Turin, Italy},

abstract = {Deep machine learning models, e.g., image classifier, are increasingly deployed in the wild to provide services to users. Adversaries are shown capable of stealing the knowledge of these models by sending inference queries and then training substitute models based on query results. The availability and quality of adversarial query inputs are undoubtedly crucial in the stealing process. The recent prior art demonstrates the feasibility of replacing real data by exploring the synthetic adversarial queries, so called data-free attacks, under strong adversarial assumptions, i.e., the deployed classier returns not only class labels but also class probabilities. In this paper, we consider a general adversarial model and propose an effective data-free stealing algorithm, Tandem-GAN, which not only explores synthetic queries but also explicitly exploits the high quality ones. The core of TandemGAN is composed of (i) substitute model which imitates the target model through synthetic queries and their inferred labels; and (ii) a tandem generator consisting of two networks, Gx and Ge, which first explores the synthetic data space via Gx and then exploits high-quality examples via Ge to maximize the knowledge transfer from the target to the substitute model. Our results on four datasets show that the accuracy of our trained substitute model ranges between 96-67% of the target model and outperforms the existing state-of-the-art data-free model stealing approach by up to 2.5X.},

keywords = {eupilot, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Gianluca Mittone, Walter Riviera, Iacopo Colonnelli, Robert Birke, Marco Aldinucci

Model-Agnostic Federated Learning Proceedings Article

In: Euro-Par 2023: Parallel Processing, pp. 383–396, Springer, Limassol, Cyprus, 2023.

Abstract | Links | BibTeX | Tags: ai, confidential, eupilot, icsc, riscv

Iacopo Colonnelli, Robert Birke, Marco Aldinucci

Experimenting with PyTorch on RISC-V Proceedings Article

In: RISC-V Summit Europe 2023, Barcelona, Spain, 2023, (Poster).

Abstract | Links | BibTeX | Tags: eupilot, icsc, riscv

Marco Aldinucci, Robert Birke, Antonio Brogi, Emanuele Carlini, Massimo Coppola, Marco Danelutto, Patrizio Dazzi, Luca Ferrucci, Forti Stefano, Hanna Kavalionak, Gabriele Mencagli, Matteo Mordacchin, Marcelo Pasin, Federica Paganelli, Massimo Torquati

A Proposal for a Continuum-aware Programming Model: From Workflows to Services Autonomously Interacting in the Compute Continuum Proceedings Article

In: 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), IEEE, Turin, Italy, 2023.

Abstract | Links | BibTeX | Tags: icsc

@inproceedings{23:aldinucci:continuum,

title = {A Proposal for a Continuum-aware Programming Model: From Workflows to Services Autonomously Interacting in the Compute Continuum},

author = {Marco Aldinucci and Robert Birke and Antonio Brogi and Emanuele Carlini and Massimo Coppola and Marco Danelutto and Patrizio Dazzi and Luca Ferrucci and Forti Stefano and Hanna Kavalionak and Gabriele Mencagli and Matteo Mordacchin and Marcelo Pasin and Federica Paganelli and Massimo Torquati},

url = {https://iris.unito.it/retrieve/2ae13a33-5814-43da-8ea6-2d3e8b122384/Continuum-aware-PM.pdf},

doi = {10.1109/COMPSAC57700.2023.00287},

year  = {2023},

date = {2023-06-01},

booktitle = {2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)},

publisher = {IEEE},

address = {Turin, Italy},

abstract = {This paper proposes a continuum-aware programming model enabling the execution of application workflows across the compute continuum: cloud, fog and edge resources. It simplifies the management of heterogeneous nodes while alleviating the burden of programmers and unleashing innovation. This model optimizes the continuum through advanced development experiences by transforming workflows into autonomous service collaborations. It reduces complexity in positioning/interconnecting services across the continuum. A meta-model introduces high-level workflow descriptions as service networks with defined contracts and quality of service, thus enabling the deployment/management of workflows as first-class entities. It also provides automation based on policies, monitoring and heuristics. Tailored mechanisms orchestrate/manage services across the continuum, optimizing performance, cost, data protection and sustainability while managing risks. This model facilitates incremental development with visibility of design impacts and seamless evolution of applications and infrastructures. In this work, we explore this new computing paradigm showing how it can trigger the development of a new generation of tools to support the compute continuum progress.},

keywords = {icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Gianluca Mittone, Nicolò Tonci, Robert Birke, Iacopo Colonnelli, Doriana Medić, Andrea Bartolini, Roberto Esposito, Emanuele Parisi, Francesco Beneventi, Mirko Polato, Massimo Torquati, Luca Benini, Marco Aldinucci

Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning Proceedings Article

In: 20th ACM International Conference on Computing Frontiers (CF '23), ACM, Bologna, Italy, 2023, ISBN: 979-8-4007-0140-5/23/05, (https://arxiv.org/abs/2302.07946).

Abstract | Links | BibTeX | Tags: ai, confidential, eupilot, HPC, icsc, riscv

Gianluca Mittone, Filip Svoboda, Marco Aldinucci, Nicholas D. Lane, Pietro Lio

A Federated Learning Benchmark for Drug-Target Interaction Proceedings Article

In: Companion Proceedings of the ACM Web Conference 2023 (WWW '23 Companion), ACM, Austin, Texas, 2023, ISBN: 978-1-4503-9419-2/23/04, (https://arxiv.org/abs/2302.07684).

Abstract | Links | BibTeX | Tags: ai, confidential, eupilot, icsc

Bruno Casella, Roberto Esposito, Antonio Sciarappa, Carlo Cavazzoni, Marco Aldinucci

Experimenting with Normalization Layers in Federated Learning on non-IID scenarios Technical Report

Computer Science Department, University of Torino 2023.

Abstract | Links | BibTeX | Tags: confidential, epi, icsc

Alessia Antelmi, Luca La Cava, Arianna Pera

Tell Me Who You Are and I Will Predict Your Vulnerability to Political Persuasion Techniques Proceedings Article

In: The 12th International Conference on Complex Networks and their Applications-Book of Abstracts, 2023.

Abstract | Links | BibTeX | Tags: analytics, icsc

Alessia Antelmi, Luca La Cava, Arianna Pera

Finding Hidden Swingers in the 2022 Italian Elections Twitter Discourse Proceedings Article

In: The 12th International Conference on Complex Networks and their Applications-Book of Abstracts, 2023.

Abstract | Links | BibTeX | Tags: analytics, icsc

Alessia Antelmi, Massimo Torquati, Daniele Gregori, Francesco Polzella, Gianmarco Spinatelli, Marco Aldinucci

The SWH-Analytics Framework Proceedings Article

In: Bena, Nicola, Martino, Beniamino Di, Maratea, Antonio, Sperduti, Alessandro, Nardo, Emanuel Di, Ciaramella, Angelo, Montella, Raffaele, Ardagna, Claudio A. (Ed.): Proceedings of the 2nd Italian Conference on Big Data and Data Science (ITADATA 2023), Naples, Italy, September 11-13, 2023, CEUR-WS.org, 2023.

Abstract | Links | BibTeX | Tags: admire, analytics, icsc

Iacopo Colonnelli

Workflow Models for Heterogeneous Distributed Systems Proceedings Article

Abstract | Links | BibTeX | Tags: across, eupex, icsc, jupyter-workflow, streamflow

Bruno Casella, Lorenzo Paletto

Predicting Cryptocurrencies Market Phases through On-Chain Data Long-Term Forecasting Proceedings Article

In: Proceedings of the 2023 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), 1-5 May 2023, Dubai, 2023, (https://ieeexplore.ieee.org/document/10174989).

Abstract | Links | BibTeX | Tags: epi, icsc

Bruno Casella, Samuele Fonio

Architecture-Based FedAvg for Vertical Federated Learning Proceedings Article

In: Proceedings of the 3rd Workshop on Distributed Machine Learning for the Intelligent Computing Continuum (DML-ICC), IEEE/ACM UCC 2023, Taormina, Italy, 4 December 2023, 2023, (https://iris.unito.it/bitstream/2318/1949730/1/HALF_HVL_for_DML_ICC23___Taormina-2.pdf).

Abstract | Links | BibTeX | Tags: ai, epi, icsc

@inproceedings{23:casella:architecturalfedavg,

title = {Architecture-Based FedAvg for Vertical Federated Learning},

author = {Bruno Casella and Samuele Fonio},

url = {https://iris.unito.it/retrieve/173d9960-8531-419d-9bd5-5acce6694c4e/Aggregation%20Based%20VFL.pdf},

doi = {10.1145/3603166.3632559},

year  = {2023},

date = {2023-01-01},

booktitle = {Proceedings of the 3rd Workshop on Distributed Machine Learning for the Intelligent Computing Continuum (DML-ICC), IEEE/ACM UCC 2023, Taormina, Italy, 4 December 2023},

abstract = {Federated Learning (FL) has emerged as a promising solution to address privacy concerns by collaboratively training Deep Learning (DL) models across distributed parties. This work proposes an architecture-based aggregation strategy in Vertical FL, where parties hold data with different attributes but shared instances. Our approach leverages the identical architectural parts, i.e. neural network layers, of different models to selectively aggregate weights, which is particularly relevant when collaborating with institutions holding different types of datasets, i.e., image, text, or tabular datasets. In a scenario where two entities train DL models, such as a Convolutional Neural Network (CNN) and a Multi-Layer Perceptron (MLP), our strategy computes the average only for architecturally identical segments. This preserves data-specific features learned from demographic and clinical data. We tested our approach on two clinical datasets, i.e., the COVID-CXR dataset and the ADNI study. Results show that our method achieves comparable results with the centralized scenario, in which all the data are collected in a single data lake, and benefits from FL generalizability. In particular, compared to the non-federated models, our proposed proof-of-concept model exhibits a slight performance loss on the COVID-CXR dataset (less than 8%), but outperforms ADNI models by up to 12%. Moreover, communication costs between training rounds are minimized by exchanging only the dense layer parameters.},

note = {https://iris.unito.it/bitstream/2318/1949730/1/HALF_HVL_for_DML_ICC23___Taormina-2.pdf},

keywords = {ai, epi, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Bruno Casella, Walter Riviera, Marco Aldinucci, Gloria Menegaz

MERGE: A model for multi-input biomedical federated learning Journal Article

In: Patterns, pp. 100856, 2023, ISSN: 2666-3899.

Abstract | Links | BibTeX | Tags: ai, epi, icsc

Pedro Ângelo, Viviana Bono, Mariangiola Dezani-Ciancaglini, Mário Florido

Gradual Guarantee for FJ with lambda-Expressions Proceedings Article

In: Tomb, Aaron (Ed.): Proceedings of the 25th ACM International Workshop on Formal Techniques for Java-like Programs, FTfJP 2023, Seattle, WA, USA, 18 July 2023, pp. 32–38, ACM, 2023.

Links | BibTeX | Tags: admire, icsc

William Fornaciari, Federico Reghenzani, Federico Terraneo, Davide Baroffio, Cecilia Metra, Martin Omana, Josie E. Rodriguez Condia, Matteo Sonza Reorda, Robert Birke, Iacopo Colonnelli, Gianluca Mittone, Marco Aldinucci, Gabriele Mencagli, Francesco Iannone, Filippo Palombi, Giuseppe Zummo, Daniele Cesarini, Federico Tesser

RISC-V-based Platforms for HPC: Analyzing Non-functional Properties for Future HPC and Big-Data Clusters Proceedings Article

In: Embedded Computer Systems: Architectures, Modeling, and Simulation - 23rd International Conference, SAMOS 2023, Samos, Greece, 2023, (icsc).

Abstract | Links | BibTeX | Tags: icsc, riscv

Alessia Antelmi, Daniele De Vinco, Gennaro Cordasco, Carmine Spagnuolo

Towards Unraveling Developers Communities in Stack Overflow and Reddit Proceedings Article

In: International Conference on Computational Social Science 2023, 2023.

Abstract | Links | BibTeX | Tags: analytics, icsc

Alessia Antelmi

Engagement in Open Data Workshops: The dark side of remote settings Proceedings Article

In: Methodologies and Intelligent Systems for Technology Enhanced Learning, 12th International Conference, Springer International Publishing, Cham, 2023.

Abstract | Links | BibTeX | Tags: analytics, icsc

Doriana Medić, Marco Aldinucci

Towards formal model for location aware workflows Proceedings Article

In: Shahriar, Hossain, Teranishi, Yuuichi, Cuzzocrea, Alfredo, Sharmin, Moushumi, Towey, Dave, Majumder, A. K. M. Jahangir Alam, Kashiwazaki, Hiroki, Yang, Ji-Jiang, Takemoto, Michiharu, Sakib, Nazmus, Banno, Ryohei, Ahamed, Sheikh Iqbal (Ed.): 47th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2023, pp. 1864–1869, IEEE, Torino, Italy, 2023.

Abstract | Links | BibTeX | Tags: eupex, icsc, semantics

Alberto Mulone, Sherine Awad, Davide Chiarugi, Marco Aldinucci

Porting the Variant Calling Pipeline for NGS data in cloud-HPC environment Proceedings Article

Abstract | Links | BibTeX | Tags: across, icsc, streamflow

WE ARE HIRING! If you are Research Engineers, Ph.D. Candidates and Post-Doctoral Researchers send your CV to alpha@di.unito.it

WE ARE HIRING! If you are Research Engineers, Ph.D. Candidates and Post-Doctoral Researchers send your CV to alpha@di.unito.it

Papers | Parallel Computing

2024

2023