Papers | Parallel Computing

2024

Gianluca Mittone, Giulio Malenza, Marco Aldinucci, Robert Birke

Distributed Edge Inference: an Experimental Study on Multiview Detection Proceedings Article

In: Proc. of the 16th IEEE/ACM Intl. Conference on Utility and Cloud Computing Companion (UCC), pp. 1-6, ACM, Taormina, Italy, 2024, (eupilot, icsc).

Abstract | Links | BibTeX | Tags: ai, eupilot, icsc

Miruna Bețianu, Abele Mălan, Marco Aldinucci, Robert Birke, Lydia Chen

DALLMi: Domain Adaption for LLM-based Multi-label Classifier Proceedings Article

In: Yang, De-Nian, Xie, Xing, Tseng, Vincent S., Pei, Jian, Huang, Jen-Wei, Lin, Jerry Chun-Wei (Ed.): Proceedings of the 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 277–289, Springer, Taipei, Taiwan, 2024.

Abstract | Links | BibTeX | Tags: ai, eupilot, icsc

@inproceedings{24:betianu:llm,

title = {DALLMi: Domain Adaption for LLM-based Multi-label Classifier},

author = {Miruna Bețianu and Abele Mălan and Marco Aldinucci and Robert Birke and Lydia Chen},

editor = {De-Nian Yang and Xing Xie and Vincent S. Tseng and Jian Pei and Jen-Wei Huang and Jerry Chun-Wei Lin},

url = {https://hdl.handle.net/2318/1976672},

doi = {10.1007/978-981-97-2259-4_21},

year  = {2024},

date = {2024-05-01},

booktitle = {Proceedings of the 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining},

volume = {14647},

pages = {277–289},

publisher = {Springer},

address = {Taipei, Taiwan},

series = {Lecture Notes in Computer Science},

abstract = {Large language models (LLMs) increasingly serve as the backbone for classifying text associated with distinct domains and simultaneously several labels (classes). When encountering domain shifts, e.g., classifier of movie reviews from IMDb to Rotten Tomatoes, adapting such an LLM-based multi-label classifier is challenging due to incomplete label sets at the target domain and daunting training overhead. The existing domain adaptation methods address either image multi-label classifiers or text binary classifiers. In this paper, we design DALLMi, Domain Adaptation Large Language Model interpolator, a first-of-its-kind semi-supervised domain adaptation method for text data models based on LLMs, specifically BERT. The core of DALLMi is the novel variation loss and MixUp regularization, which jointly leverage the limited positively labeled and large quantity of unlabeled text and, importantly, their interpolation from the BERT word embeddings. DALLMi also introduces a label-balanced sampling strategy to overcome the imbalance between labeled and unlabeled data. We evaluate DALLMi against the partial-supervised and unsupervised approach on three datasets under different scenarios of label availability for the target domain. Our results show that DALLMi achieves higher mAP than unsupervised and partially-supervised approaches by 19.9% and 52.2%, respectively.},

keywords = {ai, eupilot, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Adriano Marques Garcia, Giulio Malenza, Robert Birke, Marco Aldinucci

Assessing Large Language Models Inference Performance on a 64-core RISC-V CPU with Silicon-Enabled Vectors Proceedings Article

In: Antelmi, Alessia, Carlini, Emanuele, Dazzi, Patrizio (Ed.): Proceedings of BigHPC2024: Special Track on Big Data and High-Performance Computing, co-located with the 3textsuperscriptrd Italian Conference on Big Data and Data Science, ITADATA2024, pp. 1-9, CEUR-WS.org, Pisa, Italy, 2024.

Abstract | Links | BibTeX | Tags: eupilot, icsc

2023

Samuele Fonio

Benchmarking Federated Learning Frameworks for Medical Imaging Tasks Proceedings Article

In: Foresti, G. L., Fusiello, A., Hancock, E. (Ed.): Image Analysis and Processing - ICIAP 2023 Workshops. ICIAP 2023, Springer, Cham, Udine, Italy, 2023, (In print).

Abstract | Links | BibTeX | Tags: ai, eupilot, icsc

Gianluca Mittone, Samuele Fonio

Benchmarking Federated Learning Scalability Proceedings Article

In: Proceedings of the 2nd Italian Conference on Big Data and Data Science, ITADATA 2023, September 11-13, 2023, CEUR, Naples, Italy, 2023.

Abstract | Links | BibTeX | Tags: eupilot, HPC, icsc

Chi Hong, Jiyue Huang, Robert Birke, Lydia Y. Chen

Exploring and Exploiting Data-Free Model Stealing Proceedings Article

In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Turin, Italy, 2023.

Abstract | Links | BibTeX | Tags: eupilot, icsc

@inproceedings{23:hong:datafree,

title = {Exploring and Exploiting Data-Free Model Stealing},

author = {Chi Hong and Jiyue Huang and Robert Birke and Lydia Y. Chen},

url = {https://iris.unito.it/retrieve/ce44dec6-12c9-443d-99e7-f1141e50aa3a/Data-free%20Model%20Stealing.pdf},

doi = {10.1007/978-3-031-43424-2_2},

year  = {2023},

date = {2023-09-01},

booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)},

address = {Turin, Italy},

abstract = {Deep machine learning models, e.g., image classifier, are increasingly deployed in the wild to provide services to users. Adversaries are shown capable of stealing the knowledge of these models by sending inference queries and then training substitute models based on query results. The availability and quality of adversarial query inputs are undoubtedly crucial in the stealing process. The recent prior art demonstrates the feasibility of replacing real data by exploring the synthetic adversarial queries, so called data-free attacks, under strong adversarial assumptions, i.e., the deployed classier returns not only class labels but also class probabilities. In this paper, we consider a general adversarial model and propose an effective data-free stealing algorithm, Tandem-GAN, which not only explores synthetic queries but also explicitly exploits the high quality ones. The core of TandemGAN is composed of (i) substitute model which imitates the target model through synthetic queries and their inferred labels; and (ii) a tandem generator consisting of two networks, Gx and Ge, which first explores the synthetic data space via Gx and then exploits high-quality examples via Ge to maximize the knowledge transfer from the target to the substitute model. Our results on four datasets show that the accuracy of our trained substitute model ranges between 96-67% of the target model and outperforms the existing state-of-the-art data-free model stealing approach by up to 2.5X.},

keywords = {eupilot, icsc},

pubstate = {published},

tppubtype = {inproceedings}

}

Gianluca Mittone, Walter Riviera, Iacopo Colonnelli, Robert Birke, Marco Aldinucci

Model-Agnostic Federated Learning Proceedings Article

In: Euro-Par 2023: Parallel Processing, pp. 383–396, Springer, Limassol, Cyprus, 2023.

Abstract | Links | BibTeX | Tags: ai, confidential, eupilot, icsc, riscv

Iacopo Colonnelli, Robert Birke, Marco Aldinucci

Experimenting with PyTorch on RISC-V Proceedings Article

In: RISC-V Summit Europe 2023, Barcelona, Spain, 2023, (Poster).

Abstract | Links | BibTeX | Tags: eupilot, icsc, riscv

Gianluca Mittone, Nicolò Tonci, Robert Birke, Iacopo Colonnelli, Doriana Medić, Andrea Bartolini, Roberto Esposito, Emanuele Parisi, Francesco Beneventi, Mirko Polato, Massimo Torquati, Luca Benini, Marco Aldinucci

Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning Proceedings Article

In: 20th ACM International Conference on Computing Frontiers (CF '23), ACM, Bologna, Italy, 2023, ISBN: 979-8-4007-0140-5/23/05, (https://arxiv.org/abs/2302.07946).

Abstract | Links | BibTeX | Tags: ai, confidential, eupilot, HPC, icsc, riscv

Gianluca Mittone, Filip Svoboda, Marco Aldinucci, Nicholas D. Lane, Pietro Lio

A Federated Learning Benchmark for Drug-Target Interaction Proceedings Article

In: Companion Proceedings of the ACM Web Conference 2023 (WWW '23 Companion), ACM, Austin, Texas, 2023, ISBN: 978-1-4503-9419-2/23/04, (https://arxiv.org/abs/2302.07684).

Abstract | Links | BibTeX | Tags: ai, confidential, eupilot, icsc

Marco Aldinucci Mirko Polato Roberto Esposito

Boosting Methods for Federated Learning Proceedings Article

In: Calvanese, Diego, Diamantini, Claudia, Ferro, Nicola, Marchesin, Stefano, Silvello, Gianmaria, Tanca, Letizia (Ed.): Proc. of the 31th Italian Symposium on Advanced Database Systems,SEBD 2023, pp. 439–448, CEUR-WS.org, 2023.

Abstract | Links | BibTeX | Tags: eupilot

@inproceedings{DBLP:conf/sebd/Esposito23,

title = {Boosting Methods for Federated Learning},

author = {Marco Aldinucci Mirko Polato Roberto Esposito},

editor = {Diego Calvanese and Claudia Diamantini and Nicola Ferro and Stefano Marchesin and Gianmaria Silvello and Letizia Tanca},

url = {https://ceur-ws.org/Vol-3478/paper48.pdf},

year  = {2023},

date = {2023-01-01},

booktitle = {Proc. of the 31th Italian Symposium on Advanced Database Systems,SEBD 2023},

pages = {439–448},

publisher = {CEUR-WS.org},

series = {CEUR Workshop Proceedings},

abstract = {Federated Learning (FL) has been proposed to develop better AI systems without compromising the privacy of final users and the legitimate interests of private companies. Initially deployed by Google to predict text input on mobile devices, FL has been deployed in many other industries. Since its introduction, Federated Learning mainly exploited the inner working of neural networks and other gradient descent-based algorithms by either exchanging the weights of the model or the gradients computed during learning. While this approach has been very successful, it rules out applying FL in contexts where other models are preferred, e.g., easier to interpret or known to work better. This paper proposes to leverage distributed versions of the AdaBoost algorithm to acquire strong federated models. In contrast with previous approaches, our proposal does not put any constraint on the client-side learning models and does not rely on inner workings of the learning algorithms used in the clients. We perform a large set of experiments on ten UCI datasets, comparing the algorithms in six non-iidness settings. Results show that the approach is effective, in the case of an IID setting, results are often near to the theoretical optimum (i.e., the performances of AdaBoost on the complete dataset). In case of non-IID settings, results very much depend on the severity of the non-IIDness.},

keywords = {eupilot},

pubstate = {published},

tppubtype = {inproceedings}

}

Iacopo Colonnelli, Bruno Casella, Gianluca Mittone, Yasir Arfat, Barbara Cantalupo, Roberto Esposito, Alberto Riccardo Martinelli, Doriana Medić, Marco Aldinucci

Federated Learning meets HPC and cloud Proceedings Article

In: Bufano, Filomena, Riggi, Simone, Sciacca, Eva, Schillirò, Francesco (Ed.): Astrophysics and Space Science Proceedings, pp. 193–199, Springer, Catania, Italy, 2023, ISBN: 978-3-031-34167-0, (Keynote talk).

Abstract | Links | BibTeX | Tags: across, eupilot, streamflow

2022

Mirko Polato, Roberto Esposito, Marco Aldinucci

Boosting the Federation: Cross-Silo Federated Learning without Gradient Descent Proceedings Article

In: Intl. Joint Conference on Neural Networks (IJCNN), pp. 1–10, IEEE, Padua, Italy, 2022.

Abstract | Links | BibTeX | Tags: eupilot, hpc4ai

Bruno Casella, Roberto Esposito, Carlo Cavazzoni, Marco Aldinucci

Benchmarking FedAvg and FedCurv for Image Classification Tasks Proceedings Article

In: Anisetti, Marco, Bonifati, Angela, Bena, Nicola, Ardagna, Claudio, Malerba, Donato (Ed.): Proceedings of the 1st Italian Conference on Big Data and Data Science, ITADATA 2022, September 20-21, 2022, CEUR-WS.org, 2022.

Abstract | Links | BibTeX | Tags: eupilot

@inproceedings{casella2022benchmarking,

title = {Benchmarking FedAvg and FedCurv for Image Classification Tasks},

author = {Bruno Casella and Roberto Esposito and Carlo Cavazzoni and Marco Aldinucci},

editor = {Marco Anisetti and Angela Bonifati and Nicola Bena and Claudio Ardagna and Donato Malerba},

url = {https://ceur-ws.org/Vol-3340/paper40.pdf},

year  = {2022},

date = {2022-01-01},

booktitle = {Proceedings of the 1st Italian Conference on Big Data and Data Science, ITADATA 2022, September 20-21, 2022},

volume = {3340},

publisher = {CEUR-WS.org},

series = {CEUR Workshop Proceedings},

abstract = {Classic Machine Learning (ML) techniques require training on data available in a single data lake (either centralized or distributed). However, aggregating data from different owners is not always convenient for different reasons, including security, privacy and secrecy. Data carry a value that might vanish when shared with others; the ability to avoid sharing the data enables industrial applications where security and privacy are of paramount importance, making it possible to train global models by implementing only local policies which can be run independently and even on air-gapped data centres. Federated Learning (FL) is a distributed machine learning approach which has emerged as an effective way to address privacy concerns by only sharing local AI models while keeping the data decentralized. Two critical challenges of Federated Learning are managing the heterogeneous systems in the same federated network and dealing with real data, which are often not independently and identically distributed (non-IID) among the clients. In this paper, we focus on the second problem, i.e., the problem of statistical heterogeneity of the data in the same federated network. In this setting, local models might be strayed far from the local optimum of the complete dataset, thus possibly hindering the convergence of the federated model. Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv), aiming at tackling the non-IID setting, have already been proposed. This work provides an empirical assessment of the behaviour of FedAvg and FedCurv in common non-IID scenarios. Results show that the number of epochs per round is an important hyper-parameter that, when tuned appropriately, can lead to significant performance gains while reducing the communication cost. As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.},

keywords = {eupilot},

pubstate = {published},

tppubtype = {inproceedings}

}

2021

Marco Aldinucci, Giovanni Agosta, Antonio Andreini, Claudio A. Ardagna, Andrea Bartolini, Alessandro Cilardo, Biagio Cosenza, Marco Danelutto, Roberto Esposito, William Fornaciari, Roberto Giorgi, Davide Lengani, Raffaele Montella, Mauro Olivieri, Sergio Saponara, Daniele Simoni, Massimo Torquati

The Italian research on HPC key technologies across EuroHPC Proceedings Article

In: ACM Computing Frontiers, pp. 279–286, ACM, Virtual Conference, Italy, 2021.

Abstract | Links | BibTeX | Tags: admire, eupex, eupilot, textarossa

@inproceedings{21:CINI_acm_CF,

title = {The Italian research on HPC key technologies across EuroHPC},

author = {Marco Aldinucci and Giovanni Agosta and Antonio Andreini and Claudio A. Ardagna and Andrea Bartolini and Alessandro Cilardo and Biagio Cosenza and Marco Danelutto and Roberto Esposito and William Fornaciari and Roberto Giorgi and Davide Lengani and Raffaele Montella and Mauro Olivieri and Sergio Saponara and Daniele Simoni and Massimo Torquati},

url = {https://iris.unito.it/retrieve/handle/2318/1783118/744641/preprint.pdf},

doi = {10.1145/3457388.3458508},

year  = {2021},

date = {2021-05-01},

booktitle = {ACM Computing Frontiers},

pages = {279–286},

publisher = {ACM},

address = {Virtual Conference, Italy},

abstract = {High-Performance Computing (HPC) is one of the strategic priorities for research and innovation worldwide due to its relevance for industrial and scientific applications. We envision HPC as composed of three pillars: infrastructures, applications, and key technologies and tools. While infrastructures are by construction centralized in large-scale HPC centers, and applications are generally within the purview of domain-specific organizations, key technologies fall in an intermediate case where coordination is needed, but design and development are often decentralized. A large group of Italian researchers has started a dedicated laboratory within the National Interuniversity Consortium for Informatics (CINI) to address this challenge. The laboratory, albeit young, has managed to succeed in its first attempts to propose a coordinated approach to HPC research within the EuroHPC Joint Undertaking, participating in the calls 2019-20 to five successful proposals for an aggregate total cost of 95M Euro. In this paper, we outline the working group's scope and goals and provide an overview of the five funded projects, which become fully operational in March 2021, and cover a selection of key technologies provided by the working group partners, highlighting their usage development within the projects.},

keywords = {admire, eupex, eupilot, textarossa},

pubstate = {published},

tppubtype = {inproceedings}

}

WE ARE HIRING! If you are Research Engineers, Ph.D. Candidates and Post-Doctoral Researchers send your CV to alpha@di.unito.it

WE ARE HIRING! If you are Research Engineers, Ph.D. Candidates and Post-Doctoral Researchers send your CV to alpha@di.unito.it

Papers | Parallel Computing

2024

2023

2022

2021