
Ph.D. student in Modeling and Data Science, University of Turin
Parallel Computing group
Via Pessinetto 12, 10149 Torino – Italy
Email: bruno.casella@unito.it
Short Bio
Bruno Casella is a PhD student in Modeling and Data Science at UniTO, financed by Leonardo Company.
He graduated in Computer Engineering in 2020 with a thesis on the performances of AlphaZero in different scenarios.
He also received the Master’s Degree in Data Science for management in 2021 with a thesis on Federated Transfer Learning.
Fields of interest
- Federated Learning
- Deep learning
- High Performance Computing
Publications
2023
- O. de Filippo, F. Bruno, T. H. Pinxterhuis, M. Gąsior, L. Perl, L. Gaido, D. Tuttolomondo, A. Greco, R. Verardi, G. Lo Martire, M. Iannaccone, A. Leone, G. Liccardo, S. Caglioni, R. González Ferreiro, G. Rodinò, G. Musumeci, G. Patti, I. Borzillo, G. Tarantini, W. Wańha, B. Casella, E. H. Ploumen, Ł. Pyka, R. Kornowski, A. Gagnor, R. Piccolo, S. R. Roubin, D. Capodanno, P. Zocca, F. Conrotto, G. M. De Ferrari, C. von Birgelen, and F. D’Ascenzo, “Predictors of target lesion failure after treatment of left main, bifurcation, or chronic total occlusion lesions with ultrathin-strut drug-eluting coronary stents in the ultra registry,” Catheterization and cardiovascular interventions, 2023. doi:10.1002/ccd.30696
[BibTeX] [Abstract] [Download PDF]
Background: Data about the long-term performance of new-generation ultrathin-strut drug-eluting stents (DES) in challenging coronary lesions, such as left main (LM), bifurcation, and chronic total occlusion (CTO) lesions are scant. Methods: The international multicenter retrospective observational ULTRA study included consecutive patients treated from September 2016 to August 2021 with ultrathin-strut (<70µm) DES in challenging de novo lesions. Primary endpoint was target lesion failure (TLF): composite of cardiac death, target-lesion revascularization (TLR), target-vessel myocardial infarction (TVMI), or definite stent thrombosis (ST). Secondary endpoints included all-cause death, acute myocardial infarction (AMI), target vessel revascularization, and TLF components. TLF predictors were assessed with Cox multivariable analysis. Results: Of 1801 patients (age: 66.6±11.2 years; male: 1410 [78.3%]), 170 (9.4%) experienced TLF during follow-up of 3.1±1.4 years. In patients with LM, CTO, and bifurcation lesions, TLF rates were 13.5%, 9.9%, and 8.9%, respectively. Overall, 160 (8.9%) patients died (74 [4.1%] from cardiac causes). AMI and TVMI rates were 6.0% and 3.2%, respectively. ST occurred in 11 (1.1%) patients while 77 (4.3%) underwent TLR. Multivariable analysis identified the following predictors of TLF: age, STEMI with cardiogenic shock, impaired left ventricular ejection fraction, diabetes, and renal dysfunction. Among the procedural variables, total stent length increased TLF risk (HR: 1.01, 95% CI: 1-1.02 per mm increase), while intracoronary imaging reduced the risk substantially (HR: 0.35, 95% CI: 0.12-0.82). Conclusions: Ultrathin-strut DES showed high efficacy and satisfactory safety, even in patients with challenging coronary lesions. Yet, despite using contemporary gold-standard DES, the association persisted between established patient- and procedure-related features of risk and impaired 3-year clinical outcome.
@article{23:casella:ultra, abstract = {Background: Data about the long-term performance of new-generation ultrathin-strut drug-eluting stents (DES) in challenging coronary lesions, such as left main (LM), bifurcation, and chronic total occlusion (CTO) lesions are scant. Methods: The international multicenter retrospective observational ULTRA study included consecutive patients treated from September 2016 to August 2021 with ultrathin-strut (<70µm) DES in challenging de novo lesions. Primary endpoint was target lesion failure (TLF): composite of cardiac death, target-lesion revascularization (TLR), target-vessel myocardial infarction (TVMI), or definite stent thrombosis (ST). Secondary endpoints included all-cause death, acute myocardial infarction (AMI), target vessel revascularization, and TLF components. TLF predictors were assessed with Cox multivariable analysis. Results: Of 1801 patients (age: 66.6±11.2 years; male: 1410 [78.3%]), 170 (9.4%) experienced TLF during follow-up of 3.1±1.4 years. In patients with LM, CTO, and bifurcation lesions, TLF rates were 13.5%, 9.9%, and 8.9%, respectively. Overall, 160 (8.9%) patients died (74 [4.1%] from cardiac causes). AMI and TVMI rates were 6.0% and 3.2%, respectively. ST occurred in 11 (1.1%) patients while 77 (4.3%) underwent TLR. Multivariable analysis identified the following predictors of TLF: age, STEMI with cardiogenic shock, impaired left ventricular ejection fraction, diabetes, and renal dysfunction. Among the procedural variables, total stent length increased TLF risk (HR: 1.01, 95% CI: 1-1.02 per mm increase), while intracoronary imaging reduced the risk substantially (HR: 0.35, 95% CI: 0.12-0.82). Conclusions: Ultrathin-strut DES showed high efficacy and satisfactory safety, even in patients with challenging coronary lesions. Yet, despite using contemporary gold-standard DES, the association persisted between established patient- and procedure-related features of risk and impaired 3-year clinical outcome.}, author = {de Filippo, Ovidio and Bruno, Francesco and Pinxterhuis, Tineke H. and Gąsior, Mariusz and Perl, Leor and Gaido, Luca and Tuttolomondo, Domenico and Greco, Antonio and Verardi, Roberto and Lo Martire, Gianluca and Iannaccone, Mario and Leone, Attilio and Liccardo, Gaetano and Caglioni, Serena and González Ferreiro, Rocio and Rodinò, Giulio and Musumeci, Giuseppe and Patti, Giuseppe and Borzillo, Irene and Tarantini, Giuseppe and Wańha, Wojciech and Casella, Bruno and Ploumen, Eline H and Pyka, Łukasz and Kornowski, Ran and Gagnor, Andrea and Piccolo, Raffaele and Roubin, Sergio Raposeiras and Capodanno, Davide and Zocca, Paolo and Conrotto, Federico and De Ferrari, Gaetano M and von Birgelen, Clemens and D'Ascenzo, Fabrizio}, doi = {10.1002/ccd.30696}, journal = {Catheterization and Cardiovascular Interventions}, title = {Predictors of target lesion failure after treatment of left main, bifurcation, or chronic total occlusion lesions with ultrathin-strut drug-eluting coronary stents in the ULTRA registry}, url = {https://onlinelibrary.wiley.com/doi/full/10.1002/ccd.30696}, year = {2023} }
- B. Casella, W. Riviera, M. Aldinucci, and G. Menegaz, "Mifl: multi-input neural networks in federated learning," Computer Science Department, University of Torino 2023. doi:https://www.techrxiv.org/articles/preprint/MiFL_Multi-Input_Neural_Networks_in_Federated_Learning/22492732
[BibTeX] [Abstract] [Download PDF]
Driven by the Deep Learning (DL) revolution, Artificial intelligence (AI) has become a fundamental tool for many Bio-Medical tasks, including AI-assisted diagnosis. These include analysing and classifying images (2D and 3D), where, for some tasks, DL exhibits superhuman performance. Diagnostic imaging, however, is not the only diagnostic tool. Tabular data, such as personal data, vital signs, and genomic/blood tests, are commonly collected for every patient entering a clinical institution. However, it is rarely considered in DL pipelines, although it carries diagnostic information. The training of DL models requires large datasets, so large that every institution might need more data that should be pooled from different sites. Data pooling generates newfound concerns about data access and movement across other institutions spawning multiple dimensions, such as performance, energy efficiency, privacy, criticality, and security. Federated Learning (FL) is a cooperative learning paradigm aiming at addressing these concerns by moving models instead of data across different institutions. This paper proposes a Federated multi-input model that leverages images and tabular data, providing a proof of concept of the feasibility of multi-input FL architectures. The proposed model was evaluated on two showcases: the prognosis of CoViD-19 disease and the patients’ stratification for Alzheimer’s disease. Results show that enabling multi-input architectures in the FL framework allows for improving the performance regarding both accuracy and generalizability with respect to non-federated models while ensuring security and data protection peculiar to FL.
@techreport{23:casella:mifl, abstract = {Driven by the Deep Learning (DL) revolution, Artificial intelligence (AI) has become a fundamental tool for many Bio-Medical tasks, including AI-assisted diagnosis. These include analysing and classifying images (2D and 3D), where, for some tasks, DL exhibits superhuman performance. Diagnostic imaging, however, is not the only diagnostic tool. Tabular data, such as personal data, vital signs, and genomic/blood tests, are commonly collected for every patient entering a clinical institution. However, it is rarely considered in DL pipelines, although it carries diagnostic information. The training of DL models requires large datasets, so large that every institution might need more data that should be pooled from different sites. Data pooling generates newfound concerns about data access and movement across other institutions spawning multiple dimensions, such as performance, energy efficiency, privacy, criticality, and security. Federated Learning (FL) is a cooperative learning paradigm aiming at addressing these concerns by moving models instead of data across different institutions. This paper proposes a Federated multi-input model that leverages images and tabular data, providing a proof of concept of the feasibility of multi-input FL architectures. The proposed model was evaluated on two showcases: the prognosis of CoViD-19 disease and the patients’ stratification for Alzheimer’s disease. Results show that enabling multi-input architectures in the FL framework allows for improving the performance regarding both accuracy and generalizability with respect to non-federated models while ensuring security and data protection peculiar to FL.}, author = {Casella, Bruno and Riviera, Walter and Aldinucci, Marco and Menegaz, Gloria}, doi = {https://www.techrxiv.org/articles/preprint/MiFL_Multi-Input_Neural_Networks_in_Federated_Learning/22492732}, institution = {Computer Science Department, University of Torino}, keywords = {epi}, note = {https://www.techrxiv.org/articles/preprint/MiFL_Multi-Input_Neural_Networks_in_Federated_Learning/22492732}, title = {MiFL: Multi-Input Neural networks in federated Learning}, url = {https://www.techrxiv.org/articles/preprint/MiFL_Multi-Input_Neural_Networks_in_Federated_Learning/22492732}, year = 2023 }
- B. Casella and L. Paletto, "Predicting cryptocurrencies market phases through on-chain data long-term forecasting," , Dubai, United Arab Emirates, 2023.
[BibTeX] [Abstract] [Download PDF]
Blockchain, the underlying technology of Bitcoin and several other cryptocurrencies, like Ethereum, produces a massive amount of open-access data that can be analyzed, providing important information about the network’s activity and its respective token. The on-chain data have extensively been used as input to Machine Learning algorithms for predicting cryptocurrencies’ future prices; however, there is a lack of study in predicting the future behaviour of on-chain data. This study aims to show how on-chain data can be used to detect cryptocurrency market regimes, like minimum and maximum, bear and bull market phases, and how forecasting these data can provide an optimal asset allocation for long-term investors.
@inproceedings{23:casella:blockchain, title = {Predicting Cryptocurrencies Market Phases through On-Chain Data Long-Term Forecasting}, author = {Casella, Bruno and Paletto, Lorenzo}, abstract = {Blockchain, the underlying technology of Bitcoin and several other cryptocurrencies, like Ethereum, produces a massive amount of open-access data that can be analyzed, providing important information about the network’s activity and its respective token. The on-chain data have extensively been used as input to Machine Learning algorithms for predicting cryptocurrencies’ future prices; however, there is a lack of study in predicting the future behaviour of on-chain data. This study aims to show how on-chain data can be used to detect cryptocurrency market regimes, like minimum and maximum, bear and bull market phases, and how forecasting these data can provide an optimal asset allocation for long-term investors.}, institution = {Computer Science Department, University of Torino}, address = {Dubai, United Arab Emirates}, publisher = {{IEEE}}, year = {2023}, keywords = {epi}, url = {https://iris.unito.it/retrieve/2e845fe4-8562-4898-bb97-184675c455c0/6.%20ICBC23%20-%20PREDICTING%20BTC.pdf}, note = {https://iris.unito.it/retrieve/2e845fe4-8562-4898-bb97-184675c455c0/6.%20ICBC23%20-%20PREDICTING%20BTC.pdf} }
- B. Casella, R. Esposito, A. Sciarappa, C. Cavazzoni, and M. Aldinucci, "Experimenting with normalization layers in federated learning on non-iid scenarios," Computer Science Department, University of Torino 2023.
[BibTeX] [Abstract] [Download PDF]
Training Deep Learning (DL) models require large, high-quality datasets, often assembled with data from different institutions. Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions by iteratively globally aggregating locally trained models. One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants. Even though this fragility cannot be eliminated, it can be debunked by a suitable optimization of two hyperparameters: layer normalization methods and collaboration frequency selection. In this work, we benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets. Results show that Batch Normalization, widely employed for centralized DL, is not the best choice for FL, whereas Group and Layer Normalization consistently outperform Batch Normalization. Similarly, frequent model aggregation decreases convergence speed and mode quality.
@techreport{23:casella:normalization, abstract = {Training Deep Learning (DL) models require large, high-quality datasets, often assembled with data from different institutions. Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions by iteratively globally aggregating locally trained models. One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants. Even though this fragility cannot be eliminated, it can be debunked by a suitable optimization of two hyperparameters: layer normalization methods and collaboration frequency selection. In this work, we benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets. Results show that Batch Normalization, widely employed for centralized DL, is not the best choice for FL, whereas Group and Layer Normalization consistently outperform Batch Normalization. Similarly, frequent model aggregation decreases convergence speed and mode quality.}, author = {Casella, Bruno and Esposito, Roberto and Sciarappa, Antonio and Cavazzoni, Carlo and Aldinucci, Marco}, institution = {Computer Science Department, University of Torino}, keywords = {futurehpc, epi}, title = {Experimenting with Normalization Layers in Federated Learning on non-IID scenarios}, url = {https://arxiv.org/pdf/2303.10630.pdf}, year = {2023} }
2022
- M. Pennisi, F. Proietto Salanitri, G. Bellitto, B. Casella, M. Aldinucci, S. Palazzo, and C. Spampinato, "Feder: federated learning through experience replay and privacy-preserving data synthesis," Computer Science Department, University of Torino 2022. doi:10.48550/arXiv.2206.10048
[BibTeX] [Download PDF]@techreport{22:casella:deepfakes, abstract = {}, author = {Pennisi, Matteo and Proietto Salanitri, Federica and Bellitto, Giovanni and Casella, Bruno and Aldinucci, Marco and Palazzo, Simone and Spampinato, Concetto}, doi = {10.48550/arXiv.2206.10048}, institution = {Computer Science Department, University of Torino}, note = {https://arxiv.org/abs/2206.10048}, title = {FedER: Federated Learning through Experience Replay and Privacy-Preserving Data Synthesis}, url = {https://arxiv.org/abs/2206.10048}, year = 2022 }
- I. Colonnelli, B. Casella, G. Mittone, Y. Arfat, B. Cantalupo, R. Esposito, A. R. Martinelli, D. Medić, and M. Aldinucci, "Federated learning meets HPC and cloud," in Astrophysics and space science proceedings, Catania, Italy, 2022.
[BibTeX] [Abstract] [Download PDF]
HPC and AI are fated to meet for several reasons. This article will discuss some of them and argue why this will happen through the set of methods and technologies that underpin cloud computing. As a paradigmatic example, we present a new federated learning system that collaboratively trains a deep learning model in different supercomputing centers. The system is based on the StreamFlow workflow manager designed for hybrid cloud-HPC infrastructures.
@inproceedings{22:ml4astro, abstract = {HPC and AI are fated to meet for several reasons. This article will discuss some of them and argue why this will happen through the set of methods and technologies that underpin cloud computing. As a paradigmatic example, we present a new federated learning system that collaboratively trains a deep learning model in different supercomputing centers. The system is based on the StreamFlow workflow manager designed for hybrid cloud-HPC infrastructures.}, address = {Catania, Italy}, author = {Iacopo Colonnelli and Bruno Casella and Gianluca Mittone and Yasir Arfat and Barbara Cantalupo and Roberto Esposito and Alberto Riccardo Martinelli and Doriana Medi\'{c} and Marco Aldinucci}, booktitle = {Astrophysics and Space Science Proceedings}, keywords = {across, eupilot, streamflow}, publisher = {Springer}, title = {Federated Learning meets {HPC} and cloud}, url = {https://iris.unito.it/retrieve/5631da1c-96a0-48c0-a48e-2cdf6b84841d/main.pdf}, year = {2022} }
- B. Casella, R. Esposito, C. Cavazzoni, and M. Aldinucci, "Benchmarking fedavg and fedcurv for image classification tasks," in Proceedings of the 1st italian conference on big data and data science, ITADATA 2022, september 20-21, 2022, 2022.
[BibTeX] [Download PDF]@inproceedings{casella2022benchmarking, author = {Bruno Casella and Roberto Esposito and Carlo Cavazzoni and Marco Aldinucci}, booktitle = {Proceedings of the 1st Italian Conference on Big Data and Data Science, {ITADATA} 2022, September 20-21, 2022}, editor = {Marco Anisetti and Angela Bonifati and Nicola Bena and Claudio Ardagna and Donato Malerba}, keywords = {eupilot}, publisher = {CEUR-WS.org}, series = {{CEUR} Workshop Proceedings}, title = {Benchmarking FedAvg and FedCurv for Image Classification Tasks}, url = {https://iris.unito.it/bitstream/2318/1870961/1/Benchmarking_FedAvg_and_FedCurv_for_Image_Classification_Tasks.pdf}, year = {2022}, bdsk-url-1 = {https://iris.unito.it/bitstream/2318/1870961/1/Benchmarking_FedAvg_and_FedCurv_for_Image_Classification_Tasks.pdf} }
- B. Casella, A. Chisari, S. Battiato, and M. Giuffrida., "Transfer learning via test-time neural networks aggregation," in Proceedings of the 17th international joint conference on computer vision, imaging and computer graphics theory and applications, VISIGRAPP 2022, volume 5: visapp, online streaming, february 6-8, 2022, 2022, pp. 642-649. doi:10.5220/0010907900003124
[BibTeX] [Abstract] [Download PDF]
It has been demonstrated that deep neural networks outperform traditional machine learning. However, deep networks lack generalisability, that is, they will not perform as good as in a new (testing) set drawn from a different distribution due to the domain shift. In order to tackle this known issue, several transfer learning approaches have been proposed, where the knowledge of a trained model is transferred into another to improve performance with different data. However, most of these approaches require additional training steps, or they suffer from catastrophic forgetting that occurs when a trained model has overwritten previously learnt knowledge. We address both problems with a novel transfer learning approach that uses network aggregation. We train dataset-specific networks together with an aggregation network in a unified framework. The loss function includes two main components: a task-specific loss (such as cross-entropy) and an aggregation loss. The proposed aggregation loss allows our model to learn how trained deep network parameters can be aggregated with an aggregation operator. We demonstrate that the proposed approach learns model aggregation at test time without any further training step, reducing the burden of transfer learning to a simple arithmetical operation. The proposed approach achieves comparable performance w.r.t. the baseline. Besides, if the aggregation operator has an inverse, we will show that our model also inherently allows for selective forgetting, i.e., the aggregated model can forget one of the datasets it was trained on, retaining information on the others.
@inproceedings{22:VISAPP:transferlearning, abstract = {It has been demonstrated that deep neural networks outperform traditional machine learning. However, deep networks lack generalisability, that is, they will not perform as good as in a new (testing) set drawn from a different distribution due to the domain shift. In order to tackle this known issue, several transfer learning approaches have been proposed, where the knowledge of a trained model is transferred into another to improve performance with different data. However, most of these approaches require additional training steps, or they suffer from catastrophic forgetting that occurs when a trained model has overwritten previously learnt knowledge. We address both problems with a novel transfer learning approach that uses network aggregation. We train dataset-specific networks together with an aggregation network in a unified framework. The loss function includes two main components: a task-specific loss (such as cross-entropy) and an aggregation loss. The proposed aggregation loss allows our model to learn how trained deep network parameters can be aggregated with an aggregation operator. We demonstrate that the proposed approach learns model aggregation at test time without any further training step, reducing the burden of transfer learning to a simple arithmetical operation. The proposed approach achieves comparable performance w.r.t. the baseline. Besides, if the aggregation operator has an inverse, we will show that our model also inherently allows for selective forgetting, i.e., the aggregated model can forget one of the datasets it was trained on, retaining information on the others. }, author = {Bruno Casella and Alessio Chisari and Sebastiano Battiato and Mario Giuffrida.}, booktitle = {Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, {VISIGRAPP} 2022, Volume 5: VISAPP, Online Streaming, February 6-8, 2022}, doi = {10.5220/0010907900003124}, editor = {Giovanni Maria Farinella and Petia Radeva and Kadi Bouatouch}, isbn = {978-989-758-555-5}, organization = {INSTICC}, pages = {642-649}, publisher = {SciTePress}, title = {Transfer Learning via Test-time Neural Networks Aggregation}, url = {https://iris.unito.it/retrieve/handle/2318/1844159/947123/TRANSFER_LEARNING_VIA_TEST_TIME_NEURAL_NETWORKS_AGGREGATION.pdf}, year = {2022}, bdsk-url-1 = {https://iris.unito.it/retrieve/handle/2318/1844159/947123/TRANSFER_LEARNING_VIA_TEST_TIME_NEURAL_NETWORKS_AGGREGATION.pdf}, bdsk-url-2 = {https://doi.org/10.5220/0010907900003124} }