Alberto Mulone


Alberto Mulone

PhD Student at Computer Science Department, University of Turin
Parallel Computing group
Via Pessinetto 12, 10149 Torino – Italy 
Email: alberto.mulone@unito.it

Short Bio

Alberto Mulone is a PhD Student at the University of Turin.
He received his Master’s degree with honours in Computer Science at the University of Turin in 2023 with a thesis exposing the porting of a bioinformatics pipeline, called Variants Calling, for hybrid cloud-HPC executions. The main goal of the thesis is to assess hybrid workflows using StreamFlow, a Workflow Management System that implements this paradigm.

Publications

2023

  • A. Mulone, S. Awad, D. Chiarugi, and M. Aldinucci, “Porting the Variant Calling Pipeline for NGS data in cloud-HPC environment,” in 47th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2023, Torino, Italy, 2023, p. 1858–1863. doi:10.1109/COMPSAC57700.2023.00288
    [BibTeX] [Abstract] [Download PDF]

    In recent years we have understood the importance of analyzing and sequencing human genetic variation. A relevant aspect that emerged from the Covid-19 pandemic was the need to obtain results very quickly; this involved using High-Performance Computing (HPC) environments to execute the Next Generation Sequencing (NGS) pipeline. However, HPC is not always the most suitable environment for the entire execution of a pipeline, especially when it involves many heterogeneous tools. The ability to execute parts of the pipeline on different environments can lead to higher performance but also cheaper executions. This work shows the design and optimization process that led us to a state-of-the-art Variant Calling hybrid workflow based on the StreamFlow Workflow Management System (WfMS). We also compare StreamFlow with Snakemake, an established WfMS targeting HPC facilities, observing comparable performance on single environments and satisfactory improvements with a hybrid cloud-HPC configuration.

    @inproceedings{23:mulone:wide:vcp,
    abstract = {In recent years we have understood the importance of analyzing and sequencing human genetic variation.
    A relevant aspect that emerged from the Covid-19 pandemic was the need to obtain results very quickly; this involved using High-Performance Computing (HPC) environments to execute the Next Generation Sequencing (NGS) pipeline.
    However, HPC is not always the most suitable environment for the entire execution of a pipeline, especially when it involves many heterogeneous tools.
    The ability to execute parts of the pipeline on different environments can lead to higher performance but also cheaper executions.
    This work shows the design and optimization process that led us to a state-of-the-art Variant Calling hybrid workflow based on the StreamFlow Workflow Management System (WfMS).
    We also compare StreamFlow with Snakemake, an established WfMS targeting HPC facilities, observing comparable performance on single environments and satisfactory improvements with a hybrid cloud-HPC configuration.},
    address = {Torino, Italy},
    author = {Alberto Mulone and Sherine Awad and Davide Chiarugi and Marco Aldinucci},
    booktitle = {47th {IEEE} Annual Computers, Software, and Applications Conference, {COMPSAC} 2023},
    editor = {Hossain Shahriar and Yuuichi Teranishi and Alfredo Cuzzocrea and Moushumi Sharmin and Dave Towey and A. K. M. Jahangir Alam Majumder and Hiroki Kashiwazaki and Ji{-}Jiang Yang and Michiharu Takemoto and Nazmus Sakib and Ryohei Banno and Sheikh Iqbal Ahamed},
    keywords = {across, icsc, streamflow},
    pages = {1858--1863},
    publisher = {{IEEE}},
    year = {2023},
    title = {Porting the {V}ariant {C}alling {P}ipeline for {NGS} data in cloud-{HPC} environment},
    url = {https://iris.unito.it/bitstream/2318/1919364/1/paper.pdf},
    doi = {10.1109/COMPSAC57700.2023.00288},
    bdsk-url-1 = {https://iris.unito.it/bitstream/2318/1919364/1/paper.pdf},
    bdsk-url-2 = {https://doi.org/10.1109/COMPSAC57700.2023.00288}
    }