Stay foolish, build your own supercomputer at UNITO

The novel Competence Center for Scientific Computing at University of Torino and INFN Torino (C3S@UNITO) is eventually opening this week. The inauguration workshop will take place on October 7, 2016 at main theatre of the Campus Luigi Einaudi. The center involves over 16 departments of University of Torino and hosts the brand new OCCAM platform.

The program and (free) registration form is here. Everybody is invited.

The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multi-purpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Torino branch of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible, reconfigurable and extendable infrastructure to cater to a wide range of different scientific computing needs, as well as a platform for R&D activities on computational technologies themselves. Extending it with novel architecture CPU, accelerator or hybrid microarchitecture (such as forthcoming Intel Xeon Phi Knights Landing) should be as a simple as plugging a node in a rack.

The initial system counts slightly more than 1100 cpu cores and includes different types of computing nodes (standard dual-socket nodes, large quad-sockets nodes with 768 GB RAM, and multi-GPU nodes) and two separate disk storage subsystems: a smaller high-performance scratch area, based on the Lustre file system, intended for direct computational I/O and a larger one, of the order of 1PB, to archive near-line data for archival purposes. All the components of the system are interconnected through a 10Gb/s Ethernet layer with one-level topology and an InfiniBand FDR 56Gbps layer in fat-tree topology.

A system of this kind, heterogeneous and reconfigurable by design, poses a number of challenges related to the frequency at which heterogeneous hardware resources might change their availability and shareability status, which in turn affect methods and means to allocate, manage, optimize, bill, monitor VMs, virtual farms, jobs, interactive bare-metal sessions.

The topic of the workshop is indeed the description of some of the use cases that prompted the design ad construction of the HPC cluster, its architecture and a first characterisation of its performance by some synthetic benchmark tools and a few realistic use-case tests.

More technical details at CHEP 2016: M. Aldinucci, S. Bagnasco, S. Lusso, P. Pasteris, and S. Rabellino, “The Open Computing Cluster for Advanced data Manipulation (OCCAM),” in The 22nd International Conference on Computing in High Energy and Nuclear Physics (CHEP), San Francisco, USA, 2016. 

Occam

This entry was posted in news on by .

About Marco Aldinucci

Marco Aldinucci is an assistant professor at Computer Science Department of the University of Torino since 2008. Previously, he has been researcher at University of Pisa and Italian National Research Agency. He is the author of over a hundred papers in international journals and conference proceeding (Google scholar h-index 21). He has been participating in over 20 national and international research projects concerning parallel and autonomic computing. He is the recipient of the HPC Advisory Council University Award 2011 and the NVidia Research award 2013. He has been leading the “Low-Level Virtualization and Platform-Specific Deployment” workpackage within the EU-STREP FP7 ParaPhrase (Parallel Patterns for Adaptive Heterogeneous Multicore Systems) project, the GPGPU workpackage within the IMPACT project (Innovative Methods for Particle Colliders at the Terascale), and he is the contact person for University of Torino for the European Network of Excellence on High Performance and Embedded Architecture and Compilation. In the last year he delivered 5 invited talks in international workshops (March 2012 – March 2013). He co-designed, together with Massimo Torquati, the FastFlow programming framework and several other programming frameworks and libraries for parallel computing. His research is focused on parallel and distributed computing.