The Merlin HPC Cluster

Merlin is a series of high-performance computing clusters provided by central PSI.

The Merlin resources are available to all PSI staff and external collaborators. The Merlin clusters currently available include:

  • Merlin 7 (next-generation, currently in pre-production, expected to be generally available by January 2025)
  • Merlin 6 (production system, will be decommissioned in 2025)
  • Merlin 5 (legacy system with best effort support, will be decommissioned by 2025)

Merlin 7 is the newest generation of the HPC clusters available for PSI staff and collaborators. Based in Lugano at CSCS on their Alps infrastructure, the cluster boasts extensive advantages for distributed workloads and GPU jobs. The system is made up of several different types of Cray Shasta nodes, in a heterogeneous configuration. Currently, the system is being built up, and is in pre-production at the moment.

Our production cluster is Merlin 6, the cluster was designed to be extensible regarding the addition of compute nodes and storage. In addition to the main cluster's CPU based resources, the system also contains a smaller partition of GPU resources for biology research (Cryo-EM analysis) and machine learning applications.

We also maintain a legacy cluster, called Merlin 5, which is provided on a best-effort bases for workloads that don't have large resource needs or can be run for long periods.

Since 2019, the service has been maintained by the High-Performance Computing & Emerging Technologies group (HPCE).
 

All PSI staff and external collaborators can request access to Merlin, please follow the instructions in the documentation.

Various resources and support articles are provided in the documentation, which is only available through the PSI intranet. It also includes details on how to get help from admins.

Resource specific documentation:

  • Merlin 7, please go here.
  • Merlin 5 and 6, please go here.

The clusters use different services to help get users the most out of the resources. These include:

AFS

The Andrew File System (AFS) is available at PSI under the 'psi.ch' domain. This is mounted in the Merlin clusters thanks to the Auristor client. AFS contains personal user information as well as the software stack used in the Merlin clusters. AFS is mounted through the standard Ethernet network.

Network mounted home directories

Home directories are mounted under the PSI Central NFS service, providing up to 10GB capacity to each user, with daily snapshots for one week. This is mounted through the standard Ethernet network.

HPC storage

The main storage at PSI is based on the IBM's General Parallel FileSystem Spectrum Scale, suitable for HPC environments. This is mounted through the Infiniband network for high performance and low latency.

For Merlin 7, we are operating a HPE ClusterStor Array that used the Lustre File System, which is designed for high-throughput low latency data operations. The storage is connected to the cluster nodes over the Cray Slingshot network fabric.

Linux O.S.

All Merlin 5 and 6 nodes and servers are running RedHat Enterprise Linux. Merlin 7 uses the Cray OS (based on SUSE Linux for Enterprise).

Remote Desktop

For remote desktop access, the newest login nodes are running NoMachine Terminal Server.

Batch system

All the Merlin clusters use the Slurm Workload Manager. The Merlin Slurm configuration allows running from single core based jobs up to MPI based jobs, allowing to scale up when running over multiple nodes. Mixed resource workloads, such as use GPUs with CPUs, are also supported.
 

The system is built on Crays Shasta platform and will include the following hardware (exact details are still being determined, specifics might change until the system is in production):

  • Multi-core CPU node (x86), 2x AMD EPYC 7742 (x86_64 Rome, 64 Cores, 3.2GHz) with 512GB DRR4 3200Mhz RAM
  • Multi-core CPU + GPU node (x86), 2x AMD EPYC 7713 (x86_64 Milan, 64 Cores, 3.2GHz) with 512GB DRR4 3200Mhz RAM, and 4x NVidia A100 (Ampere, 80GB)
  • Grace Hopper node (arm64), 4x Nvidia Grace CPU (SBSA arm64 , 72 Cores, 3.44GHz) with 128 GB DDR5 6400Mhz RAM, and 4x Nvidia H200 (Hopper, 96GB)
Networking


Merlin 6 uses Infiniband and is based on EDR (100Gbps) technology for MPI communication as well as for storage access. Infiniband bandwidth between chassis provide up to 1200Gbps

Hardware
Hardware Specifications
ServiceHardware
 SolutionBladeDescription

Computing nodes

4 x HPE Apollo k6000

Chassis

96 x HPE Apollo XL230K Gen10

(24 blades per chassis)

72 x Two Intel® Xeon® Gold 6152 Scalable Processor @ 2.10GHz (2 x 22 cores per node, HT-enabled, 384GB RAM, NVMe /scratch)

24 x Two Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz (2 x 24 cores per node, 18 x 768GB + 6 x 384GB RAM, NVMe / scratch)

HPC Network based on Dual Port Infiniband ConnectX-5 EDR (1 x 100Gbps). Standard network 1 x 10Gbps.

Login nodes

Single Blade

 

2 x HPE Proliant DL380 Gen10

Two Intel® Xeon® Gold 6152 Scalable Processor @ 2.10GHz (2 x 22 cores, HT-enabled, 384GB RAM, NVMe /scratch)

HPC Network based on Dual Port Infiniband ConnectX-5 EDR (2 x 100Gbps). Standard network 2 x 10Gbps.

Networking

Merlin 5 network is Infiniband is based on QDR (40Gbps) and FDR (56Gbps) technologies for MPI communication as well as for storage access. Merlin5 is connected to Merlin6 through FDR (56 Gbps) (MPI)

Hardware
Hardware Specifications
ServiceHardware
 SolutionBladeDescription

Computing nodes

2 x HPE BladeSystem c7000

Chassis

32 x HPE Proliant DL380 Gen8

(16 blades per chassis)

Two Intel® Xeon® Processor E5-2670 @ 2.60GHz (2 x 8 cores, no-HT, 64GB RAM, SAS /scratch)

HPC Network based on Single Port Infiniband ConnectX-3 QDR (1 x 40Gbps). Standard network 1Gbps.

Login nodes

Single Blade

 

1 x HPE Proliant DL380 Gen9

Two Intel(R) Xeon(R) CPU E5-2697A v4@ 2.60GHz (2 x 16 cores, HT-enabled, 512GB RAM, SAS /scratch)

HPC Network based on Dual Port Infiniband ConnectIB FDR (1 x 56Gbps). Standard network 1 x 1Gbps.

Hardware Specifications
ServiceHardware
 SolutionBladeDescription
Storage nodes

Lenovo Distributed Storage Solution for IBM Spectrum Scale

1 x Lenovo DSS G240 Building block

1 x ThinkSystem SR630 (Mgmt node)

2 x ThinkSystem SR650 (IO nodes)

ThinkSystem SR630: Two Intel(R) Xeon(R) Gold 5118 Scalable Processor @ 2.30Ghz (2 x 12 cores, HT-enabled, 96GB RAM

  • Support/management node with xCAT
  • 1 x Dual Port Infiniband ConnectX-5 EDR-100Gbps (low latency network).
  • 1 x Dual Port Infiniband ConnectX-4 EDR-100Gbps (low latency network).
  • Standard network 2 x 10Gbps.

Building block 1:

  • 2 x ThinkSystem SR650: Two Intel(R) Xeon(R) Gold 6142 Scalable Processor @ 2.60Ghz (2 x 16 cores, HT-enabled, 384GB RAM)
    • IO Node
    • 2 x Dual Port Infiniband ConnectX-5 EDR-100Gbps (low latency network).
    • 2 x Dual Port Infiniband ConnectX-4 EDR-100Gbps (low latency network).
    • Standard network 2 x 10Gbps.
    • ThinkSystem RAID 930-8i 2GB Flash PCIe 12Gb Adapter
  • 4 x Lenovo Storage D3284 High Density Expansion Enclosure, each one:
    • Holds 84 x 3.5” hot-swap drive bays in two drawers. Each drawer has three rows of drives, and each row has 14 drives.
    • Each drive bay will contain a 10TB Helium 7.2K NL-SAS HDD.