Keeping up with High Performance Computing’s ever-expanding horizon, with a chance of clouds. ☁️
Google Cloud HPC Toolkit
Google’s Cloud HPC Toolkit allows admins to define a cluster in YAML and their ghpc
command-line tool combine various Google-maintained and community Terraform modules to produce a static directory of Terraform to apply, tear down and apply again! Modules already exist to have a fully working Slurm batch scheduler, optionally in federation with your OnPrem cluster, with a wide array of software support available, including
Spack
and Intel’s OneAPI suite.
Taking advantage of the best of cloud these clusters are capable of autoscaling and can take advantage of Google Cloud’s services and partners.
All using industry standard tooling: HashiCorp’s Terraform & Packer, implementing Google’s best practices in SRE .
Accelerate your High Performance Computing journey with new Google Cloud HPC Toolkit | Google Cloud
Google Cloud: Cloud HPC Toolkit, an open source tool that enables users to easily create repeatable, turnkey HPC clusters based on proven best practices.
WITH CLOUD HPC TOOLKIT, GOOGLE PURSUES HPC, INTEL PUSHES ONEAPI | Next Platform
Intel and Google Cloud Announce Cloud HPC Toolkit | HPC Wire
Google Cloud flexes Arms with Ampere Altra
Expanding the Tau VM family with Arm-based processors | Google Cloud
Google Cloud enters the Arm Neoverse with Ampere Altra Arm processors!
Scaling up to 48 vCPUs, Tau T2A VMs can be deployed as part of Google Kubernetes Engine, as Compute and in Dataflow.
Google Cloud Podcast discusses Arm Servers on GCP with Jon Masters and Emma Haruka Iwao
AWS ParallelCluster 3.2 adds Slurm memory-aware scheduling and expands filesystem support
Slurm-based memory-aware scheduling in AWS ParallelCluster 3.2 by Oliver Perks
tldr; --mem
and --mem-per-cpu
job flags are now available!
Expanded filesystems support in AWS ParallelCluster 3.2 by Oliver Perks
ParallelCluster already has support for Amazon Elastic File System (EFS) , Amazon Elastic Block Store (EBS) and Amazon FSx for Lustre . In this release we added support for the FSx for NetApp ONTAP and FSx for OpenZFS filesystems.
With this ParallelCluster 3.2, you can now mount up to 20 Amazon FSx file systems and up to 20 Amazon EFS filesystems.
→ Brendan Bouffler’s “HPC guy @awscloud” Twitter thread on the release
Rocky Linux on Google Cloud as fully supported alternative to CentOS
→ Moving off CentOS? Introducing Rocky Linux Optimized for Google Cloud
As CentOS 7 reaches end of life, many enterprises are considering their options for an enterprise-grade, downstream Linux distribution on which to run their production applications. Rocky Linux has emerged as a strong alternative that, like CentOS, is 100% compatible with Red Hat Enterprise Linux.
Also This Month
John C. Linford publishes a comprehensive Getting Started with NVIDIA Arm HPC Developers Kit
→ Getting started with HPC on Arm64 | GitHub
This guide includes how-to guides, sample code, recommendations, and technical best practices to help new users get started with Arm-based systems like the NVIDIA Arm HPC Developer Kit. While it is intended for the users and administrators of NVIDIA’s Arm-based platforms, this guide is also generically useful for anyone running HPC applications on Arm CPUs, with or without GPUs. The focus is mostly on the CPU since Arm-hosted GPUs are just the same as GPUs hosted by any other CPUs.
Fujitsu to provide A64FX HPC as a service via new Cloud
→ Japanese space agency to put massive HPC cloud to the test | The Register
Atos sees Federated HPC Cloud as the future, with their acquisition of Nimbix and their HPC Cloud experience
→ PULLING ALL THE LEVERS FOR HPC IN THE CLOUD | Next Platform
CXL further cements itself as the leading CPU interconnect standard
Having previously absorbed AMD/Arm’s CCIX, HPE’s Gen-Z and now IBM’s OpenCAPI. CXL enhances PCIe, enables device caching of main memory and most interesting for HPC enables coherent memory access.
→ OpenCAPI to Be Folded into CXL | HPCWire
Interesting paper that quantifies the carbon intensity of HDD/SSD drives by CAPEX/OPEX, i.e. carbon to produce and carbon to run.
→ The Dirty Secret of SSDs: Embodied Carbon | arxiv
More by Joe Heaton at @Heaton_dev & LinkedIn