Supported by the National Science Foundation Collaborator: University of Michigan Collaborator: Michigan State University Collaborator: Wayne State University Collaborator: Merit

HORUS provides three distinct types of computational nodes, co-located with existing OSiRIS storage infrastructure and connected to an existing 100 Gbps research network. open source software makes the resource broadly available, including to researchers outside the region via the OSG/PATh sharing described below.

Hardware

We have identified three types of computational servers that allow us to support a broad range of use-cases: compute servers for high-throughput, compute servers for large memory, and compute servers for GPU tasks. The table below summarizes the characteristics of each.

HORUS Planned Computing System Details

Horus Systems Model Mem (G) / Host CPU CPU Cnt GPU GPU Mem GPU/Host NICs Host Cnt HT Job Slots (total) Mem (G) / Slot
GPU Node Dell R750xa 512 Xeon Gold 6334 8C/16T 2 A100 80G 4 2x25G 4 128 16.0
Large Memory Node Dell R6525 1024 AMD Epyc 7F72 3.2GHz, 24C/48T 2 - - - 2x25G 10 960 10.7
Compute Node Dell R6525 512 AMD Epyc 7H12 2.60GHz, 64C/128T 2 - - - 2x25G 10 2560 2.0

Dynamic partitioning of server resources is expected to be a main feature. For example, some of our GPU users simply need a single CPU and associated memory to match the total memory of a GPU. A server with four GPUs would only require assigning four hyper-threaded CPUs and about 320G of memory, leaving 28 CPUs and 192G of memory idle. We need to be able to dynamically make those CPUs and memory accessible for other jobs that just require one or more CPUs and a smaller amount of memory.

OSiRIS combined with HORUS makes a great enabler for many of the science domains in the project. Currently, OSiRIS has approximately 6 Petabytes of raw storage available for use, out of 12.0 Petabytes deployed. Using erasure-coding techniques (8+3), the 6 Petabytes translates into roughly 4.3 PB of usable space that did not need to be paid for by HORUS.

High-Performance Network

A high-speed research network was created as part of the OSiRIS project (see Figure 1 below). This network, combined with Merit’s regional network, will support access to HORUS compute resources by educational institutions throughout Michigan and beyond. This network is highly resilient and features multiple 100 Gbps links internally and 100 Gbps connectivity to Merit and the wide-area network.

Merit’s core strength is enabling researchers, educators, and learners to transfer critical data across its high capacity optical network in a safe, secure environment. Merit’s network is designed with resiliency within its core network to our campuses and out to the national and global research communities. With Merit’s next-generation network, HORUS creates a high-capacity compute research platform across three campuses and makes it available to researchers throughout Michigan and the region.

Horus Network

Figure 1: The network that HORUS uses features multiple 100 Gbps resilient network links. It is well-connected to both the Merit network and broader global research and education networks. Note that HORUS equipment will be directly connected to the “gray” boxes at top and bottom of the diagram (MSU, UM or WSU, e.g., um-sw01 or wsu-sw02, etc).

Software

The HORUS team has extensive experience operating cyberinfrastructure for researchers. Besides the work on OSiRIS, team members have designed and operated infrastructures such as the ATLAS Great Lakes Tier-2 for high-energy physics, the ICER infrastructure at Michigan State University, and Wayne State University’s computing center.

The HORUS team made use of available open source software, typically deployed in grid sites and data centers. The system was developed with researchers in mind:

  • Easy to submit work and track jobs
  • Fair sharing of resources
  • Dynamic partitioning of resources to suit varying job sets
  • Resource accounting and system metrics

Building Blocks / Open Source Components

HORUS uses software to adhere to its guiding principles of fairly sharing resources among users and maximizing the use of resources. The HORUS software architecture builds on a number of open source tools and applications, grouped by their particular role.

Authentication and Authorization

  • InCommon, a set of community-designed identity and access management services
  • CoManage, identity lifecycle management
  • Grouper for creating and managing roles, groups, and permissions
  • CILogon, for logging on to cyberinfrastructure

Resource allocation and management

  • HTCondor, used to manage fair-share access for computational tasks.
  • HTCondor-CE, a meta-scheduler used as a “door” to a set of resources.
  • SLURM, used to schedule jobs and interface with Open OnDemand (OOD).
  • OOD, Open OnDemand which provides a user web interface to HORUS resources.
  • NVIDIA MIG, used to subdivide a GPU (cores and memory) into up to seven smaller instances, allowing more jobs to share a GPU.
  • Ceph, with quotas to manage storage use in OSiRIS.

Monitoring and Accounting

  • Elasticsearch, gathers data from syslogs and other sources for aggregation, visualization, analytics, and correlation.
  • CheckMK, intelligent server and host monitoring system capable of validating service states and tracking resource usage.
  • perfSONAR, used to test and monitor network behavior across infrastructure.
  • AlmaLinux9, for accounting and auditing to augment usage and security information.

Deployments of many of these tools (CoManage, Grouper, Elasticsearch, CheckMK, perfSONAR) already were in place for OSiRIS and have been reconfigured to accommodate HORUS. HTCondor is the HORUS batch scheduler, used with OSG/PATh to configure the appropriate connection to users’ hosted Compute Element (CE) based upon HTCondor-CE. The HTCondor services, as well as all other required HORUS services, will rely on the virtualization platform already in place for OSiRIS. This virtualization platform includes four powerful virtualization hosts at each of our sites (UM, MSU, WSU) running libvirt. In addition, we have access to SLATE infrastructure, which provides the ability to orchestrate containers via Kubernetes if particular tools or jobs would benefit from that.