The HPE Performance Cluster Manager (HPCM) administration course provides knowledge and practice installing HPCM, managing data networks, provisioning servers, creating and modifying server images, working with software repositories and image version control, automating post installation tasks, configuring services, reviewing security features, and troubleshooting.
TARGET AUDIENCE:
• Attend this class if you need to learn to
install, configure and administer clusters
managed with the HPE Performance Cluster
Manager (HPCM) • Experienced Linux system administrators
COURSE PREREQUISITES:
H8PE8S: HPE Performance Cluster
Management Foundations
• The following Linux system administration
skills are prerequisites for this course:
– Edit text with the vi editor
– Recognize regular expression syntax
– Access documentation with man and info file
viewers
– Monitor, manage and maintain log files
– Enter common commands at the bash
command line; create and interpret basic
bash shell scripts
– Install and configure standard software
components, services and security features
– Configure basic communication protocols
– Create and modify crontabs
– Monitor resources usage; be familiar with
basic monitoring tools
– Install and configure a Linux distribution on
a server
– Create, modify, and delete user accounts
and group accounts
– Partition disks, manage filesystems and
logical volumes
– Use RPM package management
– Install and use virtualized systems
– Understand basic hardware and hardware
troubleshooting
COURSE CONTENT:
Module 1: Install Cluster
• Describe HPCM features
• Define operating system slots
• Build cluster from ground up
• Provision node with GUI
• Provision node with command line
• Add nodes to the cluster
• Explore auto installation tools
Module 2: Discover
• Discover nodes
• Interpret cluster configuration files
• Review cluster services
Module 3: Data Networks
• Describe technologies
• Describe InfiniBand configuration
• Describe Intel Omni-Path configuration
• Describe software components
• Use diagnostic commands
Module 4: Manage Images
• Manage software repositories
• List software repositories
• Add software repositories
• Remove software repositories
• Create repository groups
• Customize an image by using RPM lists
• Create a compute node image
• Create an ICE-compute node image
• Manage image version control
• Check in an image into version control
• Compare differences between two versions of an
image
• List the versions of an image
• Deploy a specific version of an image
• Push an ICE-compute image to a rack
• Use parallel tools and inbuilt functionality to check
differences between nodes
• Enable hyperthreading
• Disable hyperthreading
• Configure array services
• Install batch scheduler server on a compute node
• Install batch scheduler client on a compute node and in
ICE compute node
• Configure HPCM connectors to job schedulers
• Capture an image from a node (golden)
• Add RPMs to, remove RPMs from, and version control
compute images
• Add and remove RPMs from running compute nodes
• Clone an ICE-compute image
• Clean up old images on the lead node
• Add RPMs to ICE compute image Compare when and
when not to use tmpfs root
• Determine which nodes use tmpfs root
• Configure nodes to use tmpfs root
• List tmpfs quota difference (rack leader quotas do not
apply when ICE-compute nodes are in tmpfs)
• Set tmpfs mode
• Set disk mode
• Show which mode a node has booted with
• Show which mode a node is scheduled to boot into
• Perform a clone operating system slot operation
Module 5: Automate Post Installation Tasks • Review conf.d
scripts
• Exclude a conf.d script
• Use pre_reconf.sh
• Use reconfig.sh
• Develop post install and per-host customization scripts
Module 6: Configure Shared Filesystem, User Accounts,
Applications, and Updates
• NFS Export a filesystem on a compute node
• Mount an NFS filesystem and create a user on an ICE
compute node
• Manage user accounts
• Synchronize UIDs and GIDs, LDAP, etc.
• Run an application on compute and ICE compute
nodes
• Display BIOS settings
• Upgrade firmware
• Update kernel
• Update distribution
• Update HPCM
Module 7: Troubleshoot Cluster • Backup cluster
configuration
• Backup managed network switch configuration
• Use the central log repository
• Investigate log files
• Gather system information
• Interrogate iLOs, BMCs
• Confirm resources
• Create pdsh groups
• Investigate bond devices
• Inspect VLAN devices
• Capture a node crash dump
• Transfer an image from another slot or another system
and confirm that the image can be used.
• Inject faults
Module 8: Review Cluster Security
• Describe system administrator configurable security
tasks
• Describe what makes cluster security different from
standalone security (how would change X break the
cluster)
• List ports used for each node role and for which
interfaces
• List components with passwords
– Admin node
– Flat compute nodes
– Rack leader nodes
– ICE compute nodes
– BMCs
– CMCs
– Ethernet network switches
– InfiniBand and Omni-Path switches
– IB/OPA switch BMCs
– Storage controllers
• List components that can have passwords applied
COURSE OBJECTIVE:
At the conclusion of this course, you should be
able to: • Install HPCM • Add servers to the cluster • Manage data networks • Provision nodes • Create and modify images and software
repositories • Use image version control • Automate post installation tasks • Configure shared filesystem, user accounts,
applications and updates • Troubleshoot cluster services • Review cluster security features
FOLLOW ON COURSES:
Not available. Please contact.