table of contents
ethhostadmin(8) | EFSFFCLIRG (Man Page) | ethhostadmin(8) |
NAME¶
ethhostadmin
Performs a number of multi-step host initialization and verification operations, including upgrading software, rebooting hosts, and other operations. In general, operations performed by ethhostadmin involve a login to one or more host systems.
Syntax¶
ethhostadmin [-c] [-e] [-f hostfile] [-h 'hosts']
[-r release] [-I install_options] [-U upgrade_options]
[-d dir]
[-T product] [-P packages] [-S] operation ...
Options¶
- --help
-
Produces full help text.
- -c
-
Overwrites the result files from any previous run before starting this run.
- -e
-
exit after 1st operation which fails.
- -f hostfile
-
Specifies the file with the names of hosts in a cluster. Default is /etc/eth-tools/hosts file.
- -h hosts
-
Specifies the list of hosts to execute the operation against.
- -r release
-
Specifies the software version to load/upgrade to. Default is the version of Intel(R) Ethernet Fabric Suite Software presently being run on the server.
- -d dir
-
Specifies the directory to retrieve product. release.tgz for load or upgrade.
- -I install_options
-
Specifies the software install options.
- -U upgrade_options
-
Specifies the software upgrade options.
- -T product
-
Specifies the product type to install. Options include:
- IntelEth-Basic. <distro> (default)
- IntelEth-FS. <distro>
- where <distro> is the distribution and CPU, such as RHEL81-x86_64.
- -P packages
-
Specifies the packages to install. Default is eth eth_rdma. Refer to INSTALL -C for compete list of packages.
- -S
-
Securely prompts for user password on remote system.
- operation
-
Performs the specified operation, which can be one or more of the following:
load Starts initial installation of all hosts.
upgrade Upgrades installation of all hosts.
reboot Reboots hosts, ensures they go down and come back.
rping Verifies this host can ping each host through RDMA.
pfctest Verifies PFC works on all hosts.
mpiperf Verifies latency and bandwidth for each host.
mpiperfdeviation Verifies latency and bandwidth for each host against a defined threshold (or relative to average host performance).
Example¶
ethhostadmin -c reboot
ethhostadmin upgrade
ethhostadmin -h 'elrond arwen' reboot
HOSTS='elrond arwen' ethhostadmin reboot
Details¶
ethhostadmin provides detailed logging of its results. During each run, the following files are produced:
- test.res : Appended with summary results of run.
- test.log : Appended with detailed results of run.
- save_tmp/ : Contains a directory per failed test with detailed logs.
- test_tmp*/ : Intermediate result files while test is running.
The -c option removes all log files.
Results from ethhostadmin are grouped into test suites, test cases, and test items. A given run of ethhostadmin represents a single test suite. Within a test suite, multiple test cases occur; typically one test case per host being operated on. Some of the more complex operations may have multiple test items per test case. Each test item represents a major step in the overall test case.
Each ethhostadmin run appends to test.res and test.log, and creates temporary files in test_tmp$PID in the current directory. test.res provides an overall summary of operations performed and their results. The same information is also displayed while ethhostadmin is executing. test.log contains detailed information about what was performed, including the specific commands executed and the resulting output. The test_tmp directories contain temporary files which reflect tests in progress (or killed). The logs for any failures are logged in the save_temp directory with a directory per failed test case. If the same test case fails more than once, save_temp retains the information from the first failure. Subsequent runs of ethhostadmin are appended to test.log. Intel recommends reviewing failures and using the -c option to remove old logs before subsequent runs of ethhostadmin.
ethhostadmin implicitly performs its operations in parallel. However, as for the other tools, FF_MAX_PARALLEL can be exported to change the degree of parallelism. 1000 parallel operations is the default.
Environment Variables¶
The following environment variables are also used by this command:
- HOSTS
-
List of hosts, used if -h option not supplied.
- HOSTS_FILE
-
File containing list of hosts, used in absence of -f and -h.
- FF_MAX_PARALLEL
-
Maximum concurrent operations are performed.
- FF_SERIALIZE_OUTPUT
-
Serialize output of parallel operations (yes or no).
- FF_TIMEOUT_MULT
-
Multiplier for all timeouts associated with this command. Used if the systems are slow for some reason.
ethhostadmin Operation Details¶
(Host) Intel recommends that you set up password SSH or SCP for use during this operation. Alternatively, the -S option can be used to securely prompt for a password, in which case the same password is used for all hosts. Alternately, the password may be put in the environment or the ethfastfabric.conf file using FF_PASSWORD and FF_ROOTPASS.
- load
-
Performs an initial installation of Intel(R) Ethernet Fabric Suite Software on a group of hosts. Any existing installation is uninstalled and existing configuration files are removed. Subsequently, the hosts are installed with a default Intel(R) Ethernet Fabric Suite Software configuration. The -I option can be used to select different install packages. Default is eth_tools eth_rdma mpi The -r option can be used to specify a release to install other than the one that this host is presently running. The FF_PRODUCT. FF_PRODUCT_VERSION.tgz file (for example, IntelEth-Basic. version.tgz) is expected to exist in the directory specified by -d. Default is the current working directory. The specified software is copied to all the selected hosts and installed.
- upgrade
-
Upgrades all selected hosts without modifying existing configurations. This operation is comparable to the -U option when running ./INSTALL manually. The -r option can be used to upgrade to a release different from this host. The default is to upgrade to the same release as this host. The FF_PRODUCT. FF_PRODUCT_VERSION.tgz file (for example, IntelEth-Basic. version.tgz) is expected to exist in the directory specified by -d. The default is the current working directory. The specified software is copied to all the end nodes and installed.
NOTE: Only components that are currently installed are upgraded. This operation fails for hosts that do not have Intel(R) Ethernet Fabric Suite Software installed.
- reboot
-
Reboots the given hosts and ensures they go down and come back up by pinging them during the reboot process. The ping rate is slow (5 seconds), so if the servers boot faster than this, false failures may be seen.
- rping
-
Verifies RDMA basic operation by ensuring that the nodes can ping each other through RDMA. To run this command, Intel(R) Ethernet Fabric software must be installed, RDMA must be configured and running on the host, and the given hosts, and switches must be up.
- pfctest
-
Empirical test which verifies PFC is working right. To run this command, Intel(R) Ethernet Fabric software must be installed, PFC must be configured on both hosts and switches, and the given hosts and switches must be up.
- mpiperf
-
Verifies that MPI is operational and checks MPI end-to-end latency and bandwidth between pairs of nodes (for example, 1-2, 3-4, 5-6). Use this to verify switch latency/hops, PCI bandwidth, and overall MPI performance. The test.res file contains the results of each pair of nodes tested.
NOTE: This option is available for the Intel(R) Ethernet Host Software OFA Delta packaging, but is not presently available for other packagings of OFED.
- To obtain accurate results, this test should be run at a time when no other stressful applications (for example, MPI jobs or high stress file system operations) are running on the given hosts.
- Bandwidth issues typically indicate server configuration issues (for example, incorrect slot used, incorrect BIOS settings, or incorrect NIC model), or fabric issues (for example, symbol errors, incorrect link width, or speed). Assuming ethreport has previously been used to check for link errors and link speed issues, the server configuration should be verified.
- Note that BIOS settings and differences between server models can account for 10-20% differences in bandwidth. For more details about BIOS settings, consult the documentation from the server supplier and/or the server PCI chipset manufacturer.
- mpiperfdeviation
-
Specifies the enhanced version of mpiperf that verifies MPI performance. Can be used to verify switch latency/hops, PCI bandwidth, and overall MPI performance. It performs assorted pair-wise bandwidth and latency tests, and reports pairs outside an acceptable tolerance range. The tool identifies specific nodes that have problems and provides a concise summary of results. The test.res file contains the results of each pair of nodes tested.
- By default, concurrent mode is used to quickly analyze the fabric and host performance. Pairs that have 20% less bandwidth or 50% more latency than the average pair are reported as failures.
- The tool can be run in a sequential or a concurrent mode. Sequential mode runs each host against a reference host. By default, the reference host is selected based on the best performance from a quick test of the first 40 hosts. In concurrent mode, hosts are paired up and all pairs are run concurrently. Since there may be fabric contention during such a run, any poor performing pairs are then rerun sequentially against the reference host.
- Concurrent mode runs the tests in the shortest amount of time, however, the results could be slightly less accurate due to switch contention. In heavily oversubscribed fabric designs, if concurrent mode is producing unexpectedly low performance, try sequential mode.
NOTE: This option is available for the Intel(R) Ethernet Host Software OFA Delta packaging, but is not presently available for other packagings of OFED.
- To obtain accurate results, this test should be run at a time when no other stressful applications (for example, MPI jobs, high stress file system operations) are running on the given hosts.
- Bandwidth issues typically indicate server configuration issues (for example, incorrect slot used, incorrect BIOS settings, or incorrect NIC model), or fabric issues (for example, symbol errors, incorrect link width, or speed). Assuming ethreport has previously been used to check for link errors and link speed issues, the server configuration should be verified.
- Note that BIOS settings and differences between server models can account for 10-20% differences in bandwidth. A result 5-10% below the average is typically not cause for serious alarm, but may reflect limitations in the server design or the chosen BIOS settings.
- For more details about BIOS settings, consult the documentation from the server supplier and/or the server PCI chipset manufacturer.
- The deviation application supports a number of parameters which allow for more precise control over the mode, benchmark and pass/fail criteria. The parameters to use can be selected using the FF_DEVIATION_ARGS configuration parameter in ethfastfabric.conf
- Available parameters for deviation application:
-
[-bwtol bwtol] [-bwdelta MBs] [-bwthres MBs]
[-bwloop count] [-bwsize size] [-lattol latol]
[-latdelta usec] [-latthres usec] [-latloop count]
[-latsize size][-c] [-b] [-v] [-vv]
[-h reference_host]
-bwtol Specifies the percent of bandwidth degradation allowed below average value.
-bwbidir Performs a bidirectional bandwidth test.
-bwunidir Performs a unidirectional bandwidth test (Default).
-bwdelta Specifies the limit in MB/s of bandwidth degradation allowed below average value.
-bwthres Specifies the lower limit in MB/s of bandwidth allowed.
-bwloop Specifies the number of loops to execute each bandwidth test.
-bwsize Specifies the size of message to use for bandwidth test.
-lattol Specifies the percent of latency degradation allowed above average value.
-latdelta Specifies the imit in µsec of latency degradation allowed above average value.
-latthres Specifies the lower limit in µsec of latency allowed.
-latloop Specifies the number of loops to execute each latency test.
-latsize Specifies the size of message to use for latency test.
-c Runs test pairs concurrently instead of the default of sequential.
-b When comparing results against tolerance and delta, uses best instead of average.
-v Specifies the verbose output.
-vv Specifies the very verbose output.
-h Specifies the reference host to use for sequential pairing.
- Both bwtol and bwdelta must be exceeded to fail bandwidth test.
- When bwthres is supplied, bwtol and bwdelta are ignored.
- Both lattol and latdelta must be exceeded to fail latency test.
- When latthres is supplied, lattol and latdelta are ignored.
- For consistency with OSU benchmarks, MB/s is defined as 1000000 bytes/s.
Intel Corporation | Copyright(C) 2020-2022 |