Provided by: perftest_24.01.0+0.38-1build2_amd64 

NAME
ib_write_bw, ib_read_bw, ib_send_bw, ib_atomic_bw, ib_write_lat, ib_read_lat, ib_send_lat, ib_atomic_lat,
raw_ethernet_bw, raw_ethernet_lat, raw_ethernet_burst_lat, raw_ethernet_fs_rate - benchmarks for various
types of infinabnd performance
DESCRIPTION
Perftest is a package that includes various benchmarks that measures
different metrics & verbs performance which include many different options and modes.
RUNNING TESTS
Server:
./<test name> <options>
Client:
./<test name> <options> <server IP address>
Examples:
1- Running bidirectional bandwidth test using Write verb for 5 seconds with 8388608 as a message
size and 3 qps:
Server: ./ib_write_bw -s 8388608 -b -D 5 -q 3
Client: ./ib_write_bw -s 8388608 -b -D 5 -q 3 1.1.1.2
2- Running latency test using Read verb for 5000 iterations with 32 as a message size:
Server: ./ib_read_lat -s 32 -n 5000
Client: ./ib_read_lat -s 32 -n 5000 192.168.0.1
IMPORTANT NOTES
1- The options that specific to modes in perftest must be the same for both server and client.
2- Perftest applications may need to be ran with sudo when running from non root.
3- Perftest applications usually installed to the /usr/bin/.
4- Perftest may print some failures with syndroms to the stderr, perftest get those errors from
rdma-core.
OPTIONS
-h, --help
Lists the available options to the screen.
-a, --all
Run sizes from 2 till 2^23.
Not relevant for Atomic and RawEth.
-A, --atomic_type=<type>
Type of atomic operation from {CMP_AND_SWAP,FETCH_AND_ADD} (default FETCH_AND_ADD).
Relevant only for Atomic.
-b, --bidirectional
Measure bidirectional bandwidth (default unidirectional).
Relevant only for BW.
-c, --connection=<RC/XRC/UC/UD/DC/SRD>
Connection type RC/XRC/UC/UD/DC/SRD (default RC).
UD relevant only for Send verb.
SRD relevant only for Read, Write and Send verbs.
UC relevant only for Write and Send verbs.
Not relevant for RawEth.
--log_dci_streams=<log_num_dci_stream_channels> (default 0)
Run DC initiator as DCS instead of DCI with <log_num dci_stream_channels>.
Not relevant for RawEth.
System support required.
--log_active_dci_streams=<log_num_active_dci_stream_channels> (default log_num_dci_stream_channels)
Not relevant for RawEth.
System support required.
--aes_xts
Runs traffic with AES_XTS feature (encryption).
Not relevant for RawEth and Write latency.
System support required.
--encrypt_on_tx
Runs traffic with encryption on tx (default decryption on tx).
Not relevant for RawEth and Write latency.
System support required.
--sig_before
Puts signature on data before encrypting it (default after).
Not relevant for RawEth and Write latency.
System support required.
--aes_block_size=<512,520,4048,4096,4160> (default 512)
Not relevant for RawEth and Write latency.
System support required.
--data_enc_keys_number=<number of data encryption keys> (default 1)
Not relevant for RawEth and Write latency.
System support required.
--kek_path <path to the key encryption key file>
Not relevant for RawEth and Write latency.
System support required.
--credentials_path <path to the credentials file>
Not relevant for RawEth and Write latency.
System support required.
--data_enc_key_app_path <path to the data encryption key app>
Not relevant for RawEth and Write latency.
System support required.
-C, --report-cycles
Report times in cpu cycle units (default microseconds).
Relevant only for latency.
-d, --ib-dev=<dev>
Use IB device <dev> (default first device found).
-D, --duration
Run test for a customized period of seconds.
-e, --events
Sleep on CQ events (default poll).
Not relevant for Write and RawEth.
-X, --vector=<completion vector>
Set <completion vector> used for events.
Not relevant for Write and RawEth.
-f, --margin
measure results within margins. (default=2sec).
-F, --CPU-freq
Do not show a warning even if cpufreq_ondemand module is loaded, and cpu-freq is not on max.
-g, --mcg
Send messages to multicast group with 1 QP attached to it.
When there is no multicast gid specified, a default IPv6 typed gid
'255:1:0:0:0:2:201:133:0:0:0:0:0:0:0:0' will be used.
Relevant only for send non fsRate.
-H, --report-histogram
Print out all results (default print summary only).
Relevant only for latency and raw_ethernet_fs_rate.
-i, --ib-port=<port>
Use port <port> of IB device (default 1).
-I, --inline_size=<size>
Max size of message to be sent in inline.
Not relevant for Read and Atomic.
-l, --post_list=<list size>
Post list of send WQEs of <list size> size (instead of single post).
Relevant only for BW and raw_ethernet_burst_lat.
--recv_post_list=<list size>
Post list of receive WQEs of <list size> size (instead of single post).
Relevant only for BW and raw_ethernet_burst_lat.
-L, --hop_limit=<hop_limit>
Set hop limit value (ttl for IPv4 RawEth QP). Values 0-255 (default 64).
Relevant only for RawEth
Not relevant for raw_ethernet_fs_rate.
-m, --mtu=<mtu>
MTU size : 64 - 9600 (default port mtu) for RawEth else 256 - 4096.
Not relevant for raw_ethernet_fs_rate.
-M, --MGID=<multicast_gid>
In multicast, uses <multicast_gid> as the group MGID.
<multicast_gid> can be either decimal or hexadecimal, e.g. regarding the IPv4 224.0.0.30 :
Decimal: 0:0:0:0:0:0:0:0:0:0:255:255:224:0:0:30 , Hexadecimal:
0:0:0:0:0:0:0:0:0:0:0xff:0xff:0xe0:0:0:0x1e
Relevant only for send non fsRate.
-n, --iters=<iters>
Number of exchanges (at least 5, default for write 5000 else 1000 ).
-N, --noPeak
Cancel peak-bw calculation (default with peak up to iters=20000).
Relevant only for bandwidth.
-o, --outs=<num>
Relevant only for Read and Atomic.
-O, --dualport
Run test in dual-port mode.
Not relevant for RawEth.
Relevant only for bandwidth.
System support required.
-p, --port=<port>
Listen on/connect to port <port> (default 18515).
-q, --qp=<num of qp's>
Num of qp's(default 1).
Relevant only for bandwidth.
-Q, --cq-mod
Generate Cqe only after <--cq-mod> completion.
Relevant only for bandwidth.
-r, --rx-depth=<dep>
Rx queue size (default 512), if using srq, rx-depth controls max-wr size of the srq.
Relevant only for send non fsRate.
-R, --rdma_cm
Connect QPs with rdma_cm and run test on those QPs.
Not relevant for RawEth.
-s, --size=<size>
Size of message to exchange (default 65536 for bw, for lat 2).
Not relevant for Atomic.
-S, --sl=<sl>
SL (default 0).
Not relevant for raw_ethernet_fs_rate.
-t, --tx-depth=<dep>
Size of tx queue (default 128 for bw else 1).
Relevant only for bw and raw_ethernet_burst_lat.
-T, --tos=<tos value>
Set <tos_value> to RDMA-CM QPs. available only with -R flag. values 0-256 (default off).
Not relevant for RawEth
-u, --qp-timeout=<timeout>
QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14.
-U, --report-unsorted
(implies -H) print out unsorted results (default sorted).
Relevant only for latency and raw_ethernet_burst_lat and raw_ethernet_fs_rate.
-V, --version
Display perftest version number.
-W, --report-counters=<list of counter names>
Report performance counter change (example: counters/port_xmit_data,hw_counters/out_of_buffer).
-x, --gid-index=<index>
Test uses GID with GID index.
Not relevant for RawEth.
-z, --comm_rdma_cm
Communicate with rdma_cm module to exchange data - use regular QPs.
Not relevant for RawEth.
--out_json
Save the report in a json file.
--out_json_file=<file>
Name of the report json file. (Default: "perftest_out.json" in the working directory).
--cpu_util
Show CPU Utilization in report, valid only in Duration mode.
--dlid
Set a Destination LID instead of getting it from the other side.
Not relevant for raw_ethernet_fs_rate.
--dont_xchg_versions
Do not exchange versions and MTU with other side.
Not relevant for RawEth.
--force-link=<value>
Force the link(s) to a specific type: IB or Ethernet.
Not relevant for raw_ethernet_fs_rate.
--use-srq
Use a Shared Receive Queue. --rx-depth controls max-wr size of the SRQ.
Relevant only for Send.
--ipv6
Use IPv6 GID. Default is IPv4.
Not relevant for RawEth.
--ipv6-addr=<IPv6>
Use IPv6 address for parameters negotiation. Default is IPv4.
Not relevant for RawEth.
--bind_source_ip
Source IP of the interface used for connection establishment. By default taken from routing
table.
Not relevant for RawEth.
--latency_gap=<delay_time>
delay time between each post send.
Relevant only for latency.
--mmap=file
Use an mmap'd file as the buffer for testing P2P transfers.
Not relevant for RawEth.
--mmap-offset=<offset>
The mmap offset.
Not relevant for RawEth.
--mr_per_qp
Create memory region for each qp.
Relevant only for bandwidth.
--odp
Use On Demand Paging instead of Memory Registration.
System support required.
--output=<units>
Set verbosity output level: bandwidth , message_rate, latency.
Latency measurement is Average calculation.
bw (bandwidth / message_rate), latency (latency).
--payload_file_path=<payload_txt_file_path>
Set the payload by passing a txt file containing a pattern in the next form(little endian):
'0xaaaaaaaa, 0xbbbbbbbb, ...
Not relevant for RawEth and Write latency.
--use_old_post_send
Use old post send flow (ibv_post_send).
--perform_warm_up
Perform some iterations before start measuring in order to warming-up memory cache.
Not relevant for raw_ethernet_fs_rate.
--pkey_index=<pkey index>
PKey index to use for QP.
Not relevant for raw_ethernet_fs_rate.
--report-both
Report RX & TX results separately on Bidirectional BW tests.
Relevant only for bidirectional bandwidth.
--report_gbits
Report Max/Average BW of test in Gbit/sec (instead of MiB/sec).
Relevant only for bandwidth.
--report-per-port
Report BW data on both ports when running Dualport and Duration mode.
Not relevant for RawEth.
System support required.
--reversed
Reverse traffic direction - Server send to client.
--run_infinitely
Run test forever, print results every <duration> seconds.
--retry_count=<value>
Set retry count value in rdma_cm mode.
Relevant only for rdma_cm mode.
Not relevant for RawEth.
--tclass=<value>
Set the Traffic Class in GRH (if GRH is in use).
Not relevant for raw_ethernet_fs_rate.
--use-null-mr
Allocate a null memory region for the client with ibv_alloc_null_mr(3)
--use_cuda=<cuda device id>
Use CUDA specific device for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.
--use_cuda_bus_id=<cuda full BUS id>
Use CUDA specific device, based on its full PCIe address, for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.
--use_cuda_dmabuf
Use CUDA DMA-BUF for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.
--use_hl=<hl device id>
Use HabanaLabs specific device for HW accelerator direct RDMA testing.
System support required.
--use_neuron=<logical neuron core id>
Use Neuron specific device for HW accelerator direct RDMA testing.
System support required.
--use_neuron_dmabuf
Use Neuron DMA-BUF for HW accelerator direct RDMA testing.
System support required.
--use_rocm=<rocm device id>
Use selected ROCm device for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.
--use_hugepages
Use Hugepages instead of contig, memalign allocations.
Not relevant for raw_ethernet_fs_rate.
--wait_destroy=<seconds>
Wait <seconds> before destroying allocated resources (QP/CQ/PD/MR..).
Relevant only for bandwidth and raw_ethernet_burst_lat.
--disable_pcie_relaxed
Disable PCIe relaxed ordering.
Relevant only for bandwidth and raw_ethernet_burst_lat.
System support required.
--burst_size=<size>
Set the amount of messages to send in a burst when using rate limiter.
Relevant only for bandwidth and raw_ethernet_burst_lat.
--typical_pkt_size=<bytes>
Set the size of packet to send in a burst. Only supports PP rate limiter.
Relevant only for bandwidth and raw_ethernet_burst_lat.
--rate_limit=<rate>
Set the maximum rate of sent packages. default unit is [Gbps]. use --rate_units to change that.
Relevant only for bandwidth and raw_ethernet_burst_lat.
--rate_units=<units>
[Mgp] Set the units for rate limit to MiBps (M), Gbps (g) or pps (p). default is Gbps (g).
Relevant only for bandwidth and raw_ethernet_burst_lat.
--rate_limit_type=<type>
[HW/SW/PP] Limit the QP's by HW, PP or by SW. Disabled by default. When rate_limit is not
specified HW limit is Default.
Relevant only for bandwidth and raw_ethernet_burst_lat.
--use_ooo
Use out of order data placement.
System support required.
--write_with_imm
Use write-with-immediate verb instead of write.
Write tests only.
RawEth only options
-B, --source_mac
Source MAC address by this format XX:XX:XX:XX:XX:XX **MUST** be entered.
-E, --dest_mac
Destination MAC address by this format XX:XX:XX:XX:XX:XX **MUST** be entered.
-G, --use_rss
Use RSS on server side. need to open 2^x qps (using -q flag. default is -q 2). open 2^x clients
that transmit to this server.
-J, --dest_ip
Destination ip address by this format X.X.X.X for IPv4 or X:X:X:X:X:X for IPv6 (using to send
packets with IP header).
System support required for IPv6.
-j, --source_ip
Source ip address by this format X.X.X.X for IPv4 or X:X:X:X:X:X for IPv6 (using to send packets
with IP header).
System support required for IPv6.
-K, --dest_port
Destination port number (using to send packets with UDP header as default, or you can use --tcp
flag to send TCP Header).
-k, --source_port
Source port number (using to send packets with UDP header as default, or you can use --tcp flag
to send TCP Header).
-Y, --ethertype
Ethertype value in the ethernet frame by this format 0xXXXX.
-Z, --server
Choose server side for the current machine (--server/--client must be selected ).
--vlan_en
Insert vlan tag in ethernet header.
--vlan_pcp
Specify vlan_pcp value for vlan tag, 0~7. 8 means different vlan_pcp for each packet.
-P, --client
Choose client side for the current machine (--server/--client must be selected).
Not relevant for raw_ethernet_fs_rate.
-v, --mac_fwd
Run mac forwarding test.
Not relevant for raw_ethernet_fs_rate.
--flows
Set number of TCP/UDP flows, starting from <src_port, dst_port>.
Not relevant for raw_ethernet_fs_rate.
--flows_burst
Set number of burst size per TCP/UDP flow.
Not relevant for raw_ethernet_fs_rate.
--promiscuous
Run promiscuous mode.
Not relevant for raw_ethernet_fs_rate.
--reply_every
In latency test, receiver pong after number of received pings.
Not relevant for raw_ethernet_fs_rate.
--sniffer
Run sniffer mode.
Not relevant for raw_ethernet_fs_rate.
System support required.
--flow_label
IPv6 flow label.
Not relevant for raw_ethernet_fs_rate.
--tcp
Send TCP Packets. must include IP and Ports information.
--raw_ipv6
Send IPv6 Packets.
System support required.
--raw_mcast.
Relevant only for bandwidth.
AUTHORS
Hassan Khadour <hkhadour@nvidia.com>
Talat Batheesh <talatb@nvidia.com>
perftest(1)