我正在做一些与整个英特尔QAT相关的dpdk密码开发测试。当使用dpdk应用程序dpdk test-crypto时,我注意到吞吐量比http://fast.dpdk.org/doc/perf/DPDK_20_11_Intel_crypto_performance_report.pdf中提到的要低得多。
[root]# ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem -w 0000:3d:01.0 -w 0000:3d:01.7 -w 0000:3d:02.7 -l 4,5,13,6,14 -n 4 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --devtype crypto_qat --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 30000000 --silent --digest-sz 20 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt
EAL: Detected 36 lcore(s)
EAL: Detected 2 NUMA nodes
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: Probe PCI driver: qat (8086:37c9) device: 0000:3d:01.0 (socket 0)
CRYPTODEV: Creating cryptodev 0000:3d:01.0_qat_sym
CRYPTODEV: Initialisation parameters - name: 0000:3d:01.0_qat_sym,socket id: 0, max queue pairs: 0
CRYPTODEV: Creating cryptodev 0000:3d:01.0_qat_asym
CRYPTODEV: Initialisation parameters - name: 0000:3d:01.0_qat_asym,socket id: 0, max queue pairs: 0
EAL: Probe PCI driver: qat (8086:37c9) device: 0000:3d:01.7 (socket 0)
CRYPTODEV: Creating cryptodev 0000:3d:01.7_qat_sym
CRYPTODEV: Initialisation parameters - name: 0000:3d:01.7_qat_sym,socket id: 0, max queue pairs: 0
CRYPTODEV: Creating cryptodev 0000:3d:01.7_qat_asym
CRYPTODEV: Initialisation parameters - name: 0000:3d:01.7_qat_asym,socket id: 0, max queue pairs: 0
EAL: Probe PCI driver: qat (8086:37c9) device: 0000:3d:02.7 (socket 0)
CRYPTODEV: Creating cryptodev 0000:3d:02.7_qat_sym
CRYPTODEV: Initialisation parameters - name: 0000:3d:02.7_qat_sym,socket id: 0, max queue pairs: 0
CRYPTODEV: Creating cryptodev 0000:3d:02.7_qat_asym
CRYPTODEV: Initialisation parameters - name: 0000:3d:02.7_qat_asym,socket id: 0, max queue pairs: 0
EAL: No legacy callbacks, legacy socket not created
Allocated pool "priv_sess_mp_0" on socket 0
CRYPTODEV: elt_size 0 is expanded to 240
Allocated pool "sess_mp_0" on socket 0
lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf
13 64 32 30000000 30000000 608024205 590361292 1.4151 0.7245 2190.68
14 64 32 30000000 30000000 657459751 640717849 1.4149 0.7244 2190.99
5 64 32 30000000 30000000 605264521 587610893 1.4148 0.7244 2191.05
6 64 32 30000000 30000000 657948166 641046769 1.4148 0.7244 2191.08
6 128 32 30000000 30000000 656326323 639492353 1.4120 1.4459 2195.45
5 128 32 30000000 30000000 603124442 585311776 1.4116 1.4455 2196.03
13 128 32 30000000 30000000 606420080 588546448 1.4116 1.4455 2196.11
14 128 32 30000000 30000000 656535199 639674449 1.4115 1.4454 2196.28
5 256 32 30000000 30000000 612548897 594307290 1.3874 2.8413 2234.44
6 256 32 30000000 30000000 661625698 644285343 1.3871 2.8407 2234.91
14 256 32 30000000 30000000 661348938 644025757 1.3869 2.8403 2235.23
13 256 32 30000000 30000000 615542904 597231595 1.3868 2.8403 2235.29
13 512 32 30000000 30000000 648544131 629851341 1.3264 5.4330 2337.10
6 512 32 30000000 30000000 690157406 672252121 1.3264 5.4328 2337.21
14 512 32 30000000 30000000 688334849 670537999 1.3260 5.4312 2337.88
5 512 32 30000000 30000000 646173682 627438866 1.3256 5.4297 2338.55
5 1024 32 30000000 30000000 787708296 768071019 1.1471 9.3970 2702.48
13 1024 32 30000000 30000000 792311812 772652910 1.1461 9.3888 2704.84
6 1024 32 30000000 30000000 829463330 810543251 1.1458 9.3866 2705.48
14 1024 32 30000000 30000000 828377296 809370537 1.1457 9.3860 2705.65
13 2048 32 30000000 30000000 1399981574 1379798969 0.7602 12.4550 4077.90
14 2048 32 30000000 30000000 1437924813 1418541434 0.7601 12.4533 4078.45
6 2048 32 30000000 30000000 1441794521 1422292353 0.7600 12.4514 4079.11
5 2048 32 30000000 30000000 1399004014 1378687062 0.7600 12.4512 4079.15这是意料之中吗?还是我的配置有问题?也有很多的Enq/Deq失败。
[root]# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-1127.19.1.rt56.1116.el7.x86_64 root=/dev/mapper/centos-root ro processor.max_cstate=1 intel_idle.max_cstate=0 intel_pstate=disable idle=poll default_hugepagesz=1G hugepagesz=1G hugepages=100 intel_iommu=on iommu=pt selinux=0 enforcing=0 nmi_watchdog=0 audit=0 mce=off kthread_cpus=0,35 irqaffinity=0,35 skew_tick=1 isolcpus=1-34 intel_pstate=disable nosoftlockup nohz=on nohz_full=1-34 rcu_nocbs=1-34如果我使用sw密码,我会得到更好的结果。
[root] # ./dpdk-test-crypto-perf --socket-mem 2048,0 --legacy-mem --vdev crypto_aesni_mb_pmd_1 -l 4,5 -n 4 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --devtype crypto_aesni_mb --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 10000000 --silent --digest-sz 12 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt
EAL: Detected 6 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
CRYPTODEV: Creating cryptodev crypto_aesni_mb_pmd_1
CRYPTODEV: Initialisation parameters - name: crypto_aesni_mb_pmd_1,socket id: 0, max queue pairs: 8
EAL: No legacy callbacks, legacy socket not created
Allocated pool "priv_sess_mp_0" on socket 0
CRYPTODEV: elt_size 0 is expanded to 320
Allocated pool "sess_mp_0" on socket 0
lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf
5 64 32 10000000 10000000 0 0 8.4491 4.3259 295.89
5 128 32 10000000 10000000 0 0 6.9138 7.0798 361.59
5 256 32 10000000 10000000 0 0 5.2645 10.7816 474.88
5 512 32 10000000 10000000 0 0 3.5384 14.4931 706.54
5 1024 32 10000000 10000000 0 0 2.1451 17.5729 1165.43
5 2048 32 10000000 10000000 0 0 1.1839 19.3968 2111.69发布于 2021-11-16 07:44:18
基于日志和实时调试,得出了performance is in line with the expected values of both SW and HW的结论。以下是变异的原因
HW密码是在Xeon级联湖核上完成的,
H 110HW密码一起运行的CPU核运行的,HW密码被限制与内存控制器一起运行,总共运行<>D14,而SW密码是为10000000
。
使用这些正确的命令是
HW:./dpdk-test-crypto-perf --socket-mem 2048,1 --legacy-mem -l 4,6 -w 0000:3d:01.0 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 30000000 --silent --digest-sz 20 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt --devtype crypto_qat
SW:./dpdk-test-crypto-perf --socket-mem 2048,1 --legacy-mem -l 4,6 --vdev crypto_aesni_mb_pmd_1 -a 0000:00:00.0 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 30000000 --silent --digest-sz 20 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt --devtype crypto_aesni_mb
对于64B的Xeon (3.1Ghz),我们可以得到3.2Gbps,而corei7 (5 5Ghz)和SW可以得到4.2Gbps。在2048 able下,Xeon能达到50 50Gbps,而SW为19.2Gbps。
注意:在Xeon中,有用于队列和排队列的下降,这可以通过平台和BIOS设置来进一步减少。
https://stackoverflow.com/questions/69968932
复制相似问题