这是一个关于使能MySQL PGO的测试报告,正确的使能MySQL PGO,性能提升明显,写性能大概提升13%左右,读性能大概提升20%左右。
环境搭建
此套测试环境是从阿里云申请的ARM服务器ECS,如下表所示:
测试方 | 操作系统 | 内核版本 | GCC版本 | MySQL版本 | Boost版本 | CPU核数 | 内存 | NUMA |
---|---|---|---|---|---|---|---|---|
服务端 | Ubuntu 22.04.2 LTS | 5.15.0-76-generic | 11.4.0 | mysql-8.0-8.0.33 | 1.77.0 | 8 | 32G | 1 |
客户端 | 5.10.134-14.al8.aarch64 | Alibaba Cloud Linux 3 | 10.2.1 | 8.0.33 | 1.77.0 | 32 | 128G | 1 |
MySQL PGO编译步骤
以下是编译MySQL的步骤:
- 在cmake的时候加入参数"-DFPROFILE_GENERATE=ON",然后编译:
$ cd mysql-server-8.0.33
$ mkdir build
$ cmake -DCMAKE_C_FLAGS="-g -O3 -march=native -mcpu=native -flto" -DCMAKE_CXX_FLAGS="-g -O3 -march=native -mcpu=native -flto" -DCMAKE_INSTALL_PREFIX=/mysql_data/mysql_8.0.33_gcc_11.3.0_profile -DWITH_BOOST=/build/boost_1_77_0 -DFPROFILE_GENERATE=ON ..
$ make -j $(nproc)
$ make install
- 启动安装在/mysql_data/mysql_8.0.33_gcc_11.3.0_profile的MySQL,从客户端跑写和读的性能测试,profile数据会生成在目录:
mysql-server-8.0.33/build-profile-data
- 加入参数"-DFPROFILE_USE=ON",重新编译MySQL:
$ cd mysql-server-8.0.33
$ rm rf build && mkdir build # make sure it's same build directory as in step 1, or it would fail to find the profile data under build-profile-data
$ cmake -DCMAKE_C_FLAGS="-g -O3 -march=native -mcpu=native -flto" -DCMAKE_CXX_FLAGS="-g -O3 -march=native -mcpu=native -flto" -DCMAKE_INSTALL_PREFIX=/mysql_data/mysql_8.0.33_gcc_11.3.0_pgo -DWITH_BOOST=/build/boost_1_77_0 -DFPROFILE_USE=ON ..
$ make -j $(nproc)
$ make install
- 启动安装在/mysql_data/mysql_8.0.33_gcc_11.3.0_pgo的MySQL server,从客户端跑性能测试,并且对比非PGO版本的MySQL。
测试结果
以下是测试步骤:
- 重启server机器
- 启动非PGO版本的MySQL
- 跑写测试
- 跑读测试
- 重复步骤1到4共3遍
- 重启server机器
- 启动PGO版本的MySQL
- 跑写测试
- 跑读测试
- 重复步骤6到9共3遍
非PGO
写测试结果
以下是非PGO版本的3轮测试结果:
Throughput:
events/s (eps): 7837.0387
time elapsed: 300.0977s
total number of events: 2351876
Throughput:
events/s (eps): 7616.9932
time elapsed: 300.0844s
total number of events: 2285740
Throughput:
events/s (eps): 7893.9496
time elapsed: 300.0817s
total number of events: 2368829
读测试结果
以下是非PGO版本3轮读测试的结果:
Throughput:
events/s (eps): 3768.8060
time elapsed: 300.1503s
total number of events: 1131208
Throughput:
events/s (eps): 3688.1464
time elapsed: 300.1504s
total number of events: 1106998
Throughput:
events/s (eps): 3774.9087
time elapsed: 300.1509s
total number of events: 1133042
PGO版本
写测试结果(+13.4%, +16.7%, +11.8%)
以下是3轮PGO版本的写测试结果,相比非PGO版本,每一轮的测试性能提升分别为:13.4%, 16.7%, 11.8%:
Throughput:
events/s (eps): 8891.5023
time elapsed: 300.0943s
total number of events: 2668288
Throughput:
events/s (eps): 8892.7030
time elapsed: 300.0876s
total number of events: 2668589
Throughput:
events/s (eps): 8831.1063
time elapsed: 300.0857s
total number of events: 2650088
读测试结果 (+25.9%, +20.9%, +16.4%)
以下是3轮PGO版本的读测试结果,相比非PGO版本,每一轮的测试性能提升分别为:25.9%, 20.9%, 16.4%:
Throughput:
events/s (eps): 4746.7576
time elapsed: 300.1492s
total number of events: 1424735
Throughput:
events/s (eps): 4460.4811
time elapsed: 300.1489s
total number of events: 1338808
Throughput:
events/s (eps): 4395.7754
time elapsed: 300.1699s
total number of events: 1319479