Description
Could you please consider adding an option to use non-alpine based haproxy ingress images?
Alpine's PTHREAD implementaion has a drasitc CPU overhead - (internals/details can be found here https://stackoverflow.com/questions/73807754/how-one-pthread-waits-for-another-to-finish-via-futex-in-linux/73813907#73813907 )
Here are two strace statistics samples for the same load profile (25K RPS via 3 haproxy ingress pods) for the equal period of time (about 1 minute):
1. GLIBC based haproxy
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
47.55 147.946790 53 2787268 880506 recvfrom
26.33 81.933249 88 929414 sendto
16.81 52.295309 54 962217 epoll_ctl
3.37 10.486387 51 203040 getpid
1.48 4.597493 51 90048 clock_gettime
1.41 4.380619 97 44924 epoll_wait
0.64 2.003053 54 36497 getsockopt
0.56 1.731618 97 17829 close
0.51 1.582058 56 28118 setsockopt
0.39 1.207813 66 18144 8945 accept4
0.38 1.188416 116 10223 10223 connect
0.29 0.903808 88 10223 socket
0.18 0.548180 53 10223 fcntl
0.10 0.299368 79 3785 1130 futex
0.00 0.011658 60 193 timer_settime
0.00 0.010546 54 193 30 rt_sigreturn
------ ----------- ----------- --------- --------- ----------------
100.00 311.126365 5152339 900834 total
2. MUSL based haproxy:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
68.24 412.454997 96 4259899 419280 futex
10.00 60.440537 120 502107 madvise
8.74 52.833292 111 472438 121948 recvfrom
4.22 25.477060 166 152913 sendto
2.80 16.921311 107 157293 getpid
2.26 13.680361 109 125062 epoll_ctl
1.38 8.351141 119 69682 writev
0.54 3.254861 106 30535 clock_gettime
0.37 2.255775 148 15187 epoll_pwait
0.34 2.033282 178 11419 close
0.31 1.844610 117 15724 5964 accept4
0.25 1.530881 110 13850 setsockopt
0.25 1.509742 107 14001 getsockopt
0.08 0.466851 157 2966 munmap
0.06 0.392208 170 2294 2294 connect
0.06 0.378519 107 3505 mmap
0.05 0.287839 125 2294 socket
0.04 0.234976 102 2294 fcntl
0.00 0.014530 94 154 timer_settime
0.00 0.014262 92 154 15 rt_sigreturn
0.00 0.006613 143 46 23 read
0.00 0.003571 148 24 write
0.00 0.003377 241 14 shutdown
------ ----------- ----------- --------- --------- ----------------
100.00 604.390596 5853855 549524 total
As you can see - the last one (MUSL based one) - 60+% of time spends on futex (FUTEX_WAKE_PRIVATE to be exact) system calls.
As a reuslt - more than twice higher CPU utilisation on the same load profile acommpaned by upstream's sessions number spikes: