Rather than invasively patching GCC, given we have binutils 2.38 available, we can patch it to flip the default for
-muse-unaligned-vector-move
.
A 1 line binutils patch, is much more maintainable than the ~300 line patch into GCC. It’s also a slight inprovement in regards to patching out ualigned instructions in the release binaries. For comparison: Master:
0objdump -D bin/*.exe | rg "vmova|vmovdqa|vmovaps|vmovapd|vmovdqa64|vmovdqa32"
1141b8be20: c5 f8 28 1a vmovaps (%rdx), %xmm3
21420564b3: c5 79 29 36 vmovapd %xmm14, (%rsi)
31403060f3: c5 79 29 36 vmovapd %xmm14, (%rsi)
4140792b13: c5 79 29 36 vmovapd %xmm14, (%rsi)
5140cb0693: c5 79 29 36 vmovapd %xmm14, (%rsi)
61415ea0f3: c5 79 29 36 vmovapd %xmm14, (%rsi)
This PR:
0objdump -D bin/*.exe | rg "vmova|vmovdqa|vmovaps|vmovapd|vmovdqa64|vmovdqa32"
1141b8be20: c5 f8 28 1a vmovaps (%rdx), %xmm3
21420564b3: c5 79 29 36 vmovapd %xmm14, (%rsi)
31403060f3: c5 79 29 36 vmovapd %xmm14, (%rsi)
4140792b13: c5 79 29 36 vmovapd %xmm14, (%rsi)
5140cb0693: c5 79 29 36 vmovapd %xmm14, (%rsi)