- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
Add SIMD to LowerCallMemcmp #84530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SIMD to LowerCallMemcmp #84530
Conversation
| Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsAdd SIMD to unroll length  bool Test(Span<byte> s) => s.SequenceEqual(
    "THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND"u8);Old codegen:; Method Prog:Test(System.Span`1[ubyte]):bool:this
G_M52730_IG01:              
       4883EC28             sub      rsp, 40
G_M52730_IG02:              
       49B8882A908BDA010000 mov      r8, 0x1DA8B902A88
       488B0A               mov      rcx, bword ptr [rdx]
       8B5208               mov      edx, dword ptr [rdx+08H]
       4C89442420           mov      bword ptr [rsp+20H], r8
       83FA3E               cmp      edx, 62
       7513                 jne      SHORT G_M52730_IG04
G_M52730_IG03:              
       41B83E000000         mov      r8d, 62
       488B542420           mov      rdx, bword ptr [rsp+20H]
       FF1591FE1600         call     [System.SpanHelpers:SequenceEqual(byref,byref,ulong):bool]
       EB02                 jmp      SHORT G_M52730_IG05
G_M52730_IG04:              
       33C0                 xor      eax, eax
G_M52730_IG05:              
       4883C428             add      rsp, 40
       C3                   ret      
; Total bytes of code: 56New codegen:; Method Prog:Test(System.Span`1[ubyte]):bool:this
G_M52730_IG01:              
       C5F877               vzeroupper 
G_M52730_IG02:              
       48B8882A7D01B3020000 mov      rax, 0x2B3017D2A88
       488B0A               mov      rcx, bword ptr [rdx]
       8B5208               mov      edx, dword ptr [rdx+08H]
       4883FA3E             cmp      rdx, 62
       752B                 jne      SHORT G_M52730_IG04
G_M52730_IG03:              
       C5FC1001             vmovups  ymm0, ymmword ptr[rcx]
       C5FC1008             vmovups  ymm1, ymmword ptr[rax]
       C5FC10511E           vmovups  ymm2, ymmword ptr[rcx+1EH]
       C5FC10581E           vmovups  ymm3, ymmword ptr[rax+1EH]
       C5FDEFC1             vpxor    ymm0, ymm0, ymm1
       C5EDEFCB             vpxor    ymm1, ymm2, ymm3
       C5FDEBC1             vpor     ymm0, ymm0, ymm1
       C4E27D17C0           vptest   ymm0, ymm0
       0F94C0               sete     al
       0FB6C0               movzx    rax, al
       EB02                 jmp      SHORT G_M52730_IG05
G_M52730_IG04:              
       33C0                 xor      eax, eax
G_M52730_IG05:              
       C5F877               vzeroupper 
       C3                   ret      
; Total bytes of code: 74
 | 
| GenTree* rXor = newBinaryOp(comp, GT_XOR, actualLoadType, l2Indir, r2Indir); | ||
| GenTree* resultOr = newBinaryOp(comp, GT_OR, actualLoadType, lXor, rXor); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you log an issue tracking us fixing this to opportunistically using vpternlog for AVX-512 hardware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you log an issue tracking us fixing this to opportunistically using
vpternlogfor AVX-512 hardware?
Good idea, done: #84534
| #84536 is the SPMI replay failure | 
| PTAL @jakobbotsch since you reviewed the previous impl of  | 
Add SIMD to unroll length
[16..64](can be enabled for[64..128]with avx512),[16..32]on arm64.Old codegen:
New codegen: