JIT: Stack zeroed with rep stosd

C# source in this gist: https://gist.github.com/Zhentar/4ffb0a5d597c4c1e788d6007f1602b21

According to vTune, 5% of my execution time is in my function's prologue. This was unexpected because it hadn't been in previous iterations (and my function body had unfortunately not improved at all).
Looking at the the disassembly, I see:
```asm
LineEnumerator.MoveNext()
	push    rdi
	push    rsi
	sub     rsp,48h
	mov     rsi,rcx
	lea     rdi,[rsp+28h]
	mov     ecx,8
	xor     eax,eax
	rep     stos dword ptr [rdi]
	mov     rcx,rsi
	mov     rax,0F1CD0434ED23h
	mov     qword ptr [rsp+40h],rax
```

The `rep stos dword` in there seems rather odd - at the very least, it should be a `rep stos qword` with half as many iterations (although I'm not sure it would be any faster on my Skylake). But also I don't think there's any x86 architecture for which a 32 byte `rep stos` is faster than a reasonable unrolled version and the unrolled version wouldn't even be particularly large. And some of the comments in the JIT code seem to suggest that `rep stos` shouldn't ever be getting emitted.

category:cq
theme:optimization
skill-level:intermediate
cost:medium

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JIT: Stack zeroed with rep stosd #10744

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

JIT: Stack zeroed with rep stosd #10744

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions