|
|
Subscribe / Log in / New account

Re: [PATCH 00/10] sched: EEVDF using latency-nice

Thread information [Search the linux-kernel archive]

From:  K Prateek Nayak <kprateek.nayak-AT-amd.com>
To:  Peter Zijlstra <peterz-AT-infradead.org>, mingo-AT-kernel.org, vincent.guittot-AT-linaro.org
Subject:  Re: [PATCH 00/10] sched: EEVDF using latency-nice
Date:  Wed, 22 Mar 2023 12:19:11 +0530
Message-ID:  <18199afa-16e2-8e76-c96b-9507f92befdc@amd.com>
Cc:  linux-kernel-AT-vger.kernel.org, juri.lelli-AT-redhat.com, dietmar.eggemann-AT-arm.com, rostedt-AT-goodmis.org, bsegall-AT-google.com, mgorman-AT-suse.de, bristot-AT-redhat.com, corbet-AT-lwn.net, qyousef-AT-layalina.io, chris.hyser-AT-oracle.com, patrick.bellasi-AT-matbug.net, pjt-AT-google.com, pavel-AT-ucw.cz, qperret-AT-google.com, tim.c.chen-AT-linux.intel.com, joshdon-AT-google.com, timj-AT-gnu.org, yu.c.chen-AT-intel.com, youssefesmat-AT-chromium.org, joel-AT-joelfernandes.org

Hello Peter,

Leaving some results from my testing on a dual socket Zen3 machine
(2 x 64C/128T) below.

tl;dr

o I've not tested workloads with nice and latency nice yet focusing more
  on the out of the box performance. No changes to sched_feat were made
  for the same reason. 

o Except for hackbench (m:n communication relationship), I do not see any
  regression for other standard benchmarks (mostly 1:1 or 1:n) relation
  when system is below fully loaded.

o At fully loaded scenario, schbench seems to be unhappy. Looking at the
  data from /proc/<pid>/sched for the tasks with schedstats enabled,
  there is an increase in number of context switches and the total wait
  sum. When system is overloaded, things flip and the schbench tail
  latency improves drastically. I suspect the involuntary
  context-switches help workers make progress much sooner after wakeup
  compared to tip thus leading to lower tail latency.

o For the same reason as above, tbench throughput takes a hit with
  number of involuntary context-switches increasing drastically for the
  tbench server. There is also an increase in wait sum noticed.

o Couple of real world workloads were also tested. DeathStarBench
  throughput tanks much more with the updated version in your tree
  compared to this series as is.
  SpecJBB Max-jOPS sees large improvements but comes at a cost of
  drop in Critical-jOPS signifying an increase in either wait time
  or an increase in involuntary context-switches which can lead to
  transactions taking longer to complete.

o Apart from DeathStarBench, the all the trends reported remain same
  comparing the version in your tree and this series, as is, applied
  on the same base kernel.

I'll leave the detailed results below and some limited analysis. 

On 3/6/2023 6:55 PM, Peter Zijlstra wrote:
> Hi!
> 
> Ever since looking at the latency-nice patches, I've wondered if EEVDF would
> not make more sense, and I did point Vincent at some older patches I had for
> that (which is here his augmented rbtree thing comes from).
> 
> Also, since I really dislike the dual tree, I also figured we could dynamically
> switch between an augmented tree and not (and while I have code for that,
> that's not included in this posting because with the current results I don't
> think we actually need this).
> 
> Anyway, since I'm somewhat under the weather, I spend last week desperately
> trying to connect a small cluster of neurons in defiance of the snot overlord
> and bring back the EEVDF patches from the dark crypts where they'd been
> gathering cobwebs for the past 13 odd years.
> 
> By friday they worked well enough, and this morning (because obviously I forgot
> the weekend is ideal to run benchmarks) I ran a bunch of hackbenck, netperf,
> tbench and sysbench -- there's a bunch of wins and losses, but nothing that
> indicates a total fail.
> 
> ( in fact, some of the schbench results seem to indicate EEVDF schedules a lot
>   more consistent than CFS and has a bunch of latency wins )
> 
> ( hackbench also doesn't show the augmented tree and generally more expensive
>   pick to be a loss, in fact it shows a slight win here )
> 
> 
>   hackbech load + cyclictest --policy other results:
> 
> 
> 			EEVDF			 CFS
> 
> 		# Min Latencies: 00053
>   LNICE(19)	# Avg Latencies: 04350
> 		# Max Latencies: 76019
> 
> 		# Min Latencies: 00052		00053
>   LNICE(0)	# Avg Latencies: 00690		00687
> 		# Max Latencies: 14145		13913
> 
> 		# Min Latencies: 00019
>   LNICE(-19)	# Avg Latencies: 00261
> 		# Max Latencies: 05642
> 

Following are the results from testing the series on a dual socket
Zen3 machine (2 x 64C/128T):

NPS Modes are used to logically divide single socket into
multiple NUMA region.
Following is the NUMA configuration for each NPS mode on the system:

NPS1: Each socket is a NUMA node.
    Total 2 NUMA nodes in the dual socket machine.

    Node 0: 0-63,   128-191
    Node 1: 64-127, 192-255

NPS2: Each socket is further logically divided into 2 NUMA regions.
    Total 4 NUMA nodes exist over 2 socket.
   
    Node 0: 0-31,   128-159
    Node 1: 32-63,  160-191
    Node 2: 64-95,  192-223
    Node 3: 96-127, 223-255

NPS4: Each socket is logically divided into 4 NUMA regions.
    Total 8 NUMA nodes exist over 2 socket.
   
    Node 0: 0-15,    128-143
    Node 1: 16-31,   144-159
    Node 2: 32-47,   160-175
    Node 3: 48-63,   176-191
    Node 4: 64-79,   192-207
    Node 5: 80-95,   208-223
    Node 6: 96-111,  223-231
    Node 7: 112-127, 232-255

Kernel versions:
- tip:          6.2.0-rc6 tip sched/core
- eevdf: 	6.2.0-rc6 tip sched/core
		+ eevdf commits from your tree
		(https://git.kernel.org/pub/scm/linux/kernel/git/peterz/qu...)

- eevdf prev:	6.2.0-rc6 tip sched/core + this series as is

When the testing started, the tip was at:
commit 7c4a5b89a0b5 "sched/rt: pick_next_rt_entity(): check list_entry"

Benchmark Results:

~~~~~~~~~~~~~
~ hackbench ~
~~~~~~~~~~~~~

o NPS1

Test:			tip			eevdf
 1-groups:	   4.63 (0.00 pct)	   4.52 (2.37 pct)
 2-groups:	   4.42 (0.00 pct)	   5.41 (-22.39 pct)	*
 4-groups:	   4.21 (0.00 pct)	   5.26 (-24.94 pct)	*
 8-groups:	   4.95 (0.00 pct)	   5.01 (-1.21 pct)
16-groups:	   5.43 (0.00 pct)	   6.24 (-14.91 pct)	*

o NPS2

Test:			tip			eevdf
 1-groups:	   4.68 (0.00 pct)	   4.56 (2.56 pct)
 2-groups:	   4.45 (0.00 pct)	   5.19 (-16.62 pct)	*
 4-groups:	   4.19 (0.00 pct)	   4.53 (-8.11 pct)	*
 8-groups:	   4.80 (0.00 pct)	   4.81 (-0.20 pct)
16-groups:	   5.60 (0.00 pct)	   6.22 (-11.07 pct)	*

o NPS4

Test:			tip			eevdf
 1-groups:	   4.68 (0.00 pct)	   4.57 (2.35 pct)
 2-groups:	   4.56 (0.00 pct)	   5.19 (-13.81 pct)	*
 4-groups:	   4.50 (0.00 pct)	   4.96 (-10.22 pct)	*
 8-groups:	   5.76 (0.00 pct)	   5.49 (4.68 pct)
16-groups:	   5.60 (0.00 pct)	   6.53 (-16.60 pct)	*

~~~~~~~~~~~~
~ schbench ~
~~~~~~~~~~~~

o NPS1

#workers:	tip			eevdf
  1:	  36.00 (0.00 pct)	  36.00 (0.00 pct)
  2:	  37.00 (0.00 pct)	  37.00 (0.00 pct)
  4:	  38.00 (0.00 pct)	  39.00 (-2.63 pct)
  8:	  52.00 (0.00 pct)	  50.00 (3.84 pct)
 16:	  66.00 (0.00 pct)	  68.00 (-3.03 pct)
 32:	 111.00 (0.00 pct)	 109.00 (1.80 pct)
 64:	 213.00 (0.00 pct)	 212.00 (0.46 pct)
128:	 502.00 (0.00 pct)	 637.00 (-26.89 pct)	*
256:	 45632.00 (0.00 pct)	 24992.00 (45.23 pct)	^
512:	 78720.00 (0.00 pct)	 44096.00 (43.98 pct)	^

o NPS2

#workers:	tip			eevdf
  1:	  31.00 (0.00 pct)	  23.00 (25.80 pct)
  2:	  32.00 (0.00 pct)	  33.00 (-3.12 pct)
  4:	  39.00 (0.00 pct)	  37.00 (5.12 pct)
  8:	  52.00 (0.00 pct)	  49.00 (5.76 pct)
 16:	  67.00 (0.00 pct)	  68.00 (-1.49 pct)
 32:	 113.00 (0.00 pct)	 112.00 (0.88 pct)
 64:	 213.00 (0.00 pct)	 214.00 (-0.46 pct)
128:	 508.00 (0.00 pct)	 491.00 (3.34 pct)
256:	 46912.00 (0.00 pct)	 22304.00 (52.45 pct)	^
512:	 76672.00 (0.00 pct)	 42944.00 (43.98 pct)	^

o NPS4

#workers:	tip			eevdf
  1:	  33.00 (0.00 pct)	  30.00 (9.09 pct)
  2:	  40.00 (0.00 pct)	  36.00 (10.00 pct)
  4:	  44.00 (0.00 pct)	  41.00 (6.81 pct)
  8:	  73.00 (0.00 pct)	  73.00 (0.00 pct)
 16:	  71.00 (0.00 pct)	  71.00 (0.00 pct)
 32:	 111.00 (0.00 pct)	 115.00 (-3.60 pct)
 64:	 217.00 (0.00 pct)	 211.00 (2.76 pct)
128:	 509.00 (0.00 pct)	 553.00 (-8.64 pct)	*
256:	 44352.00 (0.00 pct)	 26848.00 (39.46 pct)	^
512:	 75392.00 (0.00 pct)	 44352.00 (41.17 pct)	^


~~~~~~~~~~
~ tbench ~
~~~~~~~~~~

o NPS1

Clients:	tip			eevdf
    1	 483.10 (0.00 pct)	 476.46 (-1.37 pct)
    2	 956.03 (0.00 pct)	 943.12 (-1.35 pct)
    4	 1786.36 (0.00 pct)	 1760.64 (-1.43 pct)
    8	 3304.47 (0.00 pct)	 3105.19 (-6.03 pct)
   16	 5440.44 (0.00 pct)	 5609.24 (3.10 pct)
   32	 10462.02 (0.00 pct)	 10416.02 (-0.43 pct)
   64	 18995.99 (0.00 pct)	 19317.34 (1.69 pct)
  128	 27896.44 (0.00 pct)	 28459.38 (2.01 pct)
  256	 49742.89 (0.00 pct)	 46371.44 (-6.77 pct)	*
  512	 49583.01 (0.00 pct)	 45717.22 (-7.79 pct)	*
 1024	 48467.75 (0.00 pct)	 43475.31 (-10.30 pct)	*

o NPS2

Clients:	tip			eevdf
    1	 472.57 (0.00 pct)	 475.35 (0.58 pct)
    2	 938.27 (0.00 pct)	 942.19 (0.41 pct)
    4	 1764.34 (0.00 pct)	 1783.50 (1.08 pct)
    8	 3043.57 (0.00 pct)	 3205.85 (5.33 pct)
   16	 5103.53 (0.00 pct)	 5154.94 (1.00 pct)
   32	 9767.22 (0.00 pct)	 9793.81 (0.27 pct)
   64	 18712.65 (0.00 pct)	 18601.10 (-0.59 pct)
  128	 27691.95 (0.00 pct)	 27542.57 (-0.53 pct)
  256	 47939.24 (0.00 pct)	 43401.62 (-9.46 pct)	*
  512	 47843.70 (0.00 pct)	 43971.16 (-8.09 pct)	*
 1024	 48412.05 (0.00 pct)	 42808.58 (-11.57 pct)	*

o NPS4

Clients:	tip			eevdf
    1	 486.74 (0.00 pct)	 484.88 (-0.38 pct)
    2	 950.50 (0.00 pct)	 950.04 (-0.04 pct)
    4	 1778.58 (0.00 pct)	 1796.03 (0.98 pct)
    8	 3106.36 (0.00 pct)	 3180.09 (2.37 pct)
   16	 5139.81 (0.00 pct)	 5139.50 (0.00 pct)
   32	 9911.04 (0.00 pct)	 10086.37 (1.76 pct)
   64	 18201.46 (0.00 pct)	 18289.40 (0.48 pct)
  128	 27284.67 (0.00 pct)	 26947.19 (-1.23 pct)
  256	 46793.72 (0.00 pct)	 43971.87 (-6.03 pct)	*
  512	 48841.96 (0.00 pct)	 44255.01 (-9.39 pct)	*
 1024	 48811.99 (0.00 pct)	 43118.99 (-11.66 pct)	*

~~~~~~~~~~
~ stream ~
~~~~~~~~~~

o NPS1

- 10 Runs:

Test:		tip			eevdf
 Copy:	 321229.54 (0.00 pct)	 332975.45 (3.65 pct)
Scale:	 207471.32 (0.00 pct)	 212534.83 (2.44 pct)
  Add:	 234962.15 (0.00 pct)	 243011.39 (3.42 pct)
Triad:	 246256.00 (0.00 pct)	 256453.73 (4.14 pct)

- 100 Runs:

Test:		tip			eevdf
 Copy:	 332714.94 (0.00 pct)	 333183.42 (0.14 pct)
Scale:	 216140.84 (0.00 pct)	 212160.53 (-1.84 pct)
  Add:	 239605.00 (0.00 pct)	 233168.69 (-2.68 pct)
Triad:	 258580.84 (0.00 pct)	 256972.33 (-0.62 pct)

o NPS2

- 10 Runs:

Test:		tip			eevdf
 Copy:	 324423.92 (0.00 pct)	 340685.20 (5.01 pct)
Scale:	 215993.56 (0.00 pct)	 217895.31 (0.88 pct)
  Add:	 250590.28 (0.00 pct)	 257495.12 (2.75 pct)
Triad:	 261284.44 (0.00 pct)	 261373.49 (0.03 pct)

- 100 Runs:

Test:		tip			eevdf
 Copy:	 325993.72 (0.00 pct)	 341244.18 (4.67 pct)
Scale:	 227201.27 (0.00 pct)	 227255.98 (0.02 pct)
  Add:	 256601.84 (0.00 pct)	 258026.75 (0.55 pct)
Triad:	 260222.19 (0.00 pct)	 269878.75 (3.71 pct)

o NPS4

- 10 Runs:

Test:		tip			eevdf
 Copy:	 356850.80 (0.00 pct)	 371230.27 (4.02 pct)
Scale:	 247219.39 (0.00 pct)	 237846.20 (-3.79 pct)
  Add:	 268588.78 (0.00 pct)	 261088.54 (-2.79 pct)
Triad:	 272932.59 (0.00 pct)	 284068.07 (4.07 pct)

- 100 Runs:

Test:		tip			eevdf
 Copy:	 365965.18 (0.00 pct)	 371186.97 (1.42 pct)
Scale:	 246068.58 (0.00 pct)	 245991.10 (-0.03 pct)
  Add:	 263677.73 (0.00 pct)	 269021.14 (2.02 pct)
Triad:	 273701.36 (0.00 pct)	 280566.44 (2.50 pct)

~~~~~~~~~~~~~
~ Unixbench ~
~~~~~~~~~~~~~

o NPS1

Test			Metric	  Parallelism			tip		          eevdf
unixbench-dhry2reg      Hmean     unixbench-dhry2reg-1      49077561.21 (   0.00%)    49144835.64 (
0.14%)
unixbench-dhry2reg      Hmean     unixbench-dhry2reg-512  6285373890.61 (   0.00%)  6270537933.92 (
-0.24%)
unixbench-syscall       Amean     unixbench-syscall-1        2664815.40 (   0.00%)     2679289.17 *
-0.54%*
unixbench-syscall       Amean     unixbench-syscall-512      7848462.70 (   0.00%)     7456802.37 *
4.99%*
unixbench-pipe          Hmean     unixbench-pipe-1           2531131.89 (   0.00%)     2475863.05 *
-2.18%*
unixbench-pipe          Hmean     unixbench-pipe-512       305171024.40 (   0.00%)   301182156.60 (
-1.31%)
unixbench-spawn         Hmean     unixbench-spawn-1             4058.05 (   0.00%)        4284.38 *
5.58%*
unixbench-spawn         Hmean     unixbench-spawn-512          79893.24 (   0.00%)       78234.45 *
-2.08%*
unixbench-execl         Hmean     unixbench-execl-1             4148.64 (   0.00%)        4086.73 *
-1.49%*
unixbench-execl         Hmean     unixbench-execl-512          11077.20 (   0.00%)       11137.79 (
0.55%)

o NPS2

Test			Metric	  Parallelism			tip		          eevdf
unixbench-dhry2reg      Hmean     unixbench-dhry2reg-1      49394822.56 (   0.00%)    49175574.26 (
-0.44%)
unixbench-dhry2reg      Hmean     unixbench-dhry2reg-512  6267817215.36 (   0.00%)  6282838979.08 *
0.24%*
unixbench-syscall       Amean     unixbench-syscall-1        2663675.03 (   0.00%)     2677018.53 *
-0.50%*
unixbench-syscall       Amean     unixbench-syscall-512      7342392.90 (   0.00%)     7443264.00 *
-1.37%*
unixbench-pipe          Hmean     unixbench-pipe-1           2533194.04 (   0.00%)     2475969.01 *
-2.26%*
unixbench-pipe          Hmean     unixbench-pipe-512       303588239.03 (   0.00%)   302217597.98 *
-0.45%*
unixbench-spawn         Hmean     unixbench-spawn-1             5141.40 (   0.00%)        4862.78 (
-5.42%)    *
unixbench-spawn         Hmean     unixbench-spawn-512          82993.79 (   0.00%)       79139.42 *
-4.64%*    *
unixbench-execl         Hmean     unixbench-execl-1             4140.15 (   0.00%)        4084.20 *
-1.35%*
unixbench-execl         Hmean     unixbench-execl-512          12229.25 (   0.00%)       11445.22 (
-6.41%)    *

o NPS4

Test			Metric	  Parallelism			tip		          eevdf
unixbench-dhry2reg      Hmean     unixbench-dhry2reg-1      48970677.27 (   0.00%)    49070289.56 (
0.20%)
unixbench-dhry2reg      Hmean     unixbench-dhry2reg-512  6297506696.81 (   0.00%)  6311038905.07 (
0.21%)
unixbench-syscall       Amean     unixbench-syscall-1        2664715.13 (   0.00%)     2677752.20 *
-0.49%*
unixbench-syscall       Amean     unixbench-syscall-512      7938670.70 (   0.00%)     7972291.60 (
-0.42%)
unixbench-pipe          Hmean     unixbench-pipe-1           2527605.54 (   0.00%)     2476140.77 *
-2.04%*
unixbench-pipe          Hmean     unixbench-pipe-512       305068507.23 (   0.00%)   304114548.50 (
-0.31%)
unixbench-spawn         Hmean     unixbench-spawn-1             5207.34 (   0.00%)        4964.39 (
-4.67%)    *
unixbench-spawn         Hmean     unixbench-spawn-512          81352.38 (   0.00%)       74467.00 *
-8.46%*    *
unixbench-execl         Hmean     unixbench-execl-1             4131.37 (   0.00%)        4044.09 *
-2.11%*
unixbench-execl         Hmean     unixbench-execl-512          13025.56 (   0.00%)       11124.77 *
-14.59%*    *

~~~~~~~~~~~
~ netperf ~
~~~~~~~~~~~

o NPS1

                        tip                     eevdf
 1-clients:      107932.22 (0.00 pct)    106167.39 (-1.63 pct)
 2-clients:      106887.99 (0.00 pct)    105304.25 (-1.48 pct)
 4-clients:      106676.11 (0.00 pct)    104328.10 (-2.20 pct)
 8-clients:      98645.45 (0.00 pct)     94076.26 (-4.63 pct)
16-clients:      88881.23 (0.00 pct)     86831.85 (-2.30 pct)
32-clients:      86654.28 (0.00 pct)     86313.80 (-0.39 pct)
64-clients:      81431.90 (0.00 pct)     74885.75 (-8.03 pct)
128-clients:     55993.77 (0.00 pct)     55378.10 (-1.09 pct)
256-clients:     43865.59 (0.00 pct)     44326.30 (1.05 pct)

o NPS2

                        tip                     eevdf
 1-clients:      106711.81 (0.00 pct)    108576.27 (1.74 pct)
 2-clients:      106987.79 (0.00 pct)    108348.24 (1.27 pct)
 4-clients:      105275.37 (0.00 pct)    105702.12 (0.40 pct)
 8-clients:      103028.31 (0.00 pct)    96250.20 (-6.57 pct)
16-clients:      87382.43 (0.00 pct)     87683.29 (0.34 pct)
32-clients:      86578.14 (0.00 pct)     86968.29 (0.45 pct)
64-clients:      81470.63 (0.00 pct)     75906.15 (-6.83 pct)
128-clients:     54803.35 (0.00 pct)     55051.90 (0.45 pct)
256-clients:     42910.29 (0.00 pct)     44062.33 (2.68 pct)

~~~~~~~~~~~
~ SpecJBB ~
~~~~~~~~~~~

o NPS1

			tip		    eevdf
Max-jOPS	       100%		  115.71%	(+15.71%)  ^
Critical-jOPS	       100%		   93.59%	 (-6.41%)  *

~~~~~~~~~~~~~~~~~~
~ DeathStarBench ~
~~~~~~~~~~~~~~~~~~

o NPS1

  #CCX                      	 1 CCX    2 CCX   3 CCX   4 CCX
o eevdf compared to tip		-10.93	 -14.35	  -9.74	  -6.07
o eevdf prev (this sries as is)    
  compared to tip                -1.99    -6.64   -4.99   -3.87

Note: #CCX is the number of LLCs the the services are pinned to.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ Some Preliminary Analysis ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tl;dr

- There seems to be an increase in number of involuntary context switches
  when the system is overloaded. This probably allows newly waking task to
  make progress benefiting latency sensitive workload like schbench in
  overloaded scenario compared to tip but hurts tbench performance.
  When system is fully loaded, the larger average wait time seems to hurt
  the schbench performance.
  More analysis is needed to get to the bottom of the problem.

- For hackbench 2 groups scenario, there seems the wait time seems to go up
  drastically.

Scheduler statistics of interest are listed in detail below.

Note: Units of all metrics denoting time is ms. They are processed from
      per-task schedstats in /proc/<pid>/sched.

o Hackbench (2 Groups) (NPS1)

				tip		eevdf			%diff
Comm				sched-messaging	sched-messaging		N/A
Sum of avg_atom			282.0024818	19.04355233		-93.24702669
Average of avg_atom		3.481512121	0.235105584		-93.24702669
Sum of avg_per_cpu		1761.949461	61.52537145		-96.50810805
Average of avg_per_cpu		21.75246248	0.759572487		-96.50810805
Average of avg_wait_time	0.007239228	0.012899105		78.18343632
Sum of nr_switches		4897740		4728784			-3.449672706
Sum of nr_voluntary_switches	4742512		4621606			-2.549408415
Sum of nr_involuntary_switches	155228		107178			-30.95446698
Sum of nr_wakeups		4742648		4623175			-2.51912012
Sum of nr_migrations		1263925		930600			-26.37221354
Sum of sum_exec_runtime		288481.15	262255.2574		-9.091024712
Sum of sum_idle_runtime		2576164.568	2851759.68		10.69788457
Sum of sum_sleep_runtime	76890.14753	78632.31679		2.265789982
Sum of wait_count		4897894		4728939			-3.449543824
Sum of wait_sum			3041.78227	24167.4694		694.5167422

o schbench (2 messengers, 128 workers - fully loaded) (NPS1)

				tip		eevdf		%diff
Comm				schbench	schbench	N/A
Sum of avg_atom			7538.162897	7289.565705	-3.297848503
Average of avg_atom		29.10487605	28.14504133	-3.297848503
Sum of avg_per_cpu		630248.6079	471215.3671	-25.23341406
Average of avg_per_cpu		2433.392309	1819.364352	-25.23341406
Average of avg_wait_time	0.054147456	25.34304285	46703.75524
Sum of nr_switches		85210		88176		3.480812111
Sum of nr_voluntary_switches	83165		83457		0.351109241
Sum of nr_involuntary_switches	2045		4719		130.7579462
Sum of nr_wakeups		83168		83459		0.34989419
Sum of nr_migrations		3265		3025		-7.350689127
Sum of sum_exec_runtime		2476504.52	2469058.164	-0.300680129
Sum of sum_idle_runtime		110294825.8	132520924.2	20.15153321
Sum of sum_sleep_runtime	5293337.741	5297778.714	0.083897408
Sum of sum_block_runtime	56.043253	15.12936	-73.00413664
Sum of wait_count		85615		88606		3.493546692
Sum of wait_sum			4653.340163	9605.221964	106.4156418

o schbench (2 messengers, 256 workers - overloaded) (NPS1)

				tip		eevdf		%diff
Comm				schbench	schbench	N/A
Sum of avg_atom			11676.77306	4803.485728	-58.8629007
Average of avg_atom		22.67334574	9.327156753	-58.8629007
Sum of avg_per_cpu		55235.68013	38286.47722	-30.68524343
Average of avg_per_cpu		107.2537478	74.34267421	-30.68524343
Average of avg_wait_time	2.23189096	2.58191945	15.68304621
Sum of nr_switches		202862		425258		109.6292061
Sum of nr_voluntary_switches	163079		165058		1.213522281
Sum of nr_involuntary_switches	39783		260200		554.0482115
Sum of nr_wakeups		163082		165058		1.211660392
Sum of nr_migrations		44199		54894		24.19738003
Sum of sum_exec_runtime		4586675.667	3963846.024	-13.57910801
Sum of sum_idle_runtime		201050644.2	195126863.7	-2.946412087
Sum of sum_sleep_runtime	10418117.66	10402686.4	-0.148119407
Sum of sum_block_runtime	1548.979156	516.115078	-66.68030838
Sum of wait_count		203377		425792		109.3609405
Sum of wait_sum			455609.3122	1100885.201	141.6292142

o tbench (256 clients - overloaded) (NPS1)

- tbench client
				tip		eevdf		% diff
comm				tbench		tbench		N/A
Sum of avg_atom			3.594587941	5.112101854	42.21663064
Average of avg_atom		0.013986724	0.019891447	42.21663064
Sum of avg_per_cpu		392838.0975	142065.4206	-63.83613975
Average of avg_per_cpu		1528.552909	552.7837377	-63.83613975
Average of avg_wait_time	0.010512441	0.006861579	-34.72895916
Sum of nr_switches		692845080	511780111	-26.1335433
Sum of nr_voluntary_switches	178151085	371234907	108.3820635
Sum of nr_involuntary_switches	514693995	140545204	-72.69344399
Sum of nr_wakeups		178151085	371234909	108.3820646
Sum of nr_migrations		45279		71177		57.19649286
Sum of sum_exec_runtime		9192343.465	9624025.792	4.69610746
Sum of sum_idle_runtime		7125370.721	16145736.39	126.5950365
Sum of sum_sleep_runtime	2222469.726	5792868.629	160.650058
Sum of sum_block_runtime	68.60879	446.080476	550.1797743
Sum of wait_count		692845479	511780543	-26.13352349
Sum of wait_sum			7287852.246	3297894.139	-54.7480653

- tbench server

				tip		eevdf		% diff
Comm				tbench_srv	tbench_srv	N/A
Sum of avg_atom			5.077837807	5.447267364	7.275331971
Average of avg_atom2		0.019758124	0.021195593	7.275331971
Sum of avg_per_cpu		538586.1634	87925.51225	-83.67475471
Average of avg_per_cpu2		2095.666006	342.1226158	-83.67475471
Average of avg_wait_time	0.000827346	0.006505748	686.3392261
Sum of nr_switches		692980666	511838912	-26.13951051
Sum of nr_voluntary_switches	690367607	390304935	-43.46418762
Sum of nr_involuntary_switches	2613059		121533977	4551.023073
Sum of nr_wakeups		690367607	390304935	-43.46418762
Sum of nr_migrations		39486		84474		113.9340526
Sum of sum_exec_runtime		9176708.278	8734423.401	-4.819646259
Sum of sum_idle_runtime		413900.3645	447180.3879	8.040588086
Sum of sum_sleep_runtime	8966201.976	6690818.107	-25.37734345
Sum of sum_block_runtime	1.776413	1.617435	-8.949382829
Sum of wait_count		692980942	511839229	-26.13949418
Sum of wait_sum			565739.6984	3295519.077	482.5150836

> 
> The nice -19 numbers aren't as pretty as Vincent's, but at the end I was going
> cross-eyed from staring at tree prints and I just couldn't figure out where it
> was going side-ways.
> 
> There's definitely more benchmarking/tweaking to be done (0-day already
> reported a stress-ng loss), but if we can pull this off we can delete a whole
> much of icky heuristics code. EEVDF is a much better defined policy than what
> we currently have.
> 

DeathStarBench and SpecJBB and slightly more complex to analyze. I'll
get the schedstat data for both soon. I'll rerun some of the above
workloads with NO_PRESERVE_LAG to see if that makes any difference.
In the meantime, if you need more data from the test system for any
particular workload, please do let me know. I will collect the per-task
and system-wide schedstat data for the workload as it is rather
inexpensive to collect and gives good insights but if you need any
other data, I'll be more than happy to get those too for analysis.

--
Thanks and Regards,
Prateek


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds