Skip to content

bpo-46823: Implement LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE #31484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 24, 2022

Conversation

sweeneyde
Copy link
Member

@sweeneyde sweeneyde commented Feb 22, 2022

I still need to benchmark, but this avoids an INCREF/DECREF pair for "owner", and this is currently the most common opcode pair.

See faster-cpython/ideas#291

https://bugs.python.org/issue46823

@sweeneyde
Copy link
Member Author

pyperformance (gcc with --enable-optimizations --with-lto):

Slower (11):
- scimark_sparse_mat_mult: 5.17 ms +- 0.26 ms -> 5.34 ms +- 0.26 ms: 1.03x slower
- unpickle_list: 5.00 us +- 0.09 us -> 5.14 us +- 0.11 us: 1.03x slower
- xml_etree_parse: 159 ms +- 6 ms -> 163 ms +- 6 ms: 1.03x slower
- regex_effbot: 2.98 ms +- 0.07 ms -> 3.06 ms +- 0.03 ms: 1.03x slower
- float: 81.0 ms +- 1.9 ms -> 82.9 ms +- 1.8 ms: 1.02x slower
- go: 149 ms +- 3 ms -> 151 ms +- 3 ms: 1.01x slower
- unpack_sequence: 47.7 ns +- 0.6 ns -> 48.2 ns +- 1.2 ns: 1.01x slower
- dulwich_log: 81.6 ms +- 1.5 ms -> 82.3 ms +- 1.6 ms: 1.01x slower
- unpickle_pure_python: 259 us +- 5 us -> 261 us +- 5 us: 1.01x slower
- python_startup: 9.10 ms +- 0.14 ms -> 9.16 ms +- 0.16 ms: 1.01x slower
- sympy_sum: 186 ms +- 2 ms -> 187 ms +- 2 ms: 1.01x slower

Faster (16):
- deltablue: 4.38 ms +- 0.13 ms -> 4.09 ms +- 0.14 ms: 1.07x faster
- raytrace: 335 ms +- 39 ms -> 314 ms +- 6 ms: 1.07x faster
- scimark_monte_carlo: 72.5 ms +- 2.4 ms -> 69.5 ms +- 0.9 ms: 1.04x faster
- unpickle: 16.1 us +- 1.3 us -> 15.4 us +- 1.2 us: 1.04x faster
- logging_silent: 105 ns +- 4 ns -> 101 ns +- 2 ns: 1.04x faster
- scimark_sor: 120 ms +- 1 ms -> 117 ms +- 1 ms: 1.03x faster
- chaos: 79.3 ms +- 2.5 ms -> 76.8 ms +- 2.8 ms: 1.03x faster
- 2to3: 281 ms +- 24 ms -> 272 ms +- 7 ms: 1.03x faster
- fannkuch: 412 ms +- 8 ms -> 400 ms +- 5 ms: 1.03x faster
- pickle_dict: 28.1 us +- 0.5 us -> 27.3 us +- 0.7 us: 1.03x faster
- pidigits: 188 ms +- 1 ms -> 184 ms +- 2 ms: 1.02x faster
- scimark_lu: 117 ms +- 2 ms -> 114 ms +- 2 ms: 1.02x faster
- nbody: 102 ms +- 4 ms -> 100 ms +- 4 ms: 1.02x faster
- chameleon: 7.23 ms +- 0.19 ms -> 7.13 ms +- 0.13 ms: 1.01x faster
- regex_dna: 212 ms +- 2 ms -> 209 ms +- 2 ms: 1.01x faster
- xml_etree_process: 56.6 ms +- 0.9 ms -> 56.2 ms +- 0.9 ms: 1.01x faster

Benchmark hidden because not significant (32): crypto_pyaes, django_template, hexiom, html5lib, json_dumps, json_loads, logging_format, logging_simple, mako, meteor_contest, nqueens, pathlib, pickle, pickle_list, pickle_pure_python, pyflate, python_startup_no_site, regex_compile, regex_v8, richards, scimark_fft, spectral_norm, sqlalchemy_declarative, sqlalchemy_imperative, sqlite_synth, sympy_expand, sympy_integrate, sympy_str, telco, tornado_http, xml_etree_iterparse, xml_etree_generate

Geometric mean: 1.01x faster

@sweeneyde
Copy link
Member Author

With a very nice microbenchmark:

Mean +- std dev: [main_micro_load_attr2] 6.49 ns +- 0.08 ns -> [pr_micro_load_attr2] 4.67 ns +- 0.12 ns: 1.39x faster
from pyperf import Runner, perf_counter
from itertools import repeat

class A:
    pass

def bench(loops):
    a = A()
    a.x = 42
    it = repeat(None, loops)
    t0 = perf_counter()
    for x in it:
        a.x; a.x; a.x; a.x; a.x
        a.x; a.x; a.x; a.x; a.x
        a.x; a.x; a.x; a.x; a.x
        a.x; a.x; a.x; a.x; a.x
        a.x; a.x; a.x; a.x; a.x
    return perf_counter() - t0

runner = Runner()
runner.bench_time_func("a.x", bench, inner_loops=25)

@sweeneyde sweeneyde changed the title Implement LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE bpo-46823: Implement LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE Feb 22, 2022
@sweeneyde sweeneyde marked this pull request as ready for review February 22, 2022 05:15
@sweeneyde
Copy link
Member Author

Execution counts

execution counts for all instructions
Name Count Self Cumulative Miss ratio
LOAD_FAST 3135247057 11.1% 11.1%
PRECALL 1549055641 5.5% 16.6%
LOAD_CONST 1395011117 4.9% 21.6%
STORE_FAST__LOAD_FAST 1231670572 4.4% 25.9%
LOAD_FAST__LOAD_FAST 1175265042 4.2% 30.1%
LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE 1041285280 3.7% 33.8% 2.7%
RESUME_QUICK 919158398 3.3% 37.1%
RETURN_VALUE 839050574 3.0% 40.0%
PUSH_NULL 824171929 2.9% 43.0%
STORE_FAST__STORE_FAST 774141609 2.7% 45.7%
POP_JUMP_IF_FALSE 728285003 2.6% 48.3%
LOAD_FAST__LOAD_CONST 671553119 2.4% 50.7%
FOR_ITER 621270950 2.2% 52.9%
BINARY_OP_ADD_INT 547698900 1.9% 54.8% 0.0%
JUMP_ABSOLUTE_QUICK 529933639 1.9% 56.7%
LOAD_GLOBAL_BUILTIN 522653482 1.9% 58.6% 0.6%
COMPARE_OP_INT_JUMP 514680578 1.8% 60.4% 0.0%
LOAD_GLOBAL_MODULE 514381725 1.8% 62.2% 0.3%
STORE_FAST 484205064 1.7% 63.9%
POP_TOP 4805377 1.7% 65.6%
CALL_PY_EXACT_ARGS 473740790 1.7% 67.3% 4.0%
BINARY_SUBSCR_ADAPTIVE 461644394 1.6% 68.9%
BINARY_SUBSCR_LIST_INT 434332854 1.5% 70.5% 1.4%
BINARY_OP_ADAPTIVE 397001088 1.4% 71.9%
SWAP 395325039 1.4% 73.3%
COPY 390257442 1.4% 74.7%
LOAD_METHOD_CACHED 364232595 1.3% 76.0% 2.8%
BINARY_OP_MULTIPLY_FLOAT 362589777 1.3% 77.3% 1.0%
LOAD_ATTR_ADAPTIVE 266600111 0.9% 78.2%
CALL_ADAPTIVE 255423647 0.9% 79.1%
LOAD_CONST__LOAD_FAST 241299240 0.9% 80.0%
STORE_ATTR_INSTANCE_VALUE 232645358 0.8% 80.8% 1.3%
POP_JUMP_IF_TRUE 230860405 0.8% 81.6%
STORE_SUBSCR_ADAPTIVE 224938290 0.8% 82.4%
BINARY_OP_ADD_FLOAT 223079628 0.8% 83.2% 1.2%
LOAD_METHOD_NO_DICT 197942047 0.7% 83.9% 1.2%
BINARY_OP_SUBTRACT_INT 194779975 0.7% 84.6% 0.5%
LOAD_ATTR_INSTANCE_VALUE 182555077 0.6% 85.2% 15.5%
BUILD_SLICE 172720155 0.6% 85.9%
LOAD_DEREF 163282128 0.6% 86.4%
UNPACK_SEQUENCE_TWO_TUPLE 157035080 0.6% 87.0%
CALL_NO_KW_BUILTIN_O 150406692 0.5% 87.5% 0.4%
BINARY_OP_SUBTRACT_FLOAT 145989837 0.5% 88.0% 2.8%
STORE_SUBSCR_LIST_INT 144335936 0.5% 88.6%
JUMP_FORWARD 143526704 0.5% 89.1%
CALL_NO_KW_ISINSTANCE 125516747 0.4% 89.5%
LOAD_METHOD_ADAPTIVE 118383429 0.4% 89.9%
LOAD_ATTR_WITH_HINT 109587149 0.4% 90.3% 5.0%
CONTAINS_OP 107671805 0.4% 90.7%
BUILD_TUPLE 103783270 0.4% 91.1%
CALL_NO_KW_LEN 102395356 0.4% 91.4%
GET_ITER 9954202 0.4% 91.8%
EXTENDED_ARG 99524564 0.4% 92.1%
LOAD_ATTR_SLOT 98277363 0.3% 92.5% 10.9%
IS_OP 98213606 0.3% 92.8%
BINARY_OP_MULTIPLY_INT 92842665 0.3% 93.2% 0.8%
CALL_NO_KW_METHOD_DESCRIPTOR_FAST 90002285 0.3% 93.5% 0.0%
CALL_NO_KW_BUILTIN_FAST 89699624 0.3% 93.8% 0.1%
NOP 88359245 0.3% 94.1%
UNPACK_SEQUENCE_TUPLE 88195090 0.3% 94.4% 0.9%
YIELD_VALUE 83397812 0.3% 94.7%
COMPARE_OP 74593437 0.3% 95.0%
POP_JUMP_IF_NONE 74004156 0.3% 95.3%
UNPACK_SEQUENCE_LIST 71541151 0.3% 95.5% 0.8%
POP_JUMP_IF_NOT_NONE 69019455 0.2% 95.8%
COMPARE_OP_ADAPTIVE 59993110 0.2% 96.0%
BINARY_SUBSCR_GETITEM 57264255 0.2% 96.2% 0.0%
BINARY_SUBSCR_DICT 56164502 0.2% 96.4% 0.0%
CALL_NO_KW_LIST_APPEND 51490590 0.2% 96.6%
LOAD_ATTR_MODULE 46506516 0.2% 96.7% 2.2%
BINARY_SUBSCR_TUPLE_INT 45410261 0.2% 96.9% 2.0%
CALL_BUILTIN_CLASS 43620514 0.2% 97.0% 0.0%
CALL_NO_KW_METHOD_DESCRIPTOR_NOARGS 41964475 0.1% 97.2% 0.0%
STORE_ATTR_SLOT 41254649 0.1% 97.3% 1.8%
COPY_FREE_VARS 31296865 0.1% 97.4%
COMPARE_OP_STR_JUMP 27673292 0.1% 97.5% 0.5%
LIST_APPEND 26127032 0.1% 97.6%
BUILD_LIST 26032774 0.1% 97.7%
MAKE_FUNCTION 25056208 0.1% 97.8%
BINARY_OP_ADD_UNICODE 2376171 0.1% 97.9% 0.4%
BUILD_MAP 23631604 0.1% 98.0%
CALL_BUILTIN_FAST_WITH_KEYWORDS 23509051 0.1% 98.1% 0.7%
JUMP_IF_FALSE_OR_POP 23130144 0.1% 98.1%
MAKE_CELL 23078612 0.1% 98.2%
KW_NAMES 22982976 0.1% 98.3%
CALL_NO_KW_STR_1 22476717 0.1% 98.4%
CALL 22428157 0.1% 98.5%
LOAD_METHOD_CLASS 21890454 0.1% 98.5% 0.1%
COMPARE_OP_FLOAT_JUMP 21780609 0.1% 98.6% 0.1%
CALL_NO_KW_TYPE_1 20795948 0.1% 98.7%
LOAD_GLOBAL 20519349 0.1% 98.8%
STORE_ATTR_WITH_HINT 18603689 0.1% 98.8% 2.1%
CALL_NO_KW_METHOD_DESCRIPTOR_O 18361233 0.1% 98.9% 0.0%
SEND 17129747 0.1% 99.0%
RETURN_GENERATOR 17076965 0.1% 99.0%
CALL_FUNCTION_EX 16791338 0.1% 99.1%
CALL_PY_WITH_DEFAULTS 16339506 0.1% 99.1% 0.1%
STORE_SUBSCR_DICT 15750582 0.1% 99.2%
LOAD_METHOD 15581713 0.1% 99.3%
STORE_ATTR_ADAPTIVE 15516098 0.1% 99.3%
JUMP_NO_INTERRUPT 15448218 0.1% 99.4%
JUMP_IF_TRUE_OR_POP 14162847 0.1% 99.4%
LOAD_CLOSURE 13698510 0.0% 99.5%
DICT_MERGE 13262955 0.0% 99.5%
UNARY_NOT 13221544 0.0% 99.6%
STORE_DEREF 11425403 0.0% 99.6%
STORE_NAME 9655507 0.0% 99.6%
LOAD_ATTR 9039495 0.0% 99.7%
LOAD_METHOD_MODULE 6771510 0.0% 99.7% 0.8%
MAP_ADD 6355289 0.0% 99.7%
CALL_NO_KW_TUPLE_1 6003868 0.0% 99.7% 0.0%
UNARY_NEGATIVE 5844036 0.0% 99.7%
LOAD_NAME 5836594 0.0% 99.8%
UNARY_INVERT 5401008 0.0% 99.8%
RESUME 4790535 0.0% 99.8%
IMPORT_FROM 4652118 0.0% 99.8%
IMPORT_NAME 4070659 0.0% 99.8%
LOAD_GLOBAL_ADAPTIVE 3536018 0.0% 99.8%
DELETE_SUBSCR 3512441 0.0% 99.9%
STORE_GLOBAL 3448720 0.0% 99.9%
PUSH_EXC_INFO 2870491 0.0% 99.9%
POP_EXCEPT 2870491 0.0% 99.9%
JUMP_IF_NOT_EXC_MATCH 2839678 0.0% 99.9%
FORMAT_VALUE 2727392 0.0% 99.9%
LIST_EXTEND 2552915 0.0% 99.9%
LIST_TO_TUPLE 2272461 0.0% 99.9%
STORE_ATTR 2235704 0.0% 99.9%
UNPACK_SEQUENCE_ADAPTIVE 1921047 0.0% 99.9%
BUILD_STRING 1521435 0.0% 100.0%
BEFORE_WITH 1511733 0.0% 100.0%
GET_YIELD_FROM_ITER 1493429 0.0% 100.0%
BINARY_SUBSCR 1484375 0.0% 100.0%
BINARY_OP 1402274 0.0% 100.0%
BINARY_OP_INPLACE_ADD_UNICODE 1390876 0.0% 100.0% 96.7%
BUILD_CONST_KEY_MAP 1149258 0.0% 100.0%
JUMP_ABSOLUTE 778834 0.0% 100.0%
DELETE_ATTR 724588 0.0% 100.0%
DELETE_FAST 682011 0.0% 100.0%
UNPACK_SEQUENCE 595633 0.0% 100.0%
BUILD_SET 546441 0.0% 100.0%
LOAD_BUILD_CLASS 504368 0.0% 100.0%
STORE_SUBSCR 405518 0.0% 100.0%
RERAISE 309644 0.0% 100.0%
RAISE_VARARGS 281812 0.0% 100.0%
GET_AWAITABLE 188100 0.0% 100.0%
DICT_UPDATE 158464 0.0% 100.0%
SET_ADD 87062 0.0% 100.0%
DELETE_NAME 80572 0.0% 100.0%
IMPORT_STAR 21793 0.0% 100.0%
WITH_EXCEPT_START 9865 0.0% 100.0%
SET_UPDATE 7470 0.0% 100.0%
SETUP_ANNOTATIONS 1372 0.0% 100.0%
LOAD_CLASSDEREF 1092 0.0% 100.0%
DELETE_DEREF 660 0.0% 100.0%
MATCH_CLASS 10 0.0% 100.0%

Pair counts

Pair counts for top 100 pairs
Pair Count Self Cumulative
JUMP_ABSOLUTE_QUICK FOR_ITER 497186523 1.8% 1.8%
PRECALL CALL_PY_EXACT_ARGS 473126644 1.7% 3.4%
STORE_FAST__STORE_FAST STORE_FAST__STORE_FAST 466647964 1.7% 5.1%
LOAD_FAST PRECALL 461867391 1.6% 6.7%
CALL_PY_EXACT_ARGS RESUME_QUICK 452877527 1.6% 8.3%
PUSH_NULL LOAD_GLOBAL_BUILTIN 406307141 1.4% 9.8%
STORE_FAST__LOAD_FAST LOAD_FAST 284566127 1.0% 10.8%
LOAD_GLOBAL_BUILTIN LOAD_FAST 279336040 1.0% 11.8%
RESUME_QUICK LOAD_FAST 278961132 1.0% 12.8%
FOR_ITER STORE_FAST__LOAD_FAST 267570174 0.9% 13.7%
LOAD_FAST LOAD_METHOD_CACHED 254307403 0.9% 14.6%
POP_JUMP_IF_FALSE LOAD_FAST 253727412 0.9% 15.5%
LOAD_CONST RETURN_VALUE 253065026 0.9% 16.4%
PRECALL CALL_ADAPTIVE 232638785 0.8% 17.2%
LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE LOAD_FAST 209779510 0.7% 18.0%
LOAD_FAST__LOAD_CONST BINARY_OP_ADD_INT 196247505 0.7% 18.7%
RESUME_QUICK LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE 189415594 0.7% 19.4%
LOAD_FAST__LOAD_FAST LOAD_CONST 177165901 0.6% 20.0%
PUSH_NULL LOAD_GLOBAL_MODULE 174707541 0.6% 20.6%
STORE_FAST__LOAD_FAST LOAD_CONST 170005026 0.6% 21.2%
STORE_FAST__STORE_FAST LOAD_FAST 168400773 0.6% 21.8%
LOAD_CONST BINARY_OP_ADD_INT 168174713 0.6% 22.4%
RETURN_VALUE POP_TOP 164739690 0.6% 23.0%
STORE_FAST PUSH_NULL 159494672 0.6% 23.6%
BINARY_OP_MULTIPLY_FLOAT BINARY_OP_ADD_FLOAT 159210848 0.6% 24.1%
FOR_ITER STORE_FAST 153219078 0.5% 24.7%
LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE 151926011 0.5% 25.2%
PRECALL CALL_NO_KW_BUILTIN_O 150313397 0.5% 25.7%
POP_TOP JUMP_ABSOLUTE_QUICK 146101581 0.5% 26.3%
UNPACK_SEQUENCE_TWO_TUPLE STORE_FAST__STORE_FAST 144955157 0.5% 26.8%
LOAD_FAST BINARY_OP_ADD_INT 136304657 0.5% 27.3%
COPY COPY 136210665 0.5% 27.7%
SWAP SWAP 136210665 0.5% 28.2%
LOAD_FAST__LOAD_FAST BINARY_OP_MULTIPLY_FLOAT 135705074 0.5% 28.7%
RESUME_QUICK PUSH_NULL 134594699 0.5% 29.2%
LOAD_METHOD_CACHED PRECALL 131204944 0.5% 29.6%
LOAD_FAST RETURN_VALUE 129721621 0.5% 30.1%
LOAD_CONST LOAD_CONST 127883194 0.5% 30.6%
LOAD_METHOD_CACHED LOAD_FAST 126560129 0.4% 31.0%
LOAD_FAST BINARY_SUBSCR_LIST_INT 125832435 0.4% 31.5%
PRECALL CALL_NO_KW_ISINSTANCE 125452051 0.4% 31.9%
LOAD_FAST__LOAD_FAST PRECALL 123490315 0.4% 32.3%
COMPARE_OP_INT_JUMP LOAD_FAST 123159075 0.4% 32.8%
LOAD_CONST BINARY_OP_ADAPTIVE 113870633 0.4% 33.2%
LOAD_FAST LOAD_ATTR_ADAPTIVE 111234207 0.4% 33.6%
CALL_ADAPTIVE RESUME_QUICK 110764177 0.4% 34.0%
BUILD_SLICE BINARY_SUBSCR_ADAPTIVE 109642865 0.4% 34.4%
FOR_ITER UNPACK_SEQUENCE_TWO_TUPLE 109375622 0.4% 34.7%
BINARY_SUBSCR_LIST_INT STORE_FAST__LOAD_FAST 109131813 0.4% 35.1%
LOAD_FAST BINARY_SUBSCR_ADAPTIVE 108590531 0.4% 35.5%
LOAD_FAST LOAD_GLOBAL_MODULE 106454131 0.4% 35.9%
LOAD_FAST__LOAD_CONST PRECALL 103237215 0.4% 36.3%
PRECALL CALL_NO_KW_LEN 102340275 0.4% 36.6%
PUSH_NULL LOAD_FAST__LOAD_CONST 101990755 0.4% 37.0%
RETURN_VALUE STORE_FAST__LOAD_FAST 100750603 0.4% 37.3%
LOAD_GLOBAL_MODULE PRECALL 98532528 0.3% 37.7%
CONTAINS_OP POP_JUMP_IF_FALSE 97557008 0.3% 38.0%
LOAD_CONST PRECALL 96410303 0.3% 38.4%
COMPARE_OP_INT_JUMP LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE 95059190 0.3% 38.7%
CALL_NO_KW_BUILTIN_O POP_TOP 94649930 0.3% 39.1%
BINARY_SUBSCR_LIST_INT LOAD_CONST 93984037 0.3% 39.4%
POP_JUMP_IF_FALSE PUSH_NULL 93342131 0.3% 39.7%
CALL_NO_KW_ISINSTANCE POP_JUMP_IF_FALSE 92979827 0.3% 40.0%
LOAD_FAST COMPARE_OP_INT_JUMP 92825566 0.3% 40.4%
LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE COMPARE_OP_INT_JUMP 92792577 0.3% 40.7%
LOAD_CONST COMPARE_OP_INT_JUMP 90636084 0.3% 41.0%
LOAD_CONST BUILD_SLICE 90006493 0.3% 41.3%
PRECALL CALL_NO_KW_METHOD_DESCRIPTOR_FAST 89940548 0.3% 41.7%
PRECALL CALL_NO_KW_BUILTIN_FAST 89568082 0.3% 42.0%
BINARY_OP_ADD_INT STORE_FAST__LOAD_FAST 88471836 0.3% 42.3%
LOAD_FAST LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE 88009871 0.3% 42.6%
COPY BINARY_SUBSCR_LIST_INT 87503953 0.3% 42.9%
SWAP STORE_SUBSCR_LIST_INT 87503953 0.3% 43.2%
PUSH_NULL LOAD_FAST__LOAD_FAST 86754157 0.3% 43.5%
LOAD_FAST BINARY_OP_MULTIPLY_FLOAT 85469982 0.3% 43.8%
COMPARE_OP_INT_JUMP LOAD_FAST__LOAD_CONST 85111197 0.3% 44.1%
LOAD_FAST__LOAD_FAST STORE_ATTR_INSTANCE_VALUE 84364353 0.3% 44.4%
STORE_FAST__LOAD_FAST POP_JUMP_IF_TRUE 84110321 0.3% 44.7%
LOAD_CONST BINARY_OP_SUBTRACT_INT 83412559 0.3% 45.0%
LOAD_FAST LOAD_GLOBAL_BUILTIN 83191924 0.3% 45.3%
LOAD_CONST BINARY_SUBSCR_LIST_INT 81500222 0.3% 45.6%
GET_ITER FOR_ITER 80980344 0.3% 45.9%
BINARY_SUBSCR_ADAPTIVE LOAD_FAST__LOAD_FAST 80562864 0.3% 46.2%
LOAD_CONST__LOAD_FAST LOAD_FAST 79896372 0.3% 46.5%
LOAD_FAST LOAD_ATTR_SLOT 77989254 0.3% 46.8%
POP_JUMP_IF_FALSE LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE 76924170 0.3% 47.0%
BINARY_SUBSCR_ADAPTIVE LOAD_FAST 76725887 0.3% 47.3%
LOAD_FAST LOAD_METHOD_NO_DICT 76618148 0.3% 47.6%
LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE POP_JUMP_IF_FALSE 7661684 0.3% 47.8%
POP_TOP LOAD_FAST 75003774 0.3% 48.1%
BINARY_OP_ADAPTIVE STORE_FAST__LOAD_FAST 74842807 0.3% 48.4%
BINARY_OP_ADD_INT BUILD_SLICE 74707640 0.3% 48.6%
LOAD_FAST__LOAD_FAST LOAD_FAST 74650843 0.3% 48.9%
STORE_FAST__LOAD_FAST LOAD_METHOD_NO_DICT 7394707 0.3% 49.2%
STORE_FAST JUMP_ABSOLUTE_QUICK 73890021 0.3% 49.4%
CALL_NO_KW_METHOD_DESCRIPTOR_FAST STORE_FAST__LOAD_FAST 72155558 0.3% 49.7%
LOAD_FAST__LOAD_FAST BINARY_SUBSCR_ADAPTIVE 71843908 0.3% 49.9%
LOAD_GLOBAL_BUILTIN PRECALL 71740406 0.3% 50.2%
UNPACK_SEQUENCE_LIST STORE_FAST__STORE_FAST 71483229 0.3% 50.4%
RETURN_VALUE LOAD_FAST 71160867 0.3% 50.7%

Specialization stats

specialization stats by family

BINARY_SUBSCR

specialization stats for BINARY_SUBSCR family
Kind Count Ratio
unquickened 1484375 0.1%
pair_count[100] 144283 0.0%
pair_count[101] 4500 0.0%
pair_count[102] 40666 0.0%
pair_count[103] 431 0.0%
pair_count[104] 44 0.0%
pair_count[105] 77 0.0%
pair_count[106] 18181 0.0%
pair_count[107] 4968 0.0%
pair_count[110] 1631 0.0%
pair_count[111] 118 0.0%
pair_count[112] 12 0.0%
pair_count[114] 3074 0.0%
pair_count[115] 498 0.0%
pair_count[116] 54909 0.0%
pair_count[118] 1473 0.0%
pair_count[120] 168 0.0%
pair_count[122] 49008 0.0%
pair_count[124] 111393 0.0%
pair_count[125] 172687 0.0%
pair_count[128] 13584 0.0%
pair_count[129] 498 0.0%
pair_count[12] 8 0.0%
pair_count[136] 33 0.0%
pair_count[137] 25478 0.0%
pair_count[138] 11510 0.0%
pair_count[142] 432 0.0%
pair_count[144] 1817 0.0%
pair_count[145] 2247 0.0%
pair_count[146] 123 0.0%
pair_count[147] 7195 0.0%
pair_count[155] 7129 0.0%
pair_count[160] 146362 0.0%
pair_count[162] 24 0.0%
pair_count[164] 12 0.0%
pair_count[166] 372140 0.0%
pair_count[1] 3463 0.0%
pair_count[25] 3154 0.0%
pair_count[2] 427 0.0%
pair_count[35] 81732 0.0%
pair_count[60] 13218 0.0%
pair_count[65] 69 0.0%
pair_count[66] 3514 0.0%
pair_count[68] 14195 0.0%
pair_count[69] 135 0.0%
pair_count[72] 1758 0.0%
pair_count[79] 45139 0.0%
pair_count[83] 58549 0.0%
pair_count[86] 4169 0.0%
pair_count[90] 18245 0.0%
pair_count[92] 22720 0.0%
pair_count[95] 330 0.0%
pair_count[97] 6 0.0%
pair_count[99] 16519 0.0%
deferred 454363297 45.8%
deopt 128880 0.0%
hit 529028934 53.3%
miss 6880762 0.7%

Specialization attempts

Count Ratio
Success 273074 3.8%
Failure 7008023 96.2%
Failure kind Count Ratio
array int 4221379 60.2%
list slice 1392689 19.9%
buffer int 650519 9.3%
string int 210622 3.0%
buffer slice 203033 2.9%
other 186743 2.7%
string slice 104371 1.5%
tuple slice 29957 0.4%
sequence int 8643 0.1%
array slice 67 0.0%

STORE_SUBSCR

specialization stats for STORE_SUBSCR family
Kind Count Ratio
unquickened 405518 0.1%
pair_count[100] 69658 0.0%
pair_count[101] 52037 0.0%
pair_count[103] 1126 0.0%
pair_count[105] 1877 0.0%
pair_count[110] 6584 0.0%
pair_count[113] 33314 0.0%
pair_count[116] 11455 0.0%
pair_count[124] 51866 0.0%
pair_count[125] 6103 0.0%
pair_count[136] 67 0.0%
pair_count[137] 3684 0.0%
pair_count[144] 17343 0.0%
pair_count[1] 1298 0.0%
pair_count[2] 55722 0.0%
pair_count[89] 13470 0.0%
pair_count[91] 5 0.0%
pair_count[9] 79909 0.0%
deferred 221384285 58.0%
deopt 1 0.0%
hit 160086465 41.9%
miss 53 0.0%

Specialization attempts

Count Ratio
Success 89426 2.5%
Failure 3464579 97.5%
Failure kind Count Ratio
array int 1712010 49.4%
list slice 977000 28.2%
bytearray int 523117 15.1%
dict subclass no override 129402 3.7%
py simple 72354 2.1%
out of range 46071 1.3%
other 4558 0.1%
array slice 55 0.0%
py other 12 0.0%

UNPACK_SEQUENCE

specialization stats for UNPACK_SEQUENCE family
Kind Count Ratio
unquickened 595633 0.2%
pair_count[101] 225 0.0%
pair_count[124] 9113 0.0%
pair_count[125] 518363 0.2%
pair_count[138] 21968 0.0%
pair_count[35] 180 0.0%
pair_count[90] 44506 0.0%
pair_count[92] 1278 0.0%
deferred 1809256 0.6%
deopt 26800 0.0%
hit 315350921 98.8%
miss 1420400 0.4%

Specialization attempts

Count Ratio
Success 110060 98.5%
Failure 1731 1.5%
Failure kind Count Ratio
sequence 1403 81.1%
iterator 328 18.9%

FOR_ITER

specialization stats for FOR_ITER family
Kind Count Ratio
unquickened 621270950 100.0%
pair_count[100] 18392549 3.0%
pair_count[101] 12057 0.0%
pair_count[103] 1119048 0.2%
pair_count[105] 31178 0.0%
pair_count[110] 524014 0.1%
pair_count[113] 17486 0.0%
pair_count[116] 6105 0.0%
pair_count[124] 21960304 3.5%
pair_count[125] 153219078 24.7%
pair_count[126] 531813 0.1%
pair_count[136] 2615 0.0%
pair_count[137] 385228 0.1%
pair_count[138] 3059906 0.5%
pair_count[143] 37896 0.0%
pair_count[144] 22103 0.0%
pair_count[150] 3233 0.0%
pair_count[153] 5558574 0.9%
pair_count[154] 109375622 17.6%
pair_count[158] 1448837 0.2%
pair_count[159] 267570174 43.1%
pair_count[161] 1142752 0.2%
pair_count[167] 1045582 0.2%
pair_count[169] 2320562 0.4%
pair_count[1] 395 0.0%
pair_count[2] 14297729 2.3%
pair_count[35] 62 0.0%
pair_count[57] 10688766 1.7%
pair_count[65] 5325 0.0%
pair_count[66] 171346 0.0%
pair_count[67] 72998 0.0%
pair_count[83] 6756344 1.1%
pair_count[89] 66 0.0%
pair_count[90] 160073 0.0%
pair_count[91] 7449 0.0%
pair_count[92] 255949 0.0%
pair_count[9] 1067732 0.2%

Specialization attempts

Count Ratio
Success 0 0.0%
Failure 621270950 100.0%
Failure kind Count Ratio
list 231430416 37.3%
range 216796730 34.9%
other 44495887 7.2%
enumerate 35430571 5.7%
set 27126722 4.4%
dict items 2455692 4.0%
tuple 24122749 3.9%
generator 13179833 2.1%
itertools 2768156 0.4%
string 513906 0.1%
dict keys 474425 0.1%
bytes 270608 0.0%
dict values 104025 0.0%

STORE_ATTR

specialization stats for STORE_ATTR family
Kind Count Ratio
unquickened 2235704 0.7%
pair_count[100] 645741 0.2%
pair_count[101] 7751 0.0%
pair_count[103] 65523 0.0%
pair_count[105] 20151 0.0%
pair_count[110] 96085 0.0%
pair_count[113] 4037 0.0%
pair_count[116] 41780 0.0%
pair_count[120] 179 0.0%
pair_count[124] 1141733 0.4%
pair_count[125] 6338 0.0%
pair_count[136] 13916 0.0%
pair_count[137] 14876 0.0%
pair_count[139] 77 0.0%
pair_count[144] 7 0.0%
pair_count[1] 978 0.0%
pair_count[2] 111881 0.0%
pair_count[83] 253 0.0%
pair_count[89] 5005 0.0%
pair_count[90] 24 0.0%
pair_count[92] 319 0.0%
pair_count[9] 59050 0.0%
deferred 15028412 4.9%
deopt 75725 0.0%
hit 288293161 93.1%
miss 4210535 1.4%

Specialization attempts

Count Ratio
Success 272912 56.0%
Failure 214774 44.0%
Failure kind Count Ratio
overriding descriptor 68259 31.8%
overridden 66498 31.0%
out of range 35533 16.5%
not managed dict 18224 8.5%
method 10716 5.0%
mutable class 7392 3.4%
non object slot 4785 2.2%
property 3367 1.6%

LOAD_ATTR

specialization stats for LOAD_ATTR family
Kind Count Ratio
unquickened 9039495 0.5%
pair_count[100] 640930 0.0%
pair_count[101] 68650 0.0%
pair_count[102] 52858 0.0%
pair_count[103] 32590 0.0%
pair_count[104] 5306 0.0%
pair_count[105] 1520 0.0%
pair_count[106] 350809 0.0%
pair_count[107] 43945 0.0%
pair_count[110] 1076 0.0%
pair_count[111] 2125 0.0%
pair_count[112] 26441 0.0%
pair_count[114] 243883 0.0%
pair_count[115] 102242 0.0%
pair_count[116] 149674 0.0%
pair_count[117] 17818 0.0%
pair_count[118] 196190 0.0%
pair_count[11] 6011 0.0%
pair_count[120] 9971 0.0%
pair_count[121] 72 0.0%
pair_count[122] 49872 0.0%
pair_count[124] 1317894 0.1%
pair_count[125] 3207821 0.2%
pair_count[128] 123474 0.0%
pair_count[129] 164398 0.0%
pair_count[12] 2143 0.0%
pair_count[133] 203 0.0%
pair_count[136] 1005 0.0%
pair_count[137] 20530 0.0%
pair_count[138] 9023 0.0%
pair_count[142] 551 0.0%
pair_count[144] 23545 0.0%
pair_count[145] 15527 0.0%
pair_count[146] 315 0.0%
pair_count[147] 874 0.0%
pair_count[155] 25499 0.0%
pair_count[15] 220 0.0%
pair_count[160] 762548 0.0%
pair_count[162] 349 0.0%
pair_count[164] 975 0.0%
pair_count[165] 90 0.0%
pair_count[166] 801949 0.0%
pair_count[1] 359 0.0%
pair_count[25] 9683 0.0%
pair_count[2] 38800 0.0%
pair_count[35] 52458 0.0%
pair_count[53] 53431 0.0%
pair_count[60] 57837 0.0%
pair_count[61] 47 0.0%
pair_count[68] 82149 0.0%
pair_count[69] 705 0.0%
pair_count[83] 120911 0.0%
pair_count[86] 4732 0.0%
pair_count[90] 104945 0.0%
pair_count[92] 2869 0.0%
pair_count[95] 25193 0.0%
pair_count[96] 156 0.0%
pair_count[97] 6 0.0%
pair_count[99] 4219 0.0%
pair_count[9] 79 0.0%
deferred 260788474 14.9%
deopt 1371546 0.1%
hit 1404524258 80.3%
miss 73687127 4.2%

Specialization attempts

Count Ratio
Success 1890214 32.5%
Failure 3921423 67.5%
Failure kind Count Ratio
overridden 1383074 35.3%
out of range 791644 20.2%
overriding descriptor 677757 17.3%
not managed dict 395137 10.1%
property 289336 7.4%
method 253281 6.5%
non object slot 86294 2.2%
mutable class 41738 1.1%
module attr not found 3162 0.1%

COMPARE_OP

specialization stats for COMPARE_OP family
Kind Count Ratio
unquickened 74593437 10.7%
pair_count[100] 288 0.0%
pair_count[107] 55 0.0%
pair_count[110] 127540 0.0%
pair_count[111] 1932367 0.3%
pair_count[112] 335824 0.0%
pair_count[114] 979254 0.1%
pair_count[115] 305755 0.0%
pair_count[120] 288 0.0%
pair_count[122] 765 0.0%
pair_count[124] 106941 0.0%
pair_count[125] 67986 0.0%
pair_count[12] 1435212 0.2%
pair_count[144] 31811650 4.6%
pair_count[145] 336 0.0%
pair_count[158] 3191447 0.5%
pair_count[159] 140452 0.0%
pair_count[161] 1333134 0.2%
pair_count[166] 82302 0.0%
pair_count[17] 17072 0.0%
pair_count[21] 21454 0.0%
pair_count[25] 39 0.0%
pair_count[2] 1955538 0.3%
pair_count[3] 5147107 0.7%
pair_count[83] 24292046 3.5%
pair_count[86] 1293550 0.2%
pair_count[90] 15035 0.0%
deferred 58885540 8.4%
deopt 6235 0.0%
hit 563758561 80.8%
miss 375918 0.1%

Specialization attempts

Count Ratio
Success 117595 10.6%
Failure 989975 89.4%
Failure kind Count Ratio
float long 368737 37.2%
set 227629 23.0%
different types 88408 8.9%
not followed by cond jump 59490 6.0%
tuple 51648 5.2%
other 50170 5.1%
bool 44637 4.5%
big int 41012 4.1%
baseobject 29838 3.0%
bytes 18027 1.8%
list 8753 0.9%
long float 1571 0.2%
string 55 0.0%

LOAD_GLOBAL

specialization stats for LOAD_GLOBAL family
Kind Count Ratio
unquickened 20519349 1.9%
pair_count[100] 727815 0.1%
pair_count[102] 115256 0.0%
pair_count[103] 801 0.0%
pair_count[104] 98 0.0%
pair_count[105] 176 0.0%
pair_count[106] 1551399 0.1%
pair_count[107] 99327 0.0%
pair_count[110] 79 0.0%
pair_count[111] 106 0.0%
pair_count[112] 740 0.0%
pair_count[114] 11596623 1.1%
pair_count[115] 2444 0.0%
pair_count[116] 451195 0.0%
pair_count[117] 267221 0.0%
pair_count[118] 58555 0.0%
pair_count[11] 2186 0.0%
pair_count[120] 515 0.0%
pair_count[121] 170555 0.0%
pair_count[122] 68391 0.0%
pair_count[124] 3458003 0.3%
pair_count[125] 51179 0.0%
pair_count[128] 48640 0.0%
pair_count[129] 19343 0.0%
pair_count[136] 43641 0.0%
pair_count[137] 73638 0.0%
pair_count[138] 3364 0.0%
pair_count[144] 8580 0.0%
pair_count[147] 687 0.0%
pair_count[155] 271 0.0%
pair_count[160] 638544 0.1%
pair_count[166] 850521 0.1%
pair_count[25] 33553 0.0%
pair_count[2] 70684 0.0%
pair_count[35] 12 0.0%
pair_count[53] 4267 0.0%
pair_count[58] 276 0.0%
pair_count[60] 173 0.0%
pair_count[64] 7496 0.0%
pair_count[68] 26294 0.0%
pair_count[69] 9 0.0%
pair_count[83] 40440 0.0%
pair_count[86] 5 0.0%
pair_count[90] 933 0.0%
pair_count[95] 24606 0.0%
pair_count[97] 2 0.0%
pair_count[99] 490 0.0%
pair_count[9] 216 0.0%
deferred 2274473 0.2%
deopt 46606 0.0%
hit 1032579008 97.4%
miss 4456199 0.4%

Specialization attempts

Count Ratio
Success 1261545 100.0%
Failure 0 0.0%
Failure kind Count Ratio

BINARY_OP

specialization stats for BINARY_OP family
Kind Count Ratio
unquickened 1402274 0.1%
pair_count[100] 141162 0.0%
pair_count[101] 8209 0.0%
pair_count[102] 7888 0.0%
pair_count[103] 773 0.0%
pair_count[104] 66 0.0%
pair_count[105] 44 0.0%
pair_count[106] 592 0.0%
pair_count[107] 49249 0.0%
pair_count[110] 17 0.0%
pair_count[112] 3432 0.0%
pair_count[114] 76724 0.0%
pair_count[115] 20642 0.0%
pair_count[116] 6291 0.0%
pair_count[117] 231 0.0%
pair_count[118] 137 0.0%
pair_count[11] 711 0.0%
pair_count[120] 468 0.0%
pair_count[121] 48 0.0%
pair_count[122] 68425 0.0%
pair_count[124] 182715 0.0%
pair_count[125] 362740 0.0%
pair_count[12] 48 0.0%
pair_count[133] 43789 0.0%
pair_count[137] 940 0.0%
pair_count[138] 4802 0.0%
pair_count[142] 454 0.0%
pair_count[144] 3707 0.0%
pair_count[145] 3107 0.0%
pair_count[146] 938 0.0%
pair_count[147] 7569 0.0%
pair_count[15] 154 0.0%
pair_count[160] 785 0.0%
pair_count[162] 185 0.0%
pair_count[166] 143421 0.0%
pair_count[1] 6559 0.0%
pair_count[25] 6605 0.0%
pair_count[2] 8879 0.0%
pair_count[60] 7781 0.0%
pair_count[68] 1201 0.0%
pair_count[83] 34178 0.0%
pair_count[86] 2650 0.0%
pair_count[90] 39010 0.0%
pair_count[92] 164 0.0%
pair_count[97] 289 0.0%
pair_count[99] 154495 0.0%
deferred 390718278 19.7%
deopt 258962 0.0%
hit 1578336813 79.5%
miss 13796563 0.7%

Specialization attempts

Count Ratio
Success 373812 5.9%
Failure 5908998 94.1%
Failure kind Count Ratio
and int 787753 13.3%
lshift 644761 10.9%
rshift 569945 9.6%
true divide different types 501680 8.5%
add other 495942 8.4%
remainder 474207 8.0%
true divide float 378062 6.4%
floor divide 377067 6.4%
xor 360464 6.1%
subtract different types 291534 4.9%
multiply different types 290135 4.9%
subtract other 256181 4.3%
power 132920 2.2%
add different types 129534 2.2%
or 115896 2.0%
multiply other 66272 1.1%
and other 28667 0.5%
true divide other 5353 0.1%
and different types 2625 0.0%

LOAD_METHOD

specialization stats for LOAD_METHOD family
Kind Count Ratio
unquickened 15581713 2.2%
pair_count[100] 1104165 0.2%
pair_count[101] 141817 0.0%
pair_count[103] 9060 0.0%
pair_count[105] 83 0.0%
pair_count[116] 186301 0.0%
pair_count[124] 1507102 0.2%
pair_count[136] 23754 0.0%
pair_count[137] 51299 0.0%
pair_count[144] 11 0.0%
pair_count[166] 12481875 1.7%
pair_count[2] 56819 0.0%
pair_count[35] 4489 0.0%
pair_count[65] 414 0.0%
pair_count[66] 14524 0.0%
deferred 116124632 16.1%
deopt 234116 0.0%
hit 578280645 80.0%
miss 12555961 1.7%

Specialization attempts

Count Ratio
Success 628779 27.8%
Failure 1630018 72.2%
Failure kind Count Ratio
has managed dict 499798 30.7%
has dict 436160 26.8%
instance attribute 332193 20.4%
overridden 114431 7.0%
class method obj 111587 6.8%
metaclass attribute 81731 5.0%
non overriding descriptor 21464 1.3%
builtin class method 9937 0.6%
not descriptor 6959 0.4%
mutable class 5446 0.3%
other 3638 0.2%
property 2838 0.2%
object slot 1711 0.1%
is attr 1193 0.1%
overriding descriptor 909 0.1%
non object slot 23 0.0%

PRECALL

specialization stats for PRECALL family
Kind Count Ratio
unquickened 1549055641 100.0%
pair_count[144] 90 0.0%
pair_count[171] 22045179 1.4%
pair_count[172] 22982886 1.5%
pair_count[34] 232638785 15.0%
pair_count[36] 43512298 2.8%
pair_count[37] 150313397 9.7%
pair_count[38] 89568082 5.8%
pair_count[39] 19870099 1.3%
pair_count[40] 102340275 6.6%
pair_count[41] 125452051 8.1%
pair_count[42] 473126644 30.5%
pair_count[43] 16311960 1.1%
pair_count[44] 51454664 3.3%
pair_count[45] 18312469 1.2%
pair_count[46] 41937071 2.7%
pair_count[47] 22470997 1.5%
pair_count[48] 5995650 0.4%
pair_count[55] 20782496 1.3%
pair_count[56] 89940548 5.8%

Specialization attempts

Count Ratio
Success 0 0.0%
Failure 1549055641 100.0%
Failure kind Count Ratio
pyfunction 573694427 37.0%
pycfunction 522639165 33.7%
method descriptor 229870636 14.8%
class 121891395 7.9%
bound method 58749623 3.8%
python class 31122813 2.0%
other 5617365 0.4%
cmethod 4307282 0.3%
method wrapper 1103739 0.1%
operator wrapper 59196 0.0%

CALL

specialization stats for CALL family
Kind Count Ratio
unquickened 22428157 2.1%
pair_count[100] 339390 0.0%
pair_count[101] 21612 0.0%
pair_count[102] 25279 0.0%
pair_count[103] 10340 0.0%
pair_count[104] 103 0.0%
pair_count[105] 230 0.0%
pair_count[106] 46459 0.0%
pair_count[107] 109395 0.0%
pair_count[110] 776 0.0%
pair_count[111] 25388 0.0%
pair_count[112] 33777 0.0%
pair_count[114] 659688 0.1%
pair_count[115] 295009 0.0%
pair_count[116] 57551 0.0%
pair_count[117] 1190 0.0%
pair_count[118] 17377 0.0%
pair_count[11] 656 0.0%
pair_count[120] 7845 0.0%
pair_count[122] 48613 0.0%
pair_count[124] 327655 0.0%
pair_count[125] 1061701 0.1%
pair_count[126] 77 0.0%
pair_count[128] 96486 0.0%
pair_count[129] 532 0.0%
pair_count[12] 1083 0.0%
pair_count[130] 29063 0.0%
pair_count[133] 77 0.0%
pair_count[135] 307740 0.0%
pair_count[136] 52 0.0%
pair_count[137] 4637 0.0%
pair_count[138] 16599 0.0%
pair_count[142] 1482 0.0%
pair_count[144] 60435 0.0%
pair_count[145] 38428 0.0%
pair_count[146] 194 0.0%
pair_count[147] 5708 0.0%
pair_count[149] 294156 0.0%
pair_count[151] 1909019 0.2%
pair_count[155] 456 0.0%
pair_count[15] 70 0.0%
pair_count[160] 118726 0.0%
pair_count[162] 398 0.0%
pair_count[164] 7335 0.0%
pair_count[166] 258900 0.0%
pair_count[1] 1087871 0.1%
pair_count[25] 2746 0.0%
pair_count[2] 120242 0.0%
pair_count[35] 44862 0.0%
pair_count[53] 70656 0.0%
pair_count[60] 1506 0.0%
pair_count[61] 77 0.0%
pair_count[68] 97395 0.0%
pair_count[69] 2158 0.0%
pair_count[75] 47669 0.0%
pair_count[80] 12882219 1.2%
pair_count[83] 458986 0.0%
pair_count[86] 27637 0.0%
pair_count[90] 1212446 0.1%
pair_count[92] 109403 0.0%
pair_count[95] 408 0.0%
pair_count[97] 4038 0.0%
pair_count[99] 15864 0.0%
pair_count[9] 88 0.0%
deferred 250304088 23.7%
deopt 366797 0.0%
hit 764585970 72.3%
miss 19929048 1.9%

Specialization attempts

Count Ratio
Success 1277470 25.0%
Failure 3842089 75.0%
Failure kind Count Ratio
bound method 955436 24.9%
complex parameters 511752 13.3%
python class 508788 13.2%
pycfunction with keywords 361624 9.4%
class no vectorcall 307273 8.0%
kwnames 273192 7.1%
pycfunction 244743 6.4%
pycfunction noargs 165014 4.3%
class mutable 147305 3.8%
other 92221 2.4%
bad call flags 85958 2.2%
pycfunction fast with keywords 68441 1.8%
cmethod 67441 1.8%
str 32111 0.8%
method wrapper 18451 0.5%
operator wrapper 2339 0.1%

Specialization effectiveness

specialization effectiveness
Instructions Count Ratio
Basic 15754525268 55.9%
Not specialized 4232687411 15.0%
Specialized 8199896680 29.1%

Call stats

Inlined calls and frame stats
Count Ratio
Calls to PyEval_EvalDefault 254821643 27.0%
Calls to Python functions inlined 689705537 73.0%
Frames pushed 844052434 89.4%
Frame objects created 9037863 1.0%

Object stats

allocations, frees and dict materializatons
Count Ratio
Allocations 1622062277
Frees 1586809566
New values 28822093
Materialize dict (on request) 1208299 4.2%
Materialize dict (new key) 525518 1.8%
Materialize dict (too big) 0 0.0%

Stats gathered on: 2022-02-22

@sweeneyde sweeneyde added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 22, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit b6e0b7b 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 22, 2022
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to keep the superinstructions separate from the specialized instructions, but this isn't much extra code and the speedup looks solid.

Pragmatism beats purity, it seems.

A couple of minor changes, otherwise looks good.

@@ -5430,6 +5460,53 @@ MISS_WITH_CACHE(BINARY_SUBSCR)
MISS_WITH_CACHE(UNPACK_SEQUENCE)
MISS_WITH_OPARG_COUNTER(STORE_SUBSCR)

LOAD_ATTR_INSTANCE_VALUE_miss:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't need special casing. If the LOAD_ATTR_INSTANCE_VALUE get de-optimized, the preceding LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE is still valid, just less likely to hit.
Given how unlikely a branch to the LOAD_ATTR is, don't worry about this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the combined instruction borrows the cache from the following instruction, I was worried about the combined instruction misinterpreting some of the cache data, so that's why I was attempting to keep them in sync.

But it seems that the instructions in the LOAD_ATTR family all interpret the cache entries in the same way, so for now anyway, if there's ever a correct cache1->tp_version, then cache0->index should be correct: you can't turn something with a class that has a _PyObject_ValuesPointer into a class with __slots__ without modifying tp_version_tag. (right?)

This might not be true though if we ever specialize instance.class_attribute. In that scenario, a LOAD_ATTR_INSTANCE_VALUE cache entry could have a cache1->tp_version and a cache0->index into the values array of the instance, but then later get de-optimized and re-optimized to the hypothetical LOAD_ATTR_CLASS_ATTRIBUTE, which could (?) have the same cache1->tp_version but a cache0->index indexing into types's value array.

It feels less fragile to me to keep the adjacent instructions in sync, so that cache is only interpreted in one way at a time -- never as a LOAD_ATTR_INSTANCE_VALUE cache from one instruction's perspective but a LOAD_ATTR_ADAPTIVE cache from a different perspective. Is that worth worrying about?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does make sense to special case LOAD_ATTR_INSTANCE_VALUE_miss.
I didn't properly consider the case where LOAD_ATTR_INSTANCE_VALUE get deoptimised before LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE.

LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE_miss can be simplified by letting LOAD_ATTR_INSTANCE_VALUE_miss do the work.

Something like:

LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE_miss:
        {
            // This is special-cased because we have a superinstruction
            // that includes a specialized instruction. 
            // We execute the first instruction, then perform a miss for
            // the second instruction as usual.

            // Do LOAD_FAST
            {
                PyObject *value = GETLOCAL(oparg);
                Py_INCREF(value);
                PUSH(value);
                NEXTOPARG();
                next_instr++;
            }

            // Now we are in the correct state for LOAD_ATTR
            goto LOAD_ATTR_INSTANCE_VALUE_miss;

Python/ceval.c Outdated
LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE_miss:
{
// Special-cased because the standard opcode does not push
// "owner" to the stack, but we need to push it if we DEOPT.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems misleading. We don't need to push the local here because the jump to LOAD_FAST does it.
Maybe:
"Special cased as this is a specialization of LOAD_FAST, but de-optimization occurs during the following LOAD_ATTR instruction.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Python/ceval.c Outdated
{
// This is special-cased because we have a specialization of
// LOAD_FAST that borrows cache from the following instruction.
STAT_INC(LOAD_FAST__LOAD_ATTR_INSTANCE_VALUE, miss);
Copy link
Member

@markshannon markshannon Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to count this as a miss. It is LOAD_ATTR_INSTANCE_VALUE that misses and we don't want to count the miss twice.

@sweeneyde
Copy link
Member Author

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 24, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 6450e4b 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 24, 2022
@markshannon
Copy link
Member

Looks good

@markshannon
Copy link
Member

Failures are either timeouts or pre-existing test_gdb failures. All unrelated to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants