-
-
Notifications
You must be signed in to change notification settings - Fork 32.7k
Description
Bug report
Bug description:
When switching a Python project from the default multiprocessing fork
mode to forkserver
, I've noticed the processes in the pool got significantly slower to start. Specifying preload
modules didn't help.
Debugging internals of multiprocessing
showed that most of the time (50 ms in each process in my test) is spent in _fixup_main_from_path(data['init_main_from_path'])
:
cpython/Lib/multiprocessing/spawn.py
Line 246 in eae9d7d
_fixup_main_from_path(data['init_main_from_path']) |
This seemed surprising because __main__
was mentioned in the preload
parameter, however I've noticed that forkserver.py
tries to populate the main_path
parameter from spawn.get_preparation_data()
:
cpython/Lib/multiprocessing/forkserver.py
Lines 149 to 151 in eae9d7d
desired_keys = {'main_path', 'sys_path'} | |
data = spawn.get_preparation_data('ignore') | |
main_kws = {x: y for x, y in data.items() if x in desired_keys} |
However, the latter only writes the path to the init_main_from_path
parameter:
cpython/Lib/multiprocessing/spawn.py
Line 202 in eae9d7d
d['init_main_from_path'] = os.path.normpath(main_path) |
The end effect is that the __main__
module wasn't preloaded in practice, and every child process had to re-run the main script. Unless I'm missing something, the logic in forkserver.py
needs to get main_path
from the value of init_main_from_path
?
CPython versions tested on:
3.13
Operating systems tested on:
Linux