Skip to content

gh-136421: Load _datetime static types during interpreter initialization #136583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Include/internal/pycore_pylifecycle.h
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ extern PyStatus _Py_HashRandomization_Init(const PyConfig *);

extern PyStatus _PyGC_Init(PyInterpreterState *interp);
extern PyStatus _PyAtExit_Init(PyInterpreterState *interp);
extern PyStatus _PyDateTime_InitTypes(PyInterpreterState *interp);

/* Various internal finalizers */

Expand Down
26 changes: 26 additions & 0 deletions Lib/test/datetimetester.py
Original file line number Diff line number Diff line change
Expand Up @@ -3651,6 +3651,32 @@ def test_repr_subclass(self):
td = SubclassDatetime(2010, 10, 2, second=3)
self.assertEqual(repr(td), "SubclassDatetime(2010, 10, 2, 0, 0, 3)")

@support.cpython_only
def test_concurrent_initialization_subinterpreter(self):
# Run in a subprocess to ensure we get a clean version of _datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put an anchor instead of the well-known explanation of assert_python_ok(). Also, move the test to ExtensionModuleTests (@support.cpython_only is redundant there). TestDateTime is the place to test the datetime class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure what you mean by "an anchor".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. I meant the gh-issue number or the url.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, did both.

script = """if True:
from concurrent.futures import InterpreterPoolExecutor
def func():
import _datetime
print('a', end='')
with InterpreterPoolExecutor() as executor:
for _ in range(8):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to you, but range(10) is probably better than range(8) for the Windows x64 CI to reproduce the issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I think this is fine. Many other systems have 8 cores, not 10.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but it sounds like you are talking about the max_workers argument rather than the submit count here.

Copy link
Member Author

@ZeroIntensity ZeroIntensity Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_workers just chooses the number of threads on the system when not set. We could try to submit os.cpu_count() number of futures, but we don't need to overcomplicate this; we just need something that stresses several subinterpreters trying to import _datetime concurrently.

executor.submit(func)
"""
rc, out, err = script_helper.assert_python_ok("-c", script)
self.assertEqual(rc, 0)
self.assertEqual(out, b"a" * 8)
self.assertEqual(err, b"")

# Now test against concurrent reinitialization
script = "import _datetime\n" + script
rc, out, err = script_helper.assert_python_ok("-c", script)
self.assertEqual(rc, 0)
self.assertEqual(out, b"a" * 8)
self.assertEqual(err, b"")


class TestSubclassDateTime(TestDateTime):
theclass = SubclassDatetime
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix crash when initializing :mod:`datetime` concurrently.
2 changes: 2 additions & 0 deletions Modules/Setup.bootstrap.in
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ posix posixmodule.c
_signal signalmodule.c
_tracemalloc _tracemalloc.c
_suggestions _suggestions.c
# needs libm and on some platforms librt
_datetime _datetimemodule.c

# modules used by importlib, deepfreeze, freeze, runpy, and sysconfig
_codecs _codecsmodule.c
Expand Down
3 changes: 0 additions & 3 deletions Modules/Setup.stdlib.in
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,6 @@
@MODULE_CMATH_TRUE@cmath cmathmodule.c
@MODULE__STATISTICS_TRUE@_statistics _statisticsmodule.c

# needs libm and on some platforms librt
@MODULE__DATETIME_TRUE@_datetime _datetimemodule.c

# _decimal uses libmpdec
# either static libmpdec.a from Modules/_decimal/libmpdec or libmpdec.so
# with ./configure --with-system-libmpdec
Expand Down
154 changes: 71 additions & 83 deletions Modules/_datetimemodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include "pycore_object.h" // _PyObject_Init()
#include "pycore_time.h" // _PyTime_ObjectToTime_t()
#include "pycore_unicodeobject.h" // _PyUnicode_Copy()
#include "pycore_initconfig.h" // _PyStatus_OK()

#include "datetime.h"

Expand Down Expand Up @@ -124,10 +125,9 @@ get_module_state(PyObject *module)
#define INTERP_KEY ((PyObject *)&_Py_ID(cached_datetime_module))

static PyObject *
get_current_module(PyInterpreterState *interp, int *p_reloading)
get_current_module(PyInterpreterState *interp)
{
PyObject *mod = NULL;
int reloading = 0;

PyObject *dict = PyInterpreterState_GetDict(interp);
if (dict == NULL) {
Expand All @@ -138,7 +138,6 @@ get_current_module(PyInterpreterState *interp, int *p_reloading)
goto error;
}
if (ref != NULL) {
reloading = 1;
if (ref != Py_None) {
(void)PyWeakref_GetRef(ref, &mod);
if (mod == Py_None) {
Expand All @@ -147,9 +146,6 @@ get_current_module(PyInterpreterState *interp, int *p_reloading)
Py_DECREF(ref);
}
}
if (p_reloading != NULL) {
*p_reloading = reloading;
}
return mod;

error:
Expand All @@ -163,7 +159,7 @@ static datetime_state *
_get_current_state(PyObject **p_mod)
{
PyInterpreterState *interp = PyInterpreterState_Get();
PyObject *mod = get_current_module(interp, NULL);
PyObject *mod = get_current_module(interp);
if (mod == NULL) {
assert(!PyErr_Occurred());
if (PyErr_Occurred()) {
Expand Down Expand Up @@ -7329,13 +7325,9 @@ clear_state(datetime_state *st)
}


static int
init_static_types(PyInterpreterState *interp, int reloading)
PyStatus
_PyDateTime_InitTypes(PyInterpreterState *interp)
{
if (reloading) {
return 0;
}

// `&...` is not a constant expression according to a strict reading
// of C standards. Fill tp_base at run-time rather than statically.
// See https://bugs.python.org/issue40777
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not address the possible races in PyDateTime_*.tp_base = &PyDateTime_*Type; below, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not. I don't think there's an easy way to do this here, because atomically storing tp_base will continue to race with all the non-atomic reads elsewhere.

Do we even need to load it at runtime like this? We have other examples of directly storing it in the PyTypeObject structure:

.tp_base = &PyCFunction_Type,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not needed now because the module is statically linked, that issue happens only with dynamic loaded modules so you can define it statically now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a _Py_IsMainInterPreter() check will suffice in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not needed now because the module is statically linked, that issue happens only with dynamic loaded modules so you can define it statically now.

Ah, TIL. That's definitely the best option here then.

I guess a _Py_IsMainInterPreter() check will suffice in this PR?

For what?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ensuring tp_base is set only once after Py_Initialize(). But I missed the Kumar 's comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove PyDateTime_DateTimeType.tp_base = &PyDateTime_DateType; as well.

Expand All @@ -7347,11 +7339,74 @@ init_static_types(PyInterpreterState *interp, int reloading)
for (size_t i = 0; i < Py_ARRAY_LENGTH(capi_types); i++) {
PyTypeObject *type = capi_types[i];
if (_PyStaticType_InitForExtension(interp, type) < 0) {
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR but as a follow up I think it would be better to now remove _PyStaticType_InitForExtension and just use _PyStaticType_InitBuiltin for it, there's a lot of special casing that could be removed.

return _PyStatus_ERR("could not initialize static types");
}
}

return 0;
#define DATETIME_ADD_MACRO(dict, c, value_expr) \
do { \
assert(!PyErr_Occurred()); \
PyObject *value = (value_expr); \
if (value == NULL) { \
goto error; \
} \
if (PyDict_SetItemString(dict, c, value) < 0) { \
Py_DECREF(value); \
goto error; \
} \
Py_DECREF(value); \
} while(0)

/* timedelta values */
PyObject *d = _PyType_GetDict(&PyDateTime_DeltaType);
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));
DATETIME_ADD_MACRO(d, "min", new_delta(-MAX_DELTA_DAYS, 0, 0, 0));
DATETIME_ADD_MACRO(d, "max",
new_delta(MAX_DELTA_DAYS, 24*3600-1, 1000000-1, 0));

/* date values */
d = _PyType_GetDict(&PyDateTime_DateType);
DATETIME_ADD_MACRO(d, "min", new_date(1, 1, 1));
DATETIME_ADD_MACRO(d, "max", new_date(MAXYEAR, 12, 31));
DATETIME_ADD_MACRO(d, "resolution", new_delta(1, 0, 0, 0));

/* time values */
d = _PyType_GetDict(&PyDateTime_TimeType);
DATETIME_ADD_MACRO(d, "min", new_time(0, 0, 0, 0, Py_None, 0));
DATETIME_ADD_MACRO(d, "max", new_time(23, 59, 59, 999999, Py_None, 0));
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));

/* datetime values */
d = _PyType_GetDict(&PyDateTime_DateTimeType);
DATETIME_ADD_MACRO(d, "min",
new_datetime(1, 1, 1, 0, 0, 0, 0, Py_None, 0));
DATETIME_ADD_MACRO(d, "max", new_datetime(MAXYEAR, 12, 31, 23, 59, 59,
999999, Py_None, 0));
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));

/* timezone values */
d = _PyType_GetDict(&PyDateTime_TimeZoneType);
if (PyDict_SetItemString(d, "utc", (PyObject *)&utc_timezone) < 0) {
goto error;
}

/* bpo-37642: These attributes are rounded to the nearest minute for backwards
* compatibility, even though the constructor will accept a wider range of
* values. This may change in the future.*/

/* -23:59 */
DATETIME_ADD_MACRO(d, "min", create_timezone_from_delta(-1, 60, 0, 1));

/* +23:59 */
DATETIME_ADD_MACRO(
d, "max", create_timezone_from_delta(0, (23 * 60 + 59) * 60, 0, 0));

#undef DATETIME_ADD_MACRO

return _PyStatus_OK();

error:
return _PyStatus_NO_MEMORY();
}


Expand All @@ -7369,20 +7424,15 @@ _datetime_exec(PyObject *module)
{
int rc = -1;
datetime_state *st = get_module_state(module);
int reloading = 0;

PyInterpreterState *interp = PyInterpreterState_Get();
PyObject *old_module = get_current_module(interp, &reloading);
PyObject *old_module = get_current_module(interp);
if (PyErr_Occurred()) {
assert(old_module == NULL);
goto error;
}
/* We actually set the "current" module right before a successful return. */

if (init_static_types(interp, reloading) < 0) {
goto error;
}

for (size_t i = 0; i < Py_ARRAY_LENGTH(capi_types); i++) {
PyTypeObject *type = capi_types[i];
const char *name = _PyType_Name(type);
Expand All @@ -7396,68 +7446,6 @@ _datetime_exec(PyObject *module)
goto error;
}

#define DATETIME_ADD_MACRO(dict, c, value_expr) \
do { \
assert(!PyErr_Occurred()); \
PyObject *value = (value_expr); \
if (value == NULL) { \
goto error; \
} \
if (PyDict_SetItemString(dict, c, value) < 0) { \
Py_DECREF(value); \
goto error; \
} \
Py_DECREF(value); \
} while(0)

if (!reloading) {
/* timedelta values */
PyObject *d = _PyType_GetDict(&PyDateTime_DeltaType);
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));
DATETIME_ADD_MACRO(d, "min", new_delta(-MAX_DELTA_DAYS, 0, 0, 0));
DATETIME_ADD_MACRO(d, "max",
new_delta(MAX_DELTA_DAYS, 24*3600-1, 1000000-1, 0));

/* date values */
d = _PyType_GetDict(&PyDateTime_DateType);
DATETIME_ADD_MACRO(d, "min", new_date(1, 1, 1));
DATETIME_ADD_MACRO(d, "max", new_date(MAXYEAR, 12, 31));
DATETIME_ADD_MACRO(d, "resolution", new_delta(1, 0, 0, 0));

/* time values */
d = _PyType_GetDict(&PyDateTime_TimeType);
DATETIME_ADD_MACRO(d, "min", new_time(0, 0, 0, 0, Py_None, 0));
DATETIME_ADD_MACRO(d, "max", new_time(23, 59, 59, 999999, Py_None, 0));
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));

/* datetime values */
d = _PyType_GetDict(&PyDateTime_DateTimeType);
DATETIME_ADD_MACRO(d, "min",
new_datetime(1, 1, 1, 0, 0, 0, 0, Py_None, 0));
DATETIME_ADD_MACRO(d, "max", new_datetime(MAXYEAR, 12, 31, 23, 59, 59,
999999, Py_None, 0));
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));

/* timezone values */
d = _PyType_GetDict(&PyDateTime_TimeZoneType);
if (PyDict_SetItemString(d, "utc", (PyObject *)&utc_timezone) < 0) {
goto error;
}

/* bpo-37642: These attributes are rounded to the nearest minute for backwards
* compatibility, even though the constructor will accept a wider range of
* values. This may change in the future.*/

/* -23:59 */
DATETIME_ADD_MACRO(d, "min", create_timezone_from_delta(-1, 60, 0, 1));

/* +23:59 */
DATETIME_ADD_MACRO(
d, "max", create_timezone_from_delta(0, (23 * 60 + 59) * 60, 0, 0));
}

#undef DATETIME_ADD_MACRO

/* Add module level attributes */
if (PyModule_AddIntMacro(module, MINYEAR) < 0) {
goto error;
Expand Down
1 change: 1 addition & 0 deletions PCbuild/_freeze_module.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@
</ItemGroup>
<ItemGroup>
<ClCompile Include="..\Modules\atexitmodule.c" />
<ClCompile Include="..\Modules\_datetimemodule.c" />
<ClCompile Include="..\Modules\faulthandler.c" />
<ClCompile Include="..\Modules\gcmodule.c" />
<ClCompile Include="..\Modules\getbuildinfo.c" />
Expand Down
5 changes: 5 additions & 0 deletions Python/pylifecycle.c
Original file line number Diff line number Diff line change
Expand Up @@ -760,6 +760,11 @@ pycore_init_types(PyInterpreterState *interp)
return status;
}

status = _PyDateTime_InitTypes(interp);
if (_PyStatus_EXCEPTION(status)) {
return status;
}
Comment on lines +763 to +766
Copy link
Contributor

@neonene neonene Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
status = _PyDateTime_InitTypes(interp);
if (_PyStatus_EXCEPTION(status)) {
return status;
}
if (!_Py_IsMainInterpreter(interp)) {
status = _PyDateTime_InitTypes(interp);
if (_PyStatus_EXCEPTION(status)) {
return status;
}
}

Reply from #136620 (comment)

Can you run test_concurrent_initialization() with this change? Based on the crashes that come from the change, I said this PR and my example are "almost equivalent." I'm not sure right now what this PR ensures.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what are you trying to achieve here? This will just break the types for the main interpreter.

Copy link
Contributor

@neonene neonene Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will just break the types for the main interpreter.

Note that test_concurrent_initialization() does not load the _datetime in the main inter interpreter at all.

Correction: Run the script of the test without running test_datetime.


return _PyStatus_OK();
}

Expand Down
Loading