You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does qlib's execution process have caching? The hyperparameters optimized by optuna cannot reproduce the results when run separately.
After excluding environmental influences, multi-threading effects, seed effects, and other factors, I found that in the same environment, running multiple times in succession causes the later executions to be affected by the previous ones.
To Reproduce
Steps to reproduce the behavior:
The simplified reproduction process is as follows:
Run two different models consecutively and print the results of Model 1 and Model 2.
--------------- both.py -----------------
The return result of running python both.py is as follows:
----- python both.py --------
model 1 : -0.7482357621192932
model 2 : -0.7483096122741699
The return result of running python model1.py is as follows:
----- python model1.py --------
model 1 : -0.7482357621192932 # the same as both.py
The return result of running python model2.py is as follows:
----- python model2.py --------
model 2 : -0.7509312033653259 # The results are different from those of both.py even when running with the same parameters.
Screenshot
With the same parameters, the results show that the output of Model 1 matches that of both.py, while the output of Model 2 differs from both.py. The only difference is that in both.py, Model 2 is executed immediately after Model 1. This raises suspicion that there might be some caching affecting the execution results.
Environment
Note: User could run cd scripts && python collect_info.py all under project directory to get system information
and paste them here directly.
Linux
x86_64
Linux-5.15.0-112-generic-x86_64-with-glibc2.17 #122-Ubuntu SMP Thu May 23 07:48:21 UTC 2024
##
🐛 Bug DescriptionDoes qlib's execution process have caching? The hyperparameters optimized by optuna cannot reproduce the results when run separately.
After excluding environmental influences, multi-threading effects, seed effects, and other factors, I found that in the same environment, running multiple times in succession causes the later executions to be affected by the previous ones.
To Reproduce
Steps to reproduce the behavior:
The simplified reproduction process is as follows:
--------------- both.py -----------------
--------------- model1.py -----------------
--------------- model2.py -----------------
Expected Behavior
The return result of running python both.py is as follows:
----- python both.py --------
model 1 : -0.7482357621192932
model 2 : -0.7483096122741699
The return result of running python model1.py is as follows:
----- python model1.py --------
model 1 : -0.7482357621192932 # the same as both.py
The return result of running python model2.py is as follows:
----- python model2.py --------
model 2 : -0.7509312033653259 # The results are different from those of both.py even when running with the same parameters.
Screenshot
With the same parameters, the results show that the output of Model 1 matches that of both.py, while the output of Model 2 differs from both.py. The only difference is that in both.py, Model 2 is executed immediately after Model 1. This raises suspicion that there might be some caching affecting the execution results.
Environment
Note: User could run
cd scripts && python collect_info.py all
under project directory to get system informationand paste them here directly.
Linux
x86_64
Linux-5.15.0-112-generic-x86_64-with-glibc2.17
#122-Ubuntu SMP Thu May 23 07:48:21 UTC 2024
Python version: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
Qlib version: 0.9.6
numpy==1.23.5
pandas==1.5.3
scipy==1.10.1
requests==2.31.0
Additional Notes
The text was updated successfully, but these errors were encountered: