-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Description
Ran a test with a dataset today and at about 30% completion had some sort of hard crash - any tips on how to avoid this sort of thing? Worst case, should I just increase verbosity to see the best models as they print out?
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
xgb_config = {
'xgboost.XGBRegressor': {
'n_estimators': range(200, 2500, 200), # Between 200 and 2500 trees
'learning_rate': np.linspace(0.005, 0.2, 10), # Learning rate from 0.005 to 0.2
'max_depth': range(3, 12, 1), # Depth from 3 to 12
'min_child_weight': range(1, 20, 2), # Minimum sum of weights in a leaf
'subsample': np.linspace(0.5, 1.0, 6), # Subsampling between 50% and 100%
'colsample_bytree': np.linspace(0.5, 1.0, 6), # Column sampling
'gamma': np.linspace(0, 0.5, 6), # Gamma for pruning
'reg_alpha': np.logspace(-4, 1, 6), # L1 regularization
'reg_lambda': np.logspace(-4, 1, 6), # L2 regularization
'device': ['cuda'],
'tree_method': ['hist'],
'n_jobs': [10] # Use all CPU cores
}
}
# Initialize TPOT with expanded XGBoost hyperparameter search
tpot = TPOTRegressor(
generations=20, # Number of generations (increase for better tuning)
population_size=50, # Population size (larger values improve exploration)
verbosity=2,
config_dict=xgb_config, # Use only XGBoost with hyperparameter tuning
n_jobs=2
)
tpot.fit(X_train[:5000], y_train[:5000])
>>> tpot.fit(X_train[:5000], y_train[:5000])
is_classifier
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1230: FutureWarning: passing a class to None is deprecated and will be removed in 1.8. Use an instance of the class instead.
warnings.warn(
is_regressor
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1270: FutureWarning: passing a class to None is deprecated and will be removed in 1.8. Use an instance of the class instead.
warnings.warn(
Optimization Progress: 0%| | 0/1050 [00:00<?, ?pipeline/s]Optimization Progress: 6%|_____________ | 61/1050 [58:18<17:23:28, 63.30s/pipeline]Optimization Progress: 7%|______________ | 69/1050 [1:04:36<15:35:33, 57.22s/pipeline]
Generation 1 - Current best internal CV score: -0.2588038376471536
Generation 2 - Current best internal CV score: -0.2588038376471536
Generation 3 - Current best internal CV score: -0.2583334021699287
Generation 4 - Current best internal CV score: -0.2583334021699287
Generation 5 - Current best internal CV score: -0.25807251507673135
Generation 6 - Current best internal CV score: -0.25589519236631997
Optimization Progress: 34%|_______________________________________________________________________ | 353/1050 [5:37:53<11:20:12, 58.55s/pipeline]Exception ignored on calling ctypes callback function: <bound method DataIter._next_wrapper of <xgboost.data.SingleBatchInternalIter object at 0x76285c795a50>>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/xgboost/core.py", line 582, in _next_wrapper
def _next_wrapper(self, this: None) -> int: # pylint: disable=unused-argument
stopit.utils.TimeoutException:
corrupted size vs. prev_size
Aborted (core dumped)
root@d80360eab95d:/workspace/api#
root@d80360eab95d:/workspace/api# /usr/local/lib/python3.10/dist-packages/joblib/externals/loky/backend/resource_tracker.py:314: UserWarning: resource_tracker: There appear to be 8 leaked semlock objects to clean up at shutdown
warnings.warn(
/usr/local/lib/python3.10/dist-packages/joblib/externals/loky/backend/resource_tracker.py:314: UserWarning: resource_tracker: There appear to be 44 leaked folder objects to clean up at shutdown
warnings.warn(
root@d80360eab95d:/workspace/api#
Metadata
Metadata
Assignees
Labels
No labels