Open
Description
Describe the bug
I have two models. Model tabular_binary_model is trained by binay_label
, Model tabular_multi_cls_model is trained by label
.
Both label
and binay_label
are in df_test
.
I run the code as below,
tabular_binary_model = TabularModel.load_model("gandalf_emb_exp_22_3_binary_010")
df_pred = tabular_binary_model.predict(df_test)
tabular_multi_cls_model = TabularModel.load_model("gandalf_exp_22_1")
df_multi_pred = tabular_multi_cls_model.predict(df_test)
I have got the error,
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[27], line 1
----> 1 df_multi_pred = tabular_multi_cls_model.predict(df_test)
File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py:1514](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py#line=1513), in TabularModel.predict(self, test, quantiles, n_samples, ret_logits, include_input_features, device, progress_bar, test_time_augmentation, num_tta, alpha_tta, aggregate_tta, tta_seed)
1512 handle.remove()
1513 else:
-> 1514 pred_df = self._predict(
1515 test,
1516 quantiles,
1517 n_samples,
1518 ret_logits,
1519 include_input_features,
1520 device,
1521 progress_bar,
1522 )
1523 return pred_df
File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py:1372](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py#line=1371), in TabularModel._predict(self, test, quantiles, n_samples, ret_logits, include_input_features, device, progress_bar)
1370 model = self.model.to(device)
1371 model.eval()
-> 1372 inference_dataloader = self.datamodule.prepare_inference_dataloader(test)
1373 is_probabilistic = hasattr(model.hparams, "_probabilistic") and model.hparams._probabilistic
1375 if progress_bar == "rich":
File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:861](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=860), in TabularDatamodule.prepare_inference_dataloader(self, df, batch_size, copy_df)
859 if copy_df:
860 df = df.copy()
--> 861 df = self._prepare_inference_data(df)
862 dataset = TabularDataset(
863 task=self.config.task,
864 data=df,
(...)
867 target=(self.target if all(col in df.columns for col in self.target) else None),
868 )
869 return DataLoader(
870 dataset,
871 batch_size or self.batch_size,
(...)
874 **self.config.dataloader_kwargs,
875 )
File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:843](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=842), in TabularDatamodule._prepare_inference_data(self, df)
841 else:
842 df.loc[:, self.target] = np.zeros((len(df), len(self.target)))
--> 843 df, _ = self.preprocess_data(df, stage="inference")
844 return df
File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:463](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=462), in TabularDatamodule.preprocess_data(self, data, stage)
461 data = self._normalize_continuous_columns(data, stage)
462 # Converting target labels to a 0 indexed label
--> 463 data = self._label_encode_target(data, stage)
464 # Target Transforms
465 data = self._target_transform(data, stage)
File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:404](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=403), in TabularDatamodule._label_encode_target(self, data, stage)
402 for i in range(len(self.config.target)):
403 if self.config.target[i] in data.columns:
--> 404 data[self.config.target[i]] = self.label_encoder[i].transform(data[self.config.target[i]])
405 return data
File [/usr/local/lib/python3.8/dist-packages/sklearn/preprocessing/_label.py:137](http://10.253.0.240:8883/usr/local/lib/python3.8/dist-packages/sklearn/preprocessing/_label.py#line=136), in LabelEncoder.transform(self, y)
134 if _num_samples(y) == 0:
135 return np.array([])
--> 137 return _encode(y, uniques=self.classes_)
File [/usr/local/lib/python3.8/dist-packages/sklearn/utils/_encode.py:232](http://10.253.0.240:8883/usr/local/lib/python3.8/dist-packages/sklearn/utils/_encode.py#line=231), in _encode(values, uniques, check_unknown)
230 diff = _check_unknown(values, uniques)
231 if diff:
--> 232 raise ValueError(f"y contains previously unseen labels: {str(diff)}")
233 return np.searchsorted(uniques, values)
ValueError: y contains previously unseen labels: [1.0]
I have checked label
is not in continuous_cols.
If I drop the columns of label
, it does works.

To Reproduce
Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots



Desktop (please complete the following information):
- OS: Amazon Linux
- Browser chrome
- Version [e.g. 22]
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
Additional context
Add any other context about the problem here.