Skip to content

ValueError: y contains previously unseen labels #564

Open
@syuu1987

Description

@syuu1987

Describe the bug
I have two models. Model tabular_binary_model is trained by binay_label, Model tabular_multi_cls_model is trained by label.
Both label and binay_label are in df_test.

I run the code as below,

tabular_binary_model = TabularModel.load_model("gandalf_emb_exp_22_3_binary_010")
df_pred = tabular_binary_model.predict(df_test)
tabular_multi_cls_model = TabularModel.load_model("gandalf_exp_22_1")
df_multi_pred = tabular_multi_cls_model.predict(df_test)

I have got the error,

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[27], line 1
----> 1 df_multi_pred = tabular_multi_cls_model.predict(df_test)

File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py:1514](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py#line=1513), in TabularModel.predict(self, test, quantiles, n_samples, ret_logits, include_input_features, device, progress_bar, test_time_augmentation, num_tta, alpha_tta, aggregate_tta, tta_seed)
   1512     handle.remove()
   1513 else:
-> 1514     pred_df = self._predict(
   1515         test,
   1516         quantiles,
   1517         n_samples,
   1518         ret_logits,
   1519         include_input_features,
   1520         device,
   1521         progress_bar,
   1522     )
   1523 return pred_df

File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py:1372](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_model.py#line=1371), in TabularModel._predict(self, test, quantiles, n_samples, ret_logits, include_input_features, device, progress_bar)
   1370         model = self.model.to(device)
   1371 model.eval()
-> 1372 inference_dataloader = self.datamodule.prepare_inference_dataloader(test)
   1373 is_probabilistic = hasattr(model.hparams, "_probabilistic") and model.hparams._probabilistic
   1375 if progress_bar == "rich":

File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:861](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=860), in TabularDatamodule.prepare_inference_dataloader(self, df, batch_size, copy_df)
    859 if copy_df:
    860     df = df.copy()
--> 861 df = self._prepare_inference_data(df)
    862 dataset = TabularDataset(
    863     task=self.config.task,
    864     data=df,
   (...)
    867     target=(self.target if all(col in df.columns for col in self.target) else None),
    868 )
    869 return DataLoader(
    870     dataset,
    871     batch_size or self.batch_size,
   (...)
    874     **self.config.dataloader_kwargs,
    875 )

File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:843](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=842), in TabularDatamodule._prepare_inference_data(self, df)
    841     else:
    842         df.loc[:, self.target] = np.zeros((len(df), len(self.target)))
--> 843 df, _ = self.preprocess_data(df, stage="inference")
    844 return df

File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:463](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=462), in TabularDatamodule.preprocess_data(self, data, stage)
    461     data = self._normalize_continuous_columns(data, stage)
    462 # Converting target labels to a 0 indexed label
--> 463 data = self._label_encode_target(data, stage)
    464 # Target Transforms
    465 data = self._target_transform(data, stage)

File [~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py:404](http://10.253.0.240:8883/lab/tree/dl_demo/~/.local/lib/python3.8/site-packages/pytorch_tabular/tabular_datamodule.py#line=403), in TabularDatamodule._label_encode_target(self, data, stage)
    402     for i in range(len(self.config.target)):
    403         if self.config.target[i] in data.columns:
--> 404             data[self.config.target[i]] = self.label_encoder[i].transform(data[self.config.target[i]])
    405 return data

File [/usr/local/lib/python3.8/dist-packages/sklearn/preprocessing/_label.py:137](http://10.253.0.240:8883/usr/local/lib/python3.8/dist-packages/sklearn/preprocessing/_label.py#line=136), in LabelEncoder.transform(self, y)
    134 if _num_samples(y) == 0:
    135     return np.array([])
--> 137 return _encode(y, uniques=self.classes_)

File [/usr/local/lib/python3.8/dist-packages/sklearn/utils/_encode.py:232](http://10.253.0.240:8883/usr/local/lib/python3.8/dist-packages/sklearn/utils/_encode.py#line=231), in _encode(values, uniques, check_unknown)
    230     diff = _check_unknown(values, uniques)
    231     if diff:
--> 232         raise ValueError(f"y contains previously unseen labels: {str(diff)}")
    233 return np.searchsorted(uniques, values)

ValueError: y contains previously unseen labels: [1.0]

I have checked label is not in continuous_cols.
If I drop the columns of label, it does works.

Image

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots

Image Image Image

Desktop (please complete the following information):

  • OS: Amazon Linux
  • Browser chrome
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions