Skip to content

测试集长度问题 #561

@jettzhan

Description

@jettzhan
import numpy as np
from paddlets import TSDataset
from paddlets.models.forecasting import RNNBlockRegressor, MLPRegressor, LSTNetRegressor
import matplotlib.pyplot as plt

data = TSDataset.load_from_csv("demo.csv", target_cols=['Q'], time_col='datetime',
                               fillna_method="pre", fill_missing_dates=True)

model = LSTNetRegressor(in_chunk_len=30, out_chunk_len=3)

train, val_test = data.split(0.8)
val, test = val_test.split(0.5)

model.fit(train_tsdataset=train, valid_tsdataset=val)

predicted_dataset = model.predict(test)
D:\AnacondaEnvs\energy-prediction\python.exe D:\02sourcecode\PycharmProjects\energy-prediction\demo.py 
[2025-08-15 09:53:33,557] [paddlets] [ERROR] ValueError: if time_window[1] (5) <= len(TSDataset.target) - 1 (5), then 32 <= time_window[0] (32) <= time_window[1] (5) <= len(TSDataset.target) - 1 (5) must be True.
Traceback (most recent call last):
  File "D:\02sourcecode\PycharmProjects\energy-prediction\demo.py", line 14, in <module>
    model.fit(train_tsdataset=train, valid_tsdataset=val)
  File "C:\Users\YPRJ\AppData\Roaming\Python\Python39\site-packages\paddlets\models\forecasting\dl\paddle_base_impl.py", line 345, in fit
    train_dataloader, valid_dataloaders = self._init_fit_dataloaders(train_tsdataset, valid_tsdataset)
  File "C:\Users\YPRJ\AppData\Roaming\Python\Python39\site-packages\paddlets\models\forecasting\dl\paddle_base_impl.py", line 225, in _init_fit_dataloaders
    dataset = data_adapter.to_sample_dataset(
  File "C:\Users\YPRJ\AppData\Roaming\Python\Python39\site-packages\paddlets\models\data_adapter.py", line 1507, in to_sample_dataset
    return SampleDataset(
  File "C:\Users\YPRJ\AppData\Roaming\Python\Python39\site-packages\paddlets\models\data_adapter.py", line 275, in __init__
    self._validate_time_window()
  File "C:\Users\YPRJ\AppData\Roaming\Python\Python39\site-packages\paddlets\models\data_adapter.py", line 925, in _validate_time_window
    raise_if_not(
  File "C:\Users\YPRJ\AppData\Roaming\Python\Python39\site-packages\paddlets\logger\logger.py", line 135, in raise_if_not
    raise ValueError(message)
ValueError: if time_window[1] (5) <= len(TSDataset.target) - 1 (5), then 32 <= time_window[0] (32) <= time_window[1] (5) <= len(TSDataset.target) - 1 (5) must be True.

Process finished with exit code 1

demo.csv 总共有61条数据。按照api文档说 in_chunk_len 是训练数据的长度,out_chunk_len 是预测产生的长度。那么我理解是用前30的数据,预测后面3条数据。
但是在训练的时候,我设置训练集、验证集、测试集。分别设置0.8和0.5,对应的分别是610.8, 610.20.5, 610.2*0.5 ,那么训练集、验证集、测试集分别是【48条,6条,6条】。 怎么会报这个错误。这个长度该怎么设置。

@jzhang533 @ZeyuChen @yangs16 @kuizhiqing

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions