AttributeError: 'CogVLMForCausalLM' object has no attribute '_extract_past_from_model_output'

### System Info / 系統信息

(venv) (base) root@job-4043-1747191079-8jjxn:/data/try/CogVLM2/basic_demo# pip freeze
accelerate==1.7.0
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.12.13
aiosignal==1.3.2
annotated-types==0.7.0
anthropic==0.54.0
anyio==4.9.0
async-timeout==5.0.1
asyncer==0.0.7
attrs==25.3.0
backoff==2.2.1
bidict==0.23.1
bitsandbytes==0.46.0
certifi==2025.6.15
chainlit==2.5.5
charset-normalizer==3.4.2
chevron==0.14.0
click==8.2.1
colorama==0.4.6
dataclasses-json==0.6.7
Deprecated==1.2.18
distro==1.9.0
einops==0.8.1
exceptiongroup==1.3.0
fastapi==0.115.13
filelock==3.18.0
filetype==1.2.0
frozenlist==1.7.0
fsspec==2025.5.1
googleapis-common-protos==1.70.0
grpcio==1.73.0
h11==0.16.0
hf-xet==1.1.4
httpcore==1.0.9
httpx==0.28.1
httpx-sse==0.4.0
huggingface-hub==0.33.0
idna==3.10
importlib_metadata==8.6.1
inflection==0.5.1
Jinja2==3.1.6
jiter==0.10.0
Lazify==0.4.0
literalai==0.1.201
loguru==0.7.3
MarkupSafe==3.0.2
marshmallow==3.26.1
mcp==1.9.4
monotonic==1.6
mpmath==1.3.0
multidict==6.5.0
mypy_extensions==1.1.0
nest-asyncio==1.6.0
networkx==3.4.2
numpy==2.2.6
nvidia-cublas-cu12==12.6.4.1
nvidia-cuda-cupti-cu12==12.6.80
nvidia-cuda-nvrtc-cu12==12.6.77
nvidia-cuda-runtime-cu12==12.6.77
nvidia-cudnn-cu12==9.5.1.17
nvidia-cufft-cu12==11.3.0.4
nvidia-cufile-cu12==1.11.1.6
nvidia-curand-cu12==10.3.7.77
nvidia-cusolver-cu12==11.7.1.2
nvidia-cusparse-cu12==12.5.4.2
nvidia-cusparselt-cu12==0.6.3
nvidia-nccl-cu12==2.26.2
nvidia-nvjitlink-cu12==12.6.85
nvidia-nvtx-cu12==12.6.77
openai==1.88.0
opentelemetry-api==1.31.1
opentelemetry-exporter-otlp==1.31.1
opentelemetry-exporter-otlp-proto-common==1.31.1
opentelemetry-exporter-otlp-proto-grpc==1.31.1
opentelemetry-exporter-otlp-proto-http==1.31.1
opentelemetry-instrumentation==0.52b1
opentelemetry-instrumentation-alephalpha==0.40.11
opentelemetry-instrumentation-anthropic==0.40.11
opentelemetry-instrumentation-bedrock==0.40.11
opentelemetry-instrumentation-chromadb==0.40.11
opentelemetry-instrumentation-cohere==0.40.11
opentelemetry-instrumentation-crewai==0.40.11
opentelemetry-instrumentation-google-generativeai==0.40.11
opentelemetry-instrumentation-groq==0.40.11
opentelemetry-instrumentation-haystack==0.40.11
opentelemetry-instrumentation-lancedb==0.40.11
opentelemetry-instrumentation-langchain==0.40.11
opentelemetry-instrumentation-llamaindex==0.40.11
opentelemetry-instrumentation-logging==0.52b1
opentelemetry-instrumentation-marqo==0.40.11
opentelemetry-instrumentation-mcp==0.40.11
opentelemetry-instrumentation-milvus==0.40.11
opentelemetry-instrumentation-mistralai==0.40.11
opentelemetry-instrumentation-ollama==0.40.11
opentelemetry-instrumentation-openai==0.40.11
opentelemetry-instrumentation-pinecone==0.40.11
opentelemetry-instrumentation-qdrant==0.40.11
opentelemetry-instrumentation-replicate==0.40.11
opentelemetry-instrumentation-requests==0.52b1
opentelemetry-instrumentation-sagemaker==0.40.11
opentelemetry-instrumentation-sqlalchemy==0.52b1
opentelemetry-instrumentation-threading==0.52b1
opentelemetry-instrumentation-together==0.40.11
opentelemetry-instrumentation-transformers==0.40.11
opentelemetry-instrumentation-urllib3==0.52b1
opentelemetry-instrumentation-vertexai==0.40.11
opentelemetry-instrumentation-watsonx==0.40.11
opentelemetry-instrumentation-weaviate==0.40.11
opentelemetry-proto==1.31.1
opentelemetry-sdk==1.31.1
opentelemetry-semantic-conventions==0.52b1
opentelemetry-semantic-conventions-ai==0.4.9
opentelemetry-util-http==0.52b1
packaging==25.0
pillow==11.2.1
posthog==3.25.0
propcache==0.3.2
protobuf==5.29.5
psutil==7.0.0
pydantic==2.11.7
pydantic-settings==2.9.1
pydantic_core==2.33.2
PyJWT==2.10.1
python-dateutil==2.9.0.post0
python-dotenv==1.1.0
python-engineio==4.12.2
python-multipart==0.0.18
python-socketio==5.13.0
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.4
safetensors==0.5.3
simple-websocket==1.1.0
six==1.17.0
sniffio==1.3.1
sse-starlette==2.3.6
starlette==0.41.3
sympy==1.14.0
syncer==2.0.3
tenacity==9.1.2
tiktoken==0.9.0
timm==1.0.15
tokenizers==0.21.1
tomli==2.2.1
torch==2.7.0
torchvision==0.22.0
tqdm==4.67.1
traceloop-sdk==0.40.11
transformers==4.52.4
triton==3.3.0
typing-inspect==0.9.0
typing-inspection==0.4.1
typing_extensions==4.14.0
uptrace==1.31.0
urllib3==2.4.0
uv==0.7.13
uvicorn==0.34.3
watchfiles==0.20.0
wrapt==1.17.2
wsproto==1.2.0
xformers==0.0.30
yarl==1.20.1
zipp==3.23.0



能打开网页但报错：

(venv) (base) root@job-4043-1747191079-8jjxn:/data/try/CogVLM2/basic_demo# chainlit run web_demo.py --host 0.0.0.0 --port 80
Quant = 4
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
/data/try/CogVLM2/basic_demo/venv/lib/python3.10/site-packages/transformers/quantizers/auto.py:222: UserWarning: You passed `quantization_config` or equivalent parameters to `from_pretrained` but the model you're loading already has a `quantization_config` attribute. The `quantization_config` from the model will be used.
  warnings.warn(warning_msg)
2025-06-18 19:24:19 - Your app is available at http://0.0.0.0:80
2025-06-18 19:24:34 - Translated markdown file for zh-CN not found. Defaulting to chainlit.md.
2025-06-18 19:24:55 - Skipping data after last boundary
Exception in thread Thread-1 (generate):
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/data/try/CogVLM2/basic_demo/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/data/try/CogVLM2/basic_demo/venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 2597, in generate
    result = self._sample(
  File "/data/try/CogVLM2/basic_demo/venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 3563, in _sample
    model_kwargs = self._update_model_kwargs_for_generation(
  File "/root/.cache/huggingface/modules/transformers_modules/cogvlm2-llama3-chinese-chat-19B-int4/modeling_cogvlm.py", line 710, in _update_model_kwargs_for_generation
    model_kwargs["past_key_values"] = self._extract_past_from_model_output(
  File "/data/try/CogVLM2/basic_demo/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1940, in __getattr__
    raise AttributeError(
AttributeError: 'CogVLMForCausalLM' object has no attribute '_extract_past_from_model_output'


### Who can help? / 谁可以帮助到您？

_No response_

### Information / 问题信息

- [ ] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务

### Reproduction / 复现过程

rt

### Expected behavior / 期待表现

rt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AttributeError: 'CogVLMForCausalLM' object has no attribute '_extract_past_from_model_output' #218

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AttributeError: 'CogVLMForCausalLM' object has no attribute '_extract_past_from_model_output' #218

Description

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions