-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Checked other resources
- This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
- I added a clear and descriptive title that summarizes this issue.
- I used the GitHub search to find a similar question and didn't find it.
- I am sure that this is a bug in LangChain rather than my code.
- The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
- I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
- I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
Example Code
from langchain_qdrant import QdrantVectorStore, RetrievalMode, FastEmbedSparse
from langchain_ollama import OllamaEmbeddings
from langchain.retrievers import ContextualCompressionRetriever
from langchain_community.cross_encoders import HuggingFaceCrossEncoder
from langchain.retrievers.document_compressors import CrossEncoderReranker
embed_model = OllamaEmbeddings(model="Qwen3")
sparse_embeddings = FastEmbedSparse(model_name="Qdrant/bm25")
### any dummy vector store will work
vector_store = QdrantVectorStore.from_existing_collection(
collection_name='collection',
embedding=embed_model,
sparse_embedding=sparse_embeddings,
path='qdrantDB',
vector_name='dense',
sparse_vector_name='bm25'
)
reranker = HuggingFaceCrossEncoder(model_name="Qwen/Qwen3-Reranker-0.6B")
compressor = CrossEncoderReranker(model=reranker, top_n=3)
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor, base_retriever=vector_store.as_retriever(k=10)
)
context = compression_retriever.invoke('any query text to invoke it')
Error Message and Stack Trace (if applicable)
ValueError Traceback (most recent call last)
Cell In[7], line 7
----> 7 context = compression_retriever.invoke('any query text to invoke it')
File virtualenvs\GenAI\Lib\site-packages\langchain_core\retrievers.py:261, in BaseRetriever.invoke(self, input, config, **kwargs)
259 kwargs_ = kwargs if self._expects_other_args else {}
260 if self._new_arg_supported:
--> 261 result = self.get_relevant_documents(
262 input, run_manager=run_manager, **kwargs
263 )
264 else:
265 result = self.get_relevant_documents(input, **kwargs)
File virtualenvs\GenAI\Lib\site-packages\langchain\retrievers\contextual_compression.py:46, in ContextualCompressionRetriever._get_relevant_documents(self, query, run_manager, **kwargs)
40 docs = self.base_retriever.invoke(
41 query,
42 config={"callbacks": run_manager.get_child()},
43 **kwargs,
44 )
45 if docs:
---> 46 compressed_docs = self.base_compressor.compress_documents(
47 docs,
48 query,
49 callbacks=run_manager.get_child(),
50 )
51 return list(compressed_docs)
52 return []
File virtualenvs\GenAI\Lib\site-packages\langchain\retrievers\document_compressors\cross_encoder_rerank.py:45, in CrossEncoderReranker.compress_documents(self, documents, query, callbacks)
28 def compress_documents(
29 self,
30 documents: Sequence[Document],
31 query: str,
32 callbacks: Optional[Callbacks] = None,
33 ) -> Sequence[Document]:
34 """
35 Rerank documents using CrossEncoder.
36
(...) 43 A sequence of compressed documents.
44 """
---> 45 scores = self.model.score([(query, doc.page_content) for doc in documents])
46 docs_with_scores = list(zip(documents, scores))
47 result = sorted(docs_with_scores, key=operator.itemgetter(1), reverse=True)
File virtualenvs\GenAI\Lib\site-packages\langchain_community\cross_encoders\huggingface.py:59, in HuggingFaceCrossEncoder.score(self, text_pairs)
50 def score(self, text_pairs: List[Tuple[str, str]]) -> List[float]:
51 """Compute similarity scores using a HuggingFace transformer model.
52
53 Args:
(...) 57 List of scores, one for each pair.
58 """
---> 59 scores = self.client.predict(text_pairs)
60 # Some models e.g bert-multilingual-passage-reranking-msmarco
61 # gives two score not_relevant and relevant as compare with the query.
62 if len(scores.shape) > 1: # we are going to get the relevant scores
File virtualenvs\GenAI\Lib\site-packages\torch\utils_contextlib.py:120, in context_decorator..decorate_context(*args, **kwargs)
117 @functools.wraps(func)
118 def decorate_context(*args, **kwargs):
119 with ctx_factory():
--> 120 return func(*args, **kwargs)
File virtualenvs\GenAI\Lib\site-packages\sentence_transformers\cross_encoder\util.py:68, in cross_encoder_predict_rank_args_decorator..wrapper(self, *args, **kwargs)
63 kwargs.pop(deprecated_arg)
64 logger.warning(
65 f"The CrossEncoder.predict {deprecated_arg}
argument is deprecated and has no effect. It will be removed in a future version."
66 )
---> 68 return func(self, *args, **kwargs)
File virtualenvs\GenAI\Lib\site-packages\sentence_transformers\cross_encoder\CrossEncoder.py:651, in CrossEncoder.predict(self, sentences, batch_size, show_progress_bar, activation_fn, apply_softmax, convert_to_numpy, convert_to_tensor)
644 features = self.tokenizer(
645 batch,
646 padding=True,
647 truncation=True,
648 return_tensors="pt",
649 )
650 features.to(self.model.device)
--> 651 model_predictions = self.model(**features, return_dict=True)
652 logits = self.activation_fn(model_predictions.logits)
654 if apply_softmax and logits.ndim > 1:
File virtualenvs\GenAI\Lib\site-packages\torch\nn\modules\module.py:1773, in Module._wrapped_call_impl(self, *args, **kwargs)
1771 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1772 else:
-> 1773 return self._call_impl(*args, **kwargs)
File virtualenvs\GenAI\Lib\site-packages\torch\nn\modules\module.py:1784, in Module._call_impl(self, *args, **kwargs)
1779 # If we don't have any hooks, we want to skip the rest of the logic in
1780 # this function, and just call forward.
1781 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1782 or _global_backward_pre_hooks or _global_backward_hooks
1783 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1784 return forward_call(*args, **kwargs)
1786 result = None
1787 called_always_called_hooks = set()
File virtualenvs\GenAI\Lib\site-packages\transformers\utils\generic.py:959, in can_return_tuple..wrapper(self, *args, **kwargs)
957 if return_dict_passed is not None:
958 return_dict = return_dict_passed
--> 959 output = func(self, *args, **kwargs)
960 if not return_dict and not isinstance(output, tuple):
961 output = output.to_tuple()
File virtualenvs\GenAI\Lib\site-packages\transformers\modeling_layers.py:142, in GenericForSequenceClassification.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, **kwargs)
139 batch_size = inputs_embeds.shape[0]
141 if self.config.pad_token_id is None and batch_size != 1:
--> 142 raise ValueError("Cannot handle batch sizes > 1 if no padding token is defined.")
143 if self.config.pad_token_id is None:
144 last_non_pad_token = -1
ValueError: Cannot handle batch sizes > 1 if no padding token is defined.
Description
I'm trying to use Qwen/Qwen3-Reranker-0.6B
to rerank the retrieved documents from Vector Store.
System Info
System Information
OS: Windows
OS Version: 10.0.26100
Python Version: 3.12.11 (main, Jun 12 2025, 12:44:17) [MSC v.1943 64 bit (AMD64)]
Package Information
langchain_core: 0.3.74
langchain: 0.3.27
langchain_community: 0.3.27
langsmith: 0.4.17
langchain_chroma: 0.2.4
langchain_docling: 1.0.0
langchain_experimental: 0.3.4
langchain_huggingface: 0.3.0
langchain_ollama: 0.3.6
langchain_openai: 0.3.25
langchain_pymupdf4llm: 0.4.1
langchain_qdrant: 0.2.0
langchain_tavily: 0.2.11
langchain_text_splitters: 0.3.9
langchain_unstructured: 0.1.6
langgraph_sdk: 0.2.3
langgraph_supervisor: 0.0.29