Skip to content

Error while using Qwen/Qwen3-Reranker-0.6B with Cross Encoder Reranker #32686

@amanchaudhary-95

Description

@amanchaudhary-95

Checked other resources

  • This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

from langchain_qdrant import QdrantVectorStore, RetrievalMode, FastEmbedSparse
from langchain_ollama import OllamaEmbeddings
from langchain.retrievers import ContextualCompressionRetriever
from langchain_community.cross_encoders import HuggingFaceCrossEncoder
from langchain.retrievers.document_compressors import CrossEncoderReranker

embed_model = OllamaEmbeddings(model="Qwen3")
sparse_embeddings = FastEmbedSparse(model_name="Qdrant/bm25") 


### any dummy vector store will work
vector_store = QdrantVectorStore.from_existing_collection(
    collection_name='collection',
    embedding=embed_model,
    sparse_embedding=sparse_embeddings,
    path='qdrantDB',
    vector_name='dense',
    sparse_vector_name='bm25'
)

reranker = HuggingFaceCrossEncoder(model_name="Qwen/Qwen3-Reranker-0.6B")
compressor = CrossEncoderReranker(model=reranker, top_n=3)
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=vector_store.as_retriever(k=10)
)
context = compression_retriever.invoke('any query text to invoke it')

Error Message and Stack Trace (if applicable)


ValueError Traceback (most recent call last)
Cell In[7], line 7

----> 7 context = compression_retriever.invoke('any query text to invoke it')

File virtualenvs\GenAI\Lib\site-packages\langchain_core\retrievers.py:261, in BaseRetriever.invoke(self, input, config, **kwargs)
259 kwargs_ = kwargs if self._expects_other_args else {}
260 if self._new_arg_supported:
--> 261 result = self.get_relevant_documents(
262 input, run_manager=run_manager, **kwargs

263 )
264 else:
265 result = self.get_relevant_documents(input, **kwargs)

File virtualenvs\GenAI\Lib\site-packages\langchain\retrievers\contextual_compression.py:46, in ContextualCompressionRetriever._get_relevant_documents(self, query, run_manager, **kwargs)
40 docs = self.base_retriever.invoke(
41 query,
42 config={"callbacks": run_manager.get_child()},
43 **kwargs,
44 )
45 if docs:
---> 46 compressed_docs = self.base_compressor.compress_documents(
47 docs,
48 query,
49 callbacks=run_manager.get_child(),
50 )
51 return list(compressed_docs)
52 return []

File virtualenvs\GenAI\Lib\site-packages\langchain\retrievers\document_compressors\cross_encoder_rerank.py:45, in CrossEncoderReranker.compress_documents(self, documents, query, callbacks)
28 def compress_documents(
29 self,
30 documents: Sequence[Document],
31 query: str,
32 callbacks: Optional[Callbacks] = None,
33 ) -> Sequence[Document]:
34 """
35 Rerank documents using CrossEncoder.
36
(...) 43 A sequence of compressed documents.
44 """
---> 45 scores = self.model.score([(query, doc.page_content) for doc in documents])
46 docs_with_scores = list(zip(documents, scores))
47 result = sorted(docs_with_scores, key=operator.itemgetter(1), reverse=True)

File virtualenvs\GenAI\Lib\site-packages\langchain_community\cross_encoders\huggingface.py:59, in HuggingFaceCrossEncoder.score(self, text_pairs)
50 def score(self, text_pairs: List[Tuple[str, str]]) -> List[float]:
51 """Compute similarity scores using a HuggingFace transformer model.
52
53 Args:
(...) 57 List of scores, one for each pair.
58 """
---> 59 scores = self.client.predict(text_pairs)
60 # Some models e.g bert-multilingual-passage-reranking-msmarco
61 # gives two score not_relevant and relevant as compare with the query.
62 if len(scores.shape) > 1: # we are going to get the relevant scores

File virtualenvs\GenAI\Lib\site-packages\torch\utils_contextlib.py:120, in context_decorator..decorate_context(*args, **kwargs)
117 @functools.wraps(func)
118 def decorate_context(*args, **kwargs):
119 with ctx_factory():
--> 120 return func(*args, **kwargs)

File virtualenvs\GenAI\Lib\site-packages\sentence_transformers\cross_encoder\util.py:68, in cross_encoder_predict_rank_args_decorator..wrapper(self, *args, **kwargs)
63 kwargs.pop(deprecated_arg)
64 logger.warning(
65 f"The CrossEncoder.predict {deprecated_arg} argument is deprecated and has no effect. It will be removed in a future version."
66 )
---> 68 return func(self, *args, **kwargs)

File virtualenvs\GenAI\Lib\site-packages\sentence_transformers\cross_encoder\CrossEncoder.py:651, in CrossEncoder.predict(self, sentences, batch_size, show_progress_bar, activation_fn, apply_softmax, convert_to_numpy, convert_to_tensor)
644 features = self.tokenizer(
645 batch,
646 padding=True,
647 truncation=True,
648 return_tensors="pt",
649 )
650 features.to(self.model.device)
--> 651 model_predictions = self.model(**features, return_dict=True)
652 logits = self.activation_fn(model_predictions.logits)
654 if apply_softmax and logits.ndim > 1:

File virtualenvs\GenAI\Lib\site-packages\torch\nn\modules\module.py:1773, in Module._wrapped_call_impl(self, *args, **kwargs)
1771 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1772 else:
-> 1773 return self._call_impl(*args, **kwargs)

File virtualenvs\GenAI\Lib\site-packages\torch\nn\modules\module.py:1784, in Module._call_impl(self, *args, **kwargs)
1779 # If we don't have any hooks, we want to skip the rest of the logic in
1780 # this function, and just call forward.
1781 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1782 or _global_backward_pre_hooks or _global_backward_hooks
1783 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1784 return forward_call(*args, **kwargs)
1786 result = None
1787 called_always_called_hooks = set()

File virtualenvs\GenAI\Lib\site-packages\transformers\utils\generic.py:959, in can_return_tuple..wrapper(self, *args, **kwargs)
957 if return_dict_passed is not None:
958 return_dict = return_dict_passed
--> 959 output = func(self, *args, **kwargs)
960 if not return_dict and not isinstance(output, tuple):
961 output = output.to_tuple()

File virtualenvs\GenAI\Lib\site-packages\transformers\modeling_layers.py:142, in GenericForSequenceClassification.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, **kwargs)
139 batch_size = inputs_embeds.shape[0]
141 if self.config.pad_token_id is None and batch_size != 1:
--> 142 raise ValueError("Cannot handle batch sizes > 1 if no padding token is defined.")
143 if self.config.pad_token_id is None:
144 last_non_pad_token = -1

ValueError: Cannot handle batch sizes > 1 if no padding token is defined.

Description

I'm trying to use Qwen/Qwen3-Reranker-0.6B to rerank the retrieved documents from Vector Store.

System Info

System Information

OS: Windows
OS Version: 10.0.26100
Python Version: 3.12.11 (main, Jun 12 2025, 12:44:17) [MSC v.1943 64 bit (AMD64)]

Package Information

langchain_core: 0.3.74
langchain: 0.3.27
langchain_community: 0.3.27
langsmith: 0.4.17
langchain_chroma: 0.2.4
langchain_docling: 1.0.0
langchain_experimental: 0.3.4
langchain_huggingface: 0.3.0
langchain_ollama: 0.3.6
langchain_openai: 0.3.25
langchain_pymupdf4llm: 0.4.1
langchain_qdrant: 0.2.0
langchain_tavily: 0.2.11
langchain_text_splitters: 0.3.9
langchain_unstructured: 0.1.6
langgraph_sdk: 0.2.3
langgraph_supervisor: 0.0.29

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugRelated to a bug, vulnerability, unexpected error with an existing featurehelp wantedGood issue for contributorsinvestigateFlagged for investigation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions