From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection

[ICML2025] Official Code of ABS

Lincan Cai, Jingxuan Kang, Shuang Li, Wenxuan Ma, Binhui Xie, Zhida Qin, Jian Liang

Contribution

We propose an Attention-Based Selection (ABS) to guide the cropping process, focusing on the main objects in the image and minimizing the risk of cropping background objects.
We introduce a feature selection that cropping at the feature map of the original image to supplement the cropped images with global information, ensuring that the model retains semantic understanding while focusing on local features.
We propose a soft matching approach, enabling targeted matching of text descriptions to different patches. ABS achieves state-of-the-art performance in zero-shot classification and out-of-distribution datasets, even outperforming methods that require finetuning.

Requirements

Environment

For proper execution, please ensure Python is installed and all required dependencies are configured by running the following command from the project root directory:

conda create --name <env> --file requirements.txt

Datasets

To proceed with the experiments, please follow these steps:

Download the required dataset, please refer to Dataset.md
Update the corresponding data_path parameter in the configuration files located in the cfgs folder to point to your local dataset directory.

Experiment with zero-shot and OOD Benchmark

Run the following command:

python main.py --dataset_name imagenet

where dataset_name specifies the dataset to be used.

Acknowledgements

This project is based on the project: WCA. We thank the authors for making the source code publicly available.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
cfgs		cfgs
clip		clip
data		data
features		features
my_datasets		my_datasets
prompts		prompts
README.md		README.md
framework.png		framework.png
helper.py		helper.py
main.py		main.py
own_functional.py		own_functional.py
own_nn.py		own_nn.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection

Contribution

Requirements

Experiment with zero-shot and OOD Benchmark

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

BIT-DA/ABS

Folders and files

Latest commit

History

Repository files navigation

From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection

Contribution

Requirements

Experiment with zero-shot and OOD Benchmark

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages