Skip to content

Implementation of Vision Transformers (ViT) with a token merging mechanism

Notifications You must be signed in to change notification settings

Ctrl408/ViT-implementations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Vision Transformer with Token Merging using Bipartite Soft Merging

This repository contains an implementation of Vision Transformers (ViT) with a token merging mechanism using Bipartite Soft Merging from the paper https://arxiv.org/abs/2210.09461. The objective is to enhance the throughput of Vision Transformers by merging tokens in an adaptive manner. Includes training code.

Introduction

Features

  • Vision Transformer (ViT) Implementation: Based on the original ViT architecture.
  • Bipartite Soft Merging:merge tokens effectively, reducing computational load.

Installation

To get started, clone the repository and install the necessary dependencies:

git clone https://github.com/Ctrl408/ViT-implementations.git
cd ViT-implementations