This repository contains codes and models for the following papers:
Baijiong Lin, Weisen Jiang, Pengguang Chen, Yu Zhang, Shu Liu, and Ying-Cong Chen. MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders. In European Conference on Computer Vision, 2024.
Baijiong Lin, Weisen Jiang, Pengguang Chen, Shu Liu, and Ying-Cong Chen. MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.
-
PyTorch 2.0.0
-
timm 0.9.16
-
mmsegmentation 1.2.2
-
mamba-ssm 1.1.2
-
CUDA 11.8
-
Prepare the pretrained Swin-Large checkpoint by running the following command
cd pretrained_ckpts bash run.sh cd ../
-
Prepare the data.
PASCAL-Context and NYUD-v2: download the data from PASCALContext.tar.gz, NYUDv2.tar.gz, and then extract them.
Cityscapes: please sign up on the official Cityscapes website and download the following files (leftImg8bit_trainvaltest.zip, gtFine_trainvaltest.zip, and disparity_trainvaltest.zip).
You need to modify the dataset directory as
db_root
variable inconfigs/mypath.py
. -
Train the model. Taking training NYUDv2 as an example, you can run the following command
python -m torch.distributed.launch --nproc_per_node 8 main.py --run_mode train --config_exp ./configs/mtmamba_nyud.yml
You can download our trained models from
NYUD-v2 | PASCAL-Context | Cityscapes |
---|---|---|
mtmamba_nyud.pth.tar | mtmamba_pascal.pth.tar | mtmamba_cityscapes.pth.tar |
mtmamba_plus_nyud.pth.tar | mtmamba_plus_pascal.pth.tar | mtmamba_plus_cityscapes.pth.tar |
-
Evaluation. You can run the following command,
python -m torch.distributed.launch --nproc_per_node 1 main.py --run_mode infer --config_exp ./configs/mtmamba_nyud.yml --trained_model ./ckpts/mtmamba_nyud.pth.tar
The evaluation of Boundary Detection Task is based on external codebase (which is Matlab-based).
We would like to thank the authors that release the public repositories: Multi-Task-Transformer, mamba, and VMamba.
If you found this code/work to be useful in your own research, please cite the following:
@inproceedings{lin2024mtmamba,
title={{MTMamba}: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders},
author={Lin, Baijiong and Jiang, Weisen and Chen, Pengguang and Zhang, Yu and Liu, Shu and Chen, Ying-Cong},
booktitle={European Conference on Computer Vision},
year={2024}
}
@article{lin2025mtmambaplus,
title={{MTMamba++}: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders},
author={Lin, Baijiong and Jiang, Weisen and Chen, Pengguang and Liu, Shu and Chen, Ying-Cong},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2025}
}