-
Couldn't load subscription status.
- Fork 3k
Description
Hello everyone,
I have recently stumbled across this repo trying to tune some diffusion model. I'm not sure if this is relevant anymore as there seems to be no more commits for the last 2 years, but the environment configured here is very outdated, and updating it is not as straightforward due to the fact that conda is not good in dependency resolution, and the anaconda repository usually lags behind the mainstream PyPI (configured via pip) in terms of package versions.
I have tried to solve it myself in different ways and will share my experience here.
Problem
If you just take the environment here out of the box and install it via conda (conda env create -f environment.yaml), you would not be able to load any model from OpenAI. For example, this simple script tries to load some OpenAI models.
from transformers import CLIPModel, CLIPProcessor
def main():
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")
print("model", model)
print("processor", processor)
if __name__ == "__main__":
main()
It would report the error already specified here. The root cause is simply the transformers package is outdated, hence the connection is broken.
Why can't we just put a newer version in environment.yml
Well, if it is that simple then there would be no frustration and there would not be a dozen of tools for resolving dependencies. As an simple example, transformers depends on torch, torch-vision also depends on torch. torch in turns depends on numpy and Python version. With Python 3.8 configured here we can at best get the two years old version only.
We can just manually do that ourselves. But there are better tools for it.
Updating packages
I have tried 2 main ways of doing this.
-
Using conda but starting afresh to hope to get newest versions. This means I start a new conda environment and just
conda installeach of the packages inenvironment.yml, hoping to get the newest one possible. Short answer: newer versions are installed but still heavily outdated due to the inter-dependencies of packages. This can be solved manually by cherry picking but it's a lot of manual work. -
Converting everything to a more mainstream Python package manager, here I choose uv (but
poetryshould work just fine too). This works perfectly. The above script just works again thanks to the updatedtransformers. The only thing is, to structure a project usinguv, orpoetry, we can't just throw everything in the home directory. There needs to be some sort of standard Python project structure. An example PR is here.
Please let me know what you think.