-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Creating the template:
- Sanitize and/or give guidance on appropriate project names
- Ask the user to create a repo named for the project, and supply the repo URL
- Don't ask for lease name at template creation time.
- Instead, direct people to use the project name as the prefix for the lease name.
- Ask which site to put data buckets at, have a sane default if user doesn't know or care
- Some people may be GPU-agnostic (don't care what type of GPU), should be able to use both
- Should give some additional guidance in general, to help people make good decisions
Workflow:
- After creating the template, we should advise to put the newly generated project in a Github repo
- User is going to run the template generator once. EVERYTHING they will need goes in it then.
- General comment: want to minimize friction by making user run step only at the time that it is required.
chi
materials:
- In notebook 0, remove angle brackets around project name
- At the end of notebook 0: remove "next steps: launch GPU instance"
- At the end of notebook 0: add a section that shows how to check on the newly created buckets using the Horizon GUI
- Notebook 1: instead of a single lease name, let the lease name start with the project name; write code to get the active lease name that starts with the project name.
- Notebook 1: didn't substitute project name when mounting buckets
- Notebook 1: instance name should have project name instead of
mltrain
- Should have notebook 1 equivalent for no GPU, also for VM instance (with NVIDIA GPU and with no GPU)
- Need to be able to mount object store buckets even if they are on different site than compute instance
- Might need to adjust mount settings for data bucket (e.g. cache etc.) so it's not very slow to load data
Post-chi
materials:
- Need a new notebook after Notebook 1, for setting up env variable, building container images, and starting containers
- If I don't have/need a HF token, it should do something sane
- specify
HF_HOME
where I mounted object store bucket for datasets, but must also specifyHF_TOKEN_PATH
in an ephemeral location
- The whole Docker workflow assumes I am in NVIDIA land, and Pytorch
- Don't assume Lightning automatically
- When installing MLFlow Python client in Dockerfile, match the MLFlow server version
- Only need to log in to Github when I need to push something
-
mlflow.set_experiment
- use project name - Git context
- in examples, don't put everything in one cell
- should have notebook and src examples for each
Metadata
Metadata
Assignees
Labels
No labels