From fd14cb506e5b9b5aa691785a6ddac3cacfaa7cc6 Mon Sep 17 00:00:00 2001 From: Sylvain Lesage Date: Tue, 17 Sep 2024 15:09:06 +0200 Subject: [PATCH] update developer guide --- DEVELOPER_GUIDE.md | 328 +++++++++++++++++++++++---------------------- 1 file changed, 170 insertions(+), 158 deletions(-) diff --git a/DEVELOPER_GUIDE.md b/DEVELOPER_GUIDE.md index dcfd1e964..561c0d9ed 100644 --- a/DEVELOPER_GUIDE.md +++ b/DEVELOPER_GUIDE.md @@ -2,7 +2,163 @@ This document is intended for developers who want to install, test or contribute to the code. -## Install +## Set up development environment + +### Linux + +Install [rust](https://www.rust-lang.org/tools/install): + +```bash +$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +$ source $HOME/.cargo/env +``` + +Install [pyenv](https://github.com/pyenv/pyenv): + +```bash +$ curl https://pyenv.run | bash +``` + +Install Python 3.9.18: + +```bash +$ pyenv install 3.9.18 +``` + +Check that the expected local version of Python is used: + +```bash +$ cd services/worker +$ python --version +Python 3.9.18 +``` + +Install Poetry with [pipx](https://pipx.pypa.io/stable/installation/): + +- Either a single version: +```bash +pipx install poetry==1.8.2 +poetry --version +``` +- Or a parallel version (with a unique suffix): +```bash +pipx install poetry==1.8.2 --suffix=@1.8.2 +poetry@1.8.2 --version +``` + +Set the Python version to use with Poetry: + +```bash +poetry env use 3.9.18 +``` +or +```bash +poetry@1.8.2 env use 3.9.18 +``` + +Install the dependencies: + +```bash +make install +``` + +### Mac OS + +To install the [worker](./services/worker) on Mac OS, you can follow the next steps. + +#### First: as an administrator + +Install brew: + +```bash +$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" +``` + +#### Then: as a normal user + +Install pyenv: + +```bash +$ curl https://pyenv.run | bash +``` + +append the following lines to ~/.zshrc: + +```bash +export PYENV_ROOT="$HOME/.pyenv" +command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH" +eval "$(pyenv init -)" +``` + +Logout and login again. + +Install Python 3.9.18: + +```bash +$ pyenv install 3.9.18 +``` + +Check that the expected local version of Python is used: + +```bash +$ cd services/worker +$ python --version +Python 3.9.18 +``` + +Install Poetry with [pipx](https://pipx.pypa.io/stable/installation/): + +- Either a single version: +```bash +pipx install poetry==1.8.2 +poetry --version +``` +- Or a parallel version (with a unique suffix): +```bash +pipx install poetry==1.8.2 --suffix=@1.8.2 +poetry@1.8.2 --version +``` + +append the following lines to ~/.zshrc: + +```bash +export PATH="/Users/slesage2/.local/bin:$PATH" +``` + +Install rust: + +```bash +$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +$ source $HOME/.cargo/env +``` + +Set the python version to use with poetry: + +```bash +poetry env use 3.9.18 +``` +or +```bash +poetry@1.8.2 env use 3.9.18 +``` + +Avoid an issue with Apache beam (https://github.com/python-poetry/poetry/issues/4888#issuecomment-1208408509): + +```bash +poetry config experimental.new-installer false +``` +or +```bash +poetry@1.8.2 config experimental.new-installer false +``` + +Install the dependencies: + +```bash +make install +``` + +## Install dataset-viewer To start working on the project: @@ -11,6 +167,12 @@ git clone git@github.com:huggingface/dataset-viewer.git cd dataset-viewer ``` +Install all the packages: + +```bash +make install +``` + Install docker (see https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository and https://docs.docker.com/engine/install/linux-postinstall/) Run the project locally: @@ -19,6 +181,8 @@ Run the project locally: make start ``` +When the docker containers have been started, enter http://localhost:8100/healthcheck: it should show `ok`. + Run the project in development mode: ```bash @@ -28,7 +192,7 @@ make dev-start In development mode, you don't need to rebuild the docker images to apply a change in a worker. You can just restart the worker's docker container and it will apply your changes. -To install a single job (in [jobs](./jobs)), library (in [libs](./libs)) or service (in [services](./services)), go to their respective directory, and install Python 3.9 (consider [pyenv](https://github.com/pyenv/pyenv)) and [poetry](https://python-poetry.org/docs/master/#installation) (don't forget to add `poetry` to the `PATH` environment variable). +To install a single job (in [jobs](./jobs)), library (in [libs](./libs)) or service (in [services](./services)), go to their respective directory, and install Python 3.9 (consider [pyenv](https://github.com/pyenv/pyenv)) and [poetry](https://python-poetry.org/docs/main/#installation) (don't forget to add `poetry` to the `PATH` environment variable). If you use pyenv: @@ -101,8 +265,8 @@ The following environments contain all the modules: reverse proxy, API server, a | Environment | URL | Type | How to deploy | | ----------- | ---------------------------------------------------- | ----------------- | --------------------------------------- | -| Production | https://datasets-server.huggingface.co | Helm / Kubernetes | `make upgrade-prod` in [chart](./chart) | -| Development | https://datasets-server.us.dev.moon.huggingface.tech | Helm / Kubernetes | `make upgrade-dev` in [chart](./chart) | +| Production | https://datasets-server.huggingface.co | Helm / Kubernetes | Argo CD | +| Development | https://datasets-server.us.dev.moon.huggingface.tech | Helm / Kubernetes | Argo CD | | Local build | http://localhost:8100 | Docker compose | `make start` (builds docker images) | ## Jobs queue @@ -143,11 +307,9 @@ To launch the end to end tests: make e2e ``` -## Poetry - -### Versions +## Versions -If service is updated, we don't update its version in the `pyproject.yaml` file. But we have to update the [helm chart](./chart/) with the new image tag, corresponding to the last build docker published on docker.io by the CI. +We don't use the package versions (in pyproject.toml files), no need to update them. ## Pull requests @@ -170,153 +332,3 @@ DOCKERHUB_USERNAME=xxx DOCKERHUB_PASSWORD=xxx GITHUB_TOKEN=xxx ``` - -## Set up development environment - -### Linux - -Install pyenv: - -```bash -$ curl https://pyenv.run | bash -``` - -Install Python 3.9.18: - -```bash -$ pyenv install 3.9.18 -``` - -Check that the expected local version of Python is used: - -```bash -$ cd services/worker -$ python --version -Python 3.9.18 -``` - -Install Poetry with [pipx](https://pipx.pypa.io/stable/installation/): - -- Either a single version: -```bash -pipx install poetry==1.8.2 -poetry --version -``` -- Or a parallel version (with a unique suffix): -```bash -pipx install poetry==1.8.2 --suffix=@1.8.2 -poetry@1.8.2 --version -``` - -Set the Python version to use with Poetry: - -```bash -poetry env use 3.9.18 -``` -or -```bash -poetry@1.8.2 env use 3.9.18 -``` - -Install the dependencies: - -```bash -make install -``` - - -### Mac OS - -To install the [worker](./services/worker) on Mac OS, you can follow the next steps. - -#### First: as an administrator - -Install brew: - -```bash -$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" -``` - -#### Then: as a normal user - -Install pyenv: - -```bash -$ curl https://pyenv.run | bash -``` - -append the following lines to ~/.zshrc: - -```bash -export PYENV_ROOT="$HOME/.pyenv" -command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH" -eval "$(pyenv init -)" -``` - -Logout and login again. - -Install Python 3.9.18: - -```bash -$ pyenv install 3.9.18 -``` - -Check that the expected local version of Python is used: - -```bash -$ cd services/worker -$ python --version -Python 3.9.18 -``` - -Install Poetry with [pipx](https://pipx.pypa.io/stable/installation/): - -- Either a single version: -```bash -pipx install poetry==1.8.2 -poetry --version -``` -- Or a parallel version (with a unique suffix): -```bash -pipx install poetry==1.8.2 --suffix=@1.8.2 -poetry@1.8.2 --version -``` - -append the following lines to ~/.zshrc: - -```bash -export PATH="/Users/slesage2/.local/bin:$PATH" -``` - -Install rust: - -```bash -$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -$ source $HOME/.cargo/env -``` - -Set the python version to use with poetry: - -```bash -poetry env use 3.9.18 -``` -or -```bash -poetry@1.8.2 env use 3.9.18 -``` - -Avoid an issue with Apache beam (https://github.com/python-poetry/poetry/issues/4888#issuecomment-1208408509): - -```bash -poetry config experimental.new-installer false -``` -or -```bash -poetry@1.8.2 config experimental.new-installer false -``` - -Install the dependencies: - -```bash -make install -```