This application allows you to upload files into minio storage bucket and upload files from minio storage directly to colab. Furthermore, you will be able to remotely execute scripts on colab and dynamically download script output files back to minio storage in synchronization mode (rsync-like approach).
-
Make sure that you have installed the latest versions of
pythonandpipon your computer. Also, you have to install Docker and Docker Compose. -
This project by default uses poetry for dependency and virtual environment management. Make sure to install it too.
-
Make sure to provide all required environment variables (via
.envfile,exportcommand, secrets, etc.) before running application.
-
For managing pre-commit hooks this project uses pre-commit.
-
For import sorting this project uses isort.
-
For code format checking this project uses black.
-
For type checking his project uses mypy
-
For create commits and lint commit messages this project uses commitizen. Run
make committo use commitizen during commits. -
There is special
build_devstage in Docker file to build dev version of application image.
This project involves github actions to run all checks and unit-tests on push to remote repository.
There are lots of useful commands in Makefile included into this project's repo. Use make <some_command> syntax to run each of them.
If your system doesn't support make commands - you may copy commands from Makefile directly into terminal.
-
To install all the required dependencies and set up a virtual environment run in the cloned repository directory use:
poetry installYou can also install project dependencies using
pip install -r requirements.txt. -
To config pre-commit hooks for code linting, code format checking and linting commit messages run in the cloned directory:
poetry run pre-commit install -
Build app image using
make buildTo build reloadable application locally use
make build_devto build image in development environment. -
Run Docker containers using
make upNote: this will also create and attach persistent named volume
logsfor Docker container. Container will use this volume to store applicationapp.logfile. -
Stop and remove Docker containers using
make downIf you also want to remove log volume use
make down_volume
-
By default, application will be accessible at http://localhost:8080, minio storage console - at http://localhost:9001. You can try all endpoints with SWAGGER documentation at http://localhost:8080/docs
-
Use
/files/upload_minioresource to upload files to colab. You should provide minio bucket name in header. Request uses multipart/form-data to upload one or multiple files to minio storage. You should also specify key prefix that will be added to all uploaded files (in directory-like way) e.g.:files/main/. If there are existing files with the same prefix - they will be removed from storage.
-
Copy script from
documentation/colab_ssh_config_script.ipynbto your colab account. Register at ngrok. Run this cell on colab. Provide ngrok auth token to prompt. Tunel will be created with connection credentials at/content/ssh_config/credentialsfile.
-
Use
/files/upload_colabto send files from minio storage to colab. Files will be saved at colab's/content/uploaded/directory. Specify files key prefix in request body keys_prefix field (in directory-like way) - all files with such prefix will be uploaded to colab. To connect colab you must provide all credentials (i.e. username, password, host and port) from/content/ssh_config/credentialsfile. Files will be streamed to colab directly. If script_name specified - this script will be executed on colab. If jupiter notebook provided as script (e.g. file with .ipynb extension). It will be converted to python script (e.g. file with .py extension) on colab before execution. Note: make sure to save script outputs at/content/uploaded/output/in order to make them available to further download from colab to minio.
-
Use
/files/download_colabto download script results from colab directory/content/uploaded/output/to minio storage. Specified key prefix will be added to each object in minio (in directory-like way). Files are streamed directly from colab to minio (i.e. without equally stored in application local storage) using sshfs protocol, which will mount colab remote directory to temporary created application local directory. This action uses aws CLI sync under the hood, so it can be used to synchronise minio storage files with dynamically created/updated/deleted by colab script files (i.e. this means that if new file was created/updated/deleted on colab directory - it will be uploaded/updated/deleted in minio respectively. If file didn't change - it won't be modified in minio.)
-
Description of all project's endpoints and API may be viewed without running any services from
documentation/openapi.yamlfile -
You can update openapi.yaml documentation for API at any time by using
make openapicommand. -
All warnings and info messages will be shown in container's stdout and saved in
app.logfile. -
Use
colab_ssh_config_script.ipynbon colab session side to open ssh connection tunnel. -
You may use
test_script.pyortest_script.ipynbfiles fromdocumentation/as examples of scripts to upload, run on colab and get remote outputs.
-
Use
make testto run build image and run all linters checks and unit-tests. -
After all tests coverage report will be also shown.
-
Staged changes will be checked during commits via pre-commit hook.
-
All checks and tests will run on code push to remote repository as part of github actions.