Data to Science

What is D2S?

The Data to Science (D2S) platform at Purdue University is an innovative, open-source initiative designed to facilitate data sharing and collaboration among researchers. Developed by Jinha Jung, an associate professor of civil engineering, and his team, the platform primarily focuses on housing data from unmanned aerial vehicles (UAVs) used in agricultural and forestry research.

The D2S platform aims to create a data-driven open science community that promotes sustained innovation. Researchers can upload, manage, and share their UAV data, making it accessible to a broader audience. This collaborative approach helps in advancing research by providing a centralized repository of valuable datasets from various projects worldwide.

Overview of D2S System

🌟 What Makes D2S Unique?

The Data to Science (D2S) platform at Purdue University stands out from other data-sharing platforms due to several unique features and approaches:

Specialization in UAV Data: Unlike many general data-sharing platforms, D2S is specifically designed to manage and share data from unmanned aerial vehicles (UAVs), making it particularly valuable for agricultural and forestry research.
Open-Source and Free Access: D2S is an open-source platform, ensuring that researchers worldwide can access and contribute to the data repository without any cost barriers.
Focus on Collaboration: The platform emphasizes building a community of researchers who can collaborate and share insights, fostering a more interactive and cooperative research environment.
Alignment with Open Science Mandates: D2S aligns with the White House Office of Technology and Policy mandates on openness in scientific enterprise, ensuring that federally funded research and supporting data are disclosed to the public at no cost.
User-Centric Development: The platform is developed with input from its users, ensuring that the tools and features meet the specific needs of the research community. This user-driven approach helps in creating a more effective and user-friendly platform.
Training and Support: D2S offers training workshops and support to help researchers get acquainted with the platform's tools and capabilities, ensuring they can make the most of its features.

These aspects make D2S a powerful tool for researchers looking to manage, share, and collaborate on UAV data, particularly in the fields of agriculture and forestry.

⚙️ Getting started

📋 Prerequisites

Docker Engine and Docker Compose are required to run the container with the following instructions. If you can successfully run docker --version and docker compose --version from a terminal then you are ready to proceed to the next section.

📝 Copy env example files

Navigate to the root directory of the repository.
Copy backend.example.env to a new file named backend.env.
```
cp backend.example.env backend.env
```
Copy db.example.env to a new file named db.env.
```
cp db.example.env db.env
```
Copy .env.example to a new file named .env.
```
cp .env.example .env
```
Copy frontend.example.env to a new file named frontend.env.
```
cp frontend.example.env frontend.env
```
Copy frontend/.env.example to a new file named frontend/.env.
```
cp frontend/.env.example frontend/.env
```
Copy frontend/example.env.development to a new file named frontend/.env.development.
```
cp frontend/example.env.development frontend/.env.development
```

✏️ Customize env files

Open .env. Below is a list of the environment variables that can be set inside .env.

Environment variables

EXTERNAL_STORAGE: Location where raw image zips and metadata will be sent for image processing jobs. It could be a mapped network drive or any other directory on the host machine. This should be left empty unless you have set up an image processing backend that works with the D2S image processing Celery task.
TUSD_STORAGE: Location of Docker managed volume or mapped host directory that stores user uploaded datasets.
TILE_SIGNING_SECRET: Secret key used for creating a signed URL that the client can use to access raster tiles and MVT tiles.

Open frontend.env. Below is a list of the environment variables that can be set inside frontend.env.

Environment variables

VITE_MAPBOX_ACCESS_TOKEN: Mapbox access token for satellite imagery (optional).
VITE_MAPTILER_API_KEY: Maptiler API key for OSM labels (optional).

Open backend.env in a text editor. Below is a list of the environment variables that can be set inside backend.env. You may use the default values or change them as needed.

You must provide a value for SECRET_KEY in your backend.env file. Use a cryptographically secure random string of at least 32 characters.

Environment variables
- API_PROJECT_NAME: Name that will appear in the FastAPI docs.
- API_DOMAIN: Domain used for accessing the application (e.g., http://localhost or https://customdomain)
- CELERY_BROKER_URL: Address for local redis service.
- CELERY_RESULT_BACKEND: Address for local redis service.
- EXTENSIONS: Can be used to enable extensions. Should be left blank typically.
- EXTERNAL_STORAGE: Internal mount point for external storage. Should be blank unless you have a binding mount for external storage.
- MAIL_ENABLED: Enable SMTP email by changing value from 0 to 1.
- MAIL_SERVER: SMTP server address.
- MAIL_USERNAME: Username for SMTP server.
- MAIL_PASSWORD: Password for SMTP server.
- MAIL_FROM: Sender email address.
- MAIL_FROM_NAME: Name of sender.
- MAIL_ADMINS: List of emails that should receive admin mail separated by commas.
- MAIL_PORT: SMTP server port.
- MAPBOX_ACCESS_TOKEN: Mapbox access token for satellite imagery (optional).
- POINT_LIMIT: Total number of points to be used when generating point cloud preview images.
- RABBITMQ_HOST: RabbitMQ hostname. Leave blank.
- RABBITMQ_USERNAME: RabbitMQ username. Leave blank.
- RABBITMQ_PASSWORD: RabbitMQ password. Leave blank.
- SECRET_KEY: Secret key for signing and verifying JWT tokens.
- STAC_API_KEY: Secret key that can be used for verification by STAC API.
- STAC_API_URL: URL for a STAC API.
- STAC_API_TEST_URL: URL for a STAC API that can be used for testing.
- STAC_BROWSER_URL: URL for STAC Browser site connected to the STAC API.
- HTTP_COOKIE_SECURE: Set to 1 to only send cookies over HTTPS, 0 to allow HTTP.
- LIMIT_MAX_REQUESTS: Maximum number of requests a worker will handle before being restarted.
- UVICORN_WORKERS: Number of uvicorn workers.
Open db.env in a text editor. POSTGRES_PASSWORD should be assigned a secure password. The other environment variables can be left on the default values. POSTGRES_HOST should always be set to db unless the database service name is changed from db to another name in docker-compose.yml.

If you change POSTGRES_USER or POSTGRES_HOST, you must also update these environment variables with the new values under the db service in docker-compose.yml.
Open frontend/.env in a text editor. You may use the default values or change them as needed.

Environment variables
- VITE_API_V1_STR: Path for API endpoints. Do not change from default value unless the path has been changed in the backend.
- VITE_BRAND_FULL: Full name of application.
- VITE_BRAND_SHORT: Abbreviated name of application.
- VITE_BRAND_SLOGAN: Slogan that appears on landing page.
- VITE_TITLE: Page title.
- VITE_META_DESCRIPTION: Description for search results and browser tabs.
- VITE_META_OG_TITLE: Title for social media shares.
- VITE_META_OG_DESCRIPTION: Description for social media shares.
- VITE_META_OG_TYPE: Content type (e.g., 'website', 'article').
- VITE_SHOW_CONTACT_FORM: Boolean (0 or 1) to indicate if Contact Form link should be shown (requires email service).
Open frontend/.env.development in a text editor. You may use the default values or change them as needed.

Environment variables:
- VITE_META_OG_IMAGE: Preview image URL for social media shares.
- VITE_META_OG_URL: Hostname for site.

🛠️ Build Docker images for services

In the root repository directory where docker-compose.example.yml is located. Copy it to a new file named docker-compose.yml.
```
cp docker-compose.example.yml docker-compose.yml
```
Build Docker images for the frontend, backend, and proxy services with the following command:
```
docker compose build
```

▶️ Start the containers

Use the following command to run the service containers in the background:
```
docker compose up -d
```

⏹️ Stop the containers

Use the following command to stop the containers:
```
docker compose stop
```

🌍 Accessing the web application

The Data To Science web application can be accessed from http://localhost:8000. Replace localhost with the DOMAIN environment variable if it was changed to a different value. If port 8000 is already use, or you want to use a different port, change the port in docker-compose.yml under the proxy service's ports setting.

📖 Additional information

The above sections should provide all the necessary steps to get Data To Science up and running. These next sections provide additional information about using docker-compose-dev.yml for development, accessing the FastAPI documentation, and running the backend tests.

🧪 Accessing the API

After running docker compose up -d, you should be able to access the web API from http://localhost:8000/docs or http://localhost:8000/redoc. The first URL will display the Swagger UI documentation for the API and the second URL will display the ReDoc documentation. The API endpoints can be tried out from either URL.

🧪 Running backend tests

The pytest library can be used to run tests for the FastAPI backend. Use the following command to run the full test suite:

docker compose exec backend pytest

🗃️ Database migrations with Alembic

If you make any changes the database models, run the following command to create a new migration:

docker compose exec backend alembic revision --autogenerate -m "migration comment"

After creating the new migration, use the following command to update to the tables in the database:

docker compose exec backend alembic upgrade head

📘 Documentation

For detailed documentation, visit documentation here.

Name		Name	Last commit message	Last commit date
Latest commit History 1,326 Commits
.github/workflows		.github/workflows
backend		backend
docs		docs
frontend		frontend
nginx		nginx
tusd/hooks		tusd/hooks
varnish		varnish
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
backend.example.env		backend.example.env
db.example.env		db.example.env
docker-compose.example.yml		docker-compose.example.yml
docker-compose.prod.example.yml		docker-compose.prod.example.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
frontend.example.env		frontend.example.env
pg_tileserv.toml		pg_tileserv.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data to Science

What is D2S?

🌟 What Makes D2S Unique?

⚙️ Getting started

📋 Prerequisites

📝 Copy env example files

✏️ Customize env files

🛠️ Build Docker images for services

▶️ Start the containers

⏹️ Stop the containers

🌍 Accessing the web application

📖 Additional information

🧪 Accessing the API

🧪 Running backend tests

🗃️ Database migrations with Alembic

📘 Documentation

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

gdslab/data-to-science

Folders and files

Latest commit

History

Repository files navigation

Data to Science

What is D2S?

🌟 What Makes D2S Unique?

⚙️ Getting started

📋 Prerequisites

📝 Copy env example files

✏️ Customize env files

🛠️ Build Docker images for services

▶️ Start the containers

⏹️ Stop the containers

🌍 Accessing the web application

📖 Additional information

🧪 Accessing the API

🧪 Running backend tests

🗃️ Database migrations with Alembic

📘 Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages