Skip to content

Commit 47fed9a

Browse files
authored
fix: ML processing that is causing jobs to fail for medium-large jobs (issue #782) (#934)
* fix: JobLogHandlers added multiple times * Fix linting * Make celeryworker debuggable for local
1 parent 9127ffb commit 47fed9a

File tree

5 files changed

+42
-9
lines changed

5 files changed

+42
-9
lines changed

.vscode/launch.json

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
{
2+
"version": "0.2.0",
3+
"configurations": [
4+
{
5+
"name": "Python Debugger: Remote Attach",
6+
"type": "debugpy",
7+
"request": "attach",
8+
"connect": {
9+
"host": "localhost",
10+
"port": 5678
11+
},
12+
"pathMappings": [
13+
{
14+
"localRoot": "${workspaceFolder}",
15+
"remoteRoot": "."
16+
}
17+
]
18+
}
19+
]
20+
}

README.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Platform for processing and reviewing images from automated insect monitoring st
88

99
Antenna uses [Docker](https://docs.docker.com/get-docker/) & [Docker Compose](https://docs.docker.com/compose/install/) to run all services locally for development.
1010

11-
1) Install Docker for your host operating (Linux, macOS, Windows)
11+
1) Install Docker for your host operating (Linux, macOS, Windows). Docker Compose `v2.38.2` or later recommended.
1212

1313
2) Add the following to your `/etc/hosts` file in order to see and process the demo source images. This makes the hostname `minio` and `django` alias for `localhost` so the same image URLs can be viewed in the host machine's web browser and be processed by the ML services. This can be skipped if you are using an external image storage service.
1414

@@ -24,6 +24,7 @@ Antenna uses [Docker](https://docs.docker.com/get-docker/) & [Docker Compose](ht
2424
docker compose logs -f django celeryworker ui
2525
# Ctrl+c to close the logs
2626
```
27+
NOTE: If you see docker build errors such as `At least one invalid signature was encountered`, these could happen if docker runs out of space. Commands like `docker image prune -f` and `docker system prune` can be helpful to clean up space.
2728

2829
3) Optionally, run additional ML processing services: `processing_services` defines ML backends which wrap detections in our FastAPI response schema. The `example` app demos how to add new pipelines, algorithms, and models. See the detailed instructions in `processing_services/README.md`.
2930

@@ -32,12 +33,15 @@ docker compose -f processing_services/example/docker-compose.yml up -d
3233
# Once running, in Antenna register a new processing service called: http://ml_backend_example:2000
3334
```
3435

35-
4) Access the platform the following URLs:
36+
4) Access the platform with the following URLs:
3637

3738
- Primary web interface: http://localhost:4000
3839
- API browser: http://localhost:8000/api/v2/
3940
- Django admin: http://localhost:8000/admin/
4041
- OpenAPI / Swagger documentation: http://localhost:8000/api/v2/docs/
42+
- Minio UI: http://minio:9001, Minio service: http://minio:9000
43+
44+
NOTE: If one of these services is not working properly, it could be due another process is using the port. You can check for this with `lsof -i :<PORT_NUMBER>`.
4145

4246
A default user will be created with the following credentials. Use these to log into the web UI or the Django admin.
4347

ami/jobs/models.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -404,6 +404,7 @@ def run(cls, job: "Job"):
404404
chunk_size = config.get("request_source_image_batch_size", 1)
405405
chunks = [images[i : i + chunk_size] for i in range(0, image_count, chunk_size)] # noqa
406406
request_failed_images = []
407+
job.logger.info(f"Processing {image_count} images in {len(chunks)} batches of up to {chunk_size}")
407408

408409
for i, chunk in enumerate(chunks):
409410
request_sent = time.time()
@@ -946,11 +947,15 @@ def default_progress(cls) -> JobProgress:
946947

947948
@property
948949
def logger(self) -> logging.Logger:
949-
logger = logging.getLogger(f"ami.jobs.{self.pk}")
950-
# Also log output to a field on thie model instance
951-
logger.addHandler(JobLogHandler(self))
952-
logger.propagate = False
953-
return logger
950+
_logger = logging.getLogger(f"ami.jobs.{self.pk}")
951+
952+
# Only add JobLogHandler if not already present
953+
if not any(isinstance(h, JobLogHandler) for h in _logger.handlers):
954+
# Also log output to a field on thie model instance
955+
logger.info("Adding JobLogHandler to logger for job %s", self.pk)
956+
_logger.addHandler(JobLogHandler(self))
957+
_logger.propagate = False
958+
return _logger
954959

955960
class Meta:
956961
ordering = ["-created_at"]

docker-compose.yml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -89,8 +89,11 @@ services:
8989
<<: *django
9090
image: ami_local_celeryworker
9191
scale: 1
92-
ports: []
93-
command: /start-celeryworker
92+
# For remote debugging with debugpy, should get overridden for production
93+
# Also make sure to install debugpy in your requirements/local.txt
94+
ports:
95+
- "5678:5678"
96+
command: python -m debugpy --listen 0.0.0.0:5678 -m celery -A config.celery_app worker -l INFO
9497

9598
celerybeat:
9699
<<: *django

requirements/local.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
-r base.txt
2+
debugpy # For remote debugging with debugpy

0 commit comments

Comments
 (0)