08.12.25 #1

atambay37 · 2025-08-26T16:51:24Z

atambay37
Aug 26, 2025
Maintainer

Meeting notes(AI-Summarized):

Review of Carlos's Work: Anshul, Carlos, and Francis discussed the current state of the project, focusing on Carlos's work to get the application running and identifying areas for stress testing. Carlos had questions about the production environment and the implementation of ML backends.
- Current State: Anshul, Carlos, and Francis reviewed the current state of the project, emphasizing Carlos's progress in getting the application running. They discussed the next steps, including identifying areas for stress testing.
- Questions Raised: Carlos raised questions about the production environment and the implementation of ML backends. He sought clarification on how the application runs in production and the specific ML backends used.
- Stress Testing: The team discussed the importance of stress testing the application to ensure its robustness and reliability. They planned to focus on this aspect in the upcoming phases of the project.
Production Environment: Carlos and Francis discussed the production environment, confirming that the application runs on multiple VMs in an OpenStack cluster, with the database and other components spread out.
- VM Setup: Francis confirmed that the application runs on multiple VMs within an OpenStack cluster. The database and other components are distributed across different VMs to optimize performance and reliability.
- Component Distribution: Francis explained that the database is hosted on its own instance, while other components are spread out across various VMs. This setup helps in managing the load and ensuring efficient operation.
ML Backends: Carlos and Francis discussed the ML backends, specifically the Amy Data Companion, which is the main backend used in production. Francis explained that the code in the antenna repository is a cleaned-up version and not yet running in production.
- Amy Data Companion: Francis identified the Amy Data Companion as the primary ML backend used in production. This backend is crucial for the application's functionality and scalability.
- Antenna Repository: Francis clarified that the code in the antenna repository is a cleaned-up version intended for example purposes. This version is not yet running in production but serves as a reference for developers.
- Production Use: The team discussed the importance of the Amy Data Companion in production and the need to ensure its scalability and reliability. This backend is central to the application's ML capabilities.
API Definitions: Carlos and Francis clarified that the API definitions are in production but are behind the Amy Data Companion. The ML backend example in the antenna repository is not yet running by itself.
Deployment and Refactoring: Francis explained the need for refactoring due to the transition from a desktop application to a platform that supports multiple cameras. Michael added an API to the Amy Data Companion, which now doubles as a service.
- Refactoring Need: Francis highlighted the need for refactoring the application to transition from a desktop application to a platform that supports multiple cameras. This transition is essential for meeting the needs of researchers with multiple cameras in the field.
- API Addition: Michael added an API to the Amy Data Companion, enabling it to function as both a service and a desktop application. This addition enhances the application's versatility and usability.
- Service Deployment: Francis explained that the Amy Data Companion is deployed as a service, allowing it to handle multiple cameras and provide robust data processing capabilities. This deployment is crucial for the application's scalability.
User Stories and Product Requirements: Anshul emphasized the importance of creating user stories and a product requirements document, focusing on backend scalability and specific features. Carlos suggested considering personas such as developers writing new ML backend services.
- User Stories: Anshul stressed the importance of creating user stories to capture the requirements and expectations of different users. These stories will guide the development process and ensure that the application meets user needs.
- Product Requirements: Anshul emphasized the need for a comprehensive product requirements document that outlines the application's features, scalability goals, and technical specifications. This document will serve as a roadmap for the development team.
- Personas: Carlos suggested considering personas such as developers writing new ML backend services. These personas will help in understanding the specific needs and challenges faced by different users, ensuring that the application is user-centric.
Data Upload and Validation: Anshul and Michael discussed the challenges of asynchronous data uploading and validation. Michael explained that users can set up their own object stores, but it requires specific knowledge. They considered the possibility of streamlining this process.
- Asynchronous Uploading: Anshul and Michael discussed the challenges associated with asynchronous data uploading. Michael explained that users can set up their own object stores, but this process requires specific technical knowledge.
- Validation Challenges: Michael highlighted the need for data validation to ensure that uploaded data is in the correct format and does not cause processing errors. This validation is crucial for maintaining data integrity.
- Streamlining Process: The team considered the possibility of streamlining the data upload and validation process to make it more user-friendly. This could involve creating tutorials or developing tools to simplify the setup of object stores.
Indexing and Sessions: Michael explained the indexing process and how sessions are created based on time gaps between captures. This helps in quickly browsing and validating the data before processing.
- Indexing Process: Michael described the indexing process, which involves scanning the object storage and creating records for each image. This process helps in organizing and managing the data efficiently.
- Session Creation: Sessions are created based on time gaps between captures. If there is a gap of more than two hours, a new session is created. This helps in grouping related images together for easier browsing and validation.
- Data Validation: The indexing process allows users to quickly browse and validate the data before processing. This step is crucial for identifying any issues with the data and ensuring that it is ready for analysis.
Collections and Sampling: Michael described how collections are created and populated based on sampling parameters. Collections can be deterministic or random, and they are used for processing and comparing results.
- Collection Creation: Michael explained that collections are created based on specific sampling parameters. These parameters can be deterministic, such as interval-based sampling, or random, depending on the user's requirements.
- Sampling Methods: Sampling methods include interval-based sampling, where images are selected at regular intervals, and random sampling, where images are selected randomly. These methods help in creating representative subsets of the data for analysis.
- Processing Results: Collections are used for processing and comparing results. By creating collections, users can run different models on the same data and compare the outcomes, helping in evaluating the performance of the models.
Batch Processing and Pipelines: Anshul and Michael discussed the batch processing workflow, including the selection of pipelines and the role of detectors and classifiers. Michael explained the challenges with long-running tasks and the use of Celery for background processing.
- Batch Processing: Michael described the batch processing workflow, which involves selecting pipelines and running them on collections of data. This process helps in automating the analysis and processing of large datasets.
- Pipelines: Pipelines consist of detectors and classifiers. Detectors identify objects in the images, while classifiers determine the type of objects. These components work together to analyze the data and generate results.
- Long-Running Tasks: Michael highlighted the challenges associated with long-running tasks, which can cause the system to crash. To address this, they use Celery for background processing, allowing tasks to be managed more efficiently.
Load Balancing and Workers: Carlos and Michael discussed the load balancing across workers and the use of a single endpoint for processing services. Michael explained that each processing service hosts specific models and pipelines.
Occurrences and Detections: Anshul and Michael discussed the concept of occurrences and detections, explaining how multiple detections can make up a single occurrence based on the similarity of frames.
Administrative Update: Anshul informed Michael about the next meeting on the 26th and the plan to host him for a day of discussions and diagramming.
Next Steps: Anshul outlined the next steps, including sending the user stories document, stress testing by Carlos, and preparing for the meeting on the 26th to finalize the product requirements document.

Follow-up tasks:

User Stories Documentation: Send the user stories template to the team via SharePoint for completion. (Anshul)
Optional Meeting Invitations: Add Ana and Mohammed as optional attendees to the meeting invite. (Anshul)
Repository Access: Send Carlos the link to the separate AMI Data Companion repository via Slack. (Francis)
System Architecture Diagram: Create and share a diagram of the current processing service architecture for review and clarification. (Carlos)
Stress Testing: Perform stress testing on the batch processing service and share findings with the team over the next couple of weeks. (Carlos)
Product Requirements Document: Start filling out the product requirements document using current knowledge and prepare to review and refine it during the meeting on the 26th. (Anshul)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

08.12.25 #1

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

08.12.25 #1

Uh oh!

atambay37 Aug 26, 2025 Maintainer

Replies: 0 comments

atambay37
Aug 26, 2025
Maintainer