|
1 |
| ---- |
2 |
| -sidebar_position: 1 |
3 |
| -title: Introduction |
4 |
| -slug: /2024/scheduler/ |
5 |
| ---- |
6 |
| -<!-- |
7 |
| -SPDX-License-Identifier: CC-BY-SA-4.0 |
8 |
| -
|
9 |
| -SPDX-FileCopyrightText: 2024 Aditya Singh <email.here> |
10 |
| ---> |
11 |
| - |
12 |
| -## Author |
13 |
| - |
14 |
| -[Aaditya Singh](https://github.com/aadsingh) |
15 |
| - |
16 |
| -## Contact info |
17 |
| - |
18 |
| -- [Email](mailto:email.here) |
19 |
| -- [LinkedIn](https://linkedin.com/in/my-user) |
20 |
| - |
21 |
| -## Project title |
22 |
| - |
23 |
| -Scheduler overhaul |
24 |
| - |
25 |
| -## What's the project about? |
26 |
| - |
27 |
| -Insert Text Here |
28 |
| - |
29 |
| -## What should be done? |
30 |
| - |
31 |
| -What are the plans for the project? |
| 1 | +--- |
| 2 | +sidebar_position: 1 |
| 3 | +title: Introduction |
| 4 | +slug: /2024/scheduler/ |
| 5 | +--- |
| 6 | +<!-- |
| 7 | +SPDX-License-Identifier: CC-BY-SA-4.0 |
| 8 | +
|
| 9 | +SPDX-FileCopyrightText: 2024 Aditya Singh <email.here> |
| 10 | +--> |
| 11 | + |
| 12 | +## Author |
| 13 | + |
| 14 | +[Aaditya Singh](https://github.com/Aaditya-Singh78) |
| 15 | + |
| 16 | +## Contact info |
| 17 | + |
| 18 | +- [Email](mailto:singh.aaditya889@gmail.com) |
| 19 | +- [LinkedIn](https://linkedin.com/in/aadi-singh) |
| 20 | +- [Twitter](https://twitter.com/__Aadityasingh) |
| 21 | + |
| 22 | +## Project title |
| 23 | + |
| 24 | +Scheduler overhaul |
| 25 | + |
| 26 | +## What's the project about? |
| 27 | + |
| 28 | +This project aims to enhance the job scheduling capabilities of [FOSSology](https://github.com/fossology/fossology) by transitioning from a C-based implementation to a Go-based system. The overhaul focuses on leveraging Go's modern language features to improve concurrency, performance, and maintainability. This transition addresses the scalability and system *throughput* challenges in the current scheduler. |
| 29 | + |
| 30 | + |
| 31 | +### Architecture Overview |
| 32 | + |
| 33 | + |
| 34 | +**The Current architecture** utilises the a multi-threaded approach to manage job scheduling & execution.It is structured around several key *components*: |
| 35 | + |
| 36 | +1. **Main Thread**: Acts as the scheduler's control unit, managing worker threads and overseeing system operations like resource allocation and health monitoring. |
| 37 | + |
| 38 | +2. **Job Execution Queue**: Holds and manages incoming job requests, facilitating efficient job processing and priority control. |
| 39 | + |
| 40 | +3. **Worker Threads**: Executes jobs from the queue under the main thread’s management, optimizing resource use and performance. |
| 41 | + |
| 42 | +4. **Scheduler Logic**: Determines the execution order of jobs based on priority and resource availability, ensuring systematic and efficient processing. |
| 43 | + |
| 44 | +5. **Database Interaction**: Handles storage of job logs and results, supporting tracking, auditing, and data persistence. |
| 45 | + |
| 46 | +6. **Error Handling Mechanism**: Manages job execution errors to ensure stability and prevent system-wide impacts from failures. |
| 47 | + |
| 48 | +7. **Resource Allocation**: Distributes resources across jobs and threads to avoid contention and ensure efficient execution. |
| 49 | + |
| 50 | +**Key Challenges**: |
| 51 | + |
| 52 | +1. *Concurrency and Synchronization*: Ensuring that multiple worker threads operate without interfering with each other requires meticulous management of resources and synchronization. |
| 53 | + |
| 54 | +2. *Efficiency and Throughput*: The system must optimize the processing of jobs to minimize wait times and maximize the throughput of the scheduler. |
| 55 | + |
| 56 | +3. *Scalability*: As the number of jobs increases, the system must scale appropriately to handle the increased load without degradation in performance. |
| 57 | + |
| 58 | +4. *Flexibility*: Adapting to varied job types and changing operational conditions while maintaining performance and reliability. |
| 59 | + |
| 60 | +## What should be done? |
| 61 | + |
| 62 | +What are the plans for the project? |
| 63 | + |
| 64 | +1. **Refactor Existing Code**: Transitioning the existing C codebase to Go, restructuring components to fit the Go idiom. |
| 65 | + |
| 66 | + > **Why Go?** |
| 67 | +
|
| 68 | + - *Concurrency and Performance*: Go's native goroutine and channel-based concurrency model is highly efficient for processes that require concurrent execution, which is critical for job scheduling. |
| 69 | + |
| 70 | + - *Memory Safety*: Automatic memory management and garbage collection in Go reduce the risk of memory-related errors, a common challenge in C due to its manual memory handling. |
| 71 | + |
| 72 | + - *Simplicity and Productivity*: Go's clean and concise syntax, along with its powerful standard library, enables rapid development and easier maintenance compared to the verbose and complex C code. |
| 73 | + |
| 74 | + - *Robust Tooling*: The Go toolchain provides out-of-the-box support for testing, formatting, and documentation, enhancing development workflow and product quality. |
| 75 | + |
| 76 | + - *Cross-Platform Compatibility*: Go simplifies the build process with its strong support for cross-platform compilation, making it easier to manage and deploy on various systems without code changes. |
| 77 | + |
| 78 | +2. **Optimize Concurrency Handling**: Implementing a robust concurrency model using goroutines and channels to handle multiple jobs efficiently. |
| 79 | + |
| 80 | + > **How it would be achieved ?** |
| 81 | +
|
| 82 | + - The *new scheduler architecture* will utilise: |
| 83 | + |
| 84 | +  |
| 85 | + |
| 86 | + - **Go Routines for Task Management**: Efficiently handling multiple jobs in parallel to optimize resource usage. |
| 87 | + |
| 88 | + - **Channels for Communication**: Using channels to manage job queues and worker communication, ensuring thread-safe operations. |
| 89 | + |
| 90 | + - **Modular Design**: Structuring the scheduler with clear separation of concerns, allowing for easier updates and maintenance. |
| 91 | + |
| 92 | + - To ensure consistency and maintainability of the codebase, the following *coding standards* will be applied: |
| 93 | + |
| 94 | + - *Format and Style*: using `gofmt` and `golint` for formatting and linting the code. |
| 95 | + |
| 96 | + - *Error Handling*: Follow Go's idiomatic way of error handling. Always check for errors where they can occur and handle them gracefully. |
| 97 | + |
| 98 | + - *Commenting and Documentation*: Write clear comments for all public functions and methods, using Godoc conventions. Document all packages and provide examples where necessary. |
| 99 | + |
| 100 | + - *Concurrency Practices*: Use goroutines and channels appropriately. Avoid common pitfalls like race conditions by using synchronization primitives from the `sync` package when needed. |
| 101 | + |
| 102 | + - *Testing*: Write comprehensive unit tests for all components using Go's built-in `testing` package. Aim for a high level of test coverage to ensure reliability and facilitate future changes. |
| 103 | + |
| 104 | +3. **Enhance Error Handling**: Utilizing Go's built-in error handling to create a more reliable and fault-tolerant scheduler. |
| 105 | + |
| 106 | +4. **Integrate with Existing Systems**: Ensuring the new Go-based scheduler integrates seamlessly with the current FOSSology ecosystem. |
| 107 | + |
| 108 | +5. **Test and Deploy**: Thoroughly test the new system for performance and reliability before full deployment. |
| 109 | + |
| 110 | +6. **Document the System**: Provide comprehensive documentation to support future development and use of the scheduler. |
| 111 | + |
0 commit comments