You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+13Lines changed: 13 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,8 @@ The objectives were:
25
25
-**CI/CD Automation**: Implement an **automated CI/CD pipeline** using `GitHub Actions` to ensure continuous testing and code quality management.
26
26
-**Dockerization**: Develop a **Dockerized pipeline** for ease of use, incorporating `Docker volumes` for persistent data management.
27
27
28
+
<divid="top"></div> </div><palign="right">(<ahref="#top">back to top</a>)</p>
29
+
28
30
## 🛠️ Preparation & Prototyping in Notebooks
29
31
30
32
Before I started making `Kedro pipelines`, I tried out my ideas in Jupyter notebooks. Check the `notebooks` folder to see how I did it:
@@ -51,6 +53,8 @@ The `Kedro Viz tool` provides an interactive canvas to visualize and **understan
51
53
52
54
With this tool, the understanding of data progression, outputs, and interactivity is greatly simplified. Kedro Viz allows users to inspect samples of data, view parameters, analyze figures, and much more, enriching the user experience with enhanced transparency and interactivity.
53
55
56
+
<divid="top"></div> </div><palign="right">(<ahref="#top">back to top</a>)</p>
57
+
54
58
## 📜 Logging and Monitoring
55
59
56
60
Logging is integral to understanding and troubleshooting pipelines. This project leverages Kedro's logging capabilities to provide real-time insights into pipeline execution, highlighting progress, warnings, and errors. This GIF demonstrates the use of the `kedro run` or `make run` command, showcasing the logging output in action:
@@ -61,6 +65,8 @@ Logging is integral to understanding and troubleshooting pipelines. This project
61
65
62
66
Notice how the nodes are executed sequentially, and observe the **RMSE outputs during validation** for the **XGBoost model**. Logging in Kedro is highly customizable, allowing for tailored monitoring that meets the user's specific needs.
63
67
68
+
<divid="top"></div> </div><palign="right">(<ahref="#top">back to top</a>)</p>
69
+
64
70
## 📁 Project Structure
65
71
66
72
A _simplified_ overview of the Kedro project's structure:
@@ -104,6 +110,8 @@ Kedro-Energy-Forecasting/
104
110
└── requirements.txt # Project dependencies
105
111
```
106
112
113
+
<divid="top"></div> </div><palign="right">(<ahref="#top">back to top</a>)</p>
114
+
107
115
## 🚀 Getting Started
108
116
109
117
First, **Clone the Repository** to download a copy of the code onto your local machine, and before diving into transforming **raw data** into a **trained pickle Machine Learning model**, please note:
@@ -127,6 +135,7 @@ Here is an example of the available targets: (you type `make` in the command lin
127
135
- For **production** environments, initialize your setup by executing `make prep-doc` or using `pip install -r docker-requirements.txt` to install the production dependencies.
128
136
- For a **development** environment, where you may want to use **Kedro Viz**, work with **Jupyter notebooks**, or test everything thoroughly, run `make prep-dev` or `pip install -r dev-requirements.txt` to install all the development dependencies.
129
137
138
+
<divid="top"></div> </div><palign="right">(<ahref="#top">back to top</a>)</p>
130
139
131
140
### 🌿 Standard Method (Conda / venv)
132
141
@@ -154,11 +163,15 @@ Prefer this method for a containerized approach, ensuring a consistent developme
154
163
155
164
For additional assistance or to explore more command options, refer to the **Makefile** or consult `kedro --help`.
156
165
166
+
<divid="top"></div> </div><palign="right">(<ahref="#top">back to top</a>)</p>
167
+
157
168
## 🌌 Next Steps?
158
169
With our **Kedro Pipeline** 🏗 now capable of efficiently **transforming raw** data 🔄 into **trained models** 🤖, and the introduction of a Dockerized environment 🐳 for our code, the next phase involves _advancing beyond the current repository scope_ 🚀 to `orchestrate data updates automatically` using tools like **Databricks**, **Airflow**, **Azure Data Factory**... This progression allows for the seamless integration of fresh data into our models.
159
170
160
171
Moreover, implementing `experiment tracking and versioning` with **MLflow** 📊 or leveraging **Kedro Viz**'s versioning capabilities 📈 will significantly enhance our project's management and reproducibility. These steps are pivotal for maintaining a clean machine learning workflow that not only achieves our goal of simplifying model training processes 🛠 but also ensures our system remains dynamic and scalable with **minimal effort**.
161
172
173
+
<divid="top"></div> </div><palign="right">(<ahref="#top">back to top</a>)</p>
174
+
162
175
## 🌐 Let's Connect!
163
176
164
177
You can connect with me on **LinkedIn** or check out my **GitHub repositories**:
0 commit comments