This is a Flask-based web application that allows users to search for datasets on Kaggle and download them directly.
The app provides a user-friendly interface to interact with Kaggle's dataset repository, making it easier to find and retrieve datasets.
- Search for Datasets: Enter a
searchterm to find datasets hosted on Kaggle. - View Search Results: See a list of datasets that match your
query, along with their titles. - Download Datasets: Download selected datasets as
.zipfiles directly from the application.
-
App.py:- The main
Flaskapplication that handles routes and renders templates. - Routes:
/: Home page./search: Search for datasets./download/<path:dataset_ref>: Download the selected dataset.
- The main
-
kaggle_connect.py:- Handles interaction with the
Kaggle API. - Functions:
search_datasets(search_term): Searches Kaggle for datasets matching the provided term.download_dataset(dataset_ref): Downloads a dataset by its reference.
- Handles interaction with the
-
index.html:- Home page with a welcome message and a link to the search page.
-
search.html:- A form to input a search term for finding datasets.
-
search_results.html:- Displays the search results and provides download links for each dataset.
- Python 3.8+
- Kaggle API credentials (download your
kaggle.jsonfrom Kaggle and place it in~/.kaggle/or the project root).
-
Clone the repository:
git clone https://github.com/yourusername/dataset-search-download.git cd dataset-search-download -
Install the required dependencies:
pip install -r requirements.txt
-
Set up Kaggle API credentials:
- Place your
kaggle.jsonfile in the~/.kaggle/directory or in the root of the project.
- Place your
-
Run the Flask application:
python App.py -
Open your web browser and navigate to:
http://127.0.0.1:5000/ -
Use the application to search for datasets, view results, and download datasets.
- After accessing
http://127.0.0.1:5000/, you should see a page like the one shown below, and click the "Search for Datasets" button.
- Then, after selecting "Search for Datasets". We enter the term we want to find. In this case, we are using the example "College"."

- We will see a list of Datasets from which we will select one. In this case we are going to select the first dataset "College Basketball Dataset".

- The system will save the .zip file of the selected dataset into the "Downloads folder".

- After selecting the dataset we want, we can go back to the beginning to download another dataset.

- Finally, it shows us the home page with a small message telling us the last .zip file we downloaded.

.
├── App.py # Main Flask application
├── kaggle_connect.py # Kaggle API integration
├── templates/ # HTML templates
│ ├── index.html
│ ├── search.html
│ ├── search_results.html
├── dataset/ # Directory for downloaded datasets
└── README.md # Project documentation
- Ensure that the Kaggle API is properly authenticated to use this application.
- The downloaded datasets are saved in the
dataset/directory as.zipfiles.