Skip to content

Github Repository of jupyter notebooks and python files made for The 'Data Science Capstone' IBM Course Project Scenario. Made in December 2023.

Notifications You must be signed in to change notification settings

sav-1305/IBM-Data-Science-Capstone

Repository files navigation

IBM-Data-Science-Capstone

Github Repository of jupyter notebooks and python files made for The 'Data Science Capstone' IBM Course Project Scenario. Made in December 2023.

BACKGROUND

SpaceX, a leader in the space industry, strives to make space travel affordable for everyone. Its accomplishments include sending spacecraft to the international space station, launching a satellite constellation that provides internet access and sending manned missions to space. SpaceX can do this because the rocket launches are relatively inexpensive ($62 million per launch) due to its novel reuse of the first stage of its Falcon 9 rocket. Other providers, which are not able to reuse the first stage, cost upwards of $165 million each. By determining if the first stage will land, we can determine the price of the launch. To do this, we can use public data and machine learning models to predict whether SpaceX – or a competing company – can reuse the first stage.

Executive Summary

The research attempts to identify the factors for a successful rocket landing.

Research Methodologies:

  • Collect data using SpaceX REST API and web scraping techniques
  • Wrangle data to create success/fail outcome variable
  • Explore data with data visualization techniques, considering the following factors: payload, launch site, flight number and yearly trend
  • Analyze the data with SQL, calculating the following statistics: total payload, payload range for successful launches, and total # of successful and failed outcomes
  • Explore launch site success rates and proximity to geographical markers
  • Visualize the launch sites with the most success and successful payload ranges
  • Build Models to predict landing outcomes using logistic regression, support vector machine (SVM), decision tree and K-nearest neighbor (KNN)

Exploratory Data Analysis:

  • Launch success has improved over time
  • KSC LC-39A has the highest success rate among landing sites
  • Orbits ES-L1, GEO, HEO, and SSO have a 100% success rate

Visualization / Analytics:

  • Most launch sites are near the equator, and all are close to the coast

Predictive Analytics

  • All models performed similarly on the test set. The decision tree model slightly outperformed when looking at .best_score_

About

Github Repository of jupyter notebooks and python files made for The 'Data Science Capstone' IBM Course Project Scenario. Made in December 2023.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published