aws-glue-data-catalog

Here are 24 public repositories matching this topic...

j3-signalroom / apache_flink-kickstarter

Examples of Apache Flink® v2.1 applications showcasing the DataStream API, Table API in Java and Python, and Flink SQL, featuring AWS, GitHub, Terraform, Streamlit, and Apache Iceberg.

Updated Nov 29, 2025
Java

aws-samples / automated-datastore-discovery-with-aws-glue

Star

Automation framework to catalog AWS data sources using Glue

aws typescript aws-s3 dynamodb glue python3 data-catalog rds gdpr pii data-governance aws-cdk aws-glue-workflow aws-glue-crawler aws-glue-data-catalog

Updated Apr 10, 2025
Python

DivineSamOfficial / SmartCityProject

Star

Smart City Realtime Data Engineering Project

python aws kafka aws-s3 pyspark spark-streaming aws-ec2 aws-athena aws-redshift aws-glue aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated May 24, 2024
Python

BahBosque / delta-to-iceberg-aws-glue

Star

Tool to migrate Delta Lake tables to Apache Iceberg using AWS Glue and S3

open-source aws spark data-lake migration-tool apache-iceberg delta-lake aws-glue-data-catalog

Updated May 22, 2025

shiv-rna / Youtube-Data-Engineering-Pipeline

Star

This project repo 📺 offers a robust solution meticulously crafted to efficiently manage, process, and analyze YouTube video data leveraging the power of AWS services. Whether you're diving into structured statistics or exploring the nuances of trending key metrics, this pipeline is engineered to handle it all with finesse.

aws youtube aws-lambda aws-s3 aws-cli data-engineering aws-iam aws-athena aws-glue data-engineering-pipeline aws-quicksight aws-glue-data-catalog

Updated Mar 20, 2024
Python

ShubhamMohanty680 / Spotify_end_to_end_data_engineering

Star

It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.

python aws aws-lambda aws-s3 spotify-api data-engineering aws-athena data-engineering-pipeline spotipy-library aws-glue-crawler awscloudwatch aws-glue-data-catalog aws-trigger

Updated Jan 22, 2025
Jupyter Notebook

j3-signalroom / supercharge_streamlit-apache_flink

Star

Engaging, interactive visualizations crafted with Streamlit, seamlessly powered by Apache Flink in batch mode to reveal deep insights from data.

kafka apache-flink flink iceberg apache-iceberg flink-sql streamlit streamlit-dashboard pyflink aws-glue-data-catalog

Updated Dec 1, 2024
Python

ablange / aws-data-lake

Star

Prototype of AWS data lake reference implementation written in Python and Spark: https://aws.amazon.com/solutions/implementations/data-lake-solution/

python aws sql spark aws-s3 aws-sns aws-cloudformation aws-dynamodb aws-athena aws-lambda-python aws-glue aws-glue-data-catalog

Updated Apr 13, 2025
Python

subhamay-cloudworks / 0090-deutzia-cft

Sponsor

Star

Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation

aws-python-lambda aws-iam aws-cloudformation aws-cloudtrail aws-cloudwatch aws-athena aws-cloudwatch-logs aws-kinesis-stream aws-glue-crawler aws-iam-roles aws-iam-policies aws-s3-bucket aws-glue-data-catalog

Updated Jul 6, 2023
Python

subhamay-cloudworks / 0052-agapanthus-cft

Sponsor

Star

Working with Glue Data Catalog and Running the Glue Crawler On Demand

aws-cloudformation aws-glue aws-glue-crawler aws-iam-roles aws-iam-policies aws-glue-data-catalog

Updated May 11, 2023

SadafAsad / LinkedIn-Jobs-Analysis

Star

Unveiling job market trends with Scrapy and AWS

python aws-s3 scrapy aws-ec2 aws-athena aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated Apr 5, 2024
Python

j3-signalroom / ccaf-tableflow-aws_glue-snowflake-kickstarter

Star

This project demonstrates how to use Terraform to enable Tableflow in Kafka to generate and store the Iceberg Table files in an AWS S3 bucket. Then, configure Snowflake to read the Iceberg Tables using AWS Glue Data Catalog and the AWS S3 bucket where Tableflow produces the Iceberg files.

snowflake amazon-s3 confluent-kafka confluent-cloud aws-glue-data-catalog confluent-flink confluent-tableflow

Updated Nov 22, 2025
HCL

ev2900 / Iceberg_Glue_register_table

Star

Example using the Iceberg register_table command with AWS Glue and Glue Data Catalog

aws glue iceberg aws-glue apache-iceberg aws-glue-data-catalog

Updated Nov 24, 2025
Python

ShreyasLengade / serverless_etl_pipeline

Star

Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.

aws-lambda aws-s3 data-engineering aws-kinesis aws-glue data-engineering-pipeline aws-glue-crawler aws-grafana aws-glue-data-catalog

Updated Jun 25, 2024
Python

deept-agl / Youtube-data-ETL-Analysis-using-AWS

Star

This project creates a scalable data pipeline to analyze YouTube data from Kaggle using AWS services: S3, Glue, Lambda, Athena, and QuickSight. It processes raw JSON and CSV files into cleansed, partitioned datasets, integrates them with ETL workflows, and catalogs data for querying. Final insights are visualized in QuickSight dashboards.

aws-lambda athena aws-s3 aws-glue quicksight aws-glue-data-catalog

Updated Jan 25, 2025
Python

joaovnovais / pipeline_AWSGlue_PySpark

Star

This project showcases a complete data engineering pipeline on AWS, following best practices in data ingestion, transformation, and analytics — ready for real-world production use or integration with BI tools such as QuickSight or Power BI.

aws-s3 pyspark aws-athena aws-glue-crawler aws-glue-data-catalog

Updated Oct 18, 2025

Tomaslopera / E_Commerce_DW

Star

sql etl crontab aws-s3 aws-ec2 aws-rds powerbi aws-redshift prefect aws-glue-crawler aws-glue-data-catalog

Updated Oct 23, 2025
Jupyter Notebook

harika-majji / aws-stock-market-analysis

Star

python kafka aws-s3 aws-ec2 aws-glue-crawler aws-glue-data-catalog

Updated Apr 1, 2025
Jupyter Notebook

joaovnovais / terraform-aws-data-infra

Star

This project uses Terraform and GitHub Actions to build and validate a data infrastructure on AWS. The CI pipeline automates code verification for provisioning an S3 bucket and a Glue catalog, establishing a solid, version-controlled foundation for data engineering projects.

git ubuntu aws-s3 ci-cd github-actions wsl2 iac-terraform aws-glue-data-catalog

Updated Oct 19, 2025
HCL

SAGE-Rebirth / aws-glue-sample

Star

AWS Glue ETL Pipeline automates data extraction, transformation, and loading using AWS Glue and S3. It ingests raw data from an S3 source bucket, processes it via Glue ETL jobs, and stores the transformed data in a destination bucket. This solution enables efficient serverless data processing.

aws etl aws-s3 etl-pipeline aws-glue aws-glue-crawler aws-glue-data-catalog

Updated Mar 26, 2025

Improve this page

Add a description, image, and links to the aws-glue-data-catalog topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue-data-catalog topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-glue-data-catalog

Here are 24 public repositories matching this topic...

j3-signalroom / apache_flink-kickstarter

aws-samples / automated-datastore-discovery-with-aws-glue

DivineSamOfficial / SmartCityProject

BahBosque / delta-to-iceberg-aws-glue

shiv-rna / Youtube-Data-Engineering-Pipeline

ShubhamMohanty680 / Spotify_end_to_end_data_engineering

j3-signalroom / supercharge_streamlit-apache_flink

ablange / aws-data-lake

subhamay-cloudworks / 0090-deutzia-cft

subhamay-cloudworks / 0052-agapanthus-cft

SadafAsad / LinkedIn-Jobs-Analysis

j3-signalroom / ccaf-tableflow-aws_glue-snowflake-kickstarter

ev2900 / Iceberg_Glue_register_table

ShreyasLengade / serverless_etl_pipeline

deept-agl / Youtube-data-ETL-Analysis-using-AWS

joaovnovais / pipeline_AWSGlue_PySpark

Tomaslopera / E_Commerce_DW

harika-majji / aws-stock-market-analysis

joaovnovais / terraform-aws-data-infra

SAGE-Rebirth / aws-glue-sample

Improve this page

Add this topic to your repo