Skip to content

Conversation

jerryzhou196
Copy link
Member

@jerryzhou196 jerryzhou196 commented May 14, 2025

Mysteriously convenient endpoint if we had professor data...

(about 50% done) - still need to run it and also test on staging

Adds professor data ingestion and fuzzy matching

Implements endpoints and logic to ingest and process professor-course data, including JSON flattening and fuzzy matching.
Introduces middleware for admin authentication and updates routing for new admin functionality.
Adds support for environment variable to manage admin secret.
Fixes PDF parsing build flags for compatibility.

Relates to data synchronization and matching improvements.
…ofessors

Implement migration to replace materialized view with new table for professor-course associations

This migration introduces a new table `public.parsed_prof_taught_course` and replaces the materialized view `materialized.prof_teaches_course` with data seeded from reviews and the new table. It also updates the `course_search_index` and `prof_search_index` materialized views to utilize the new table structure. Additionally, it includes functions for searching courses and professors, along with triggers for refreshing related views.
Introduces normalized processing of professor-course data with JSON staging and SQL-based fuzzy matching for accuracy. Adds categorization logic to distinguish between new, ambiguous, and existing entries. Implements a new SQL table and enum for handling insertion and similarity scoring. Cleans up redundant code and simplifies data flow.

Relates to #1745912089107
@jerryzhou196 jerryzhou196 changed the title [FLOW - 42] add ingest endpoint [FLOW-42] add ingest endpoint Sep 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant