-
Notifications
You must be signed in to change notification settings - Fork 1.2k
V6 #1100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
dboskovic
wants to merge
65
commits into
master
Choose a base branch
from
v6
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PapaParse V6 Refactoring Plan
π Implementation Progress
The modern TypeScript implementation is now complete and ready for production use:
Overview
This document outlines the migration plan from the legacy single-file format (
legacy/papaparse.js
) to a modern, modular TypeScript architecture while maintaining 100% API compatibility and ensuring all tests pass.Goals
Refactoring Strategy
Phase 1: Foundation & Performance Infrastructure
Create the foundation with performance and compatibility safeguards from day one:
File:
src/types/index.ts
(Legacy reference: lines 60-86)File:
src/constants/index.ts
(Legacy reference: lines 65-75)File:
ci/performance-benchmark.ts
Phase 2: Core Parsing Engine (Split for Maintainability)
File:
src/core/lexer.ts
(Legacy reference: lines 1414-1683)File:
src/core/parser.ts
(Legacy reference: lines 1684-1819)File:
src/core/errors.ts
(Legacy reference: error handling throughout)File:
src/core/parser-handle.ts
(Legacy reference: lines 1027-1406)Phase 3: Heuristics & Algorithms
File:
src/heuristics/guess-delimiter.ts
(Legacy reference: lines 1340-1392)File:
src/heuristics/dynamic-typing.ts
(Legacy reference: lines 1253-1277)File:
src/heuristics/line-endings.ts
(Legacy reference: lines 1161-1185)Phase 4: Streaming Infrastructure
File:
src/streamers/chunk-streamer.ts
(Legacy reference: lines 487-563)File:
src/streamers/string-streamer.ts
(Legacy reference: lines 564+)File:
src/streamers/file-streamer.ts
(Legacy reference: lines 564+)File:
src/streamers/network-streamer.ts
(Legacy reference: lines 564+)File:
src/streamers/readable-stream-streamer.ts
(Legacy reference: lines 564+)File:
src/streamers/duplex-stream-streamer.ts
(Legacy reference: lines 564-1024)Phase 5: Core Functions
File:
src/csv-to-json/index.ts
(Legacy reference: lines 196-257)CsvToJson
functionFile:
src/json-to-csv/index.ts
(Legacy reference: lines 264-484)JsonToCsv
functionPhase 6: Workers & Concurrency
File:
src/workers/host.ts
(Legacy reference: lines 1821-1888, 49-58)File:
src/workers/worker-entry.ts
(Legacy reference: lines 1894-1920)Phase 7: Plugin System
File:
src/plugins/jquery.ts
(Legacy reference: lines 88-180)papaparse/jquery
for tree-shakingPhase 8: Public API & Compatibility
File:
src/public/papa.ts
- Papa object constructionFile:
src/utils/index.ts
(Legacy reference: lines 1922-1943, 189, 1408-1412)File:
src/index.ts
- Main exportImplementation Checklist
Foundation & Safety Infrastructure β COMPLETED
"target": "es2018", "module": "commonjs"
(updated for compatibility)src/types/
for public APIsrc/constants/
)src/utils/
)bun run ci:foundation
passing)Core Engine Implementation β COMPLETED
src/core/lexer.ts
) - Pure byte/character scanning with tight loopssrc/core/parser.ts
) - Row construction and field validationsrc/core/errors.ts
) - Standardized error types and factoriessrc/core/parser-handle.ts
) - High-level orchestrationAlgorithms & Coordination β COMPLETED
src/heuristics/guess-delimiter.ts
) - Pure function for field count analysissrc/heuristics/dynamic-typing.ts
) - Boolean, numeric, date, and null detectionsrc/heuristics/line-endings.ts
) - Quote-aware line ending detectionStreaming Infrastructure β COMPLETED
src/streamers/chunk-streamer.ts
) - Base class and coordinationsrc/streamers/string-streamer.ts
) - String input processingsrc/streamers/file-streamer.ts
) - File input with FileReadersrc/streamers/network-streamer.ts
) - Remote file downloadingsrc/streamers/readable-stream-streamer.ts
) - Node.js streamssrc/streamers/duplex-stream-streamer.ts
) - Node.js duplex streamsCore Functions β COMPLETED
src/csv-to-json/index.ts
) - Main CsvToJson function (lines 196-257)src/json-to-csv/index.ts
) - Main JsonToCsv function (lines 264-484)Workers & Advanced Features β COMPLETED
src/workers/host.ts
) - Main thread orchestrationsrc/workers/worker-entry.ts
) - Standalone worker entrysrc/core/errors.ts
) - Standardized error types and factoriesPublic API & Integration
src/public/papa.ts
) - Static property bag pattern with Object.assignsrc/index.ts
) - UMD wrapper adaptationPlugin System β COMPLETED
src/plugins/jquery.ts
) - Optional integration as sub-package with exact legacy behaviorsrc/plugins/index.ts
) - Tree-shakable plugin registry for extensibilityFile Structure
Testing Strategy
Compatibility Testing
tests/test-cases.js
,tests/node-tests.js
) against new implementationMigration Testing
Integration Testing
Migration Path for Users
Phase A: Parallel Implementation
Phase B: Soft Migration
Phase C: Full Migration
Success Criteria
Safeguards
Performance Protection
API Compatibility Protection
Object.keys(Papa)
must match between versionsrequire('papaparse').parse === require('papaparse').parse
Papa.parse('', {dynamicTyping: true}).data
returns[[""]]
Breaking Change Traps to Avoid
Papa.WORKER_ID
globalPapa.LocalChunkSize
after parse() starts must affect subsequent filesArchitecture Benefits
Success Metrics
===
comparison passesπ§ͺ CI Testing Infrastructure (Phase 1 Complete)
The following testing infrastructure has been implemented and is ready for use:
Performance Benchmarking
bun run ci:benchmark # Run performance benchmarks
Golden Output Snapshots
API Surface Reflection Testing
bun run ci:api-test # Run API compatibility tests
Object.keys(Papa)
matches exactly between versionsPapa.parse('', {dynamicTyping: true}).data
returns[[""]]
Foundation Testing
npm Scripts Available
bun run ci:foundation
- Foundation infrastructure tests (β passing)bun run ci:benchmark
- Performance regression testingbun run ci:snapshots:generate
- Create baseline snapshotsbun run ci:snapshots:validate
- Validate compatibilitybun run ci:api-test
- API surface testingbun run ci:all
- Complete test suitebun run refactor:test
- Alias for foundation tests