-
Notifications
You must be signed in to change notification settings - Fork 216
Description
Gofeed v2 - Proposed Changes & Implementation Progress
Trying to pull together some thoughts across a number of issues for a v2 of gofeed
. This document outlines some ideas to make the library better, tackling current limits, making data access easier, and giving the API a refresh. Since it's a v2, breaking changes are on the table if they lead to a better overall experience.
The aim is to focus on the following core themes:
- Enhanced Data Access & Preservation: Ensuring users can access all data present in a feed, including format-specific fields, the original parsed structure, and relevant HTTP response metadata.
- Comprehensive & Unified Custom/Extension Element Handling: Robustly parsing and exposing all XML/JSON elements, including those not part of standard feed specifications or common extensions, in a structured and navigable way.
- Improved Parser Configuration & Control: Providing users with more granular control over the parsing process, including performance tuning, strictness, HTTP request parameters, and conditional fetching.
Implementation Checklist
Core API Changes
Related issues: #244, #251, #205, #82, #246, #229, #235, #228
- Remove
Item.Custom
field, enhance Extensions for structured data - Implement ParseOptions foundation with RequestOptions sub-struct
- Context-first ParseURL API - removed ParseURLWithContext
- Expose original format-specific feed data via KeepOriginalFeed
Parser Improvements
- Update format-specific parsers with public constructors
- Add strictness and robustness controls
- ParseDates toggle implementation
Streaming Support
Related issues: #256
- Update feed detection to use fixed buffer instead of reading entire file
- Implement streaming parse methods that return an iterator/channel of items
- Add MaxItems support that actually stops reading (not just skipping)
Network & HTTP
Related issues: #247, #111, #165
- HTTP response metadata (ETag, Last-Modified, Cache-Control)
- Conditional request support (If-None-Match, If-Modified-Since)
- Rate limiting support via Retry-After header
- Custom HTTP client configuration
Architecture
- Refactor translator interfaces for type safety
- Comprehensive error handling system with typed errors
Dependencies & Module Structure
- Keep ftest in main module based on community feedback
- Remove unnecessary dependencies (json-iterator, goquery)
Key Design Decisions
1. ParseOptions Structure
Implemented as a single struct with sub-structs for organization:
type ParseOptions struct {
// Core parsing options
KeepOriginalFeed bool
ParseDates bool
StrictnessOptions StrictnessOptions
// HTTP request configuration
RequestOptions RequestOptions {
UserAgent string
Timeout time.Duration
IfNoneMatch string // For conditional requests
IfModifiedSince time.Time
Client *http.Client
AuthConfig *Auth
}
}
All parsing methods accept *ParseOptions
which can be nil for defaults. Decided against variadic options for simplicity.
2. Extension System
The new extension system replaces the limited map[string]string
with a structured approach:
// Access custom elements
weight := item.GetCustomValue("weight")
// Access with attributes
ext := item.GetExtension("_custom", "priority")
if len(ext) > 0 {
value := ext[0].Value
level := ext[0].Attrs["level"]
}
Non-namespaced elements in RSS/Atom are stored under the "_custom" namespace to avoid conflicts.
3. API Consistency
All parse methods now have consistent signatures:
Parse(reader io.Reader, opts *ParseOptions) (*Feed, error)
ParseString(str string, opts *ParseOptions) (*Feed, error)
ParseURL(ctx context.Context, url string, opts *ParseOptions) (*Feed, error)
Context is required for ParseURL to follow modern Go practices.
4. Streaming Considerations
Current detection loads entire feed into memory. Plan is to:
- Use fixed-size buffer (e.g., 8KB) for type detection
- Reconstruct complete reader using
io.MultiReader
- Enable true streaming for large feeds
- Implement iterator pattern for processing items without loading all into memory
- Add MaxItems support that actually stops reading when limit reached
5. Error Handling Philosophy
Moving toward typed errors with context:
- Parse location (line/column when available)
- Field that caused the error
- Strictness-aware (warnings in lenient mode become errors in strict mode)
- Categories: Parse, Validation, Network, Format, Extension errors
Migration Guide
Key breaking changes for v2:
// Old (v1)
parser.UserAgent = "MyApp"
feed, _ := parser.Parse(reader)
feed, _ := parser.ParseURL(url)
value := item.Custom["field"]
// New (v2)
opts := &gofeed.ParseOptions{
RequestOptions: gofeed.RequestOptions{
UserAgent: "MyApp",
},
}
feed, _ := parser.Parse(reader, opts) // or nil
feed, _ := parser.ParseURL(context.Background(), url, opts)
value := item.GetCustomValue("field")
Feedback
Please give feedback or suggested changes to the plan for gofeed
v2.
@infogulch @cristoper @spacecowboy and others, if you have suggestions for other changes, or comments about the above, let me know.