Skip to content

RFC: gofeed v2 – Proposed Changes #241

@mmcdole

Description

@mmcdole

Gofeed v2 - Proposed Changes & Implementation Progress

Trying to pull together some thoughts across a number of issues for a v2 of gofeed. This document outlines some ideas to make the library better, tackling current limits, making data access easier, and giving the API a refresh. Since it's a v2, breaking changes are on the table if they lead to a better overall experience.

The aim is to focus on the following core themes:

  1. Enhanced Data Access & Preservation: Ensuring users can access all data present in a feed, including format-specific fields, the original parsed structure, and relevant HTTP response metadata.
  2. Comprehensive & Unified Custom/Extension Element Handling: Robustly parsing and exposing all XML/JSON elements, including those not part of standard feed specifications or common extensions, in a structured and navigable way.
  3. Improved Parser Configuration & Control: Providing users with more granular control over the parsing process, including performance tuning, strictness, HTTP request parameters, and conditional fetching.

Implementation Checklist

Core API Changes

Related issues: #244, #251, #205, #82, #246, #229, #235, #228

  • Remove Item.Custom field, enhance Extensions for structured data
  • Implement ParseOptions foundation with RequestOptions sub-struct
  • Context-first ParseURL API - removed ParseURLWithContext
  • Expose original format-specific feed data via KeepOriginalFeed

Parser Improvements

Related issues: #248, #250

  • Update format-specific parsers with public constructors
  • Add strictness and robustness controls
  • ParseDates toggle implementation

Streaming Support

Related issues: #256

  • Update feed detection to use fixed buffer instead of reading entire file
  • Implement streaming parse methods that return an iterator/channel of items
  • Add MaxItems support that actually stops reading (not just skipping)

Network & HTTP

Related issues: #247, #111, #165

  • HTTP response metadata (ETag, Last-Modified, Cache-Control)
  • Conditional request support (If-None-Match, If-Modified-Since)
  • Rate limiting support via Retry-After header
  • Custom HTTP client configuration

Architecture

Related issues: #249, #255

  • Refactor translator interfaces for type safety
  • Comprehensive error handling system with typed errors

Dependencies & Module Structure

Related issues: #128, #254

  • Keep ftest in main module based on community feedback
  • Remove unnecessary dependencies (json-iterator, goquery)

Key Design Decisions

1. ParseOptions Structure

Implemented as a single struct with sub-structs for organization:

type ParseOptions struct {
    // Core parsing options
    KeepOriginalFeed bool
    ParseDates bool
    StrictnessOptions StrictnessOptions
    
    // HTTP request configuration
    RequestOptions RequestOptions {
        UserAgent string
        Timeout time.Duration
        IfNoneMatch string      // For conditional requests
        IfModifiedSince time.Time
        Client *http.Client
        AuthConfig *Auth
    }
}

All parsing methods accept *ParseOptions which can be nil for defaults. Decided against variadic options for simplicity.

2. Extension System

The new extension system replaces the limited map[string]string with a structured approach:

// Access custom elements
weight := item.GetCustomValue("weight")

// Access with attributes
ext := item.GetExtension("_custom", "priority")
if len(ext) > 0 {
    value := ext[0].Value
    level := ext[0].Attrs["level"]
}

Non-namespaced elements in RSS/Atom are stored under the "_custom" namespace to avoid conflicts.

3. API Consistency

All parse methods now have consistent signatures:

  • Parse(reader io.Reader, opts *ParseOptions) (*Feed, error)
  • ParseString(str string, opts *ParseOptions) (*Feed, error)
  • ParseURL(ctx context.Context, url string, opts *ParseOptions) (*Feed, error)

Context is required for ParseURL to follow modern Go practices.

4. Streaming Considerations

Current detection loads entire feed into memory. Plan is to:

  1. Use fixed-size buffer (e.g., 8KB) for type detection
  2. Reconstruct complete reader using io.MultiReader
  3. Enable true streaming for large feeds
  4. Implement iterator pattern for processing items without loading all into memory
  5. Add MaxItems support that actually stops reading when limit reached

5. Error Handling Philosophy

Moving toward typed errors with context:

  • Parse location (line/column when available)
  • Field that caused the error
  • Strictness-aware (warnings in lenient mode become errors in strict mode)
  • Categories: Parse, Validation, Network, Format, Extension errors

Migration Guide

Key breaking changes for v2:

// Old (v1)
parser.UserAgent = "MyApp"
feed, _ := parser.Parse(reader)
feed, _ := parser.ParseURL(url)
value := item.Custom["field"]

// New (v2)
opts := &gofeed.ParseOptions{
    RequestOptions: gofeed.RequestOptions{
        UserAgent: "MyApp",
    },
}
feed, _ := parser.Parse(reader, opts)  // or nil
feed, _ := parser.ParseURL(context.Background(), url, opts)
value := item.GetCustomValue("field")

Feedback

Please give feedback or suggested changes to the plan for gofeed v2.

@infogulch @cristoper @spacecowboy and others, if you have suggestions for other changes, or comments about the above, let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions