Skip to content

Conversation

Henry8192
Copy link
Contributor

@Henry8192 Henry8192 commented Aug 20, 2025

Description

Resolves #272.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Compress hello.txt to hello.txt.zst and open in the log viewer. Log viewer opens the file successfully.
Remove the file extension or rename it to other file extensions, log viewer still opens the file with the right decoder.

Summary by CodeRabbit

  • New Features

    • Added support for Zstandard-compressed files (.zst, .zstd). Files are auto-detected and decompressed on import, enabling seamless viewing and search like plain text.
  • Chores

    • Added a new dependency to enable Zstandard decoding.

@coderabbitai
Copy link

coderabbitai bot commented Aug 20, 2025

Walkthrough

Introduces Zstandard (.zst/.zstd) file support by adding a Zstd decoder using the zstddec library, registering the new file type with signature and extensions, and updating the decoder factory types. Adds the zstddec dependency.

Changes

Cohort / File(s) Summary
Dependency update
package.json
Adds dependency zstddec@^0.2.0-alpha.3.
Zstd decoder implementation
src/services/decoders/ZstdDecoder/index.ts
Adds ZstdDecoder extending PlainTextDecoder with static create(dataArray) that initializes ZSTDDecoder from zstddec, decodes input, and returns a PlainTextDecoder with decompressed data.
File type registration and typing
src/typings/file.ts
Imports ZstdDecoder; extends FileTypeDef.DecoderFactory union to include it; adds Zstandard entry to FILE_TYPE_DEFINITIONS with signature [0x28, 0xb5, 0x2f, 0xfd], extensions [".zst", ".zstd"], checkIsStructured: () => false.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant App as App
  participant Detector as FileTypeDetector
  participant Zstd as ZstdDecoder
  participant PT as PlainTextDecoder

  User->>App: Open .zst/.zstd file
  App->>Detector: Detect file type (extensions/signature)
  Detector-->>App: Zstandard
  App->>Zstd: create(Uint8Array)
  activate Zstd
  Zstd->>Zstd: init ZSTDDecoder (zstddec)
  Zstd->>Zstd: decode(compressed)
  Zstd-->>App: PlainTextDecoder(decompressed)
  deactivate Zstd
  App->>PT: read text/lines
  PT-->>App: content
  App-->>User: Display logs
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Assessment against linked issues

Objective Addressed Explanation
Integrate a Zstandard decompression library (#272)
Detect .zstd files and automatically decompress on load (#272)
Ensure seamless integration with existing parsing/display (#272) No changes in loader/parsing flow shown; relies on existing integration.

Possibly related PRs

Suggested reviewers

  • junhaoliao

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Member

@junhaoliao junhaoliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to submit the plain text decoder and zstd decoder in separate PRs?

@junhaoliao
Copy link
Member

#271 should be related to this as well. still let's explore the possibility of submitting the decoders separately

@Henry8192 Henry8192 marked this pull request as ready for review August 21, 2025 20:42
@Henry8192 Henry8192 requested a review from a team as a code owner August 21, 2025 20:42
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
src/services/decoders/PlainTextDecoder/index.ts (1)

1-2: Avoid file-wide ESLint disables; add targeted suppressions only where needed.

Global disables can hide real issues as the class evolves.

Apply this diff:

-/* eslint-disable @typescript-eslint/no-unused-vars */
-/* eslint-disable class-methods-use-this */
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2e0b823 and 2ddcdae.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (6)
  • package.json (1 hunks)
  • src/services/LogFileManager/decodeUtils.ts (0 hunks)
  • src/services/decoders/PlainTextDecoder/index.ts (1 hunks)
  • src/services/decoders/ZstdDecoder/index.ts (1 hunks)
  • src/typings/file.ts (3 hunks)
  • vite.config.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • src/services/LogFileManager/decodeUtils.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • vite.config.ts
  • src/typings/file.ts
  • src/services/decoders/PlainTextDecoder/index.ts
  • src/services/decoders/ZstdDecoder/index.ts
🧠 Learnings (3)
📚 Learning: 2024-10-28T18:40:29.816Z
Learnt from: Henry8192
PR: y-scope/yscope-log-viewer#107
File: src/components/CentralContainer/Sidebar/SidebarTabs/SearchTabPanel/Result.tsx:1-7
Timestamp: 2024-10-28T18:40:29.816Z
Learning: In the y-scope/yscope-log-viewer project, importing components via destructuring from 'mui/joy' is acceptable, as the package size impact has been analyzed and found acceptable.

Applied to files:

  • package.json
📚 Learning: 2025-05-23T15:19:35.710Z
Learnt from: junhaoliao
PR: y-scope/yscope-log-viewer#285
File: package-lock.json:33-33
Timestamp: 2025-05-23T15:19:35.710Z
Learning: For the y-scope/yscope-log-viewer and y-scope/clp repositories, the team has decided to lock the "eslint-config-yscope" dependency at a specific version (not "latest" or "*"). When a new version is released, the version string will be updated explicitly in each referencing repository.

Applied to files:

  • package.json
📚 Learning: 2025-08-21T20:31:15.954Z
Learnt from: Henry8192
PR: y-scope/yscope-log-viewer#370
File: src/services/decoders/PlainTextDecoder/index.ts:14-17
Timestamp: 2025-08-21T20:31:15.954Z
Learning: For PlainTextDecoder in yscope-log-viewer, preserve the exact text content as-is without normalizing trailing newlines or empty lines, as users want to see the file structure exactly as it appears.

Applied to files:

  • src/services/decoders/PlainTextDecoder/index.ts
🧬 Code graph analysis (1)
src/services/decoders/PlainTextDecoder/index.ts (3)
src/typings/decoders.ts (6)
  • Decoder (138-138)
  • DecoderOptions (140-140)
  • FilteredLogEventMap (141-141)
  • Metadata (143-143)
  • LogEventCount (142-142)
  • DecodeResult (139-139)
src/typings/logs.ts (1)
  • LogLevelFilter (51-51)
src/typings/common.ts (1)
  • Nullable (6-6)
🔇 Additional comments (2)
package.json (1)

30-30: Zstd WASM dependency: licensing and packaging confirmed

  • License: MIT (compatible with Apache-2.0), no transitive dependencies.
  • Packaging: the *.wasm binary is shipped as a separate asset in dist/... (~251.8 kB) and isn’t inlined into JS.

Next steps:

  • Run your bundle visualizer (e.g. via npm run analyze:size) to compare pre- and post-integration sizes.
  • Verify the production bundle continues to load the external .wasm file rather than embedding it.

If you’d like help drafting a benchmark harness or automating the size comparison, just let me know.

src/typings/file.ts (1)

25-26: Type union update for DecoderFactory looks correct.

The union now includes the new PlainTextDecoder and ZstdDecoder factories; this aligns with how creation is attempted elsewhere.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
src/services/decoders/ZstdDecoder/index.ts (2)

14-18: Initialise WASM once, add explicit return types, and wrap decompression errors for clearer fallback.

Calling init() per create() is wasteful and can regress startup latency. Also, the variable name logArrayBuffer is misleading (it is a Uint8Array), and the method lacks an explicit return type. Wrap decompress(...) so failures surface a decoder-specific error to aid diagnostics and allow the caller to continue testing other decoders.

Apply this diff within the method:

-    static override async create (dataArray: Uint8Array) {
-        await init();
-        const logArrayBuffer = decompress(dataArray);
-        return new ZstdDecoder(logArrayBuffer);
-    }
+    static override async create (dataArray: Uint8Array): Promise<ZstdDecoder> {
+        await ensureZstdInit();
+        try {
+            const logArray: Uint8Array = decompress(dataArray);
+            return new ZstdDecoder(logArray);
+        } catch (err) {
+            throw new ZstdDecompressionError("Zstd decompression failed", err);
+        }
+    }

Add these supporting declarations near the imports to avoid re-initialising and to standardise error reporting (kept compatible with ES2020 targets):

let zstdInitPromise: Promise<void> | null = null;
const ensureZstdInit = (): Promise<void> => {
    if (null === zstdInitPromise) {
        zstdInitPromise = init();
    }
    return zstdInitPromise;
};

class ZstdDecompressionError extends Error {
    public cause?: unknown;
    constructor (message: string, cause?: unknown) {
        super(message);
        this.name = "ZstdDecompressionError";
        this.cause = cause;
    }
}

16-17: Rename misleading variable to reflect actual type.

The result of decompress(...) is a Uint8Array, not an ArrayBuffer. Rename for clarity and future maintainability.

-        const logArrayBuffer = decompress(dataArray);
-        return new ZstdDecoder(logArrayBuffer);
+        const logArray: Uint8Array = decompress(dataArray);
+        return new ZstdDecoder(logArray);
src/services/decoders/PlainTextDecoder/index.ts (3)

27-33: Targeted ESLint suppressions look good.

Per prior feedback, using next-line suppressions avoids masking unrelated issues across the file.

Also applies to: 37-41, 49-52, 64-66


14-17: Preserve exact EOLs and trailing-newline semantics; do not normalise lines.

Per team learning (2025-08-21), the viewer must reflect file content byte-for-byte. Splitting on line breaks discards original EOLs and trailing-newline presence.

Apply this diff to keep original EOLs and drop the spurious terminal empty match:

     constructor (dataArray: Uint8Array) {
         const textDecoder = new TextDecoder();
-        this.#logs = textDecoder.decode(dataArray).split(/\r\n|\r|\n/);
+        const text = textDecoder.decode(dataArray);
+        const matches = text.match(/.*?(?:\r\n|\r|\n|$)/gs) ?? [];
+        if (0 < matches.length && "" === matches.at(-1)) {
+            matches.pop();
+        }
+        this.#logs = matches;
     }

54-61: Do not append a newline during decode; message should be the exact original slice.

Appending “\n” reintroduces normalisation and creates a phantom newline for files without a trailing EOL.

         return this.#logs.slice(beginIdx, endIdx).map((log, i) => ({
             logEventNum: beginIdx + i,
             logLevel: 0,
-            message: `${log}\n`,
+            message: log,
             timestamp: BigInt(0),
             utcOffset: BigInt(0),
         }));
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2ddcdae and 50aa871.

📒 Files selected for processing (2)
  • src/services/decoders/PlainTextDecoder/index.ts (1 hunks)
  • src/services/decoders/ZstdDecoder/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • src/services/decoders/ZstdDecoder/index.ts
  • src/services/decoders/PlainTextDecoder/index.ts
🧠 Learnings (1)
📚 Learning: 2025-08-21T20:31:15.954Z
Learnt from: Henry8192
PR: y-scope/yscope-log-viewer#370
File: src/services/decoders/PlainTextDecoder/index.ts:14-17
Timestamp: 2025-08-21T20:31:15.954Z
Learning: For PlainTextDecoder in yscope-log-viewer, preserve the exact text content as-is without normalizing trailing newlines or empty lines, as users want to see the file structure exactly as it appears.

Applied to files:

  • src/services/decoders/PlainTextDecoder/index.ts
🧬 Code graph analysis (1)
src/services/decoders/PlainTextDecoder/index.ts (2)
src/typings/decoders.ts (5)
  • Decoder (138-138)
  • FilteredLogEventMap (141-141)
  • Metadata (143-143)
  • LogEventCount (142-142)
  • DecodeResult (139-139)
src/typings/common.ts (1)
  • Nullable (6-6)
🔇 Additional comments (1)
src/services/decoders/PlainTextDecoder/index.ts (1)

23-26: Minor consistency check: counts align with preserved EOL logic.

With the EOL-preserving change, getEstimatedNumEvents() and build() remain correct since #logs mirrors the original line segmentation (including any EOL on each element). No action required.

Also applies to: 42-47

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/typings/file.ts (1)

51-57: Signature overlap with CLP IR — keep fallback order robust and explain why.

CLP IR and Zstandard share the Zstd magic bytes (0x28, 0xb5, 0x2f, 0xfd). Ensure the detection loop continues to the Zstandard entry when CLP IR’s factory throws after a signature match. Also make extension matching case-insensitive, and ensure an empty signature (Plain Text) never short-circuits signature-based detection.

  • Add a clarifying comment inline for future maintainers:
     {
+        // Note: Shares Zstd magic bytes with CLP IR; decoder selection relies on
+        // tryCreateDecoderBySignature continuing to the next candidate on factory failure.
         DecoderFactory: ZstdDecoder,
         checkIsStructured: () => false,
         extensions: [".zst", ".zstd"],
         name: "Zstandard",
         signature: [0x28, 0xb5, 0x2f, 0xfd],
     },
  • Verify detection behaviour and case-insensitive extension matching:
#!/bin/bash
# Verify tryCreateDecoderBySignature continues on failure and doesn't break early.
rg -nPU -C3 'tryCreateDecoderBySignature\s*\(' src | sed -n '1,200p'
rg -nPU -C2 '\btryCreateDecoderBySignature\b.*?\{.*?\}' src/services/LogFileManager/decodeUtils.ts
rg -nP -C3 '(break|return|throw).*?(on|after)\s+signature' src/services/LogFileManager/decodeUtils.ts

# Extension matching should normalise case (toLowerCase) before comparison.
rg -nP -C2 '(extension|ext|endsWith|\.zst|\.zstd)' src | sed -n '1,200p'
rg -nP -C2 'toLowerCase\(\).*endsWith' src

# Plain Text (empty signature) should not short-circuit signature-based detection.
rg -nP -C3 'signature\s*:\s*\[\s*\]' src/typings/file.ts
rg -nP -C3 'if\s*\(\s*signature\.length\s*===\s*0\s*\)' src/services/LogFileManager/decodeUtils.ts

Optionally consider placing the Zstandard entry immediately after CLP IR for locality, since they share a signature. Not required if the loop filters by signature before ordering.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 50aa871 and bf412c8.

📒 Files selected for processing (1)
  • src/typings/file.ts (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • src/typings/file.ts
🔇 Additional comments (1)
src/typings/file.ts (1)

5-5: Import wired correctly for the new decoder.

ZstdDecoder import path and default export usage look good.

Comment on lines +25 to +26
DecoderFactory: typeof ClpIrDecoder | typeof JsonlDecoder |
typeof ZstdDecoder | typeof PlainTextDecoder;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid ever-growing unions for DecoderFactory; use a structural static type.

Every new decoder requires editing this union. Define a structural type for the static side (create()) and reference that here to improve maintainability.

Apply this diff within the selected lines:

-    DecoderFactory: typeof ClpIrDecoder | typeof JsonlDecoder |
-        typeof ZstdDecoder | typeof PlainTextDecoder;
+    DecoderFactory: DecoderStatic;

Add this type alias near the top of the file (e.g., above interface FileTypeDef):

type DecoderStatic = {
  // Matches the static factory shape used by decodeUtils
  create: (dataArray: Uint8Array) => Promise<Decoder>;
};
🤖 Prompt for AI Agents
In src/typings/file.ts around lines 25 to 26, the DecoderFactory union grows
with every new decoder; define a structural static type instead and use it here:
add a type alias named DecoderStatic near the top of the file (e.g., above
interface FileTypeDef) that declares a create: (dataArray: Uint8Array) =>
Promise<Decoder> method to represent the decoder's static factory shape, then
replace the current DecoderFactory union with the new DecoderStatic type so new
decoders no longer require editing this union.

Comment on lines +51 to +57
{
DecoderFactory: ZstdDecoder,
checkIsStructured: () => false,
extensions: [".zst", ".zstd"],
name: "Zstandard",
signature: [0x28, 0xb5, 0x2f, 0xfd],
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

DRY the Zstd magic bytes into a shared constant to prevent drift.

Both CLP IR and Zstandard rely on the same magic bytes. Centralise them to avoid subtle mismatches later.

Apply within the selected lines:

-        signature: [0x28, 0xb5, 0x2f, 0xfd],
+        signature: ZSTD_MAGIC,

Add this near the top of the file (close to other typings) and use it for the CLP IR entry as well:

const ZSTD_MAGIC = [0x28, 0xb5, 0x2f, 0xfd] as const;
🤖 Prompt for AI Agents
In src/typings/file.ts around lines 51 to 57, the Zstandard magic bytes are
duplicated; DRY them by adding a shared constant near the top of the file with
the other typings (e.g., declare ZSTD_MAGIC as a readonly/const tuple of the
four bytes) and then replace the inline signature arrays for both the Zstandard
entry and the CLP IR entry to reference that constant instead of repeating the
literal bytes.

💡 Verification agent

🧩 Analysis chain

Guard against decompression bombs; confirm streaming/lazy WASM init.

Zstandard content can expand massively. Ensure ZstdDecoder enforces a sane max-out size or uses streaming, and lazily imports the WASM so non-Zstd users don’t pay the startup cost.

Run:

If limits/streaming aren’t present, I can draft a guard that rejects when the projected decompressed size exceeds a threshold and documents the UX.


🏁 Script executed:

#!/bin/bash
# ZstdDecoder should import the WASM lazily (dynamic import) inside create().
rg -nP -C3 '@bokuweb/zstd-wasm' src/services/decoders/ZstdDecoder
rg -nP -C3 'import\(\s*[\'"]@bokuweb/zstd-wasm[\'"]\s*\)' src/services/decoders/ZstdDecoder

# Check for size limits / streaming usage.
rg -nP -C3 '(max|limit|cap).*bytes' src/services/decoders/ZstdDecoder
rg -nP -C3 'decompress(Stream|ing)?' src/services/decoders/ZstdDecoder
rg -nP -C3 '(AbortSignal|controller|cancel|abort)' src/services/decoders/ZstdDecoder

# Confirm bundler config excludes the WASM from pre-bundling (as mentioned in the PR).
rg -nP -C3 'optimizeDeps\s*:\s*\{[^}]*exclude[^}]*@bokuweb/zstd-wasm' vite.config.ts

Length of output: 751


This will confirm whether the decoder still uses a static import, lacks size guards, and if the Vite config properly excludes the Zstd WASM from eager bundling.


🏁 Script executed:

#!/bin/bash
# Show static imports from @bokuweb/zstd-wasm in ZstdDecoder
rg -n "@bokuweb/zstd-wasm" -C3 src/services/decoders/ZstdDecoder/index.ts || true

# Check for any dynamic imports (should be none if static import present)
rg -n "import(" -C3 src/services/decoders/ZstdDecoder/index.ts || true

# Inspect the create() method to see where WASM is initialized
rg -n "create" -C3 src/services/decoders/ZstdDecoder/index.ts || true

# Look for any cap/limit/max guards in the decompression logic
rg -nE "(limit|max|cap)" -C3 src/services/decoders/ZstdDecoder/index.ts || true

# Search for streaming or chunked decompress calls
rg -n "stream" -C3 src/services/decoders/ZstdDecoder/index.ts || true

# Check for AbortSignal/controller usage in decoder (to allow cancellation)
rg -nE "(AbortSignal|controller|cancel|abort)" -C3 src/services/decoders/ZstdDecoder/index.ts || true

# Locate vite config file(s) to verify exclude of @bokuweb/zstd-wasm
rg -l "optimizeDeps" -g "vite.config.*" || echo "No vite.config.* found"
rg -n "optimizeDeps" -C3 vite.config.ts vite.config.js || true
rg -n "exclude" -C3 vite.config.ts vite.config.js || true

Length of output: 2080


ZstdDecoder eagerly loads WASM and lacks decompression safeguards

Our verification shows the decoder still statically imports @bokuweb/zstd-wasm and calls its APIs without any output‐size limits or streaming/abort support:

  • Static import at module top (src/services/decoders/ZstdDecoder/index.ts:1–4)
  • create() calls await init() and decompress(dataArray) directly (index.ts:14–17)
  • No checks for maximum decompressed size (no max/limit guards)
  • No streaming or chunked decompression use
  • No AbortSignal or cancellation mechanism
  • Vite config does exclude the WASM (optimizeDeps.exclude: ["@bokuweb/zstd-wasm"] in vite.config.ts:24–30), but that only prevents pre-bundling—it does not defer runtime loading for non-Zstd paths

Recommendations to address both performance and safety concerns:

  • Lazy-load the WASM: move the import into create(), e.g.:

    const { init, decompress } = await import('@bokuweb/zstd-wasm');
    await init();

    This ensures non-Zstd consumers don’t pay the WASM startup cost.

  • Enforce a max-output threshold: after decompression, reject if the size exceeds a sane limit:

    const output = decompress(dataArray);
    if (output.byteLength > MAX_DECOMPRESSED_BYTES) {
      throw new Error('Decompressed size exceeds safe limit');
    }

    Document the user-facing error/UX for size rejections.

  • Consider streaming (if supported by the WASM API) to process chunks and cap memory growth.

  • Add cancellation support via an AbortSignal to allow callers to abort long-running decompressions.

These changes are critical to prevent decompression-bomb attacks and optimize startup performance for users who never decode Zstandard payloads.

🤖 Prompt for AI Agents
In src/typings/file.ts around lines 51 to 57 the Zstd decoder entry references a
decoder that currently eagerly loads @bokuweb/zstd-wasm and lacks
safety/abort/streaming measures; update the decoder implementation (not this
index) so create() lazy-imports the WASM runtime at call-time (use dynamic
import inside create), initialize it there, enforce a configurable
MAX_DECOMPRESSED_BYTES check and throw/reject if exceeded, add optional
AbortSignal support so callers can cancel long-running decompression, and if the
WASM API allows, implement chunked/streaming decompression to bound memory
growth; ensure Vite exclusion remains but runtime load is deferred and document
the size-limit error for callers.

Copy link
Member

@junhaoliao junhaoliao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the PR title, how about:

feat(decoder): Add support for loading Zstandard-compressed text files (resolves #272)

@Henry8192 Henry8192 changed the title feat: Add Support for loading zstd compressed text log files (resolves #272). feat(decoder): Add support for loading Zstandard-compressed text files (resolves #272). Aug 26, 2025
@@ -0,0 +1,15 @@
import {ZSTDDecoder} from "zstddec";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the issue discussed at https://github.com/y-scope/yscope-log-viewer/pull/368/files#r2301199409 , can we try again with

Suggested change
import {ZSTDDecoder} from "zstddec";
import {ZSTDDecoder} from "zstddec/stream";

Comment on lines 8 to 9
const decoder = new ZSTDDecoder();
await decoder.init();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about decompressor to avoid confusion?

Suggested change
const decoder = new ZSTDDecoder();
await decoder.init();
const decompressor = new ZSTDDecoder();
await decompressor.init();

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
src/services/decoders/ZstdDecoder/index.ts (2)

11-11: Consider architectural consistency.

Based on the past review comment, there's a question about whether ZstdDecoder should extend PlainTextDecoder or if the factory method should directly return a PlainTextDecoder instance. The current implementation creates a ZstdDecoder class that extends PlainTextDecoder but then returns a PlainTextDecoder instance from the factory method, which seems inconsistent.


10-10: Use more descriptive variable name.

The variable name could be more descriptive to clearly indicate it contains decompressed data rather than just any data array.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bf412c8 and 6b669d8.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (2)
  • package.json (1 hunks)
  • src/services/decoders/ZstdDecoder/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • src/services/decoders/ZstdDecoder/index.ts
🔇 Additional comments (1)
package.json (1)

42-42: Verified zstddec version and security

  • Latest published version is 0.1.0 (npmjs.com)
  • No known security vulnerabilities in v0.1.0 (0 critical, 0 high, 0 medium, 0 low) as per Snyk (security.snyk.io)
  • License: MIT AND BSD-3-Clause (dual-licensed) (npmjs.com)

No further action required on this dependency.



class ZstdDecoder extends PlainTextDecoder {
static override async create (dataArray: Uint8Array) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing DecoderOptions parameter.

The method signature is missing the decoderOptions parameter that should match the interface expected by other decoders in the codebase.

Apply this diff to add the missing parameter:

-    static override async create (dataArray: Uint8Array) {
+    static override async create (dataArray: Uint8Array, _decoderOptions: DecoderOptions) {

You'll also need to import the DecoderOptions type:

+import {DecoderOptions} from "../../../typings/decoders";
+
 import PlainTextDecoder from "../PlainTextDecoder";
🤖 Prompt for AI Agents
In src/services/decoders/ZstdDecoder/index.ts around line 7, the static create
method is missing the decoderOptions parameter expected by other decoders;
update the signature to accept a second parameter named decoderOptions typed as
DecoderOptions and add an import for the DecoderOptions type at the top of the
file; ensure the implementation forwards or uses decoderOptions consistently
with other decoder implementations in the codebase.

Comment on lines 8 to 11
const decoder = new ZSTDDecoder();
await decoder.init();
const decompressedDataArray = decoder.decode(dataArray);
return new PlainTextDecoder(decompressedDataArray);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Optimize decoder initialization and improve error handling.

Creating a new decoder instance for every decompression operation is inefficient. Additionally, decompression errors should be wrapped to provide better diagnostic information for decoder fallback logic.

Apply this diff to optimize initialization and add error handling:

+let decoderInstance: ZSTDDecoder | null = null;
+
+const getDecoderInstance = async (): Promise<ZSTDDecoder> => {
+    if (null === decoderInstance) {
+        decoderInstance = new ZSTDDecoder();
+        await decoderInstance.init();
+    }
+    return decoderInstance;
+};
+
 class ZstdDecoder extends PlainTextDecoder {
     static override async create (dataArray: Uint8Array, _decoderOptions: DecoderOptions) {
-        const decoder = new ZSTDDecoder();
-        await decoder.init();
-        const decompressedDataArray = decoder.decode(dataArray);
-        return new PlainTextDecoder(decompressedDataArray);
+        try {
+            const decoder = await getDecoderInstance();
+            const decompressedDataArray = decoder.decode(dataArray);
+            return new PlainTextDecoder(decompressedDataArray);
+        } catch (error) {
+            throw new Error("Zstd decompression failed", {cause: error as Error});
+        }
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const decoder = new ZSTDDecoder();
await decoder.init();
const decompressedDataArray = decoder.decode(dataArray);
return new PlainTextDecoder(decompressedDataArray);
let decoderInstance: ZSTDDecoder | null = null;
const getDecoderInstance = async (): Promise<ZSTDDecoder> => {
if (null === decoderInstance) {
decoderInstance = new ZSTDDecoder();
await decoderInstance.init();
}
return decoderInstance;
};
class ZstdDecoder extends PlainTextDecoder {
static override async create (dataArray: Uint8Array, _decoderOptions: DecoderOptions) {
try {
const decoder = await getDecoderInstance();
const decompressedDataArray = decoder.decode(dataArray);
return new PlainTextDecoder(decompressedDataArray);
} catch (error) {
throw new Error("Zstd decompression failed", { cause: error as Error });
}
}
}
🤖 Prompt for AI Agents
In src/services/decoders/ZstdDecoder/index.ts around lines 8-11, the code
currently creates and inits a new ZSTDDecoder for every decode call and does not
wrap decode errors; change this to reuse a single decoder instance (e.g.,
module-level or singleton) with lazy initialization so init() is awaited only
once, and wrap the decoder.decode(dataArray) call in a try/catch that throws a
new Error containing contextual details (decoder type, input size) while
preserving the original error (via message or cause) so the caller can use the
wrapped error for fallback logic.

{
DecoderFactory: ZstdDecoder,
checkIsStructured: () => false,
extensions: [".zst", ".zstd"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a good mindset to brainstorm different file extensions, though in the IANA registry only .zst is claimed: https://www.iana.org/assignments/media-types/application/zstd

Suggested change
extensions: [".zst", ".zstd"],
extensions: [".zst"],

package.json Outdated
"monaco-editor": "0.50.0",
"react": "^19.0.0",
"react-dom": "^19.0.0",
"zstddec": "^0.1.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need the version discussed at
#368 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems some dependencies are getting updated / removed unexpectedly. can we

git checkout 7c008d95de908cd73e5fc1090e24157a9d614a18 -- package-lock.json
npm i zstddec

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
package.json (1)

52-52: Lock eslint-config-yscope to a specific version (team convention).

Per prior repo guidance, avoid "latest"; pin a specific version to ensure consistent lint results across developers and CI.

Run this to discover the currently resolved version and update package.json accordingly:

#!/bin/bash
set -euo pipefail
# Show the resolved version from package-lock (if present)
if [ -f package-lock.json ]; then
  jq -r '.. | objects | select(.name? == "eslint-config-yscope") | .version' package-lock.json | head -n1
else
  echo "package-lock.json not found; install deps first (npm ci) or check other repos for the pinned version."
fi
src/services/decoders/ZstdDecoder/index.ts (2)

3-3: Add missing DecoderOptions import to match other decoders.

The create signature should accept the DecoderOptions parameter (even if unused) for consistency with the decoder factory contract.

Apply this diff:

-import PlainTextDecoder from "../PlainTextDecoder";
+import {DecoderOptions} from "../../../typings/decoders";
+import PlainTextDecoder from "../PlainTextDecoder";

7-12: Align factory signature, reuse initialised decoder, and wrap errors with cause.

  • Add _decoderOptions: DecoderOptions to the signature and explicitly return Promise<PlainTextDecoder> to match the base override.
  • Avoid creating and initialising a new WASM decoder for every call; reuse a lazily-initialised singleton.
  • Wrap decode errors to aid upstream fallback diagnostics.

Apply this diff:

 class ZstdDecoder extends PlainTextDecoder {
-    static override async create (dataArray: Uint8Array) {
-        const decompressor = new ZSTDDecoder();
-        await decompressor.init();
-        const decompressedDataArray = decompressor.decode(dataArray);
-        return new PlainTextDecoder(decompressedDataArray);
-    }
+    static override async create (
+        dataArray: Uint8Array,
+        _decoderOptions: DecoderOptions,
+    ): Promise<PlainTextDecoder> {
+        // Lazily reuse a single decoder instance to avoid repeated WASM init.
+        // Note: Using explicit null checks per repo guidelines.
+        // Module-level singleton:
+        //   let decoderInstance: ZSTDDecoder | null = null;
+        //   const getDecoderInstance = async (): Promise<ZSTDDecoder> => {
+        //       if (null === decoderInstance) {
+        //           decoderInstance = new ZSTDDecoder();
+        //           await decoderInstance.init();
+        //       }
+        //       return decoderInstance;
+        //   };
+        try {
+            const decoder = await getDecoderInstance();
+            const decompressedDataArray: Uint8Array = decoder.decode(dataArray);
+            return new PlainTextDecoder(decompressedDataArray);
+        } catch (error) {
+            // Preserve root cause for upstream logging and fallback logic.
+            throw new Error("Zstandard decompression failed.", {cause: error as Error});
+        }
+    }
 }

Add the module-level helpers at the top of this file (under imports):

+let decoderInstance: ZSTDDecoder | null = null;
+const getDecoderInstance = async (): Promise<ZSTDDecoder> => {
+    if (null === decoderInstance) {
+        decoderInstance = new ZSTDDecoder();
+        await decoderInstance.init();
+    }
+    return decoderInstance;
+};
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6b669d8 and 928f5f7.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (2)
  • package.json (1 hunks)
  • src/services/decoders/ZstdDecoder/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • src/services/decoders/ZstdDecoder/index.ts
🧠 Learnings (2)
📚 Learning: 2025-05-23T15:19:35.710Z
Learnt from: junhaoliao
PR: y-scope/yscope-log-viewer#285
File: package-lock.json:33-33
Timestamp: 2025-05-23T15:19:35.710Z
Learning: For the y-scope/yscope-log-viewer and y-scope/clp repositories, the team has decided to lock the "eslint-config-yscope" dependency at a specific version (not "latest" or "*"). When a new version is released, the version string will be updated explicitly in each referencing repository.

Applied to files:

  • package.json
📚 Learning: 2024-11-28T19:27:38.124Z
Learnt from: davemarco
PR: y-scope/yscope-log-viewer#134
File: src/components/modals/SettingsModal/SettingsDialog.tsx:41-46
Timestamp: 2024-11-28T19:27:38.124Z
Learning: The documentation for the yscope-log-viewer is not yet hosted, so documentation URLs may not be accessible.

Applied to files:

  • package.json
🔇 Additional comments (2)
src/services/decoders/ZstdDecoder/index.ts (2)

7-12: Unable to locate any ZSTDDecoder.create(…) invocations – please verify manually

I ran a ripgrep search for both ZSTDDecoder.create and ZstdDecoder.create throughout src/ but found no matches, so I cannot confirm that your Zstd decompression path is actually executed within a Worker. To ensure UI responsiveness and correct multi-frame handling, please:

  • Audit your codebase for any usage of ZSTDDecoder.create(...) (or whichever exact class name you’re exporting) and confirm each call lives inside a Web Worker or off the main thread.
  • Validate that large or multi-frame Zstd inputs (e.g. from your sample ZIP) decompress fully and perform acceptably without blocking the UI.
  • Cross-check other decoders in src/services/decoders/ to ensure you’re using a consistent static override async create(...) signature and handling them similarly.

1-1: Verify WASM Asset Handling and Subpath Resolution

It appears that the Vite config isn’t including any .wasm assets, and the zstddec/stream folder couldn’t be found under node_modules—you’ll need to confirm both the runtime export and bundling setup:

• Confirm the zstddec dependency and subpath export
– In package.json, ensure you have "zstddec" listed under dependencies.
– Verify the installed package layout contains stream.js or stream/index.js, e.g.:

find node_modules -type f -path "*zstddec/stream*.js"

If you’re using pnpm or Yarn PnP, confirm how Vite resolves the subpath or add an alias:

// vite.config.ts
export default defineConfig({
  resolve: {
    alias: { 'zstddec/stream': 'zstddec/stream/index.js' }
  },
});

• Add .wasm to Vite’s asset pipeline
You scanned the first 200 lines of vite.config.ts and saw no assetsInclude or WASM entries. To ensure the decoder’s .wasm is bundled and served, add:

 export default defineConfig({
   // …existing config…
+  assetsInclude: ["**/*.wasm"],
 });

• Verify production bundle and runtime loading
– Run npm run build (or equivalent)
– Inspect the dist/ folder for the decoder’s .wasm file
– Serve the production build locally and check the network tab for any 404s when instantiating ZSTDDecoder

• (If using optimizeDeps) Ensure Vite pre-bundles the stream entry:

 export default defineConfig({
   optimizeDeps: {
-    exclude: ["clp-ffi-js/worker"],
+    exclude: ["clp-ffi-js/worker"],
+    include: ["zstddec/stream"]
   }
 });

Once you’ve confirmed the subpath resolves in both dev and prod builds and the .wasm asset is correctly emitted and fetched without errors, this decoder integration will be solid.

"monaco-editor": "0.50.0",
"react": "^19.0.0",
"react-dom": "^19.0.0",
"zstddec": "^0.2.0-alpha.3",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Pin the pre-release zstddec version; avoid caret on alpha.

Using a caret with a pre-release (^0.2.0-alpha.3) can float to newer pre-releases and introduce breakage unexpectedly. Pin exactly to the known-good version for reproducible builds.

Apply this diff:

-    "zstddec": "^0.2.0-alpha.3",
+    "zstddec": "0.2.0-alpha.3",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"zstddec": "^0.2.0-alpha.3",
"zstddec": "0.2.0-alpha.3",
🤖 Prompt for AI Agents
In package.json around line 42, the dependency "zstddec" is specified with a
caret pre-release "^0.2.0-alpha.3" which allows floating to newer pre-releases;
change it to the exact pinned version "0.2.0-alpha.3" (remove the leading caret)
and then run your package manager (npm install or yarn install) to update the
lockfile so builds are reproducible.

Comment on lines +6 to +13
class ZstdDecoder extends PlainTextDecoder {
static override async create (dataArray: Uint8Array) {
const decompressor = new ZSTDDecoder();
await decompressor.init();
const decompressedDataArray = decompressor.decode(dataArray);
return new PlainTextDecoder(decompressedDataArray);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Double-check type expectations for override and factory union.

This class extends PlainTextDecoder but returns a PlainTextDecoder instance from create. If the factory contract expects typeof PlainTextDecoder-like behaviour, this is fine; otherwise, either:

  • add a private constructor and return new ZstdDecoder(decompressedDataArray), or
  • stop extending PlainTextDecoder and make this a pure factory type if allowed.

If you want the class to return itself, here’s a compatible pattern:

 class ZstdDecoder extends PlainTextDecoder {
+    private constructor (logArray: Uint8Array) {
+        super(logArray);
+    }
     static override async create (
         dataArray: Uint8Array,
         _decoderOptions: DecoderOptions,
-    ): Promise<PlainTextDecoder> {
+    ): Promise<ZstdDecoder> {
         try {
             const decoder = await getDecoderInstance();
             const decompressedDataArray: Uint8Array = decoder.decode(dataArray);
-            return new PlainTextDecoder(decompressedDataArray);
+            return new ZstdDecoder(decompressedDataArray);
         } catch (error) {
             throw new Error("Zstandard decompression failed.", {cause: error as Error});
         }
     }
 }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/services/decoders/ZstdDecoder/index.ts around lines 6 to 13, the static
create method currently returns a PlainTextDecoder instance while the class
extends PlainTextDecoder, which can break type/ factory expectations; either
make the class actually construct and return itself by adding a constructor that
accepts the decompressed Uint8Array (make it private if you want construction
only via create) and change the return to new
ZstdDecoder(decompressedDataArray), or convert the class into a pure factory
(remove the extends PlainTextDecoder) and ensure the static create signature and
return type are declared as PlainTextDecoder (or the expected factory union
type); update the method signature (remove or keep override) and any exported
types to match the chosen approach.

@junhaoliao
Copy link
Member

as discussed offline. let's put this on hold until we publish our own zstd js lib, which should be possible in a few days with the review process

@junhaoliao junhaoliao marked this pull request as draft August 27, 2025 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support loading Zstandard (.zstd) compressed files

2 participants