Skip to content

feat(pm): integrate git dependency support into install flow (3/3)#2624

Open
killagu wants to merge 6 commits intopr2/bfs-git-resolutionfrom
pr3/install-git-deps
Open

feat(pm): integrate git dependency support into install flow (3/3)#2624
killagu wants to merge 6 commits intopr2/bfs-git-resolutionfrom
pr3/install-git-deps

Conversation

@killagu
Copy link
Contributor

@killagu killagu commented Feb 25, 2026

Summary

Split from #2603 — Part 3 of 3 for git dependency resolution. Depends on #2623.

  • Enable native-git feature for utoo-ruborist dependency
  • Add git_resolver module with resolve_git_spec/resolve_github_spec wrappers
  • Add is_git_url, git_cache_lookup, resolve_cache_path to downloader
  • Route git packages through resolve_cache_path in cloner
  • Skip download pipeline for git packages (pre-cached during BFS)
  • Update resolve_package_spec with PackageSpec::parse() and exhaustive match
  • Add format_save_spec for git-aware package.json writing

Pipeline flow for git packages

  1. BFS resolution clones git repo → caches at <cache_dir>/<name>/<sha>/
  2. Download worker sees git+ URL → skips (already cached)
  3. Clone worker calls resolve_cache_path → routes to git_cache_lookup → returns cached path
  4. Package cloned to node_modules/ from cache

Architectural fixes applied

  • Single is_git_url in downloader.rs — no duplicate function
  • Exhaustive match on PackageSpec variants — no _ => catch-all
  • format_save_spec handles git URLs (written as-is to package.json)

Test plan

  • cargo check -p utoo-pm passes
  • cargo test -p utoo-pm — 202 pass (1 pre-existing network failure unrelated)
  • End-to-end: utoo install git+https://github.com/user/repo.git
  • Verify utoo clean handles git-cached packages

Part of git dependency resolution: #2622#2623PR3

🤖 Generated with Claude Code

Add typed PackageSpec enum to utoo-ruborist for parsing dependency
specifications into Registry, Git, and GitHub variants. This is the
foundation for git dependency resolution support.

- PackageSpec::parse() as idiomatic associated method
- is_http_tarball_spec() for routing HTTP tarball specs
- Re-export via utoo_ruborist::spec module
- 10 unit tests + 2 doc-tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 25, 2026 02:22
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @killagu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the package manager's capability by integrating comprehensive support for Git dependencies. It establishes a robust pipeline for resolving, caching, and installing packages directly from Git repositories, including GitHub shorthand. The changes streamline the dependency resolution process, ensure proper handling of Git URLs in package.json, and optimize the download workflow by leveraging pre-caching for Git-based packages.

Highlights

  • Git Dependency Support: Enabled the native-git feature for the utoo-ruborist dependency to allow native Git operations.
  • New Git Resolver Module: Introduced a new git_resolver module to handle Git and GitHub package specifications, including cloning and caching.
  • Intelligent Caching for Git Packages: Implemented is_git_url, git_cache_lookup, and resolve_cache_path in the downloader utility for intelligent caching and retrieval of Git packages.
  • Optimized Download Pipeline: Optimized the download pipeline to skip Git packages, as they are pre-cached during the BFS resolution phase, improving efficiency.
  • Refactored Package Specification Resolution: Refactored resolve_package_spec to use PackageSpec::parse() and an exhaustive match, improving handling of various package specification types.
  • Git-Aware package.json Writing: Added format_save_spec to correctly write Git-aware package specifications (e.g., resolved URLs with pinned commits) into package.json files.
Changelog
  • crates/pm/Cargo.toml
    • Enabled the native-git feature for utoo-ruborist.
  • crates/pm/src/helper/lock.rs
    • Imported resolve_git_spec and resolve_github_spec from git_resolver.
    • Replaced downloader::download_to_cache with downloader::resolve_cache_path.
    • Replaced utoo_ruborist::util::parse_package_spec with utoo_ruborist::spec::PackageSpec.
    • Introduced format_save_spec function to handle writing git-aware versions to package.json.
    • Updated resolve_package_spec to use PackageSpec::parse() and an exhaustive match for different package types (Registry, Git, GitHub).
    • Changed download_to_cache call to resolve_cache_path in prepare_global_package_json.
  • crates/pm/src/service/pipeline/worker.rs
    • Imported is_git_url from downloader.
    • Added logic to skip the download pipeline for Git packages, relying on pre-caching.
  • crates/pm/src/util/cloner.rs
    • Replaced downloader::download_to_cache with downloader::resolve_cache_path.
  • crates/pm/src/util/downloader.rs
    • Added is_git_url function to identify Git URLs.
    • Implemented git_cache_lookup for retrieving pre-cached Git packages.
    • Introduced resolve_cache_path to route requests to either git_cache_lookup or download_to_cache.
    • Updated documentation for download_to_cache to clarify its use for registry tarballs.
  • crates/pm/src/util/git_resolver.rs
    • Added a new module.
    • Implemented resolve_git_spec to clone Git repositories and cache results.
    • Implemented resolve_github_spec to convert GitHub shorthand to Git URLs and resolve them.
  • crates/pm/src/util/mod.rs
    • Exported the new git_resolver module.
Activity
  • This pull request is the third part of a three-part series for Git dependency resolution, split from feat(pm): internalize git dependency resolution into ruborist #2603, and depends on feat(pm): add native-git BFS resolution (2/3) #2623.
  • All cargo check -p utoo-pm tests passed.
  • cargo test -p utoo-pm resulted in 202 passing tests, with one pre-existing network failure unrelated to these changes.
  • The PR description includes a test plan for end-to-end verification of utoo install git+https://github.com/user/repo.git and utoo clean handling of git-cached packages.
  • The summary was generated using Claude Code.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully integrates git dependency support into the package manager's installation flow. The changes are well-structured, enabling the native-git feature, adding a git_resolver module, and correctly updating the download and cloning pipeline to handle git packages by pre-caching them during dependency resolution. I've provided a couple of suggestions to improve code consistency and promote idiomatic Rust, which will enhance overall maintainability.

/// Wildcard specs (`*`, `latest`, empty) are pinned to `^<resolved_version>`.
/// Everything else (semver ranges, exact versions) passes through unchanged.
fn format_save_spec(version_spec: &str, resolved_version: &str) -> String {
if version_spec.starts_with("git+") || version_spec.starts_with("git://") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To adhere to the PR's goal of having a single source of truth for identifying git URLs, please use the newly introduced is_git_url function from the downloader module here. This will improve maintainability by avoiding logic duplication.

You will need to add use crate::util::downloader::is_git_url; to the imports at the top of the file.

Suggested change
if version_spec.starts_with("git+") || version_spec.starts_with("git://") {
if is_git_url(version_spec) {

Comment on lines +72 to +75
if is_git_url(tarball_url) {
return git_cache_lookup(name, version, tarball_url).await;
}
download_to_cache(name, version, tarball_url).await
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function's body can be simplified into a single if-else expression. This is more idiomatic in Rust and improves readability by making the two execution paths clearer.

    if is_git_url(tarball_url) {
        git_cache_lookup(name, version, tarball_url).await
    } else {
        download_to_cache(name, version, tarball_url).await
    }

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integrates git-based dependency resolution into the utoo-pm install pipeline by leveraging utoo-ruborist’s native git clone/cache support, ensuring git packages are pre-cached during BFS and then cloned from cache during install.

Changes:

  • Enable utoo-ruborist’s native-git feature and add a PM-side git_resolver wrapper that injects the PM cache dir.
  • Add git-aware cache routing in the downloader (is_git_url, git_cache_lookup, resolve_cache_path) and route cloner + pipeline behavior accordingly (skip downloads for git URLs).
  • Update spec resolution and package.json writing to be git-aware via PackageSpec::parse() and format_save_spec.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
crates/pm/src/util/mod.rs Exposes the new git_resolver utility module.
crates/pm/src/util/git_resolver.rs Adds thin wrappers around ruborist git clone, injecting PM cache dir.
crates/pm/src/util/downloader.rs Introduces git URL detection + git cache lookup and a unified resolve_cache_path.
crates/pm/src/util/cloner.rs Switches from download_to_cache to resolve_cache_path to support git-cached packages.
crates/pm/src/service/pipeline/worker.rs Skips the download worker for git URLs since BFS pre-caches them.
crates/pm/src/helper/lock.rs Moves to typed PackageSpec::parse(), adds git-aware save formatting, and resolves git specs via the new wrappers.
crates/pm/Cargo.toml Enables native-git feature for utoo-ruborist.

Comment on lines +49 to +52
let commit_sha = tarball_url.split_once('#').map(|(_, frag)| frag)?;
let cache_dir = get_cache_dir();
let cache_path = cache_dir.join(name).join(commit_sha);
if crate::fs::try_exists(&cache_path.join("_resolved"))
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git_cache_lookup trusts the URL fragment as a path segment (cache_dir.join(name).join(commit_sha)). If tarball_url comes from a user-edited/compromised lockfile, a fragment like #../.. could cause path traversal and let the cloner read/copy from unexpected directories. Consider validating commit_sha (e.g., only [0-9a-f]{7,40}) and rejecting anything containing path separators or .. before constructing cache_path.

Copilot uses AI. Check for mistakes.
}
download_to_cache(name, version, tarball_url).await
}

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New git-aware helpers (is_git_url / git_cache_lookup / resolve_cache_path) add a distinct control-flow path (skip download + cache lookup), but there are no unit tests covering git URL detection and cache-hit/miss behavior. Since this file already has async tests, please add tests that cover at least: (1) is_git_url for git+https, git+ssh, git:// and non-git URLs; (2) resolve_cache_path returning the expected cache path when _resolved exists for a git URL.

Suggested change
#[cfg(test)]
mod tests {
use super::*;
use std::fs;
#[test]
fn is_git_url_recognizes_git_and_non_git_urls() {
assert!(is_git_url("git+https://github.com/example/repo.git#abcdef"));
assert!(is_git_url("git+ssh://git@github.com/example/repo.git#123456"));
assert!(is_git_url("git://github.com/example/repo.git#deadbeef"));
assert!(!is_git_url("https://registry.npmjs.org/example/-/example-1.0.0.tgz"));
assert!(!is_git_url("http://example.com/not-a-git-url"));
assert!(!is_git_url("file:///tmp/local-package.tgz"));
}
#[tokio::test]
async fn resolve_cache_path_returns_git_cache_hit_when_resolved_exists() {
let name = "gitpkg-cache-hit";
let version = "1.0.0";
let commit_sha = "cafebabecafebabe";
let tarball_url = format!("git+https://example.com/{}.git#{}", name, commit_sha);
let cache_dir = get_cache_dir();
let cache_path = cache_dir.join(name).join(commit_sha);
// Prepare the cache directory and _resolved marker file to simulate a pre-resolved git package.
fs::create_dir_all(&cache_path).expect("failed to create git cache directory");
let resolved_marker = cache_path.join("_resolved");
fs::File::create(&resolved_marker).expect("failed to create _resolved marker file");
let resolved = resolve_cache_path(name, version, &tarball_url).await;
assert_eq!(resolved, Some(cache_path));
}
#[tokio::test]
async fn resolve_cache_path_returns_none_when_git_cache_miss() {
let name = "gitpkg-cache-miss";
let version = "1.0.0";
let commit_sha = "0123456789abcdef";
let tarball_url = format!("git+ssh://git@example.com/{}.git#{}", name, commit_sha);
let cache_dir = get_cache_dir();
let cache_path = cache_dir.join(name).join(commit_sha);
// Ensure the cache directory exists but without the _resolved marker,
// or clean up any leftover state from previous runs.
if cache_path.exists() {
// Best-effort cleanup; ignore errors if the directory is in use.
let _ = fs::remove_dir_all(&cache_path);
}
let resolved = resolve_cache_path(name, version, &tarball_url).await;
assert!(resolved.is_none());
}
}

Copilot uses AI. Check for mistakes.
Comment on lines 209 to +230
pub async fn resolve_package_spec(spec: &str) -> Result<(String, String, String)> {
let (name, version_spec) = parse_package_spec(spec);
let resolved = resolve_package(&Context::registry(), name, version_spec)
.await
.map_err(|e| anyhow!("{}", e))?;
Ok((name.to_string(), resolved.version, version_spec.to_string()))
let parsed = PackageSpec::parse(spec);
match parsed {
PackageSpec::Registry { name, version_spec } => {
let resolved = resolve_package(&Context::registry(), &name, &version_spec)
.await
.map_err(|e| anyhow!("{}", e))?;
Ok((name, resolved.version, version_spec))
}
PackageSpec::Git { url, commit_ish } => {
let resolved = resolve_git_spec(&url, commit_ish.as_deref()).await?;
Ok((resolved.name, resolved.version, resolved.resolved_url))
}
PackageSpec::GitHub {
owner,
repo,
commit_ish,
} => {
let resolved = resolve_github_spec(&owner, &repo, commit_ish.as_deref()).await?;
Ok((resolved.name, resolved.version, resolved.resolved_url))
}
}
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_package_spec now returns resolved.resolved_url as the third tuple element for git/GitHub specs, but returns a semver/range string for registry specs. This makes the meaning of the third element ambiguous and can break callers that assume it’s always a registry version spec (e.g. prepare_global_package_json later passes it into resolve_package). Consider returning a typed struct/enum (or include the parsed PackageSpec) so callers can handle git vs registry explicitly, or split into separate resolve_registry_spec vs resolve_non_registry_spec helpers.

Copilot uses AI. Check for mistakes.
Comment on lines +199 to +202
fn format_save_spec(version_spec: &str, resolved_version: &str) -> String {
if version_spec.starts_with("git+") || version_spec.starts_with("git://") {
return version_spec.to_string();
}
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_save_spec re-implements git URL detection inline (starts_with("git+") || starts_with("git://")), which duplicates the predicate introduced in util::downloader::is_git_url. Using the shared helper would keep git routing rules consistent if supported prefixes expand (and reduces drift between save-formatting vs download/clone routing).

Copilot uses AI. Check for mistakes.
@killagu killagu force-pushed the pr3/install-git-deps branch from e0c4b71 to 426024d Compare February 25, 2026 02:39
@killagu killagu force-pushed the pr2/bfs-git-resolution branch 2 times, most recently from e9b77e0 to dfda143 Compare February 25, 2026 03:08
@killagu killagu force-pushed the pr3/install-git-deps branch 2 times, most recently from 86840e8 to f260cda Compare February 25, 2026 03:21
@killagu killagu force-pushed the pr2/bfs-git-resolution branch from dfda143 to ad7962f Compare February 25, 2026 03:21
killagu and others added 3 commits February 25, 2026 11:34
… GitCloneResult

Extend the git protocol spec parsing layer:

- Add is_non_registry_spec() covering git+, git://, github: prefixes
  alongside existing local specs (file:, link:, workspace:, portal:)
- Deprecate is_local_spec() with alias pointing to new function
- Add ResolveError::Git variant with Display/Error impls
- Add GitCloneResult struct in traits/git.rs (data type only)
- Re-export GitCloneResult via utoo_ruborist::git module

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add gix/tempfile as optional deps behind `native-git` feature flag
- Add resolver/git.rs with clone_repo (async), clone_repo_blocking,
  resolve_non_registry_dep, build_manifest_from_git_cache
- Route non-registry specs through git resolver in BFS process_dependency
- Add GIT_CLONE_CACHE with tokio::sync::OnceCell for concurrent dedup
- Use tokio::fs for non-blocking package.json reads (arch review fix)
- Add BuildDepsConfig.cache_dir with builder pattern
- Update ResolveError::Git to structured { url, message } fields
- Keep BuildDepsOptions::new() (no public API removal)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Enable native-git feature for utoo-ruborist dependency
- Add git_resolver module with resolve_git_spec/resolve_github_spec
- Add is_git_url, git_cache_lookup, resolve_cache_path to downloader
- Route git packages through resolve_cache_path in cloner
- Skip download pipeline for git packages (pre-cached during BFS)
- Update resolve_package_spec to use PackageSpec::parse() with
  exhaustive match for Registry/Git/GitHub variants
- Add format_save_spec for git-aware package.json writing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@killagu killagu force-pushed the pr3/install-git-deps branch from f260cda to 930313c Compare February 25, 2026 03:35
@killagu killagu force-pushed the pr2/bfs-git-resolution branch from ad7962f to 52e1b13 Compare February 25, 2026 03:35
killagu and others added 2 commits February 25, 2026 13:01
- Add Local and HttpTarball variants to PackageSpec enum
- Handle bare user/repo GitHub shorthand in PackageSpec::parse
- Handle file:/link:/workspace:/portal: as Local in PackageSpec::parse
- Handle HTTP tarball URLs as HttpTarball in PackageSpec::parse
- Rewrite is_non_registry_spec to delegate to PackageSpec::parse,
  eliminating duplicated logic and ensuring consistency

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@killagu killagu force-pushed the pr2/bfs-git-resolution branch 7 times, most recently from d5364c7 to 0120a57 Compare February 27, 2026 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants