Skip to content

Conversation

Greyeye
Copy link
Collaborator

@Greyeye Greyeye commented Sep 12, 2024

add url parse validator.
add gitlab as one of prefix

add url parse validator.
add gitlab as one of prefix
@zapier-sre-bot
Copy link
Collaborator

Mergecat's Review

Click to read mergecats review!

😼 Mergecat review of cmd/locations.go

@@ -2,6 +2,7 @@ package cmd
 
 import (
 	"context"
+	"net/url"
 	"path/filepath"
 	"regexp"
 	"strings"
@@ -83,11 +84,12 @@ func maybeCloneGitUrl(ctx context.Context, repoManager cloner, repoRefreshDurati
 	return path, nil
 }
 
-func isGitURL(str string) bool {
-	if IsURL(str) && urlPathWithFragmentSuffix.MatchString(str) {
+func isGitURL(url string) bool {
+	str := strings.ToLower(url)
+	if isValidURL(str) && urlPathWithFragmentSuffix.MatchString(str) {
 		return true
 	}
-	for _, prefix := range []string{"git://", "github.com/", "git@"} {
+	for _, prefix := range []string{"git://", "github.com/", "gitlab.com/", "git@"} {
 		if strings.HasPrefix(str, prefix) {
 			return true
 		}
@@ -99,9 +101,8 @@ func isGitURL(str string) bool {
 // context from the Git repository. See IsGitURL for details.
 var urlPathWithFragmentSuffix = regexp.MustCompile(`\.git(?:#.+)?$`)
 
-// IsURL returns true if the provided str is an HTTP(S) URL by checking if it
-// has a http:// or https:// scheme. No validation is performed to verify if the
-// URL is well-formed.
-func IsURL(str string) bool {
-	return strings.HasPrefix(str, "https://") || strings.HasPrefix(str, "http://")
+// isValidURL returns true if the provided str is a well-formed HTTP(S) URL.
+func isValidURL(str string) bool {
+	u, err := url.Parse(str)
+	return err == nil && (u.Scheme == "http" || u.Scheme == "https")
 }

Feedback & Suggestions:

  1. Security Improvement: The change to use url.Parse in isValidURL is a good improvement for validating URLs. However, consider adding a check for the Host field to ensure it's not empty, which would further validate the URL.

    func isValidURL(str string) bool {
        u, err := url.Parse(str)
        return err == nil && (u.Scheme == "http" || u.Scheme == "https") && u.Host != ""
    }
  2. Performance Improvement: Converting the URL to lowercase in isGitURL might not be necessary since URL schemes are case-insensitive but the rest of the URL can be case-sensitive. This could potentially alter the behavior for some URLs.

    func isGitURL(url string) bool {
        if isValidURL(url) && urlPathWithFragmentSuffix.MatchString(url) {
            return true
        }
        for _, prefix := range []string{"git://", "github.com/", "gitlab.com/", "git@"} {
            if strings.HasPrefix(url, prefix) {
                return true
            }
        }
        return false
    }
  3. Consistency: The function name isValidURL is more descriptive than IsURL, which is a good change. Ensure that all references to IsURL are updated to isValidURL to maintain consistency.

  4. Documentation: Update the comment for isValidURL to reflect the additional check for the Host field if you decide to implement it.

    // isValidURL returns true if the provided str is a well-formed HTTP(S) URL with a non-empty host.

😼 Mergecat review of cmd/locations_test.go

@@ -19,7 +19,7 @@ type fakeCloner struct {
 	err                  error
 }
 
-func (f *fakeCloner) Clone(ctx context.Context, cloneUrl, branchName string) (*git.Repo, error) {
+func (f *fakeCloner) Clone(_ context.Context, cloneUrl, branchName string) (*git.Repo, error) {
 	f.cloneUrl = cloneUrl
 	f.branchName = branchName
 	return f.result, f.err
@@ -199,3 +199,76 @@ func TestMaybeCloneGitUrl_CloneError(t *testing.T) {
 		})
 	}
 }
+
+func Test_isGitURL(t *testing.T) {
+	type args struct {
+		str string
+	}
+	tests := []struct {
+		name string
+		args args
+		want bool
+	}{
+		{
+			name: "git url 1",
+			args: args{
+				str: "https://gitlab.com/org/team/project.git",
+			},
+			want: true,
+		},
+		{
+			name: "git url 2",
+			args: args{
+				str: "git://github.com/org/team/project.git",
+			},
+			want: true,
+		},
+		{
+			name: "git url 3",
+			args: args{
+				str: "http://github.com/org/team/project.git",
+			},
+			want: true,
+		},
+		{
+			name: "git url 4",
+			args: args{
+				str: "git://test.local/org/team/project.git",
+			},
+			want: true,
+		},
+		{
+			name: "git url invalid 1",
+			args: args{
+				str: "scp://whatever.com/org/team/project.git",
+			},
+			want: false,
+		},
+		{
+			name: "git url invalid 2",
+			args: args{
+				str: "ftp://github.com/org/team/project.git",
+			},
+			want: false,
+		},
+		{
+			name: "git url invalid 3",
+			args: args{
+				str: "thisisnoturl",
+			},
+			want: false,
+		},
+		{
+			name: "git url invalid 4",
+			args: args{
+				str: "http://zapier.com",
+			},
+			want: false,
+		},
+	}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			assert.Equalf(t, tt.want, isGitURL(tt.args.str), "isGitURL(%v)", tt.args.str)
+		})
+	}
+}

Feedback & Suggestions:

  1. Context Parameter Ignored: Changing ctx to _ in the Clone method signature indicates that the context is not used. If the context is genuinely not needed, this is fine. However, if the context should be used for cancellation or timeout purposes, consider utilizing it to ensure proper resource management.

  2. Test Coverage for isGitURL: The new test function Test_isGitURL is a good addition to ensure the isGitURL function works correctly. However, consider adding edge cases such as:

    • URLs with unusual but valid characters.
    • URLs with different protocols that might be valid in some contexts (e.g., ssh://).
  3. Test Naming Consistency: The test names like "git url 1", "git url 2", etc., are not very descriptive. Consider using more descriptive names to indicate what each test case is verifying, e.g., "valid https git url", "valid git protocol url", etc.

  4. Error Messages in Tests: Ensure that the error messages in the tests are clear and provide enough context. For example, if a test fails, it should be easy to understand why it failed based on the error message.

  5. Security Consideration: Ensure that the isGitURL function is robust against potential security issues such as URL injection or malformed URLs. This might involve additional validation or sanitization steps.



Dependency Review

Click to read mergecats review!

No suggestions found

Copy link

github-actions bot commented Sep 12, 2024

Temporary image deleted.

@polyrain polyrain merged commit dd1ca0f into main Sep 12, 2024
5 checks passed
@polyrain polyrain deleted the unitest-url-check branch September 12, 2024 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants