Skip to content

Unapply to real branches #4025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jul 1, 2024
Merged

Unapply to real branches #4025

merged 21 commits into from
Jul 1, 2024

Conversation

Caleb-T-Owens
Copy link
Contributor

No description provided.

@Caleb-T-Owens Caleb-T-Owens changed the title Don't return optional Unapply to real branches Jun 6, 2024
@Caleb-T-Owens Caleb-T-Owens force-pushed the unapply-virtual-branch branch 8 times, most recently from c86953d to 8c6b180 Compare June 11, 2024 17:52
@Caleb-T-Owens Caleb-T-Owens force-pushed the unapply-virtual-branch branch 2 times, most recently from d99e8e0 to 56b516e Compare June 16, 2024 14:49
Copy link
Collaborator

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2k lines removed looks absolutely great, thank you :)!

Since you are digging into the belly of the beast to refactor virtual branches, I wanted to share some thoughts. Ideally, for refactored and new code, we can keep a couple of things in mind.

Type Safety: Avoid UTF8 where it's not guaranteed

Instead, use BString.
I definitely like to get away from stringly-typing as well which is attempted in this PR. There should be more fitting types in gix and git2, even though a TaggedString certainly is a step up already.

Maintainability: Write knowledge down as doc-strings

As a lot of code is rewritten and maybe code is better understood, this knowledge could be written down in the form of doc-strings for everyone to benefit. This would help scaling the team, and sharing knowledge across the existing team. For instance, I would benefit from knowing what you learned by means of doc-strings.

Please do let me know if I am missing something that would make my comments unnecessary or unwanted at this time.

@@ -26,6 +26,10 @@ pub trait RepositoryExt {
fn in_memory<T, F>(&self, f: F) -> Result<T>
where
F: FnOnce(&git2::Repository) -> Result<T>;
fn integration_commit(&self) -> Result<git2::Commit<'_>>;
fn sign_buffer(&self, buffer: &CommitBuffer) -> Result<String>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my mind, it's very important to (eventually) be able to add documentation to these methods. Here, it's unclear if it returns a signed version of buffer, or the signature. I'd lean towards thinking it's the signature, but I'd be confused by the String type which implies it might be hex-encoded.
That's too many questions for me.

Personally, I tend to document as precisely as possible what I know this method is supposed to do while implementing it, because that's when I am the expert. 2 weeks later, I won't be that anymore and happy about some text that explains the intent of what seemed so obvious a couple of weeks ago.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! String is a really bad type there.

I'm still not very happy with the whole CommitBuffer thing because it doesn't support parsing a commit that has been signed either.


pub struct _ReferenceName;
/// The name of a reference ie. `refs/heads/master`
pub type ReferenceName = TaggedString<_ReferenceName>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It certainly depends on where the reference name is coming from, but there is types in git2 and gix which would assure the user that it is a valid reference name as well, and if it is partial (like main) or full (like HEAD or refs/heads/main.

Nonetheless, it's probably good to start with this type, but try to keep in mind that ideally, even more specific types are used. My take on it is that internally, only fully qualified reference names should be used, and with them, probably the gix::refs::FullName or gix::refs::Reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've returned either git2::Branch or git2::Reference from the functions inside virtual.rs because they're just easier to consume again; I've only used ReferenceName inside the controller.rs files as those values need to be serialisable.

We do have our internal Refname struct, but there is a major logical flaw in it and ideally it should be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git2 types might have the disadvantage that sometimes they carry a reference to the 'repo , which might make them unusable. gix has owned types for that, but maybe it's too early to introduce them even.

Regarding serializing these types, that's usually quite straightforward with one of these core::serde:: modules, and bstr has serde-support as well.

If in doubt, it's certainly easier to keep it as is and there is merit in that. If the comment could then make clear what values are allowed or usually in there, that would help as well.

@Caleb-T-Owens Caleb-T-Owens force-pushed the unapply-virtual-branch branch from 7609f23 to 761abb9 Compare June 17, 2024 09:41
@Caleb-T-Owens Caleb-T-Owens force-pushed the unapply-virtual-branch branch from ef1df52 to 0fdde59 Compare June 26, 2024 13:29
@github-actions github-actions bot added rust Pull requests that update Rust code svelte labels Jun 26, 2024
@Caleb-T-Owens Caleb-T-Owens force-pushed the unapply-virtual-branch branch 2 times, most recently from 1c019a8 to df83d28 Compare June 30, 2024 20:41
@krlvi
Copy link
Member

krlvi commented Jul 1, 2024

:shipit:

@Caleb-T-Owens Caleb-T-Owens force-pushed the unapply-virtual-branch branch from 037fbe9 to 87297c9 Compare July 1, 2024 13:12
@Caleb-T-Owens Caleb-T-Owens marked this pull request as ready for review July 1, 2024 13:12
@Caleb-T-Owens Caleb-T-Owens enabled auto-merge (squash) July 1, 2024 14:02
@Caleb-T-Owens Caleb-T-Owens merged commit 82c3987 into master Jul 1, 2024
26 checks passed
@Caleb-T-Owens Caleb-T-Owens deleted the unapply-virtual-branch branch July 1, 2024 14:13
@bjeanes
Copy link

bjeanes commented Sep 7, 2024

Is it possible to create a commit that includes the custom headers introduced here using standard Git tools? GitButler has introduced commits to our repository history which have these non-standard gitbutler-headers-version and gitbutler-change-id headers but other tooling is not handling them well.

I want to contribute a failing test case to this other tool, but in order to do so I need to be able to create a commit, but git-commit-tree can't do this, so it feels quite awkward.

Any thoughts would be appreciated. This single commit with these non-standard headers has become quite a headache for us...

@Byron
Copy link
Collaborator

Byron commented Sep 7, 2024

I think I have just the 'technique' for you to do arbitrary edits on commits, without the need for git (CLI) to do any of it. Here is the example. In short, it's showing a commit, then patching it with any change, and creating a new commit object from that by merely hashing it - commits can be treated as text-files.

I hope that helps.

@bjeanes
Copy link

bjeanes commented Sep 7, 2024

That looks like a fantastic starting point. Thank you!

This should fit really nicely into their test framework/approach (e.g.: https://github.com/josh-project/josh/blob/master/tests/filter/subtree_prefix.t)!

@bjeanes
Copy link

bjeanes commented Sep 7, 2024

Also TIL git-hash-object. Thanks for sharing this :)

@Caleb-T-Owens
Copy link
Contributor Author

Hi @bjeanes,

I wanted to just apologies for causing you trouble. Please let us know if there is a way we might be able help you with any issues our headers might have caused.

There are a small number of other tools which also write their own headers, though their names escape me at the moment. It should also be noted that the git CLI itself may include more headers at some point in the future, so I would recommend being able to parse them generically.

@bjeanes
Copy link

bjeanes commented Sep 7, 2024

There are a small number of other tools which also write their own headers, though their names escape me at the moment.

If you think of them later, I'd love to know. It might help substantiate changes to support this in JOSH, or help any patch I might come up with (🤞🏻) be accepted.

I would recommend being able to parse them generically.

Agreed. I hope I can find a nice way to contribute a patch to allow JOSH to pass-through and preserve all unknown headers when it rewrites commit data.

I wanted to just apologies for causing you trouble. Please let us know if there is a way we might be able help you with any issues our headers might have caused.

No sweat at all! Thank you for that offer :).

@bjeanes
Copy link

bjeanes commented Sep 19, 2024

I just wanted to share that I was able to fix this in the other project and it seems like the maintainers are warm to accepting the fix.

I'm sharing here because my first approach josh-project/josh#1387 lifted the CommitBuffer approach introduced in this PR. However, it felt slightly risky to inline a bespoke implementation for writing commits that could miss nuances of how Git works.

I took a second stab using gix-object which already has handling for custom headers. I thought the GitButler maintainers might be interested in doing something similar here, whether themselves or perhaps by accepting a patch (I may or may not give it a go if so).

It seems prudent to have something like gix-object "own" knowledge of the commit objects byte representation rather than GitButler directly, since it is purpose-built for this and can be tested more easily in a variety of scenarios. Further, gix already appears to be a dependency and indeed @Byron is its maintainer anyway...

@Byron
Copy link
Collaborator

Byron commented Sep 20, 2024

I don't have the context how gix-object would apply here, but left a comment over at the submitted josh PR.

@Caleb-T-Owens
Copy link
Contributor Author

Reading through, I want to improve my CommitBuffer implementation as it's not ideal as it doesn't understand how to parse signing headers.

I'd like to make a library for this which can parse any commit correctly.

@bjeanes
Copy link

bjeanes commented Sep 20, 2024

I don't have the context how gix-object would apply here

To replace CommitBuffer introduced here. It parses commit bytes, modifies headers, and outputs new commit bytes.

left a comment over at the submitted josh PR.

I appreciate it! I'll respond to that there.

Reading through, I want to improve my CommitBuffer implementation as it's not ideal as it doesn't understand how to parse signing headers.

Possibly any multi-line header yeah...

I'd like to make a library for this which can parse any commit correctly.

This already is pretty robustly covered in gix-object, including inserting new headers. Is there something more you want to do in a custom library?

@Caleb-T-Owens
Copy link
Contributor Author

This already is pretty robustly covered in gix-object, including inserting new headers. Is there something more you want to do in a custom library?

I've not looked at gix-object much, but I'll take a read through tomorrow. Most of our committing is done via libgit2 at the minute, but if we can change over (or even make use of gix-object simply to format the buffer and keep our existing committing functions) and have a better API, that would be fantastic.

@bjeanes
Copy link

bjeanes commented Sep 20, 2024

if we can change over (or even make use of gix-object simply to format the buffer and keep our existing committing functions) and have a better API, that would be fantastic.

Yep, that's exactly what I did in the JOSH PR. They also use git2 predominantly. It was quite straightforward to surgically use git_object::CommitRef::from_bytes() to parse the commit bytes, manipulate headers, and write it back out with git2's Object DB APIs.

@Byron
Copy link
Collaborator

Byron commented Sep 21, 2024

I also think that using gix-obect (or gix::objs as re-export) would offer anything one would need to manipulate commits. There is also a function that helps with extracting signatures, in case it matters at all (I kind of lost track, apologies).

And as always, if anything seems to be missing or could be more convenient, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rust Pull requests that update Rust code svelte
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants