Skip to content

Conversation

@samestep
Copy link
Collaborator

@samestep samestep commented Feb 19, 2021

Closes #23.

Let's say we have a stack with two commits (listed here in reverse chronological order):

  • gh/user/Y
  • gh/user/X

Let's say that origin/gh/user/Y/base is out of date, and we run ghstack to update these on GitHub.

  • Currently, if origin/gh/user/Y/base is an ancestor of (local) gh/user/X/head, then we update origin/gh/user/Y/base to be gh/user/X/head. Otherwise, we update origin/gh/user/Y/base to be a merge commit whose parents are:
    • (the previous) origin/gh/user/Y/base
    • (the new) gh/user/X/head
  • After this PR, regardless of any ancestral relationship, if origin/gh/user/Y/base is not equal to gh/user/X/head, we update origin/gh/user/Y/base to be a new commit whose only parent is (the previous) origin/gh/user/Y/base.

This reduces the number of cases in which base commits end up as descendants for head commits from other parts of the stack.


This PR adds test_ghstack.py::TestGh::test_reorder to check a commit reordering scenario similar to the one given in the linked issue. In the test, this is the repository state before reordering:

* rMRG2 (gh/ezyang/2/head) Commit 2
* rMRG1 (gh/ezyang/2/base, gh/ezyang/1/head) Commit 1
* rINI0 (HEAD -> master, gh/ezyang/1/base) Initial commit

If you run the test without the changes introduced in this PR, this is what the final "Repository state" looks like:

*   rMRG1A (gh/ezyang/1/head) Reorder on "Commit 1"
|\
| *   rMRG2A (gh/ezyang/2/head, gh/ezyang/1/base) Reorder on "Commit 2"
| |\
| | * rINI2A (gh/ezyang/2/base) Update base for Reorder on "Commit 2"
| |/|
|/| |
| * | rMRG2 Commit 2
|/ /
* / rMRG1 Commit 1
|/
* rINI0 (HEAD -> master) Initial commit

The order of operations to get to this point is important:

  1. we reorder the commits and run ghstack
  2. ghstack pushes the base branches first
  3. then ghstack pushes the head branches
  4. finally, ghstack pushes the orig branches (not shown here)

This means that after step 2 (but before step 3), we have the following:

  • gh/ezyang/1/head points to rMRG1
  • gh/ezyang/1/base points to rMRG2A

But rMRG1 is an ancestor of rMRG2A! Thus from GitHub's perspective, gh/ezyang/1/head has been merged into gh/ezyang/1/base, so it marks the PR as merged. Updating gh/ezyang/1/head after this point doesn't actually do anything to the PR.

In contrast, this is the final repository state of the test using the changes in this PR:

*   rMRG1A (gh/ezyang/1/head) Reorder on "Commit 1"
|\
| * rINI1A (gh/ezyang/1/base) Update base for Reorder on "Commit 1"
| | *   rMRG2A (gh/ezyang/2/head) Reorder on "Commit 2"
| | |\
| | | * rINI2A (gh/ezyang/2/base) Update base for Reorder on "Commit 2"
| |_|/
|/| |
| | * rMRG2 Commit 2
| |/
|/|
* | rMRG1 Commit 1
|/
* rINI0 (HEAD -> master) Initial commit

As shown, neither new base commit is a descendant of its respective former head commit, so neither PR gets prematurely marked as merged.


This PR also changes a few existing tests in test_ghstack.py::TestGh:

  • test_amend_bottom
  • test_amend_all
  • test_rebase
  • test_cherry_pick
  • test_remove_bottom_commit

Most of these seem fine, but the test_rebase and test_cherry_pick changes are a bit concerning, because the result is that after we update master, the updated master commit still isn't actually an ancestor of our base or head branches.

@samestep
Copy link
Collaborator Author

Given the issues shown by test_rebase and test_cherry_pick with the current version of this PR, I'm thinking we'll need to make the logic in this PR a bit more complicated, to allow for the base of the whole stack to be updated, without accidentally making head commits ancestors of their corresponding base commits.

@ezyang
Copy link
Owner

ezyang commented Mar 1, 2021

So, while I agree that rINI2 not showing up in the final merge graph in test_rebase is suboptimal (and the "fancy but buggy" behavior in the original version of ghstack was engineered around making sure rINI2 made sense), I'm not as convinced as you are that it is wrong.

To state the problem differently: previously, ghstack tried to accurately represent the rebase workflow as a series of incremental merges. But if I do an interactive, reordering rebase, what is the desired mergey version of the history?

Rebase A1 and B1 onto M2:

-- M1 -- M2
    \    \
     \   A2 -- B2
      \ /     /
       A1 -- B1

Reorder A1 and B1

-- M1 -----    ???
    \      \     \
     \      B2 -- A2
      \      \
       A1 -- B1

I'd argue that there is just no way to do it in the merge only view fo the universe, so we might as well not even try.

A simple patch to solve the "but I want master commits to be reachable" might just be to add an extra master merge base if you detect that master has advanced in the rebased stack but the merge base isn't there yet. But remember that this merge base is still going to be inaccurate if you rebase your commits backwards in time!

@samestep
Copy link
Collaborator Author

samestep commented Mar 3, 2021

@ezyang That makes sense I think; I'm still pondering your second example but I see the gist of what you're saying. What I've been wondering is: assuming that it's not possible to always construct a consistent mergey history, why not just use an actual stack of commits (e.g. have head and base just point to commits on orig) and force-push when we need to? I'm guessing there's a good reason for this, but I've been having trouble seeing what the reason is.

@ezyang
Copy link
Owner

ezyang commented Mar 3, 2021

The keyword here is "force push" :) If we force push, GitHub loses track of all the previous commits (they no longer show up in the PR description page). By never force pushing, we can maintain consistent history.

@samestep samestep marked this pull request as ready for review March 3, 2021 20:38
@samestep
Copy link
Collaborator Author

samestep commented Mar 3, 2021

@ezyang Makes sense to me. So, should we merge this PR as-is?

@samestep samestep requested a review from ezyang March 3, 2021 20:40
@ezyang
Copy link
Owner

ezyang commented Mar 3, 2021

I guess we might as well dogfood it for a while and see if it works.

@samestep samestep merged commit fd5da8b into ezyang:master Mar 5, 2021
@samestep samestep deleted the fix-reorder-merge branch March 5, 2021 17:18
@samestep
Copy link
Collaborator Author

samestep commented Mar 8, 2021

@ezyang Thinking about this more, I actually think we do want to put in some additional logic to allow rebasing onto a more recent version of master. If we don't allow that, wouldn't this prevent people from resolving merge conflicts in some cases?

@ezyang
Copy link
Owner

ezyang commented Mar 8, 2021

Err, people don't ever actually attempt merges on the head branch, it's just to make GitHub happy.

@samestep
Copy link
Collaborator Author

samestep commented Mar 8, 2021

Ah, I see what you mean; makes sense!

@ezyang
Copy link
Owner

ezyang commented Mar 15, 2021

It turns out, while HUMANS don't merge with master, CI SCRIPTS do. Test failures at https://app.circleci.com/pipelines/github/pytorch/pytorch/285473/workflows/5709135d-0ef7-448f-8757-fde30622f343/jobs/11551386 due to inaccurate master information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PR unexpectedly merged when commits are reordered

2 participants