Wireshark-dev: Re: [Wireshark-dev] Migrate to GitLab?

From: João Valverde <joao.valverde@xxxxxxxxxxxxxxxxxx>
Date: Sat, 12 Oct 2019 17:40:35 +0100


On 12/10/19 14:59, Pascal Quantin wrote:


Le sam. 12 oct. 2019 à 15:08, João Valverde <joao.valverde@xxxxxxxxxxxxxxxxxx> a écrit :


On 12/10/19 13:33, Peter Wu wrote:
On Sat, Oct 12, 2019 at 12:33:48PM +0100, João Valverde wrote:
On 12/10/19 11:48, Roland Knall wrote:
tl;dr - I am also -2 on merge commits, not entirely sure about ff
either, they tend to be work, cherry-pick would be preferable.

Long version:

Currently we do have a strategy in place, that is called "Cherry-Pick".
Basically it means, that Gerrit resolves any branch conflicts (the patch
had been developed on an older commit then the current master has) and
just cherry-picks the change. It has two basic advantages: 1. It
prevents "merge-branch" commit messages and therefore keeps the master
history clean, but also 2. it prevents the developer from rebasing on
master before merging the change.

Different to cherry-pick, merge branch strategy will also try to merge
the commit together with the master branch. But in the scenario
described above, it can lead to scenarios where the two branches are out
of sync, and now git* will create a merge-commit to move them on the
same track again. There are two major issues with that strategy: 1. it
pollutes history. It will end up creating a bunch of those merge
requests, as long as the developers don't take rebase seriously and 2.
it increases the risk that a merge will overwrite newer changes, an
absolut no-no. It usually is deployed either in small projects or
together with a very complete CI/CD integration with nearly 100%
automatic functionality testing. In bigger projects it is a very bad
idea in my opinion.
How can a merge commit overwrite changes? I don't understand what you are
saying.

If you merge a feature branch with n commits and there are conflicts you get
a merge conflict and need to resolve 1 conflict.

If you rebase a feature branch with n commits and there are conflicts you
get a rebase conflict and need to resolve up to n conflicts.

Pretty much the same, except the second is potentially more tedious work.

It didn't seem he was saying that but conflict resolution is absolutely easier and more obvious with merge commits.

When a merge conflict happens, it has to be resolved. When you rebase
it, any conflict resolutions will appear in the individual commits. With
a merge conflict, the resolution happens in the merge commit. Any wrong
conflict resolution will be more obvious with linear history. I think
that is what Roland is referring to.

Roland first concern (history pollution) is reasonable. Commits such as
"fixed typo in previous commit" and "Merge branch 'master' into xyz" are
not really helpful. The first commit should have been squashed into an
earlier one whereas the second commit could be avoided by a rebase to
allow fast-forward merging.

A project that does rely on merge commits is Linux:

 - Individual patches from mailing lists are applied.  Subsystem
   maintainers then send pull requests to other maintainers, and
   ultimately end up at Linus.
 - Linus will be very upset if you rebase your branch on current master
   before sending a pull request.
 - The reasoning behind this is that branches are expected to be tested.
   If a branch has just been rebased before submitting, chances are that
   it lacked testing.

If you want more background on the model in Linux, be sure to check
https://www.kernel.org/doc/html/latest/maintainer/rebasing-and-merging.html

That's another problem with rebases. You have to manually retest every single commit (or squash and lose history), which I'm sure almost none will do. The original branch is assumed to be tested already before a merge.

I can't recall a case where we discovered a bug due to a rebase.

I'm not even talking about subtle bugs, just broken builds from normal conflict resolution mistakes and whatever else. It doesn't happen now because with Gerrit we test every commit that goes in the branch (except trivial one liners). With Gitlab rebases that won't happen. Only HEAD is tested with CI/CD.

I could be wrong but I have the feeling that with our current workflow, the size of our code base and the fact that most of the time people do not work on the same code part at the same time, doing a rebase was not an issue.
Moreover the definition of testing is subject to interpretation. I guess almost everybody verifies that his change answers its own need, but do not necessarily check the side effects (which I can easily understand as you do not have a real non regression system that would take ages to execute anyway).
I'm not against merge and will accept it if we decide to go this way, but my humain brain prefers a linear history when looking at changes done in a file. So I would really enjoy keeping our current status quo if possible.

Sure, that's understandable. Like I said, I agree a linear history is nicer. I'm not going to argue against your preference. I think it's a worse tradeoff but I'm fine with it anyway.


Now fast-forward is the strictest of it all, although a little bit
similar to cherry-picking. It is used to absolutely ensure, that the
commit history is clean. It basically means, that your merge request
will have to happen on the very top of the commit history of the branch
you want to merge into. This means, that the developer has to ensure
that the patchset can be merged, no automatic resolving takes place. It
is more cumbersome to work with, especially in projects where a lot of
people commit. To my recollection the collisions have to be resolved
only for the files touching the merge request, making it a little bit
less strict, but still it requires additional effort on the side of the
developer. (see https://docs.gitlab.com/ee/user/project/merge_requests/fast_forward_merge.html
for gitlab's explanation, at the bottom they have a very good overview
of what it entails).

Personally I have ever worked with fast-forward and cherry-pick. The
first is an absolut must if code traceability has to be ensured (e.g.
for machine safety applications), the latter for projects which want a
very clean main history (e.g. for release notes) and also avoid extra
checks for feature branch merges (due to the nature of overwriting
existing changes on chance with merge-only strategies).

Therefore I am very strongly opposed on merge commits and would prefer
fast-forward. If we go with merge commits we would need other features
and workflows to ensure, that no overwrite can take place or it is
detected properly if it happens.
If we do forbid merge commits, it might still be worth editing the
commit message with a reference to the merge request where the review
has happened. For example
https://github.com/curl/curl/wiki/push-access-guidelines#updating-a-commit-message-before-pushing

FWIW, the LLVM project is another interesting case. They used to rely on
subversion for pushes where Git was a read-only mirror of it. Now they
are moving to Git on Github, but still forbid merges:
https://llvm.org/docs/Proposals/GitHubMove.html
https://llvm.org/GitHubMigrationStatus.html

Kind regards,
Peter

regards
Roland

Am Sa., 12. Okt. 2019 um 12:09 Uhr schrieb Graham Bloice:

    As one who has never used GitLab, I'm uncertain about the changes.
    To educate me can anyone point to an instance of a merge commit in
    another project?

    If I understand them correctly (which might not be true) then I'm
    a -2 for merge commits.  I really do NOT want to see master commit
    history polluted with the details of the sausage making, just the
    effective change.

    To clarify discussion on this I would like to see detailed
    workflows of both approaches (ff only and merge commits), i.e.
    intial change creation, amendment, approval and merge to master,
    along with any tooling (e.g. similar to git-review) that makes the
    process easier.

    When we have the workflows laid out then we then have a basis for
    discussion as to which fits the project needs better.


    On Sat, 12 Oct 2019 at 02:28, João Valverde wrote:


        On 11/10/19 22:54, Gerald Combs wrote:
        The reactions to migrating so far have been pretty
        favorable, so I've started documenting the process at
        https://gitlab.com/wireshark/gitlab-migration/wikis/home.
        There are still a lot of things to figure out, but I'm
        hoping that we can start preparation and testing some time
        in November and cut over the repository in mid January
        (the 14th will be the 6th anniversary of the
        Subersion-to-Gerrit cutover).
        Awesome. Also +1 for merge commits from me. Sadly I haven't
        really seen anyone else advocating for this.

        I agree it's nicer to look at a linear history but the
        workflow is better with merges.
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives:    https://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe

___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives:    https://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe

___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives:    https://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe