Top merge Questions

List of Tags
1353
J. Pablo Fernández

What's the difference between git pull and git fetch?

Answered By: Greg Hewgill ( 1221)

In the simplest terms, git pull does a git fetch followed by a git merge.

You can do a git fetch at any time to update your local copy of a remote branch. This operation never changes any of your own branches and is safe to do without changing your working copy. I have even heard of people running git fetch periodically in a cron job in the background (although I wouldn't recommend doing this).

A git pull is what you would do to bring your repository up to date with a remote repository.

Is there a good way to explain how to resolve merge conflicts in Git?

Answered By: Peter Burns ( 655)

Try: git mergetool

It opens a GUI that steps you through each conflict and you get to choose how to merge. Sometimes it requires a bit of hand editing afterwards, but usually it's enough by itself. Much better than doing the whole thing by hand certainly.

I've been using git now for a couple months on a project with one other developer. I have several years of experience with svn, so I guess I bring a lot of baggage to the relationship.

I have heard that git is excellent for branching and merging, and so far, I just don't see it. Sure, branching is dead simple, but when I try to merge, everything goes all to hell. Now, I'm used to that from svn, but it seems to me that I just traded one sub-par versioning system for another.

My partner tells me that my problems stem from my desire to merge willy-nilly, and that I should be using rebase instead of merge in many situations. For example, here's the workflow that he's laid down:

clone the remote repo
git checkout -b my_new_feature
..work and commit some stuff
git rebase master
..work and commit some stuff
git rebase master
..finish the feature
git checkout master
git merge my_new_feature

Essentially, create a feature branch, ALWAYS rebase from master to the branch, and merge from the branch back to master. Important to note is that the branch always stays local.

Here is the workflow that I started with

clone remote repo
create my_new_feature branch on remote repo
git checkout -b --track my_new_feature origin/my_new_feature
..work, commit, push to origin/my_new_feature
git merge master (to get some changes that my partner added)
..work, commit, push to origin/my_new_feature
git merge master
..finish my_new_feature, push to origin/my_new_feature
git checkout master
git merge my_new_feature
delete remote branch
delete local branch

There are 2 essential differences (I think): I use merge always instead of rebasing, and I push my feature branch (and my feature branch commits) to the remote repo.

My reasoning for the remote branch is that I want my worked backed up as I'm working. Our repo is automatically backed up and can be restored if something goes wrong. My laptop is not, or not as thoroughly. Therefore, I hate to have code on my laptop that's not mirrored somewhere else.

My reasoning for the merge instead of rebase is that merge seems to be standard and rebase seems to be an advanced feature. My gut feeling is that what I'm trying to do is not an advanced setup, so rebase should be unnecessary. I've even perused the new Pragmatic Programming book on git, and they cover merge extensively and barely mention rebase.

Anyways, I was following my workflow on a recent branch, and when I tried to merge it back to master, it all went to hell. There were tons of conflicts with things that should have not mattered. The conflicts just made no sense to me. It took me a day to sort everything out, and eventually culminated in a forced push to the remote master, since my local master has all conflicts resolved, but the remote one still wasn't happy.

What is the "correct" workflow for something like this? Git is supposed to make branching and merging super-easy, and I'm just not seeing it.

Update 2011-04-15

This seems to be a very popular question, so I thought I'd update with my 2 years experience since I first asked.

It turns out that the original workflow is correct, at least in our case. In other words, this is what we do and it works:

clone the remote repo
git checkout -b my_new_feature
..work and commit some stuff
git rebase master
..work and commit some stuff
git rebase master
..finish the feature
git checkout master
git merge my_new_feature

In fact, our workflow is a little different, as we tend to do squash merges instead of raw merges. This allows us to turn our entire feature branch into a single commit on master. Then we delete our feature branch. This allows us to logically structure our commits on master, even if they're a little messy on our branches. So, this is what we do:

clone the remote repo
git checkout -b my_new_feature
..work and commit some stuff
git rebase master
..work and commit some stuff
git rebase master
..finish the feature
git checkout master
git merge --squash my_new_feature
git branch -D my_new_feature

I've come to love git and never want to go back to SVN. If you're struggling, just stick with it and eventually you'll see the light at the end of the tunnel.

Answered By: VonC ( 143)

"Conflicts" mean "parallel evolutions of a same content". So if it goes "all to hell" during a merge, it means you have massive evolutions on the same set of files.

The reason why a rebase is then better than a merge is that:

  • you rewrite your local commit history with the one of the master (and then reapply your work, resolving any conflict then)
  • the final merge will certainly be a "fast forward" one, because it will have all the commit history of the master, plus only your changes to reapply.

I confirm that the correct workflow in that case (evolutions on common set of files) is rebase first, then merge.

However, that means that, if you push your local branch (for backup reason), that branch should not be pulled (or at least used) by anyone else (since the commit history will be rewritten by the successive rebase).


On that topic (rebase then merge workflow), barraponto mentions in the comments two interesting posts, both from randyfay.com:

Using this technique, your work always goes on top of the public branch like a patch that is up-to-date with current HEAD.

(a similar technique exists for bazaar)

286
David Joyner

I'm using git on a new project that has two parallel -- but currently experimental -- development branches:

  • master: import of existing codebase plus a few mods that I'm generally sure of
  • exp1: experimental branch #1
  • exp2: experimental branch #2

exp1 and exp2 represent two very different architectural approaches. Until I get further along I have no way of knowing which one (if either) will work. As I make progress in one branch I sometimes have edits that would be useful in the other branch and would like to merge just those.

What is the best way to merge selective files from one development branch to another while leaving behind everything else?

Approaches I've considered:

  1. git merge --no-commit followed by manual unstaging of a large number of edits that I don't want to make common between the branches.

  2. Manual copying of common files into a temp directory followed by git checkout to move to the other branch and then more manual copying out of the temp directory into the working tree.

  3. A variation on the above. Abandon the exp branches for now and use two additional local repositories for experimentation. This makes the manual copying of files much more straightforward.

All three of these approaches seem tedious and error-prone. I'm hoping there is a better approach; something akin to a filter path parameter that would make git-merge more selective.

Answered By: 1800 INFORMATION ( 103)

You use the cherry-pick command to get individual commits from one branch.

If the change(s) you want are not in individual commits, then use the method shown here to split the commit into individual commits. Roughly speaking, you use git rebase -i to get the original commit to edit, then git reset HEAD^ to selectively revert changes, then git commit to commit that bit as a new commit in the history.

There is another nice method here in Red Hat Magazine, where they use git add --patch or possibly git add --interactive which allows you to add just parts of a hunk, if you want to split different changes to an individual file (search in that page for "split").

Having split the changes, you can now cherry-pick just the ones you want.

247
Florian Pilz

Coming from mercurial, I'm using branches to organize features. Naturally I want to see this work-flow in my history as well.

I started my new project using git and finished my first feature. While merging the feature I realized that git uses fast-forward, i.e. applies my changes directly to the master branch if possible and forgets about my branch.

So to think into the future: I'm the only one working on this project. If I use the default approach of git (fast-forward merging) my history would result in one giant master branch. Nobody knows that I used a separate branch for every feature, because in the end I only have that giant master branch. Doesn't that look unprofessional?

Due to this reasoning I don't want fast-forward merging as the default behaviour of git and can't see a good reason why it is the default. Maybe there are reasons, but what's so striking about it, that it has to be the default action?

Answered By: VonC ( 321)

Fast-forward merging makes sense for short-lived branches, but in a more complex history, non-fast-forward merging may make the history easier to understand, and make it easier to revert a group of commits.

alt text
(From nvie.com, Vincent Driessen, post "A successful Git branching model")

Incorporating a finished feature on develop

Finished features may be merged into the develop branch to add them to the upcoming release:

$ git checkout develop
Switched to branch 'develop'
$ git merge --no-ff myfeature
Updating ea1b82a..05e9557
(Summary of changes)
$ git branch -d myfeature
Deleted branch myfeature (was 05e9557).
$ git push origin develop

The --no-ff flag causes the merge to always create a new commit object, even if the merge could be performed with a fast-forward. This avoids losing information about the historical existence of a feature branch and groups together all commits that together added the feature.


The fast-forward is the default because:

  • short-lived branches are very easy to create and use in Git
  • short-lived branches often isolate many commits that can be reorganized freely within that branch
  • those commits are actually part of the main branch: once reorganized, the main branch is fast-forwarded to include them.

But if you anticipate an iterative workflow on one topic/feature branch (i.e., I merge, then I go back to this feature branch and add some more commits), then it is useful to include only the merge in the main branch, rather than all the intermediate commits of the feature branch.

In this case, you can end up setting this kind of config file:

[branch "master"]
# This is the list of cmdline options that should be added to git-merge 
# when I merge commits into the master branch.

# The option --no-commit instructs git not to commit the merge
# by default. This allows me to do some final adjustment to the commit log
# message before it gets commited. I often use this to add extra info to
# the merge message or rewrite my local branch names in the commit message
# to branch names that are more understandable to the casual reader of the git log.

# Option --no-ff instructs git to always record a merge commit, even if
# the branch being merged into can be fast-forwarded. This is often the
# case when you create a short-lived topic branch which tracks master, do
# some changes on the topic branch and then merge the changes into the
# master which remained unchanged while you were doing your work on the
# topic branch. In this case the master branch can be fast-forwarded (that
# is the tip of the master branch can be updated to point to the tip of
# the topic branch) and this is what git does by default. With --no-ff
# option set, git creates a real merge commit which records the fact that
# another branch was merged. I find this easier to understand and read in
# the log.

mergeoptions = --no-commit --no-ff

The OP adds in the comments:

I see some sense in fast-forward for [short-lived] branches, but making it the default action means that git assumes you... often have [short-lived] branches. Reasonable?

Jefromi answers:

I think the lifetime of branches varies greatly from user to user. Among experienced users, though, there's probably a tendency to have far more short-lived branches.

To me, a short-lived branch is one that I create in order to make a certain operation easier (rebasing, likely, or quick patching and testing), and then immediately delete once I'm done.
That means it likely should be absorbed into the topic branch it forked from, and the topic branch will be merged as one branch. No one needs to know what I did internally in order to create the series of commits implementing that given feature.

More generally, I add:

it really depends on your development workflow:

  • if it is linear, one branch makes sense.
  • If you need to isolate features and work on them for a long period of time and repeatedly merge them, several branches make sense.

See "When should you branch?"

Actually, when you consider the Mercurial branch model, it is at its core one branch per repository (even though you can create anonymous heads, bookmarks and even named branches)
See "Git and Mercurial - Compare and Contrast".

Mercurial, by default, uses anonymous lightweight codelines, which in its terminology are called "heads".
Git uses lightweight named branches, with injective mapping to map names of branches in remote repository to names of remote-tracking branches.
Git "forces" you to name branches (well, with the exception of a single unnamed branch, which is a situation called a "detached HEAD"), but I think this works better with branch-heavy workflows such as topic branch workflow, meaning multiple branches in a single repository paradigm.

241
Aaron Digulla

EntityManager.merge() can insert new objects and update existing ones.

Why would one want to use persist() (which can only create new objects)?

Answered By: Mike ( 455)

Either way will add an entity to a PersistenceContext, the difference is in what you do with the entity afterwards.

Persist takes an entity instance, adds it to the context and makes that instance managed (ie future updates to the entity will be tracked)

Merge creates a new instance of your entity, copies the state from the supplied entity, and makes the new copy managed. The instance you pass in will not be managed (any changes you make will not be part of the transaction - unless you call merge again).

Maybe a code example will help.

MyEntity e = new MyEntity();

// scenario 1
// tran starts
em.persist(e); 
e.setSomeField(someValue); 
// tran ends, and the row for someField is updated in the database

// scenario 2
// tran starts
e = new MyEntity();
em.merge(e);
e.setSomeField(anotherValue); 
// tran ends but the row for someField is not updated in the database (you made the changes *after* merging

// scenario 3
// tran starts
e = new MyEntity();
MyEntity e2 = em.merge(e);
e2.setSomeField(anotherValue); 
// tran ends and the row for someField is updated (the changes were made to e2, not e)

Scenario 1 and 3 are roughly equivalent, but there are some situations where you'd want to use Scenario 2.

I've heard a few places that one of the main ways distributed version control systems shine, is much better merging than traditional tools like SVN. Is this actually due to inherent differences in how the two systems work, or do specific DVCS implementations like Git/Mercurial just have cleverer merging algorithms than SVN?

Answered By: Spoike ( 322)

The claim of why merging is better in a DVCS than in Subversion was largely based on how branching and merge worked in Subversion a while ago. Subversion prior to 1.5.0 didn't store any information about when branches were merged, thus when you wanted to merge you had to specify which range of revisions that had to be merged.

So why did Subversion merges suck?

Ponder this example:

      1   2   4     6     8
trunk o-->o-->o---->o---->o
       \
        \   3     5     7
b1       +->o---->o---->o

When we want to merge b1's changes into the trunk we'd issue the following command, while standing on a folder that has trunk checked out:

svn merge -r 3:7 {link to branch b1}

… which will attempt to merge the changes from b1 into your local working directory. And then you commit the changes after you resolve any conflicts and tested the result. When you commit the revision tree would look like this:

      1   2   4     6     8   9
trunk o-->o-->o---->o---->o-->o      "the merge commit is at r9"
       \
        \   3     5     7
b1       +->o---->o---->o

However this way of specifying ranges of revisions gets quickly out of hand when the version tree grows as subversion didn't have any meta data on when and what revisions got merged together. Ponder on what happens later:

           12        14
trunk  …-->o-------->o
                                     "Okay, so when did we merge last time?"
              13        15
b1     …----->o-------->o

This is largely an issue by the repository design that Subversion has, in order to create a branch you need to create a new virtual directory in the repository which will house a copy of the trunk but it doesn't store any information regarding when and what things got merged back in. That will lead to nasty merge conflicts at times. What was even worse is that Subversion used two-way merging by default, which has some crippling limitations in automatic merging when two branch heads are not compared with their common ancestor.

To mitigate this Subversion now stores meta data for branch and merge. That would solve all problems right?

And oh, by the way, Subversion still sucks…

On a centralized system, like subversion, virtual directories suck. Why? Because everyone has access to view them… even the garbage experimental ones. Branching is good if you want to experiment but you don't want to see everyones' and their aunts experimentation. This is serious cognitive noise. The more branches you add, the more crap you'll get to see.

The more public branches you have in a repository the harder it will be to keep track of all the different branches. So the question you'll have is if the branch is still in development or if it is really dead which is hard to tell in any centralized version control system.

Most of the time, from what I've seen, an organization will default to use one big branch anyway. Which is a shame because that in turn will be difficult to keep track of testing and release versions, and whatever else good comes from branching.

So why are DVCS, such as Git, Mercurial and Bazaar, better than Subversion at branching and merging?

There is a very simple reason why: branching is a first-class concept. There are no virtual directories by design and branches are hard objects in DVCS which it needs to be such in order to work simply with synchronization of repositories (i.e. push and pull).

The first thing you do when you work with a DVCS is to clone repositories (git's clone, hg's clone and bzr's branch). Cloning is conceptually the same thing as creating a branch in version control. Some call this forking or branching (although the latter is often also used to refer to co-located branches), but it's just the same thing. Every user runs their own repository which means you have a per-user branching going on.

The version structure is not a tree, but rather a graph instead. More specifically a directed acyclic graph (DAG, meaning a graph that doesn't have any cycles). You really don't need to dwell into the specifics of a DAG other than each commit has one or more parent references (which what the commit was based on). So the following graphs will show the arrows between revisions in reverse because of this.

A very simple example of merging would be this; imagine a central repository called origin and a user, Alice, cloning the repository to her machine.

         a…   b…   c…
origin   o<---o<---o
                   ^master
         |
         | clone
         v

         a…   b…   c…
alice    o<---o<---o
                   ^master
                   ^origin/master

What happens during a clone is that every revision is copied to Alice exactly as they were (which is validated by the uniquely identifiable hash-id's), and marks where the origin's branches are at.

Alice then works on her repo, committing in her own repository and decides to push her changes:

         a…   b…   c…
origin   o<---o<---o
                   ^ master

              "what'll happen after a push?"


         a…   b…   c…   d…   e…
alice    o<---o<---o<---o<---o
                             ^master
                   ^origin/master

The solution is rather simple, the only thing that the origin repository needs to do is to take in all the new revisions and move it's branch to the newest revision (which git calls "fast-forward"):

         a…   b…   c…   d…   e…
origin   o<---o<---o<---o<---o
                             ^ master

         a…   b…   c…   d…   e…
alice    o<---o<---o<---o<---o
                             ^master
                             ^origin/master

The use case, which I illustrated above, doesn't even need to merge anything. So the issue really isn't with merging algorithms since three-way merge algorithm is pretty much the same between all version control systems. The issue is more about structure than anything.

So how about you show me an example that has a real merge?

Admittedly the above example is a very simple use case, so lets do a much more twisted one albeit a more common one. Remember that origin started out with three revisions? Well, the guy who did them, lets call him Bob, has been working on his own and made a commit on his own repository:

         a…   b…   c…   f…
bob      o<---o<---o<---o
                        ^ master
                   ^ origin/master

                   "can Bob push his changes?" 

         a…   b…   c…   d…   e…
origin   o<---o<---o<---o<---o
                             ^ master

Now Bob can't push his changes directly to the origin repository. How the system detects this is by checking if Bob's revisions directly descents from origin's, which in this case doesn't. Any attempt to push will result into the system saying something akin to "Uh... I'm afraid can't let you do that Bob."

So Bob has to pull-in and then merge the changes (with git's pull; or hg's pull and merge; or bzr's merge). This is a two-step process. First Bob has to fetch the new revisions, which will copy them as they are from the origin repository. We can now see that the graph diverges:

                        v master
         a…   b…   c…   f…
bob      o<---o<---o<---o
                   ^
                   |    d…   e…
                   +----o<---o
                             ^ origin/master

         a…   b…   c…   d…   e…
origin   o<---o<---o<---o<---o
                             ^ master

The second step of the pull process is to merge the diverging tips and make a commit of the result:

                                 v master
         a…   b…   c…   f…       1…
bob      o<---o<---o<---o<-------o
                   ^             |
                   |    d…   e…  |
                   +----o<---o<--+
                             ^ origin/master

Hopefully the merge won't run into conflicts (if you anticipate them you can do the two steps manually in git with fetch and merge). What later needs to be done is to push in those changes again to origin, which will result into a fast-forward merge since the merge commit is a direct descendant of the latest in the origin repository:

                                 v origin/master
                                 v master
         a…   b…   c…   f…       1…
bob      o<---o<---o<---o<-------o
                   ^             |
                   |    d…   e…  |
                   +----o<---o<--+

                                 v master
         a…   b…   c…   f…       1…
origin   o<---o<---o<---o<-------o
                   ^             |
                   |    d…   e…  |
                   +----o<---o<--+

There is another option to merge in git and hg, called rebase, which'll move Bob's changes to after the newest changes. Since I don't want this answer to be any more verbose I'll let you read the git, mercurial or bazaar docs about that instead.

As an exercise for the reader, try drawing out how it'll work out with another user involved. It is similarly done as the example above with Bob. Merging between repositories is easier than what you'd think because all the revisions/commits are uniquely identifiable.

There is also the issue of sending patches between each developer, that was a huge problem in Subversion which is mitigated in git, hg and bzr by uniquely identifiable revisions. Once someone has merged his changes (i.e. made a merge commit) and sends it for everyone else in the team to consume by either pushing to a central repository or sending patches then they don't have to worry about the merge, because it already happened. Martin Fowler calls this way of working promiscuous integration.

Because the structure is different from Subversion, by instead employing a DAG, it enables branching and merging to be done in an easier manner not only for the system but for the user as well.

179
Jason

Possible Duplicate:
Change the current branch to master in git

I have two branch in my git repo:

  1. master
  2. seotweaks (created originally from master)

I created "seotweaks" with the intention of quickly merging it back into master, however that was 3 months ago and the code in this branch is 13 versions ahead of "master", it has effectively become our working master branch as all the code in "master" is more or less obsolete now.

Very bad practice I know, lesson learnt.

Do you know how I can replace all of the contents of the "master" branch with those in "seotweaks"?

I could just delete everything in "master" and merge, but this does not feel like best practice.

Answered By: ergosys ( 239)

You should be able to use the "ours" merge strategy to overwrite master with seotweaks like this:

git checkout seotweaks
git merge -s ours master
git checkout master
git merge seotweaks

The result should be your master is now essentially seotweaks.

178
Greg

I had a feature branch of my trunk and was merging changes from my trunk into my branch periodically and everything was working fine. Today I went to merge the branch back down into the trunk and any of the files that were added to my trunk after the creation of my branch were flagged as a "tree conflict". Is there any way to avoid this in the future? I don't think these are being properly flagged.

Thanks.

Answered By: gicappa ( 225)

I found the solution reading the link that Gary gave (and I suggest to follow this way).

Summarizing to resolve the tree conflict committing your working dir with svn client 1.6.x you can use:

svn resolve --accept working -R .

where . is the directory in conflict.

144
netawater

I have forked a branch from a repository it github and commit something for my using. now I found the original repository has a good feature which is at head, I want to merge it only without previous commits. what I should do? I have known how to merge all commit:

git branch -b a-good-feature
git pull repository master
git checkout master
git merge a-good-feature
git commit -a
git push
Answered By: VonC ( 188)

'git cherry-pick' should be your answer here.

Apply the change introduced by an existing commit.

Do not forget to read bdonlan's answer about the consequence of cherry-picking in this post:
"Pull all commits from a branch, push specified commits to another", where:

A-----B------C
 \
  \
   D

becomes:

A-----B------C
 \
  \
   D-----C'

The problem with this commits is that git considers commits to include all history before them

Where C' has a different SHA-1 ID.
Likewise, cherry picking a commit from one branch to another basically involves generating a patch, then applying it, thus losing history that way as well.

This changing of commit IDs breaks git's merging functionality among other things (though if used sparingly there are heuristics that will paper over this).
More importantly though, it ignores functional dependencies - if C actually used a function defined in B, you'll never know.