gitlab - Git submodule detached state warning

I've a GitLab repository with some branches. And i also have submodule (f.e. version of commit v2) added to this project. Yesterday i updated submodule (from v2 to v3) master branch, and i used git submodule update to download these changes to my project. In my project i'm working at develop branch, not master.

But this command just gave me lastest version of submodule already used on main project (v2). Then i used git pull on submodule to get this submodule changes and i got them (v3).

But now i've the next message in git status:

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   shared-module (new commits)

And when i'm trying to perform commit i'm receiving warning:

The Git repository at the following path is in the detached HEAD state: C:projectsdisplay-uploadershared-module

I tried:

git branch temp
git checkout temp
git branch -f develop temp
git checkout develop

It didn't help.

Then i again did git submodule update to fix detached state (message with Changes not staged for commit disappeared but submodule now is again on v2). Then i used (both) solutions from post

git submodule update --remote --merge
git submodule update --remote --rebase

But it also didn't help. What can I do?

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

To a first approximation, all submodules are always in detached-HEAD mode. Expect this, and use it. Remember, a detached HEAD repository simply has some commit checked out by its raw hash ID. All commits have a raw hash ID; that hash ID is the true name of the commit. Git normally finds the hash ID through a name—typically a branch name—but when using submodules, Git finds the hash ID for each submodule from a commit in the superproject, without using a branch name in the submodule.

Now, obviously there are some cases where some submodule(s) is/are not in that mode. In particular, you—as a human being trying to use Git to make new commits—may want to enter some submodule and take it out of the detached-HEAD mode and get some work done. That's all perfectly fine; you just need to be aware that, while you're in there doing work, this submodule is not being used normally.

It is, in effect, taken out of operation and put on a test workbench. It's possible that, in order to use and test it—even on the test workbench—you need it to be surrounded by its superproject. That's fine too. Just remember that while you work on it, it's in this "test-bed" environment, even if everything else looks the same. That test-bed-environment is why it's on some branch, instead of being in detached-HEAD mode.

When your various tests are done and the submodule is working the way you wish it to work, you can make a new commit in the submodule (though perhaps at this point you already did that). That new commit has updated some branch name in this submodule—that's why you put it on a branch in the first place—and you're now ready to git push this new commit out to some other copy of this submodule repository. It's time to talk about this other clone of the submodule. In fact, we should talk about all the clones of all the Git repositories.

How to think about the problem

In setups like this, there tend to be a lot of Git repositories involved. It's pretty typical to have a minimum of six different Git repositories, and there might be even more of them. I'm going to mention GitHub here instead of GitLab, but the principles are pretty similar. When we have six, eight, fifty, or however many repositories involved, it can get hard to keep them all straight, so for discussion purposes, let's give them names. Here are the six repositories we might be concerned with at this point. Three of them are various clones of the superproject, and three of them are various clones of the submodule, so let's come up with three names for superproject P:

Paul is the GitHub repository from which people clone the superproject.
Peter is your non-testbed clone of Paul, on your operations machine.
Pickle is your testbed clone of Paul, on your development machine.
Sam is the GitHub repository from which people clone the submodule.
Seldon is the non-testbed clone of Sam, on your operations machine.
Sidwell is your testbed clone of Sam, on your development machine.

(Mnemonic: first letter: P = superProject; S = Submodule. Second letter, A = master repo; E = mid-level repo; I = testbed / development repo. Hence Paul and Sam vs Pickle and Sidwell.)

Steps, one by one

You have made new commit(s) on/in Sidwell. You will soon need to send these new commit(s) to Sam, so that they're available to Seldon.

You may have already tested this thoroughly. Or, you may want to make a commit first, then test things. Let's assume the latter (though if all of this breaks you might want to throw away the new superproject commit). If you've already tested, this is just the stuff you do after testing. If you have a fancy CI system, you may need to do all this before testing, so as to run the CI system over the new commits.

So: in order to test these new Sidwell commit, you now wish to update Pickle so that he has, as his current commit on some branch, a reference to the new commit in Sidwell. To do this part, you don't actually need Sidwell's commit) to be on any branch. You just need Sidwell to have the correct commit checked out.¹

We are now going to create a new commit in Pickle. We tell Pickle that, whenever he checks out this particular (new) commit, the detached-HEAD commit he should call for, in Sidwell, is the current commit in Sidwell. The way we do that is to run:

git add path/to/sidwell

in Pickle. The next git commit, and all subsequent git commits until someone changes this yet again, will now call for that hash ID (see footnote 1 for one way to see that hash ID). We can run any additional git adds we like, if we want other files updated; then we run:

git commit

and supply an appropriate commit message, and do all that stuff. This makes a new commit in repository Pickle; the new commit in Pickle refers to the new commit in Sidwell.

We're ready to push some commits, but now there's an ordering issue.

¹Remember, each Git repository always has some commit checked out, perhaps with some uncommitted updates sitting around as well. At this point, you think everything's ready, so it's probably committed in Sidwell. It needs to be committed in Sidwell, otherwise Pickle can't refer to the commit! In the Sidwell repository, running:

git rev-parse HEAD

will produce the hash ID of the desired submodule commit. Whether there is a branch name for this is irrelevant at this particular point.

Pushing commits to Sam and Paul

Before anyone else—such as Seldon, or a CI system—can use the new commit you made on Pickle (in some branch), you have to send the Pickle commit to Paul. Paul is, after all, where every other Git repository—including Peter—goes to get new commits.

But if you just send the new Pickle commit to Paul, that new commit on Paul, once received at Peter, will call for a commit in Seldon, that isn't in Seldon. You made the commit in Sidwell (your workbench submodule). So you must git push this commit first.

To git push this commit, you will need to enter the Sidwell repository and run a git push command that sends this commit to Sam. To do that, you—a human being, not a computer program—don't want to mess around with raw hash IDs. You want to use a name. In fact, you have to use a name in Sam because you have to tell Sam set some name to remember a hash ID. That's what git push is about: it sends commits you have, that "they" don't—whoever "they" are—and then asks them to set some name to remember the hash ID, so that humans don't have to remember raw hash IDs. This is why you put Sidwell on a branch earlier: you wanted to be ready for this step! So, since you're on a branch—say, feature/tall or whatever—you can, while working in Sidwell, just run:

git push origin feature/tall

and Sidwell will send your new commit to Sam and ask Sam to set his feature/tall name to hold the hash ID of the new commit. (Whew!)

Now that Sam has the commit it is now safe to git push from Pickle to Paul. Enter your Pickle repository and run:

git push origin develop

or whatever branch name you are using in Pickle here, to (a) send the new commit(s) to Paul, then (b) ask Paul to update Paul's develop.

You are now ready to ask other Gits to update

If you now want to run this new commit on your operations system, you just Peter—that's the superproject on your operations system—to fetch the new commit from Paul as usual. Peter—or the machine running him anyway—gets the commit from Paul, checks it out, sees that the submodule hash ID in the new commit calls for an update to Seldon, and issues the git submodule update command (in Peter) to achieve that. This causes Seldon to get the new commit from Sam and check it out (as a detached HEAD as usual).

This is messy but straightforward

You now have all the pieces to be able to think about what's going on. You make a new submodule commit, and then you inform some superproject about the commit hash ID and make a new superproject commit. This can take place on a development machine, with or without local testing. But once it's ready to be distributed, then the new submodule commit must become available to everyone—which happens through a branch name in some other repository, so now you send the submodule's new commit upstream. This upstream submission takes some time: maybe just a few milliseconds, maybe days if you have to get it through someone's review process.

Once the submodule commit is available to anyone who will need it, now you can send the new superproject commit upstream too. Now anyone who gets the new superproject commit can use git submodule update to update their own submodule clone to get the new submodule commit, and hence have their superproject and submodule in sync.

The only use for a branch name in the submodule was that brief period to get the new commit sent upstream to the submodule's origin. Even there, the only real need for a branch name is to make the git push step happen: you can supply it during the git push itself. It's just too error prone to do it that way, if you're a human; but if you're automating all of this for a computer to do, you could skip the "put submodule on branch" step entirely.

Because of the way we tend to build systems—with centralized repositories—we end up with six repositories involved at the minimum: the super-and-sub central repo, the super-and-sub clones on the development system, and the super-and-sub clones on the deployment or CI or whatever system. The nature of submodules requires that these be updated in pairs, but git push is inherently one-repository-at-a-time. That's why it is messy.

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

gitlab - Git submodule detached state warning