Skip to content

Conversation

@kalexand-rh
Copy link
Contributor

@kalexand-rh kalexand-rh added this to the Future Release milestone Sep 26, 2019
@kalexand-rh kalexand-rh self-assigned this Sep 26, 2019
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 26, 2019
@openshift-docs-preview-bot

The preview will be available shortly at:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should already have an ImageContentSourcePolicy. If you mirror from the same repos to the same mirror repo, you won't need to update it. Basically, folks should run oc adm release mirror ... and then merge the ImageContentSourcePolicy it suggests with the values that are already in their cluster, and only creating a fresh ImageContentSourcePolicy if none already exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking, when is the ImageContentSourcePolicy created? I don't see an explicit step to do it. Is the policy created when you drop the imageContentSources data into your install-config.yaml during installation?

If you can share the command to edit the existing policy, I'm happy to document that instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when is the ImageContentSourcePolicy created?

The installer creates it if you set imageContentSources. Otherwise, you'd have to create it from scratch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you trust your mirror, this is fine. If you aren't sure you can trust your mirror, you will want to use a by-digest pullspec here with the digest whose signature you verified earlier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster-version operator aborts that particular upgrade round, and starts over, attempting to apply it again. Maybe just say:

If an upgrade fails, ClusterVersion will include a condition explaining why.

@vikram-redhat
Copy link
Contributor

@kalexand-rh ready for QE?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want release_digest instead of release_version? Per my past testing, release_digest does not work. Refer to https://jira.coreos.com/browse/CORS-1105?focusedCommentId=121819&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-121819. Since this discussion, QE always use release_version for testing, but not release_digest!

Copy link

@jianlinliu jianlinliu Oct 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If just review, I think oc get ImageContentSourcePolicy -o yaml will be more safe. In QE's testing, we suggest to create new ImageContentSourcePolicy, but not edit old ImageContentSourcePolicy, That would be more easier for track and management.
There is some discussion in https://jira.coreos.com/browse/CORS-1105?focusedCommentId=115606&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-115606.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are managing ImageContentSourcePolicy by creating a new object, there we should use oc delete ImageContentSourcePolicy xxxx to remove old ones.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to replace this command with the following command:
$ oc delete ImageContentSourcePolicy

@kalexand-rh
Copy link
Contributor Author

@jianlinliu, I think I've captured all of your updates. I'd rather publish what you've been testing. Will you PTAL?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this command would get a correct DIGEST.
`
$ oc adm release info quay.io/openshift-release-dev/ocp-release:4.2.0-rc.2 | sed -n 's/Pull From: //p'

quay.io/openshift-release-dev/ocp-release@sha256:9a1a38ce4d5fa79c785801eca3679a759bede9f40081b9e1b9f0cca2baa673a
`

The DIGEST should be 'sha256:9a1a38ce4d5fa79c785801eca3679a759bede9f40081b9e1b9f0cca2baa673a2', right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Could use s/Pull From: .*@//p to jump straight to the digest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run this command, I get this:
$ gpg --verify signature-1 gpg: Signature made Tue 08 Oct 2019 10:12:39 AM EDT using RSA key ID F21541EB gpg: Can't check signature: No public key
Seem like here miss some steps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, prob need to import at least beta2 from the release key page.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://access.redhat.com/security/team/key links to https://access.redhat.com/articles/3530471 as "a manual process has been established". Maybe just lean on that for downloading a key and verifying with that? For releases, the signatures are so far all by the beta2 key, which is https://www.redhat.com/security/data/f21541eb.txt .

Copy link

@jianlinliu jianlinliu Oct 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to follow the procedure mentioned in the doc, but still failed, any idea?
`
$ wget https://www.redhat.com/security/data/f21541eb.txt

$ gpg --no-default-keyring --keyring ./temp.keyring --import ./f21541eb.txt
gpg: /root/.gnupg/trustdb.gpg: trustdb created
gpg: key F21541EB: public key "Red Hat, Inc. (beta key 2) [email protected]" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)

$ gpg --verify signature-1
gpg: Signature made Tue 08 Oct 2019 10:12:39 AM EDT using RSA key ID F21541EB
gpg: Can't check signature: No public key
`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking, will you PTAL?

cc: @abhinavdahiya since Trevor's not working on install work this sprint

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the issue with @jianlinliu's earlier attempt was 4.2.2's signature-1 coming from release 2 instead of beta 2:

$ DIGEST="$(oc adm release info quay.io/openshift-release-dev/ocp-release:4.2.2 | sed -n 's/Pull From: .*@//p')"
$ echo "${DIGEST}"
sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0
$ URI="https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST/:/=}/signature-1"
$ echo "${URI}"
https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/sha256=dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0/signature-1
$ wget https://www.redhat.com/security/data/fd431d51.txt
$ gpg --no-default-keyring --keyring ./temp.keyring --import fd431d51.txt
$ gpg --no-default-keyring --keyring ./temp.keyring --verify signature-1
gpg: Signature made Fri 25 Oct 2019 08:47:11 AM PDT using RSA key ID FD431D51
gpg: Good signature from "Red Hat, Inc. (release key 2) <[email protected]>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 567E 347A D004 4ADE 55BA  8A5F 199E 2F91 FD43 1D51

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we're going to document which keys to import, probably best to say "both f21541eb and fd431d51" to cover our bases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kalexand-rh Great. @wking's steps works well now. pls update those steps in our doc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the following cluster upgrade command, we decide to use release_version, then here we should NOT mirror release payload image with DIGEST

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we need a tag here to keep the mirrored release from being garbage collected.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to use release_version, here should revert 'DIGEST' back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want @${DIGEST} here to guard against mirror-registry mutation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this step, upgrade is not started yet, before your cluster is totally upgraded, I do not suggest to remove the old ones. It maybe risky.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove operation should happened in the end - 'If you modified the contents of the repositoryDigestMirrors section' line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Union here. Prune later, if you want, certainly after upgrade completes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove operation should happened in the end - 'If you modified the contents of the repositoryDigestMirrors section' line.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I really like this section, check if there is difference, if yes, create a new ImageContentSourcePolicy, if not, no need create new one.

My last comment was suggesting some minor change based on this section, no need comment them out, maybe need withdraw your last change.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/add them to that section/create a new ImageContentSourcePolicy resource/, then add following operations:
$ cat test.yaml
$ oc create -f test.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to replace this command with the following command:
$ oc delete ImageContentSourcePolicy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If new ImageContentSourcePolicy is created, we need do some more check before start the cluster update.
The new ImageContentSourcePolicy settings will be deployed to each node and shortly start using the mirrored repository for requests to the source repository.

To check that the mirrored configuration worked, go to one of your nodes. For example:
`
$ oc get node

NAME STATUS ROLES AGE VERSION

ip-10-0-137-44.ec2.internal Ready worker 7m v1.14.6+90fadebfa

ip-10-0-138-148.ec2.internal Ready master 11m v1.14.6+90fadebfa

ip-10-0-139-122.ec2.internal Ready master 11m v1.14.6+90fadebfa

ip-10-0-147-35.ec2.internal Ready,SchedulingDisabled worker 7m v1.14.6+90fadebfa

ip-10-0-153-12.ec2.internal Ready worker 7m v1.14.6+90fadebfa

ip-10-0-154-10.ec2.internal Ready master 11m v1.14.6+90fadebfa

`

You can see that scheduling on each worker node is disabled as the change is being applied.

If no ImageContentSourcePolicy resource is created, we could skip this check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If new ImageContentSourcePolicy is created, we need do some more check before start the cluster update...

Ugh. What if you miss a SchedulingDisabled? This really needs to be represented in status on some object so we can use a level-check instead of SchedulingDisabled edge checks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I am not sure what unexpected behaviour will happen when upgrade is started but new mirror registry configuration is not laid down onto node yet. Maybe upgrade will reconcile until new mirror registry configuration is totally laid down. I think waiting for new ImageContentSourcePolicy setting onto nodes before upgrade would be safer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that upgrading after mirror propagation is safer. I just wish we had something better than hoping to catch all the SchedulingDisabled as they flit past.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. If use other way instead of SchedulingDisabled checks, I would suggest to use oc describe node | grep machineconfig to ensure each node is in 'Done' state, that means new ImageContentSourcePolicy mirror setting is applied to node via MCO, WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...to ensure each node is in 'Done' state...

But presumably it was in Done before you added the new ImageContentSourcePolicy. I don't know what sort of delay there is before you start seeing node-side effects. And there's no status on ImageContentSourcePolicy... I'll see if I can get something more clear, but checking for Done status on the nodes after a brief wait seems like the best we have at the moment.

@kalexand-rh
Copy link
Contributor Author

@wking, @jianlinliu, I think I've captured all of your changes. Will you please let me know what else needs to change?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be some optional operation. Not a must.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part is already covered by "oc delete ImageContentSourcePolicy" command in the above section, should be removed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a product release, I think here no need --force option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that for disconnected installations you still need --force at least with 4.2.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding some words like - "Wait for some mins, then confirm that the new ImageContentSoucePolicy is in effect"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the following lines:

<5> For <path_to_pull_secret>, specify the absolute path to and file name of the pull secret for your mirror registry that you created.

The file not only need the pull secret for your mirror registry that you created, but also the pull secret from the OpenShift Infrastructure Providers page. So the description here is not accurate. The description in 'Prerequisites' of disconnected install doc maybe more clear:

You downloaded the pull secret from the OpenShift Infrastructure Providers page and modified it to include authentication to your mirror repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the mirror steps to use both the local registry and the registry.redhat.io pull secrets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For disconnected clusters, we'll need different authorities in the pull secret for different steps (some previous discussion here). For mirroring from Quay to the mirror registry (which may be what this step is?), you'll need quay.io and your mirror token. For launching the cluster from your mirror registry, you'll only need the mirror token. For pushing telemetry from the cluster (only when you have network access to us, possibly via a proxy), you'll need ... cloud.openshift.com? Etc. But it may be easier to just have a secret with the mirror token and the cloud.openshift.com-sourced tokens, and use that everywhere without trying to break it down to minimal sets for each action.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.openshift.com/container-platform/4.2/installing/installing_restricted_networks/installing-restricted-networks-preparations.html#installation-local-registry-pull-secret_installing-restricted-networks-preparations looks good to me, Should NOT be replaced. Just like what wking explained, At this step, user is going to mirror release image, so need auth for both Quay and mirror registry token.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jianlinliu, doesn't that just give you the mirror registry token? I'm proposing changing to https://osdocs628--ocpdocs.netlify.com/openshift-enterprise/latest/installing/installing_restricted_networks/installing-restricted-networks-preparations.html#installation-adding-registry-pull-secret_installing-restricted-networks-preparations, which has the same step to generate the mirror registry token and then add it to the pull secret that you get from the OpenShift Infrastructure Providers page.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I missed your new change. Thanks for pointing the url to me. Most looks good to me except registry.redhat.io pull secret name looks like a bit weird.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we name the pull secret registry.redhat.io? The pull secret including several auths for different registries, registry.redhat.io is only one of them. Personally I think this is easy to confuse user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To differentiate it from the other pull secret. Will you please take a look at the new commit and let me know if it's ok now?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kalexand-rh kalexand-rh added the peer-review-needed Signifies that the peer review team needs to review this PR label Nov 6, 2019
@kalexand-rh
Copy link
Contributor Author

@openshift/team-documentation, will you PTAL?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to give folks instructions on constructing this name? It should be signature-1, falling back to signature-2, etc. until you either get an acceptable signature ("good sig by Red Hat's f21541eb; the release is fine") or a 404 and give up ("I guess this release isn't signed by anyone I trust").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing something. Isn't this just the name of the signature that you're storing? What would you be falling back to?

Copy link
Member

@wking wking Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs for signature-lookup here and here. Yes, <signature_name> is going to be the local filename, but you can't pick any filename, it needs to follow signature- to match a signature that exists on our signature mirrors.

@ahardin-rh ahardin-rh added peer-review-done Signifies that the peer review team has reviewed this PR and removed peer-review-needed Signifies that the peer review team needs to review this PR labels Nov 6, 2019
@ahardin-rh
Copy link
Contributor

Only 2 small suggestions from me. Looks good!

@kalexand-rh
Copy link
Contributor Author

@wking, I've attempted to address your most recent feedback. Thank you! Will you please take another look?

$ URI="https://mirror.openshift.com/pub/openshift-v4/signatures/openshift/release/${DIGEST/:/=}/signature-<n>" <1>
----
<1> Specify the name of the signature to store.
<1> Specify the name of the signature to store. The signature name must be in the format of `signature-<n>`, where `<n>` is a number. If you have multiple signatures, the signature that includes the lowest number in its name is applied first, and other signatures are applied in increasing numerical order if one fails.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe:

<1> Specify the name of the signature to store. The signature name must be in the format of signature-<n>, where <n> is a number beginning with 1. If a signature is invalid, increment <n> until you find a valid signature or the requested signature-<n> does not exist.
Or some such. That adds "start at 1" and drop "is applied first", because this is more about "has anyone I trust signed this image?" and not about "who did something first?".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... If a signature is invalid, increment <n> until you find a valid signature or the requested signature-<n> does not exist.

Where do these other certificates come from? Is this a hidden step to say "If you get to $ gpg --no-default-keyring --keyring ./temp.keyring --verify <signature_name> and that doesn't work, then name the next one signature_n+1, and it'll just run through all the signature_n files in that directory until it finds a functioning one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these other certificates come from?

We don't have any at the moment, but may grow some during a future key-rotation or some such. In that case, we'd have signature-1 by the new key and signature-2 by the old key (order insignificant).

Is this a hidden step...

I was hoping that these docs made it explicit, but yeah "keep going until you find a satisfactory sig or run out of options" is the intended behavior. Hopefully we can teach oc to do this soon...

Copy link
Contributor Author

@kalexand-rh kalexand-rh Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these other certificates come from?

We don't have any at the moment, but may grow some during a future key-rotation or some such. In that case, we'd have signature-1 by the new key and signature-2 by the old key (order insignificant).

"We" meaning OCP devs or "we" meaning the person making these certs?

How and why would they have more than one, and why would they be checking multiples at this point? If I'm reading the process right, they generate a very specific certificate and then check it. Is it actually relevant that you can check multiples until you find a valid one?

Is this a hidden step...

I was hoping that these docs made it explicit, but yeah "keep going until you find a satisfactory sig or run out of options" is the intended behavior. Hopefully we can teach oc to do this soon...

That's what I'm trying to figure out. I'm not seeing a point in the current flow where a user would have a failed certificate or multiple certificates. There's not an explicit step to make another one. I'm trying to figure out if I need to another step around line 173 and what it would say or if this is a cool detail that's not helpful in this flow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or "we" meaning the person making these certs?

This one.

Is it actually relevant that you can check multiples until you find a valid one?

Not at the moment, but I'd be nervous if our manual-verification docs only supported signature-1 when all the software implementations were set up to walk available sigs. But maybe that would be fine and we'll have oc tooling for this by the time it matters?

I'm trying to figure out if I need to another step around line 173...

Yeah, if you think the wording here isn't enough, you'd want a "if the sig doesn't check out, increment signature-<index> , return to defining URI, and try again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if you think the wording here isn't enough, you'd want a "if the sig doesn't check out, increment signature-<index> , return to defining URI, and try again.

Yup. I'm adding a commit with another step. (If I took your suggestion as-is, I would expect to see a bug about the note not being clear enough and having to add the step later.) Thank you for working through this with me, and please let me know if you see more changes after the update.

.. Verify the signature:
+
----
$ gpg --no-default-keyring --keyring ./temp.keyring --verify <signature_name> <1>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that this is checking for a valid sig, but not checking the signed content (so a single valid sig could be applied to multiple images). I'll get instructions for checking content when I get in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so what we want is --decrypt, not --verify.

$ gpg --no-default-keyring --keyring ./temp.keyring --decrypt signature-1 
{"critical": {"image": {"docker-manifest-digest": "sha256:a7e97365d16d8d920fedd3684b018b780337e069deb1dd8500e866c0d6110334"}, "type": "atomic container signature", "identity": {"docker-reference": "quay.io/openshift-release-dev/ocp-release:4.1.20"}}}gpg: Signature made Tue 15 Oct 2019 02:21:18 PM PDT using RSA key ID F21541EB
gpg: Good signature from "Red Hat, Inc. (beta key 2) <[email protected]>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: B08B 659E E86A F623 BC90  E8DB 938A 80CA F215 41EB

We want:

$ gpg --no-default-keyring --keyring ./temp.keyring --decrypt signature-1 | jq -r '.critical.image["docker-manifest-digest"]'
gpg: Signature made Tue 15 Oct 2019 02:21:18 PM PDT using RSA key ID F21541EB
gpg: Good signature from "Red Hat, Inc. (beta key 2) <[email protected]>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: B08B 659E E86A F623 BC90  E8DB 938A 80CA F215 41EB
sha256:a7e97365d16d8d920fedd3684b018b780337e069deb1dd8500e866c0d6110334

to match ${DIGEST}. Docs on the full structure here, altough I dunno how formal we want to get. I'll work on oc tooling for this next week so I'm ok cutting some corners in the docs here in the meantime, as long as we're confirming that the expected digest is in the signed content.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not even that.

This code, which resolves ${DIGEST} from the :${OCP_RELEASE} tag above, and then uses :${OCP_RELEASE} again in oc adm release mirror, is INSECURE and MUST NOT be used; the registry can maliciously modify the :${OCP_RELEASE} tag in the meantime. (Alternatively, if you trust the registry never to do that, and to always point at legitimate OCP images, you don’t need to bother with signatures, by definition.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs on the full structure here, altough I dunno how formal we want to get. I'll work on oc tooling for this next week

Please use c/image/copy.Image, to do copying with signature verification integrated, if at all possible. Feel free to ping me anytime if you need help with that.

I don’t want copy&pasted partial implementations of that format all over OCP if I can help it.

@LalatenduMohanty
Copy link
Member

@wking @kalexand-rh Can merge this asap. So that we can PRS for the enhancements we are trying to do in the workflow.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2020
@openshift-ci-robot
Copy link

@kalexand-rh: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@shawn-hurley
Copy link

@kalexand-rh @LalatenduMohanty @sdodson wondering if we have this doc for 4.4?

@LalatenduMohanty
Copy link
Member

@kalexand-rh @LalatenduMohanty @sdodson wondering if we have this doc for 4.4?

Yeah, I am trying to get it in for 4.4

@rheinzma
Copy link

rheinzma commented Mar 30, 2020

I tried the procedure in this pull request with 4.3.8 in an air gapped environment (no proxy). The node running the "oc adm mirror" command has access to both, quay.io as well as my local registry mirror from which the installation was performed.

Actually the procedure did not work, as after performing

oc adm upgrade --allow-explicit-upgrade --to-image ${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}@${DIGEST}

The version operator reported

NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.8     True        True          41m     Unable to apply x.x.x.x:443/ocp4/openshift4@sha256:a414f6308db72f88e9d2e95018f0cc4db71c6b12b2ec0f44587488f0a16efc42: the image may not be safe to use

The log shows:

W0330 12:18:10.807420       1 updatepayload.go:100] An image was retrieved from "192.168.100.98:443/ocp4/openshift4@sha256:a414f6308db72f88e9d2e95018f0cc4db71c6b12b2ec0f44587488f0a16efc42" that failed verification: The update cannot be verified: context deadline exceeded
E0330 12:18:10.836413       1 sync_worker.go:329] unable to synchronize image (waiting 2m52.525702462s): Unable to download and prepare the update: timed out waiting for the condition

When I ran with --force it worked:

oc adm upgrade --allow-explicit-upgrade --force --to-image ${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}@${DIGEST}

Some other notes regarding the PR:

Section "Update the contents of the OpenShift Container Platform image repository"

  • 2: comment 1 - "<release_version>" now needs a platform (e.g. OCP_RELEASE=4.3.8-x86_64) also
  • 3+4: We mirror the registry in Step (3) as well as (6) which is redundant and a little risky, as we should FIRST verify the signature and THEN mirror the image right ? So (3) and (4) has no value in this command chain as it is repeated in step (6).
  • 5b: "signature-n" needs explaination, as I had to "guess" what to do here.

Section "Update the restricted network cluster"

  • 3: This should reuse the same variables as above, to make it easier for the user
oc adm upgrade --allow-explicit-upgrade --to-image ${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}@${DIGEST}

@LalatenduMohanty
Copy link
Member

FYI, we created a separate PR for disconnected upgrades and it is merged now . #21993

@kalexand-rh
Copy link
Contributor Author

FYI, we created a separate PR for disconnected upgrades and it is merged now . #21993

The method in this PR is obsolete, and I don't see any context that would still be helpful. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. peer-review-done Signifies that the peer review team has reviewed this PR size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.