After a lot of pain, I learned that pachctl delete all will delete secrets, too!

If you’re reusing a pachyderm deployment (aka cluster) for, say, automatically looping through conditions to test various inputs against pipeilnes, you’ll need these secrets to stick around.

pachctl deploy local
pachctl create secret -f whatever.json
# secrets exists

If you create and trigger a pipline that needs access to a private docker registry, you’ll need to have added the image_pull_secrets option. When pachyderm creates a new kubernetes pod to handle the job, it will relay this to k8s via imagePullSecrets.

Right now, everything should work. The container will spring to life, fetch the image using your secrets to auth, and do its thing.

pachctl delete all
# secrets are gone, but the cluster remains

If you try and recreate the same thing you just did, your pipeline will fail. You will see something like

container "user" in pod "sweet-pipeline-v1-lpx8k" is waiting to start: trying and failing to pull image

Inspecting the job, you’ll see something like

Reason: rpc error: code = Unknown desc = failed to pull and unpack image
"": failed to resolve
reference "": failed to
authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized

If you check kubernetes,

kubectl get pods
NAME                                      READY   STATUS             RESTARTS   AGE
dash-866fd997-cpwh4                       2/2     Running            0          51m
etcd-58c9bf64b8-ld5l9                     1/1     Running            0          51m
pachd-7fb999d99c-g8b4d                    1/1     Running            1          51m
sweet-pipeline-v1-lpx8k                   1/2     ImagePullBackOff   0          30m

If you’re nasty, you may look at the actions for the pod.

kubectl describe pod sweet-pipeline-v1-lpx8k

  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Pulling    31m (x3 over 31m)     kubelet            Pulling image ""
  Warning  Failed     31m (x3 over 31m)     kubelet            Failed to pull image "": rpc error: code...
  Warning  Failed     31m (x3 over 31m)     kubelet            Error: ErrImagePull
  Warning  Failed     30m (x5 over 31m)     kubelet            Error: ImagePullBackOff
  Normal   BackOff    107s (x133 over 31m)  kubelet            Back-off pulling image ""

The kubelet couldn’t pull the image, ErrImagePull. Eventually it gave up trying, ImagePullBackOff.

Now, how do you actually check what secrets you have in place? That’s a big pain in the ass and Google will help you. This blog post is just a friendly reminder to trust the error messages.

  • It’s an access issue!
  • The secrets aren’t working.
  • Yes, you did copy the secrets correctly.
  • But pachctl delete all trashed them!