Pachyderm: Digging out Dynamic Job ID

How do you get the job ID for a pachyderm job dynamically so you can debug it?

The docs suggest you can omit the job id, but it’s a lie!

TL;DR

job_id=($(pachctl list job --no-pager | awk '{ print $1 }' | grep -E '[[:alnum:]]{10,}'))
pachctl inspect job ${job_id}

Explanation

List the Jobs

pachctl list job will give you something like this.

ID                               PIPELINE              STARTED        DURATION  RESTART PROGRESS      DL UL STATE
67132a672c5d462888b254848224638a super_sexy_pipeline 22 seconds ago 5 seconds 0       0 + 0 + 1 / 1 0B 0B success

If you include paging, you may end up with a fun less error.

less: unrecognized option: X

For details, check out this issue. To skip it, use the --no-pager option.

Grab the First Columns

| awk '{ print $1 }'

This will pipe the job info into awk and grab the first columns.

Grep for a Job ID

| grep -E '[[:alnum:]]{10,}'))

Pipe that into grep and look for any alphanumeric strings at least 10 characters long.

Put it All Together

Glue it all, eval and assign, wine and dine, bingo bango. Did you really read this far? Dude, you already have the answer. Go home.