Blog

$ less blog.txt
This page is still kind of buggy with the pagination and format. If you prefer, see an archive of all of my blog posts.

SSH Commit Signing Part 2: Automation and Multi-Machine Setup

2025-10-17

In my previous post, I covered the basics of signing Git commits with SSH keys instead of GPG. This post covers the automation and multi-machine setup I built to make SSH signing seamless across 12+ machines.

The Challenge

Managing SSH commit signing across multiple machines introduces several challenges:

  1. Multiple keys - Each machine has its own SSH key
  2. Multiple emails - Different projects use different commit emails
  3. Verification - Git needs to verify signatures from any of your keys
  4. GitHub - All keys need to be registered as signing keys
  5. Consistency - Configuration should be identical across machines

The Solution: Centralized Configuration

I created three interconnected repositories:

  • dotfiles (public) - Git configuration and aliases
  • bash-ftw (public) - Bash utilities and installation helpers
  • pubkeys (private) - SSH public keys and automation scripts

Key Components

1. Dynamic Key Selection

Instead of hardcoding a specific key per machine, use ssh-agent to automatically select the first available key:

# In ~/.gitconfig or ~/code/dotfiles/.gitconfig
[gpg]
    format = ssh

[gpg "ssh"]
    allowedSignersFile = ~/.ssh/allowed_signers
    defaultKeyCommand = ssh-add -L | head -n1

[commit]
    gpgsign = true

Benefits:

  • No per-machine configuration needed
  • Works with any key loaded in ssh-agent
  • Portable across all your machines

2. The allowed_signers File

Git’s allowed_signers file verifies commit signatures. The format is:

email key-type key-data comment

The key insight: Create a cross-product of all your emails × all your keys:

hello@jontsai.com ssh-ed25519 AAAAC3... laptop-key
hello@jontsai.com ssh-rsa AAAAB3... desktop-key
hello@jontsai.com ssh-ed25519 AAAAC3... server-key
user@example.com ssh-ed25519 AAAAC3... laptop-key
user@example.com ssh-rsa AAAAB3... desktop-key
user@example.com ssh-ed25519 AAAAC3... server-key

This allows Git to verify commits signed by any of your keys with any of your email addresses.

3. Automated Generation Script

Create scripts/generate_allowed_signers.sh:

#!/bin/bash
# Generate allowed_signers file for Git SSH commit signing

set -e

EMAILS_FILE="${EMAILS_FILE:-emails.txt}"
OUTPUT="${OUTPUT:-allowed_signers}"

# Read emails (filter out comments and empty lines)
emails=$(grep -v '^#' "$EMAILS_FILE" | grep -v '^[[:space:]]*$' || true)

# Clear output file
> "$OUTPUT"

# Enable nullglob for Mac compatibility
shopt -s nullglob

# Process all .pub files
for pubkey in *.pub delegates/*.pub; do
    if [ -f "$pubkey" ]; then
        key_content=$(cat "$pubkey")

        # For each key, add entry for each email
        echo "$emails" | while IFS= read -r email; do
            echo "$email $key_content" >> "$OUTPUT"
        done
    fi
done

shopt -u nullglob

echo "Generated $OUTPUT with $(wc -l < "$OUTPUT") keys"

Create an emails.txt file:

# Email addresses used for git commits
hello@jontsai.com
jontsai@users.noreply.github.com

4. Makefile for Easy Management

Create a Makefile to orchestrate everything:

.PHONY: help install install-authorized_keys install-allowed_signers github-signing-keys

## help - Display available targets
help:
	@cat Makefile | grep '^## ' --color=never | cut -c4- | \
	  sed -e "`printf 's/ - /\t- /;'`" | column -s "`printf '\t'`" -t

## authorized_keys - Generate authorized_keys file
authorized_keys: $(wildcard *.pub) $(wildcard delegates/*.pub)
	cat *.pub delegates/*.pub > authorized_keys
	chmod 600 authorized_keys

## allowed_signers - Generate allowed_signers file
allowed_signers: emails.txt scripts/generate_allowed_signers.sh $(wildcard *.pub)
	scripts/generate_allowed_signers.sh
	chmod 600 allowed_signers

## install - Install authorized_keys and allowed_signers to ~/.ssh
install: authorized_keys allowed_signers
	cp -v authorized_keys ~/.ssh/authorized_keys
	cp -v allowed_signers ~/.ssh/allowed_signers
	chmod 600 ~/.ssh/authorized_keys ~/.ssh/allowed_signers

## github-signing-keys - Add all keys to GitHub as signing keys
github-signing-keys:
	scripts/add_github_signing_keys.sh

5. Automated GitHub Key Upload

Create scripts/add_github_signing_keys.sh:

#!/bin/bash
# Add all public keys to GitHub as signing keys using gh CLI

set -e

# Check if gh is installed
if ! command -v gh &> /dev/null; then
    echo "ERROR: gh CLI is not installed"
    echo "Install from: https://cli.github.com/"
    exit 1
fi

# Check authentication
if ! gh auth status &> /dev/null; then
    echo "ERROR: Not authenticated with GitHub"
    echo "Run: gh auth login"
    exit 1
fi

# Check for required permissions
echo "Checking for required permissions..."
if ! gh ssh-key list &> /dev/null; then
    echo "ERROR: Missing required permission scope"
    echo ""
    echo "To grant this permission, run:"
    echo "    gh auth refresh -h github.com -s admin:ssh_signing_key"
    exit 1
fi

echo "Adding all public keys to GitHub as signing keys..."

success_count=0
skip_count=0
error_count=0

for pubkey in *.pub delegates/*.pub; do
    if [ -f "$pubkey" ]; then
        title=$(basename "$pubkey" .pub)
        echo -n "Adding $title... "

        output=$(gh ssh-key add --type signing "$pubkey" --title "$title" 2>&1)
        exit_code=$?

        if [ $exit_code -eq 0 ]; then
            echo "done"
            success_count=$((success_count + 1))
        elif echo "$output" | grep -q "already exists"; then
            echo "already exists (skipped)"
            skip_count=$((skip_count + 1))
        else
            echo "FAILED"
            echo "  Error: $output"
            error_count=$((error_count + 1))
        fi
    fi
done

echo ""
echo "Summary: $success_count added, $skip_count skipped, $error_count errors"

6. Git Aliases for Viewing Signatures

Add to your ~/.gitconfig:

[alias]
    # Compact log with signature status
    slog = log --pretty=format:\"%C(auto)%h %G? %C(blue)%an%C(reset) %s %C(dim)(%ar)%C(reset)\"

    # Full signature details
    logs = log --show-signature

Signature status codes:

  • G = Good signature
  • B = Bad signature
  • U = Good signature, unknown validity
  • N = No signature

7. Bash Installation Helper

Add to your ~/.bashrc or bash-ftw:

# GitHub CLI installation with OS detection
function install-gh {
    KERNEL=$(uname -s)

    if [[ $KERNEL == 'Darwin' ]]; then
        echo "Installing GitHub CLI via Homebrew..."
        brew install gh
    elif [[ $KERNEL == 'Linux' ]]; then
        echo "Installing GitHub CLI via apt..."
        curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | \
          sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg && \
        sudo chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg && \
        echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | \
          sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null && \
        sudo apt update && sudo apt install gh -y
    else
        echo "Visit https://cli.github.com for installation instructions"
        return 1
    fi

    echo "GitHub CLI installed! Run 'gh auth login' to authenticate"
}

Complete Setup Workflow

Initial Setup (One Time)

  1. Clone your dotfiles:
    cd ~/code
    git clone https://github.com/yourusername/dotfiles
    
  2. Create your pubkeys repository structure:
    mkdir -p ~/code/pubkeys/{scripts,delegates,obsolete}
    cd ~/code/pubkeys
    
    # Copy all your .pub files here
    cp ~/.ssh/*.pub .
    
    # Create emails.txt
    cat > emails.txt <<EOF
    # Your git commit emails
    you@example.com
    you@users.noreply.github.com
    EOF
    
  3. Copy the scripts (from examples above) into scripts/

  4. Install GitHub CLI:
    install-gh  # or manually from https://cli.github.com
    gh auth login
    gh auth refresh -h github.com -s admin:ssh_signing_key
    
  5. Install configuration:
    cd ~/code/pubkeys
    make install
    
    cd ~/code/dotfiles
    cp .gitconfig ~/.gitconfig
    
  6. Upload keys to GitHub:
    cd ~/code/pubkeys
    make github-signing-keys
    

Per-Machine Setup

On each new machine:

# 1. Clone repos
cd ~/code
git clone https://github.com/yourusername/dotfiles
git clone your-pubkeys-repo  # if you made it a git repo

# 2. Install
cd ~/code/pubkeys && make install
cd ~/code/dotfiles && cp .gitconfig ~/.gitconfig

# 3. Configure ssh-agent (if needed)
ssh-add ~/.ssh/id_ed25519

# 4. Test it
cd some-repo
git commit -m "test signed commit"
git log --show-signature -1

Benefits

  1. Zero per-machine configuration - Same setup everywhere
  2. Automatic key selection - Works with any key in ssh-agent
  3. Multi-email support - All your commit emails are verified
  4. One-command GitHub sync - make github-signing-keys
  5. Easy verification - git slog shows signature status inline
  6. Makefile dependencies - Auto-regenerates when keys/emails change

Lessons Learned

1. Mac Compatibility

Mac’s bash 3.2 doesn’t support <<< heredoc syntax. Use pipe instead:

# Don't do this (fails on Mac)
while read line; do ...; done <<< "$var"

# Do this (works everywhere)
echo "$var" | while read line; do ...; done

2. Makefile Dependencies

Use $(wildcard *.pub) to track file dependencies:

allowed_signers: emails.txt scripts/generate.sh $(wildcard *.pub)

3. Error Handling in Scripts

Always check exit codes and provide remediation:

output=$(command 2>&1)
exit_code=$?

if [ $exit_code -ne 0 ]; then
    echo "ERROR: $output"
    echo "To fix: <remedy steps>"
    exit 1
fi

4. GitHub CLI Permissions

The admin:ssh_signing_key scope is required for managing signing keys:

gh auth refresh -h github.com -s admin:ssh_signing_key

5. Verification is Separate from Signing

  • user.signingkey or gpg.ssh.defaultKeyCommand - Which key to sign with
  • gpg.ssh.allowedSignersFile - Which keys are trusted for verification

Verification

Check if commits are signed:

# Quick check
git slog -10

# Full details
git log --show-signature -1

# Specific commit
git verify-commit abc123

Public Resources

Feel free to adapt these scripts and configurations for your own setup!

Conclusion

SSH-based commit signing is simpler than GPG, but managing it across multiple machines requires automation. With centralized configuration, automated scripts, and proper dependency tracking, you can maintain a seamless signing setup across all your machines.

The key principles:

  1. Automate everything - Scripts eliminate manual steps and errors
  2. Centralize configuration - Dotfiles repos ensure consistency
  3. Use cross-products - All emails × all keys for maximum flexibility
  4. Make it idempotent - Safe to run commands multiple times
  5. Provide clear errors - Always show how to fix issues

Now all my commits are automatically signed, verified, and visible on GitHub with that coveted “Verified” badge. 🎉

Signing Git commits using SSH instead of GPG

2024-11-21

TIL you can sign Git commits using SSH instead of GPG

This tip is 🏆, learned from my really smart colleague.

tl;dr;

To configure Git to use your key:

  1. Configure Git to use SSH for commit signing:
    git config --global gpg.format ssh
  2. Specify which public SSH key to use as the signing key and change the filename (~/.ssh/examplekey.pub) to the location of your key. The filename might differ, depending on how you generated your key:
    git config --global user.signingkey ~/.ssh/examplekey.pub

To sign a commit:

  1. Use the -S flag when signing your commits:
    git commit -S -m "My commit msg"
  2. Optional. If you don’t want to type the -S flag every time you commit, tell Git to sign your commits automatically:
    git config --global commit.gpgsign true

Source: https://docs.gitlab.com/ee/user/project/repository/signed_commits/ssh.html

Embrace the power of Regex

2023-07-13

Too often, while reviewing code, I’ll see examples like:

def extract_id_and_env(key: str) -> dict:
    """Extracts the object ID from `key`

    `key` is a string like 'namespace_prefix_12345'
    In some cases, `key` could also look like `namespace_prefix_12345_environment`

    Returns a dict with the object ID, an integer
    """
    parts = key.split('_')

    parsed = {
        'id': int(parts[2]),
        'environment': parts[3] if len(parts) == 4 else None
    }
    return parsed

When I see this, I ask, “Why?”

Instead, this is my preferred way of handling this is to use a regex with named capture groups:

import re

KEY_PATTERN = re.compile(r'(?<namespace>[a-z]+)_(?<prefix>[a-z]+)_(?<object_id>\d+)(?:_(?P<environment>[a-z]+))?

def extract_key_components(key: str):
    m = KEY_PATTERN.match(str)
    parts = ['namespace', 'prefix', 'object_id', 'environment', ]
    values = [m.group(part) for part in parts]
    return values

In another example (contrived, but modified from a real world application), from a Django which serves both students and educators, and displays two different landing pages depending on the intent:

def login_view(request):
    url = request.GET.get('next')
    last_word = url.split("/")[-1]
    is_student = True if last_word == 'scholarship' else False

    template = 'login/student.html' if is_student else 'login/educator.html'

    response = render_to_response(request, template)
    return response

The problem with this code is not immediately apparent. It works. However, this code lacks robustness.

An arguably better approach:

import re

STUDENT_LOGIN_INTENT_PATTERNS = [
    re.compile(r'^/path/to/(?P<some_id>\d+)/scholarship$'),
]

def is_login_intent_student(request):
    is_student = False
    next = request.GET.get('next')
    for pattern in STUDENT_LOGIN_INTENT_PATTERNS:
        if pattern.match(next):
            is_student = True
            break
    return is_student
    

def login_view(request):
    is_student = is_login_intent_student(request)
    template = 'login/student.html' if is_student else 'login/educator.html'

    response = render_to_response(request, template)
    return response

In addition to the readability and maintainability of the regex approach, it is overall more robust, allowing the programmer to extract multiple components from the string all at once! This mitigates the need for updating the function in the future, if other parts of the string are needed later on (and it’s quite often that it would be the case!).

My preference for Regex over Split stems from:

  • Somewhat related to the principle of https://www.joelonsoftware.com/2005/05/11/making-wrong-code-look-wrong/
  • If code is wrong, it should fail catastrophically and loudly, not subtly or obscurely
  • It’s hard to make a regex that looks maybe right? Either a regex is right, or obviously wrong. (It could also be that I have lots of experience using regexes, and can write them without lookup up references)
  • OTOH, while split is conceptually easier to learn, for me, it’s hard or nearly impossible to see at a glance whether the code is write or wrong. For example, if you look at a block of code using split and various indexes, how would you instantly detect a possible OB1 (aka off-by-one error; https://en.wikipedia.org/wiki/Off-by-one_error)? Not possible. OB1s bugs are prevalent in software because the learning curve, and therefore barrier to entry, is low, so bugs are more likely to be introduced.
  • Regexes, OTOH, have a slightly higher learning curve, slightly higher barrier to entry, so those who use it tend not to make trivial mistakes
  • If the code never has to update ever again, then, great! split is sufficient. If the next engineer has to update it, they would not necessarily benefit from the existing code, and would have to re-evaluate all of the code in their head to make sure indexes are right.
  • Maintaining a list of patterns, or regexes, encourages a Solve for N mentality, whereas using split encourages a “solve it quick and dirty mindset”

Use Fully Qualified datetime in Python

2022-11-02

Whenever using the datetime module in Python, a highly recommended practice is to just import datetime at the top of the file, and use the fully-qualified module name in the code, as much as possible:

  • datetime.datetime
  • datetime.timedelta
  • datetime.date

If one does from datetime import datetime, it’s hard to figure out at-a-glance what datetime is referring to in the middle of a several-hundred-lines-of-code file.

For similar reasons, another common best practice in Python when using the typing module (https://docs.python.org/3/library/typing.html) is to import is as import typing as T or import typing as t (e.g. https://github.com/pallets/flask/blob/cc66213e579d6b35d9951c21b685d0078f373c44/src/flask/app.py#L7; https://github.com/pallets/werkzeug/blob/3115aa6a6276939f5fd6efa46282e0256ff21f1a/src/werkzeug/wrappers/request.py#L4)

Get Is the Worst Function Prefix Ever

2022-07-14

tl;dr;

Unless the function you are writing is a getter (which complements a setter), please avoid naming methods get_.

get_ lacks descriptiveness, precision, and is boring.

Rationale

Software engineers are supposed to creative, and get is possible the least creative function name possible.

Too often, I see codebases cluttered with get_ methods throughout, when the implementations of those methods do things far more complex than simply reading or getting a value from an object.

When half of a file with hundreds of lines of code are all named get_*, this makes code difficult to read, scan, and reason about. (Future blog posts will address code that is easy vs difficult to reason about.)

Alternatives

Since much of the world’s software (historically) has been produced from within English-speaking countries and by English-speaking programmers and software engineering teams, please allow me introduce to you the robustness of the English language.

While my hope and expectation is not for every software engineer to have a Shakespearean command over English vocabulary, I do think that it is quite tenable to learn a few prefixes to help codebases become more manageable and pleasing to read.

Without further ado, these are my suggestions:

  • build_
  • calculate_
  • extract_
  • fetch_
  • look_up_ / retrieve_
  • format_ / transform_

Below are examples and sample code, in Python (my language of choice).

build_

When to use it: Use this prefix for a method which takes in some data, and builds a more complex structure as a result.

Analogy: You have multiple loose LEGO bricks, and want to assemble those pieces to build a structure out of that.

Bad

def get_response(color, food, location):
    response = {
        'color': color,
        'food': food,
        'location': location,
    }
    return response

Good

def build_response(color, food, location):
    response = {
        'color': color,
        'food': food,
        'location': Location(location),
    }
    return response

Usage

response = build_response('green', 'eggs and ham', 'in a car')

# Result:
# {
#     'color': 'green',
#     'food': 'eggs and ham',
#     'location': {
#         'name': 'in a car'
#     },
# }

calculate_

When to use it: When you have some data, and some formula is applied to calculate a result.

Analogy: If it’s not doable via “mental math,” and needs to be calculated.

Setup

items = [
    {
        'color': 'green',
        'food': 'eggs and ham',
        'location': {
            'name': 'in a car'
        },
        'price_cents': 1525,
    },
    {
        'color': 'red',
        'food': 'hot chili peppers',
        'location': {
            'name': 'with a mouse'
        },
        'price_cents': 299,
    },
    {
        'color': 'orange',
        'food': 'carrots',
        'location': {
            'name': 'here or there'
        },
        'price_cents': 399,
    },
]

Bad:

def get_total(items, unit='cents'):
    total_cents = sum([item['price_cents'] for item in items])

    if unit == 'cents':
        total = total_cents
    elif unit == 'dollars':
        total = float((Decimal(total_cents) / Decimal(100)).quantize(Decimal('1.00')))
    
    return total

Good:

def calculate_total(items, unit='cents'):
    total_cents = sum([item['price_cents'] for item in items])

    if unit == 'cents':
        total = total_cents
    elif unit == 'dollars':
        total = float((Decimal(total_cents) / Decimal(100)).quantize(Decimal('1.00')))
    
    return total

A method named calculate_ will mentally prepare the engineer to be extra careful and meticulous when maintaining this code, because the goal is to be accurate and precise.

extract_

When to use it: When you need one, or a few pieces of information, from a more complex structure.

Analogy: You have a plain rock (diamond ore, gold ore) which is relatively useless on the surface, and want to extract the valuable contents (diamonds, gold).

Setup

response = {
    'color': 'green',
    'food': 'eggs and ham',
    'location': {
        'name': 'in a car'
    }
}

Bad

def get_color(response):
    return response['color']


def get_location_name(response):
    return response['location']['name']

# Usage

color = get_color(response)
location_name = get_location_name(response)

Better

response = {
    'color': 'green',
    'food': 'eggs and ham',
    'location': {
        'name': 'in a car'
    }
}

def extract_color(response):
    return response['color']


def extract_location_name(response):
    return response['location']['name']

# Usage

color = extract_color(response)
location_name = extract_location_name(response)

Great

Consider using object-oriented programming:

class Response:
    def __init__(self, raw_response):
        self.raw_response = raw_response

    @property
    def color(self):
        return self.raw_response['color']

    @property
    def location_name(self):
        return self.raw_response['location']['name']

# Usage

r = Response(response)

color = r.color
location_name = r.location_name

fetch_

When to use it: Use this prefix when the method makes an HTTP call or other API call. Inspired by fetch from JavaScript (https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch)

Analogy: If you have a warm and cuddly friendly dog, and you’re at the park playing the good ol’ game of fetch; the object you’re intending to retrieve is accessible to you, but not immediately within reach.

Bad

def get_story_details(story_id):
    response = requests.get(f'https://api.example.com/story/{story_id}/details')
    return response

If the method is named get_, there’s no way to distinguish at a glance whether this function calls out to another server/service.

Whenever the flow of your code leaves your control (like making an API call), there is inherent risk and potential for errors to occur (e.g. “What if the remote API goes down?”)

Good

def fetch_story_details(story_id):
    response = requests.get(f'https://api.example.com/story/{story_id}/details')
    return response

By naming methods that make API calls to remote resources withfetch_, you allow engineers to quickly identify risky sections of code at a glance, without requiring them to read the details of a function – and this saves time – allowing a flywheel effect of: by writing code faster, teams produce more code / fix bugs more quickly, delivering more business value, enabling these teams to hire more team members, to build more products….

So, if I see a method named fetch_, I can immediately make mental note to make these sections of code more resilient (such as with try and except error handling, retry logic with exponential backoffs, etc).

look_up_

When to use it: Use this prefix when the goal of the method is to retrieve data that was previous stored in a local storage, like a database.

Analogy: There is a single piece of information you want to retrieve from among a larger collection, like *looking up** the defintion of a word in a glossary.

Setup

# MySQL

| id | item              | price_cents |
----------------------------------------
|  1 | eggs and ham      |        1525 |
|  2 | hot chili peppers |         299 |
|  3 | eggs and ham      |         399 |

Bad

def get_price(item):
    sql = connect()
    q = sql.query('items').where(item=item)
    price = q.execute()['price_cents']
    return price

Better

def look_up_price(item):
    sql = connect()
    q = sql.query('items').where(item=item)
    price = q.execute()['price_cents']
    return price

By naming a method look_up, you mentally prepare the next engineer who reviews this code that this method involves some form of database retrieval, and they can also keep in mind the performance characteristics of database retrieval.

Best

Use the database repository design pattern.

# repos/items.py

class ItemRepo:
    def get(self, item: str) -> Record:
        sql = connect()
        q = sql.query('items').where(item=item)
        record = q.execute()
        return record

    def look_up_price(self, item: str) -> float:
        record = self.get(item)
        price = record['price_cents']
        return price

format_ / transform_

When to use it: When the desire output is a derivative of the inputs, or a metamorphosis such that output is not recognizable from the original form, but only to a connoiseur.

Analogy: When the input is uncooked potatoes and the output is mashed potatoes, you are transforming the raw ingredients into a useful end-product.

Bad

def get_mashed_potato(raw_potato):
    boiled_potato = boil(raw_potato)
    mashed_potato = mash(boiled_potato)
    return mashed_potato

Good

def transform_potato(raw_potato, form_factor):
    result = raw_potato

    if form_factor == 'raw':
        result = raw_potato
    elif form_factor == 'baked':
        result = bake(raw_potato)
    elif form_factor == 'boiled':
        result = boil(raw_potato)
    elif form_factor == 'mashed':
        result = mash(transform_potato(raw_potato, 'boiled'))

    return result

Alternatively, format_potato with the same function body above would work.

Conclusion

Please, for the love of all things proper, think twice before creating another method with the prefix name get_, and use one of the better alternatives: build_, extract_, fetch_, look_up_, retrieve_, transform_.

I promise you – your future self and your teammates will thank you!

Feedback

Agree? Disagree? Love it? Hate it?

Please leave comments or drop me a line!


Make a Donation