This document is for adopters who want to understand how and why this repo is built the way it is. It covers the internal architecture, design decisions, and the reasoning behind patterns you will encounter when reading or modifying the scripts.
If you are setting up a new machine for the first time, start with GettingStarted.md instead.
- Design Principles
- Repository Layout
- Shell Architecture:
.shellrcvs.aliases - Logging System
- Exit Code Safety:
ifvs&& - Deferred Error and Warning Collection
- Zsh Startup Optimisation
- Cron Safety Mechanisms
install-dotfiles.rbMechanics- Per-Project Script Overrides
capture-prefs.shArchitectureosx-defaults.shandcapture-prefs.sh— Two-Phase Preference Architecture
Every decision in this codebase is governed by four priorities, applied in order. A higher priority always wins over a lower one.
The shell must be usable immediately after opening a terminal. Every $(...) command substitution forks a new process — on a loaded system this can add tens of milliseconds each. The startup hot path (.zshenv → .zshrc → .zlogin) avoids all subshell forks. Zsh built-in parameter expansions (${MACHTYPE%%-*} instead of $(uname -m), ${COLUMNS} instead of $(tput cols)) are used throughout.
Performance optimisations that obscure intent are always commented. A reader with basic zsh knowledge should understand any section without needing to re-derive the intent.
Where possible, syntax is kept POSIX-compatible so that scripts remain portable in bash contexts (.envrc files run in a bash subshell via direnv, git alias bodies use sh). When POSIX syntax would require a subshell fork or degrade readability, zsh built-ins are used with a comment explaining why.
fresh-install-of-osx.sh — the main bootstrap — must work correctly both on a vanilla macOS and on a fully configured machine. Every section has a guard at the very top that short-circuits the entire section when its work is already done. Re-running on a configured machine is therefore fast: only sections with outstanding work execute their logic.
The FIRST_INSTALL environment variable signals a vanilla OS run. Logic that only makes sense on a blank slate (e.g. downloading .shellrc via curl before the dotfiles repo exists) is guarded with is_first_install (a utility function in .shellrc that wraps [[ -n "${FIRST_INSTALL:-}" ]]). Two occurrences in fresh-install-of-osx.sh use the raw form directly because they run before .shellrc has been sourced.
files/
--HOME--/ symlinked into $HOME
--ZDOTDIR--/ symlinked into $ZDOTDIR (defaults to $HOME)
--XDG_CONFIG_HOME--/ symlinked into $XDG_CONFIG_HOME
--PERSONAL_PROFILES_DIR--/ .envrc for direnv
scripts/
fresh-install-of-osx.sh
capture-prefs.sh
osx-defaults.sh
utilities/ shared Ruby modules (logging, string, cli_parser, …)
data/ plain-text data files read by scripts at runtime
Subdirectories under files/ use the pattern --ENV_VAR_NAME--. The install-dotfiles.rb script resolves each directory name to the env var it names (--HOME-- → $HOME, --XDG_CONFIG_HOME-- → $XDG_CONFIG_HOME) and symlinks every file inside into the resolved directory.
To add a new dotfile, drop it under the appropriate files/--VAR--/ directory. The script handles nested paths, conflict resolution, and the difference between symlinks and copies (see § 9).
| Variable | Default path | Purpose |
|---|---|---|
$DOTFILES_DIR |
~/.config/dotfiles |
This repo |
$PERSONAL_BIN_DIR |
~/personal/dev/bin |
Private scripts and per-project overrides |
$PERSONAL_CONFIGS_DIR |
~/personal/dev/configs |
Private config files (repo catalog YAML, exported prefs, etc.) |
$PERSONAL_BIN_DIR and $PERSONAL_CONFIGS_DIR live outside the dotfiles repo. They are never present on a vanilla OS before the dotfiles repo is cloned, so any function that only these scripts need belongs in .aliases — not .shellrc. They are kept separate intentionally: they hold private data (credentials, personal scripts, site-specific configs) that must never appear in a public repository. The only personal identifiers that belong in this repo are GH_USERNAME (a public GitHub username, inherently non-sensitive) and references to Keybase (which handles its own encryption for private data).
.shellrc is downloaded via curl early in fresh-install-of-osx.sh, before the dotfiles repo has been cloned and before install-dotfiles.rb has created symlinks. It must therefore stay lean — it holds only the functions that are genuinely needed in that pre-clone window.
.aliases is available only after the dotfiles repo is cloned and install-dotfiles.rb has run. Everything that does not need to exist before that point lives in .aliases.
A function belongs in .shellrc if it is called during a vanilla-OS fresh-install before install-dotfiles.rb runs. Everything else belongs in .aliases.
There is a second constraint: .envrc files are evaluated by direnv in a bash subshell. They must source .shellrc (not .aliases) because .aliases contains zsh-only syntax (:h, :A, associative array subscripts, zsh-specific glob qualifiers) that bash cannot parse. Any function called from a .envrc file — even one that is not needed during the pre-clone window — must therefore live in .shellrc, with a comment explaining the bash-compat reason.
- Each script sources only the tightest file that provides what it needs.
- If a script needs only
.shellrcfunctions, it sources.shellrc. - If it needs
.aliasesfunctions, it sources.aliases— which already sources.shellrcinternally. - Never source both in the same script.
.shellrc defines a sentinel function is_shellrc_sourced. Other scripts source .shellrc unconditionally; the guard inside .shellrc itself prevents double-loading. The same applies to .aliases via is_aliases_sourced. Scripts add a comment — # Re-source guard is inside .shellrc/.aliases itself — safe to call unconditionally. — to make this explicit without repeating the guard logic.
Six levels are defined in .shellrc, each with a distinct visual treatment:
| Level | Prefix/colour | When to use |
|---|---|---|
debug |
**DEBUG** (light purple) |
Expected-absent optional tools, silently-skipped steps. Hidden by default; visible with DEBUG=true. |
info |
no prefix (plain) | Normal progress and idempotency guards ("already installed — skipping"). Suppressed when DIRENV_IN_ENVRC is set (direnv subshell). |
success |
**SUCCESS** (green) |
An operation completed successfully. Suppressed when DIRENV_IN_ENVRC is set (direnv subshell). |
warn |
**WARN** (light red) |
Argument-parse failures (followed by usage + return 1); non-fatal recoverable failures. |
error |
**ERROR** (red) |
Unexpected mid-script failures. In shell: prints and fires a macOS notification via osascript. In Ruby: raises RuntimeError. |
user_action |
➡️ (bold yellow) |
Manual steps the user must perform after the script exits (restart an app, run a command). |
The key distinction between warn and error: warn is for expected failure modes (bad flag, recoverable skip). error triggers a macOS notification pop-up — using it for a typo in a flag argument would be annoying UX, so warn is always correct for argument-parse failures.
Two levels of section header provide a visual indent stack:
══════════════ ⏳ Top-level section ══════════════ ← section_header
──────── 🔷 Sub-step inside section ──────── ← section_header2
section_header uses = padding, light_blue colour, and no indent.
section_header2 uses - padding, cyan colour, and 2-space indent.
The padding width is computed from ${COLUMNS:-80} (the terminal width) minus the header text length, split evenly on both sides. The fallback to 80 is deliberate: cron jobs run with no terminal attached, so zsh sets COLUMNS to 0 — without the fallback, the arithmetic yields zero or negative lengths and printf produces no output.
Color functions (blue, red, cyan, yellow, etc.) wrap their argument in ANSI escape codes and — importantly — apply ${1//${HOME}/~} inline to replace absolute home paths with ~. This substitution is a no-fork operation: it uses zsh parameter expansion, not a subshell call. Logging functions (info, warn, etc.) do not apply the substitution themselves; they rely on the color functions to do it.
This means: never call replace_home_with_tilde before passing a path to a color function — the substitution happens inside and calling it twice is redundant.
A && B means "run B only if A succeeds". Under set -e or an ERR trap, if A returns non-zero, the entire && expression also returns non-zero — which triggers set -e abort or fires the ERR trap, even if A returning false is a completely normal, expected outcome.
For example:
is_file "${optional_config}" && cp "${optional_config}" "${dest}"is_file returns 1 when the file is absent. That is the common case (the file is optional). But the && expression returns 1, which under set -e aborts the script as if an error occurred.
Always use an explicit if for predicates whose "false" branch is a normal outcome:
if is_file "${optional_config}"; then
cp "${optional_config}" "${dest}"
fiAn if statement never propagates a non-zero exit code from the condition to the enclosing scope.
A && B || C is safe when C always returns 0. The overall expression resolves to C's exit code (0), so the ERR trap never fires. This pattern is used intentionally for success/failure dispatch:
git pull -r && success "Updated: ${folder}" || _record_warning "Failed: ${folder}"_record_warning always returns 0, making this safe. Do not use this pattern if C can return non-zero.
When editing any script that uses set -e or an ERR trap, scan every standalone A && B line and verify that A returning false is a genuine error (not a normal case). If it is expected, convert to if A; then B; fi.
Scripts that process many items (e.g. iterating over 100+ repos, 100+ preference domains) should not fire a macOS notification for each individual failure. A single summary notification at the end is far less disruptive.
Every script that uses this pattern declares three locals and increments the nesting counter at the top of main():
main() {
local _current_section='(init)'
local -a _step_warnings=()
local -a _step_errors=()
export _DOTFILES_SCRIPT_DEPTH=$((${_DOTFILES_SCRIPT_DEPTH:-0} + 1))
trap '_decrement_script_depth' EXIT # or chained into an existing EXIT trap
...
}The helpers _record_warning and _record_error (defined in .shellrc) append to these arrays via zsh dynamic scoping — they write into the caller's local variables without needing them passed as arguments. They also emit an inline warn/error log line immediately so failures are visible in the log stream as they happen.
print_script_summary is called once at the end of main(). It prints all collected warnings then errors, and if any errors were collected it fires exactly one _dotfiles_notify with a consolidated message.
_DOTFILES_SCRIPT_DEPTH is an exported integer that tracks call depth across subprocess boundaries. It defaults to 0 when unset; each main() increments it on entry and decrements it on exit via an EXIT trap.
is_outermost_script (shell) and outermost_script? (Ruby) check _DOTFILES_SCRIPT_DEPTH <= 1. Both print_script_start and print_script_summary (and print_script_duration via the latter) call this predicate and return early when depth > 1. This ensures that only the outermost script prints its start/finish banners and final summary — nested subprocess scripts stay silent so their output does not interleave with or duplicate the outer script's.
All scripts with the counter are currently invoked as subprocesses (not sourced). Subprocess environments are copies — a child's increment does not affect the parent, and the parent's depth is unaffected when the child exits. The decrement therefore has no practical effect on correctness today.
The decrement is applied anyway as a defensive, symmetric pattern:
- If a script is ever converted from subprocess to sourced invocation, the decrement immediately becomes load-bearing without any further change.
- It establishes a consistent visual contract: every increment has a matching decrement, making audits straightforward.
The trap string must work both as a standalone trap (trap '_decrement_script_depth' EXIT) and chained into existing traps (trap '...; _decrement_script_depth' EXIT). Extracting it to a named helper keeps the arithmetic and underflow guard ([[ depth -gt 0 ]]) in one place and avoids repeating the logic across nine trap strings.
Ruby scripts call Logging.increment_script_depth once before print_script_start. It increments ENV['_DOTFILES_SCRIPT_DEPTH'] and registers an at_exit hook that calls Logging.decrement_script_depth — the exact mirror of the shell's increment + EXIT trap pair.
Logging.increment_script_depth
script_start_time = Logging.print_script_startat_exit handlers in Ruby run on both clean exit and uncaught exceptions, mirroring the shell EXIT trap's behaviour on both normal and error exits.
- Argument-parse failures (
?/:getopts cases, missing required flags): always usewarn+usage+return 1. These are interactive typos; a pop-up notification is inappropriate. - Runtime failures (missing env vars, failed remote calls, unexpected file states): use
_record_error. These are unexpected environment problems worth surfacing as a notification.
The startup sequence (.zshenv → .zshrc → .zlogin) is measured in milliseconds. Every $(...) call forks a new process. Common substitutions are replaced with zero-fork equivalents wherever possible:
| Naive (forks) | Fast (no fork) |
|---|---|
$(whoami) |
${USER} |
$(tput cols) |
${COLUMNS} |
type func > /dev/null |
(( $+functions[func] )) |
Architecture detection is a known exception: ${MACHTYPE%%-*} is the ideal no-fork form, but MACHTYPE is unreliable on a vanilla macOS before Homebrew's zsh is active. .shellrc therefore uses $(uname -m) with a comment, accepting the fork cost at this one site in exchange for correctness across all environments.
brew shellenv sets PATH, MANPATH, and a handful of other variables. It is slow (~100 ms). The output is cached to $XDG_CONFIG_HOME/zsh/homebrew-shellenv-cache.zsh and sourced from there on subsequent starts. The cache is regenerated only when the brew binary is newer than the cache file.
Zsh can compile .zsh files to .zwc bytecode for faster loading. .zlogin triggers background compilation of all startup files and autoload directories after the interactive shell is ready. The delete_caches function removes all .zwc files; zsh regenerates them on the next startup.
Antidote replaces oh-my-zsh as the plugin manager. Key properties:
- Static bundle: plugins are resolved once and written to a bundle file (
.zsh_plugins.zsh) checked into the home git repo. The bundle is sourced at startup — no network call. ANTIDOTE_HOMEis set to~/Library/Caches/antidote(macOS-specific cache location).- The bundle is regenerated in a clean subshell (
zsh --no-rcs -c "antidote bundle < ...") to prevent ANSI escape codes from the interactive session leaking into the bundle file. - Plugin option variables (e.g.
ZSH_AUTOSUGGEST_STRATEGY) must be set before the bundle is sourced — plugins read them at load time.
compinit (the completion system initialiser) runs a filesystem security scan (compaudit) that can add ~50 ms. On subsequent starts, compinit -C is passed to skip the scan — the dump file at $XDG_CACHE_HOME/zcompdump serves as evidence that the scan already ran.
Scripts invoked from cron run in a minimal, non-interactive environment. Several things that work in a terminal silently fail or hang in cron.
load_zsh_configs sources .zshrc, which sources .zlogin. .zlogin triggers background ZWC compilation jobs. Launching these from a cron job with no terminal attached is disruptive and wasteful.
Rule: call load_zsh_configs from a cron script only if the script uses variables or functions defined in .zshrc (e.g. PROJECTS_BASE_DIR, mise shims). Most cron scripts only need vars from .shellrc/.aliases — these are available after load_file_if_exists "${HOME}/.aliases" without calling load_zsh_configs.
sudo prompts for a password when credentials are not cached. In a non-interactive cron context there is no terminal to type into — the process hangs indefinitely.
Any function callable from cron that uses sudo must call has_sudo_credentials (defined in .shellrc) first and return early with a warn if it fails:
if ! has_sudo_credentials; then
warn "sudo credentials not available — skipping."
return 0
fi
sudo some-commandhas_sudo_credentials encapsulates the sudo -n true 2>/dev/null check so call sites do not need to know the raw invocation.
is_running_in_tty returns true when stdin is a TTY ([[ -t 0 ]]) or FORCE_COLOR is set. In cron — no TTY, FORCE_COLOR unset — it returns false.
Operations that interact with the running desktop (killing apps, re-launching them via open -a) must be gated:
if [[ "${operation}" == 'import' ]] || is_running_in_tty; then
kill_login_item_apps
trap 'restart_login_item_apps; cleanup' EXIT
else
trap 'cleanup' EXIT
fiIn capture-prefs.sh: export from cron must not kill login-item apps — the user might be actively using them. Import always kills/restarts because the user explicitly triggered it.
Zsh sets COLUMNS to 0 when no terminal is attached. Any code that uses COLUMNS for arithmetic (header padding widths, display lengths) must use ${COLUMNS:-80} to avoid zero-division or negative lengths.
.shellrc suppresses info and success output when DIRENV_IN_ENVRC is set — the variable direnv injects during .envrc evaluation. DIRENV_DIR is intentionally not used as the guard: it does not survive direnv's strict_env mode. warn and error always print. This means .envrc files need no special log-suppression logic.
Each subdirectory under files/ named --VAR-- is resolved by reading the environment variable named VAR and expanding it to an absolute path. Files inside are then symlinked into that resolved directory.
Plain subdirectory names without the --VAR-- pattern are resolved literally from / (e.g. files/etc/ → /etc/). The --VAR-- form is preferred for any path that may differ between machines.
If a real file (not a symlink) already exists at a symlink target, install-dotfiles.rb moves the existing file into the repo before creating the symlink. This "adopt" behaviour ensures existing configs are never silently discarded. The --force flag overrides this and deletes the existing file instead.
Files matching custom.git* (e.g. custom.gitignore, custom.gitattributes) are copied rather than symlinked. Git itself does not handle symlinks reliably for its own core config files. Conflict resolution when both the source (repo) and destination (home) exist as real files:
FIRST_INSTALLset → destination always wins (moved into repo, then copied back).- Otherwise → the file with the newer mtime wins. On a tie, the repo wins.
After symlinking, install-dotfiles.rb ensures the line Include "./global_config" is present in ${SSH_CONFIGS_DIR}/config. This is a post-symlink step — do not add it manually or duplicate the guard elsewhere.
Autoload functions (cc, count, pull, push, st, upreb) support per-project overrides. The public function is a thin wrapper:
push() { dispatch_or_fallback push _push "$@"; }When called, dispatch_or_fallback looks for ${PERSONAL_BIN_DIR}/<cmd>-<cwd-basename>.sh. If the file exists and is executable, it is sourced in the current shell (so it inherits all functions and env vars). Otherwise the default _push implementation is called.
This means you can have a file ~/personal/dev/bin/push-my-project.sh that overrides the push behaviour specifically when you are inside a directory named my-project, without touching the shared push function.
status_all_repos and update_all_repos intentionally do not use this pattern — they operate on a fixed set of repos and a cwd-based override would not be meaningful.
The same dispatch mechanism applies to launch_me, debug_me, and build_me — they look for launch-<dir>.sh, debug-<dir>.sh, and build-<dir>.sh respectively.
defaults export produces binary plist by default. capture-prefs.sh immediately converts to XML (plutil -convert xml1) for two reasons:
- Diffability: XML plist is human-readable and produces meaningful git diffs. Binary plist diffs are useless.
- Round-trip fidelity:
defaults importreads XML plist natively. Converting via JSON is lossy —<data>blobs become base64 strings and<date>values become RFC 3339 strings thatdefaults importcannot round-trip. Some domains contain<data>-encoded values for legitimate portable keys (e.g. split-view frame strings).
| File | Purpose |
|---|---|
capture-prefs-allowed-list.txt |
Domains to export/import. One domain per line. |
capture-prefs-denied-list.txt |
Domains that must never be exported/imported (machine UUIDs, credentials, sync cursors, display geometry). Each entry has an inline comment explaining why. |
capture-prefs-excluded-keys.txt |
Individual keys within allowed domains that must be stripped before export/import. Format: `domain |
After exporting a domain to XML, _strip_excluded_keys reads the top-level keys via python3 (null-byte separated to handle keys with spaces), then for each key tests it against the patterns for that domain. Matched keys are deleted by PlistBuddy. Deletions are non-fatal — a missing key is silently skipped.
On import, stripping runs on a temp copy of the .plist file. The source file in the git repo is never modified during import.
The script-scoped _excluded_by_domain associative array is populated at startup (not per-domain) because zsh cannot pass associative arrays by value to functions.
capture-prefs.sh -e (export) is called from software-updates-cron.sh. In this context:
- Killing login-item apps is wrong — the user may be actively using them.
- Re-launching them via
open -ais worse — apps restart mid-session without the user's knowledge.
Kill/restart is therefore scoped to: always on import; interactive (TTY) export only. The is_running_in_tty check provides this gate. The EXIT trap in cron export only calls resume_softwareupdate_schedule.
macOS preferences are managed in two distinct, ordered phases. The order is load-bearing: phase 2 always wins over phase 1 by design. Both phases are invoked automatically by fresh-install-of-osx.sh in sequence.
osx-defaults.sh writes a curated, partial baseline of defaults write calls. "Partial" is intentional — it only codifies settings where a known-good starting value is worth establishing on a fresh machine. It does not attempt to replicate every preference the user has ever configured.
The baseline is appropriate for:
- System settings the user has never changed via the UI (Dock behaviour, Finder display options, keyboard shortcuts).
- App settings that are purely scriptable and have no meaningful UI-side equivalent (disabling analytics, enabling developer menus).
The baseline is not appropriate for:
- Any setting the user configures through the app's UI over time. Writing those here means
osx-defaults.sh -swould reset them to stale values on every fresh-install, defeating the purpose of phase 2. - Ephemeral state (window coordinates, last-opened directory, migration sentinels, A/B experiment assignments). Apps manage these themselves.
capture-prefs.sh -i imports the .plist files previously exported from the user's old machine via capture-prefs.sh -e. Because this runs after phase 1, every imported value overwrites the corresponding baseline value. The user's deliberate, UI-configured choices always win.
osx-defaults.sh -s # phase 1 — write baseline
capture-prefs.sh -i # phase 2 — overwrite with UI-configured valuesReversing the order causes osx-defaults.sh to overwrite the user's restored preferences with stale baseline values — exactly the wrong outcome. fresh-install-of-osx.sh encodes this order and must not be changed without understanding this constraint.
| Preference type | Where it goes |
|---|---|
| One-time baseline the user will not change via UI | osx-defaults.sh |
| Something the user configures through the app's UI | capture-prefs-allowed-list.txt — not osx-defaults.sh |
| Ephemeral state the app manages itself | capture-prefs-excluded-keys.txt or -denied-list.txt — nowhere else |
See Extras.md — osx-defaults.sh for the adopter-facing summary.
Back to README.md