Skip to content

Latest commit

 

History

History
32 lines (25 loc) · 2.49 KB

File metadata and controls

32 lines (25 loc) · 2.49 KB

AGENTS.md — vykar

Fast, encrypted, deduplicated backup tool in Rust. Two crates: vykar-core (library) and vykar-cli (thin CLI). Build with make pre-commit to run fmt-check + clippy + tests.

Repository on-disk layout

<repo>/
  config              # write-once, unencrypted msgpack: version, chunker params, pack size limits
  keys/repokey        # write-once, Argon2id-wrapped master key
  index               # mutable, encrypted: IndexBlob {generation, chunks} — the chunk index
  index.gen           # mutable, unencrypted, advisory: u64 generation hint for local cache (never trusted for writes)
  snapshots/<id>      # write-once, encrypted: SnapshotMeta — source of truth for snapshot listing
  sessions/<id>.json  # ephemeral: session presence markers (concurrent backups)
  sessions/<id>.index # ephemeral: per-session crash-recovery journals
  packs/<xx>/<id>     # write-once: compressed+encrypted chunks (256 shard dirs)
  locks/*.json        # advisory locks

Architecture notes

  • Two-phase backup enables concurrent uploads. Phase 1 (no lock) registers a session and uploads packs. Phase 2 (brief exclusive lock) reconciles the index and writes the snapshot blob as the commit point. See backup/mod.rs for the implementation.
  • Manifest is runtime-only: populated from snapshots/ blobs on open, never serialized. A local AEAD-encrypted snapshot cache avoids O(n) GETs.
  • Delete/prune ordering: delete snapshots/<id> first (must succeed, failure aborts), then decrement refcounts and persist index. Crash between the two leaves inflated refcounts (safe).
  • Maintenance lock (with_maintenance_lock()): acquires advisory lock, cleans stale sessions (72h), refuses if any active sessions remain (VykarError::ActiveSessions).

Conventions & gotchas

  • Serialization: all wire formats use rmp_serde (msgpack) with positional arrays. Do not use #[serde(skip_serializing_if)] on Item fields — breaks positional deserialization.
  • blake2 trait ambiguity: Blake2bMac<U32> has ambiguous trait methods. Always use fully-qualified Mac::update(&mut hasher, data) and <KeyedBlake2b256 as KeyInit>::new_from_slice().
  • PlaintextEngine: still needs a chunk_id_key for deterministic dedup. For unencrypted repos it's derived as BLAKE2b(repo_id).
  • Output split: vykar-core never prints — it communicates via return values and Option<&mut dyn FnMut(Event)> progress callbacks. vykar-cli uses stdout for results/tables and stderr for status/errors.