diff --git a/.github/instructions/upgrade-gitleaks.instructions.md b/.github/instructions/upgrade-gitleaks.instructions.md new file mode 100644 index 0000000..76087ad --- /dev/null +++ b/.github/instructions/upgrade-gitleaks.instructions.md @@ -0,0 +1,102 @@ +--- +description: Upgrade the gitleaks package to a newer version +globs: go.mod, go.sum, scanner.go, config.go +alwaysApply: false +--- + +# Skill: Upgrade Gitleaks Package + +Use this procedure when upgrading the `github.com/zricethezav/gitleaks/v8` dependency. + +## Step 1: Check Available Versions + +```bash +go list -m -versions github.com/zricethezav/gitleaks/v8 +``` + +## Step 2: Upgrade + +```bash +# Latest +go get github.com/zricethezav/gitleaks/v8@latest +go mod tidy + +# Or specific version +go get github.com/zricethezav/gitleaks/v8@v8.X.Y +go mod tidy +``` + +## Step 3: Check for Breaking Changes + +Review https://github.com/gitleaks/gitleaks/releases for the new version. + +**Watch for:** +- `detect.Fragment` removal (deprecated in v8, may be `sources.Fragment` in v9) +- `config.ViperConfig` or `Translate()` API changes +- `report.Finding` field changes + +## Step 4: Files That Import Gitleaks + +| File | Imports | +|------|---------| +| `config.go` | `config` | +| `scanner.go` | `config`, `detect`, `report` | +| `scanner_test.go` | `config`, `report` | + +## Step 5: Run Tests + +```bash +go test ./... +``` + +**If tests fail:** + +1. **Secret detection tests** - Rule patterns may have changed. Validate: + ```bash + echo 'key = "AKIATESTKEYEXAMPLE7A"' | gitleaks detect --no-git --source=/dev/stdin + ``` + +2. **Column/line tests** - If `diagnostics_test.go` fails, gitleaks may have changed column indexing. Check `adjustColumn()` logic. + +3. **Finding struct changes** - Check if `report.Finding` fields were renamed/removed. + +## Step 6: Run Linter + +```bash +golangci-lint run +``` + +Add suppressions to `.golangci.yml` if new deprecation warnings appear for APIs that still work. + +## Step 7: Run Benchmarks + +```bash +go test -bench=. -benchmem ./... +``` + +Check for performance regressions. + +## Step 8: Manual Test + +```bash +./test.sh +``` + +Verify in Neovim: +- Diagnostics appear correctly +- Hover works +- Code actions work + +## Step 9: Commit + +```bash +git add go.mod go.sum +git commit -m "chore: upgrade gitleaks to vX.Y.Z" +``` + +## Rollback + +```bash +go get github.com/zricethezav/gitleaks/v8@vPREVIOUS +go mod tidy +``` diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..07db1e1 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,263 @@ +# AGENTS.md - AI Agent Guidelines + +This document provides guidance for AI agents working on the gitleaks-ls codebase. + +## Project Overview + +gitleaks-ls is a Language Server Protocol (LSP) implementation for [Gitleaks](https://github.com/gitleaks/gitleaks) that provides real-time secret detection in code editors. Written in Go, it uses stdio for LSP communication. + +**Key Features:** +- Real-time scanning on file open/change/save +- Content-hash caching for performance +- `.gitleaks.toml` and `.gitleaksignore` support with file watching +- Hover documentation with remediation guidance +- Code actions for adding `gitleaks:allow` comments +- Workspace-wide scanning command with progress reporting +- Configurable diagnostic severity + +## Quick Start + +```bash +# Build +go build -o gitleaks-ls + +# Test +go test ./... + +# Lint +golangci-lint run + +# Test manually with Neovim +./test.sh +``` + +## Architecture + +Flat package structure - all source files in root (single `main` package): + +| File | Purpose | +|------|---------| +| `main.go` | Entry point, LSP server setup, `initialize` handler | +| `handlers.go` | Document handlers, `Server` struct, `DocumentStore`, `scanAndPublish()` | +| `scanner.go` | Gitleaks wrapper, `Finding` type, `.gitleaksignore` loading | +| `diagnostics.go` | `Finding` → LSP `Diagnostic` conversion, column adjustment | +| `config.go` | `.gitleaks.toml` loading via Viper, file watching | +| `cache.go` | SHA256 content-hash → findings cache | +| `hover.go` | Markdown hover documentation for findings | +| `actions.go` | Code actions, comment syntax for 40+ languages | +| `workspace.go` | `gitleaks.scanWorkspace` command, parallel file scanning | +| `settings.go` | LSP settings (`diagnosticSeverity`) | +| `uri.go` | Cross-platform `file://` URI ↔ filesystem path | + +**Global State:** `globalServer *Server` holds scanner, documents, config, and cache. + +## Critical Gotchas + +### Gitleaks Import Path + +```go +// CORRECT - use zricethezav, not gitleaks +import "github.com/zricethezav/gitleaks/v8/config" +import "github.com/zricethezav/gitleaks/v8/detect" +``` + +The module redirects from `gitleaks/gitleaks` but declares itself as `zricethezav/gitleaks`. + +### Gitleaks Config Loading (No Constructor) + +```go +// There is NO config.NewConfig() function +v := viper.New() +v.SetConfigType("toml") +v.SetConfigFile(path) +v.ReadInConfig() +var vc config.ViperConfig +v.Unmarshal(&vc) +cfg, _ := vc.Translate() // This creates config.Config +``` + +### LSP Types Require Pointers + +```go +severity := protocol.DiagnosticSeverityWarning +diag.Severity = &severity // Must be pointer + +source := "gitleaks" +diag.Source = &source // Must be pointer +``` + +### LSP Indexing & Column Quirks + +| Source | Lines | Columns | +|--------|-------|---------| +| LSP | 0-indexed | 0-indexed | +| Gitleaks | 1-indexed | **inconsistent** | + +**Gitleaks column numbering is quirky:** +- Line 0: `StartColumn` is 1-indexed, `EndColumn` is 0-indexed (exclusive) +- Line >0: `StartColumn` is 2-indexed, `EndColumn` is 1-indexed (exclusive) + +The `adjustColumn()` function in `diagnostics.go` handles this. Don't try to simplify it. + +### Cross-Platform URIs + +Windows: `file:///C:/path` → `C:\path` +Unix: `file:///path` → `/path` + +Use `uri.go` functions (`uriToPath`, `pathToURI`), not string manipulation. + +### File Size Limit + +Files >1MB are silently skipped (returns empty findings, no error). See `ScanContent()` in `scanner.go`. + +## Valid Test Secrets + +Use these for tests - invalid secrets won't be detected: + +**AWS Access Key:** `AKIATESTKEYEXAMPLE7A` +- Must be: `AKIA` + 16 chars from `[A-Z2-7]` (Base32 alphabet) +- **Invalid:** `AKIAIOSFODNN7EXAMPLE` (contains O, I - not in Base32) + +**GitHub PAT:** `ghp_1234567890abcdefghijklmnopqrstuvwx` +- Must be: `ghp_` + exactly 36 alphanumeric chars + +Validate with CLI: +```bash +echo 'key = "AKIATESTKEYEXAMPLE7A"' | gitleaks detect --no-git --source=/dev/stdin +``` + +## Testing Patterns + +**Create a test scanner:** +```go +func newTestScanner(t testing.TB) *Scanner { + v := viper.New() + v.SetConfigType("toml") + require.NoError(t, v.ReadConfig(strings.NewReader(config.DefaultConfig))) + + var vc config.ViperConfig + require.NoError(t, v.Unmarshal(&vc)) + + cfg, err := vc.Translate() + require.NoError(t, err) + + return NewScanner(cfg) +} +``` + +**Mock LSP context for integration tests:** +```go +var notifications []protocol.PublishDiagnosticsParams +ctx := &glsp.Context{ + Notify: func(method string, params any) { + if method == "textDocument/publishDiagnostics" { + if p, ok := params.(protocol.PublishDiagnosticsParams); ok { + notifications = append(notifications, p) + } + } + }, +} +``` + +## Code Quality Requirements + +- **Linting:** `golangci-lint run` (config in `.golangci.yml`) +- **Tests:** `go test ./...` (maintain 70%+ coverage) +- **Formatting:** `go fmt ./...` + +## Performance Targets + +| Operation | Target | +|-----------|--------| +| Scan small file (<100 lines) | <10ms | +| Scan medium file (~1K lines) | <50ms | +| Scan large file (~500KB) | <200ms | +| Cache hit | <1µs | +| Server startup | <500ms | + +Run benchmarks: `go test -bench=. -benchmem ./...` + +## Common Mistakes to Avoid + +1. **Invalid test secrets** - Always validate with `gitleaks detect` CLI first +2. **Wrong import path** - Use `zricethezav`, not `gitleaks` +3. **Missing pointers** - LSP types need `&value` for optional fields +4. **Unchecked type assertions** - Use `val, ok := x.(Type)` pattern +5. **Platform-specific paths** - Use `uri.go` functions +6. **Suppress deprecated warnings** - `detect.Fragment` is deprecated (v8), handled in `.golangci.yml` +7. **Simplifying `adjustColumn()`** - The gitleaks column quirks require that exact logic +8. **Using `config.NewConfig()`** - This function doesn't exist; use Viper pattern +9. **Forgetting cache invalidation** - Config/ignore file changes must clear the cache + +## Workspace Scanning + +The `gitleaks.scanWorkspace` command: +- Scans with 10 concurrent goroutines (`maxConcurrent = 10`) +- Respects `.gitignore` patterns +- Skips: hidden files/dirs, `node_modules`, `vendor`, `__pycache__`, `target`, `build`, `dist` +- Skips binary files (by extension and magic bytes via `filetype` library) +- Reports progress via LSP `$/progress` notifications + +## Key Dependencies + +- **[glsp](https://github.com/tliron/glsp)** - LSP server framework (protocol_3_16) +- **[gitleaks/v8](https://github.com/zricethezav/gitleaks)** - Secret detection engine +- **[fsnotify](https://github.com/fsnotify/fsnotify)** - File watching for config/ignore reload +- **[viper](https://github.com/spf13/viper)** - Config loading (required by gitleaks) +- **[go-gitignore](https://github.com/sabhiram/go-gitignore)** - .gitignore pattern matching +- **[filetype](https://github.com/h2non/filetype)** - Binary file detection via magic bytes +- **[testify](https://github.com/stretchr/testify)** - Testing (assert, require) + +## CI/CD + +GitHub Actions workflows in `.github/workflows/`: +- `ci.yml` - Test matrix (Linux/macOS/Windows, Go 1.24-1.25), lint, benchmark, build +- `release.yml` - Cross-platform binary releases on tag push + +**Coverage requirement:** 70% minimum (enforced in CI) + +## Documentation + +- [README.md](./README.md) - Usage and setup + +## LSP Capabilities + +The server advertises these capabilities in `initialize`: +- `textDocumentSync`: Full sync with open/close/save +- `hoverProvider`: true +- `codeActionProvider`: true +- `executeCommandProvider`: `["gitleaks.scanWorkspace"]` + +## Design Principles + +1. **Simplicity**: Flat structure, no unnecessary abstractions +2. **Testability**: Each component testable in isolation +3. **Performance**: Cache aggressively, fail fast +4. **Reliability**: Never crash, log errors, degrade gracefully + +## Error Handling + +- **Initialization errors** (fatal): Invalid config, can't init detector → log and exit +- **Scan errors** (recoverable): File too large, detector error → log, return empty findings +- **LSP errors** (recoverable): Invalid params, unknown URI → log warning, return error + +Use `slog` for structured logging to stderr. + +## Memory & CPU Targets + +| Resource | Target | Max | +|----------|--------|-----| +| Server baseline | <10MB | 20MB | +| Per document | <1MB | 5MB | +| Cache (100 files) | <20MB | 50MB | +| Idle CPU | <1% | 2% | +| Active scanning CPU | <20% | 50% | + +## Non-Goals + +These are explicitly out of scope: +- Secret management or rotation +- Git history scanning (use gitleaks CLI) +- Custom rule creation UI +- Editor extensions/plugins (raw LSP only) +- Multi-workspace folder support (uses single rootUri) diff --git a/DESIGN.md b/DESIGN.md deleted file mode 100644 index 0537ae7..0000000 --- a/DESIGN.md +++ /dev/null @@ -1,924 +0,0 @@ -# Technical Design Document: Gitleaks Language Server - -**Version**: 1.4 -**Last Updated**: 2025-12-05 -**References**: PRD.md - ---- - -## 1. Overview - -This document bridges the PRD (what/why) and implementation (code). It defines interfaces, types, message flows, and technical decisions for the gitleaks language server. - -### 1.1 Architecture Summary - -``` -Neovim (LSP Client) <--stdio/JSON-RPC--> gitleaks-ls (LSP Server) - | - +--> Scanner (gitleaks wrapper) - +--> Cache (findings cache) - +--> Config (watch .gitleaks.toml) -``` - -### 1.2 Key Design Principles - -1. **Simplicity**: Flat structure, no unnecessary abstractions -2. **Testability**: Each component testable in isolation -3. **Performance**: Cache aggressively, fail fast -4. **Reliability**: Never crash, log errors, degrade gracefully - ---- - -## 2. Package Structure - -``` -gitleaks-ls/ -├── main.go # Entry point, server initialization -├── handlers.go # LSP message handlers -├── scanner.go # Gitleaks integration -├── diagnostics.go # Finding → Diagnostic conversion -├── config.go # Configuration management -├── cache.go # Result caching (content hash) -├── hover.go # Hover provider -├── actions.go # Code actions (40+ languages) -├── workspace.go # Workspace scanning -├── go.mod / go.sum # Dependencies -├── *_test.go # Tests alongside source -└── README.md # Usage documentation -``` - -**Flat structure**: All source files in root to minimize complexity. - ---- - -## 3. Core Types and Interfaces - -### 3.1 Scanner Interface - -```go -// scanner.go - -import ( - "context" - "github.com/gitleaks/gitleaks/v8/detect" - "github.com/gitleaks/gitleaks/v8/report" -) - -// Scanner wraps gitleaks detection engine -type Scanner struct { - detector *detect.Detector - config *detect.Config -} - -// NewScanner creates a scanner with default or custom config -func NewScanner(configPath string) (*Scanner, error) - -// ScanContent scans the provided content and returns findings -func (s *Scanner) ScanContent(ctx context.Context, filename, content string) ([]Finding, error) - -// Finding represents a detected secret -type Finding struct { - RuleID string // e.g. "aws-access-key" - Description string // e.g. "AWS Access Key" - Match string // The matched secret (may be redacted) - Secret string // Extracted secret value (redacted) - StartLine int // 1-indexed - EndLine int // 1-indexed - StartColumn int // 1-indexed - EndColumn int // 1-indexed - Entropy float64 // Shannon entropy score - File string // File path/name - Commit string // Empty for LSP usage - Fingerprint string // Unique identifier for this finding -} -``` - -**Implementation Notes**: -- Wrap `gitleaks/v8/detect.Detector` -- Convert `report.Finding` to our `Finding` type -- Use gitleaks default config if no `.gitleaks.toml` found -- For LSP, we scan strings not files, so use `detector.DetectString()` - -### 3.2 Diagnostic Conversion - -```go -// diagnostics.go - -import ( - "github.com/tliron/glsp" - protocol "github.com/tliron/glsp/protocol_3_16" -) - -// FindingsToDiagnostics converts scanner findings to LSP diagnostics -func FindingsToDiagnostics(findings []Finding) []protocol.Diagnostic - -// FindingToDiagnostic converts a single finding -func FindingToDiagnostic(f Finding) protocol.Diagnostic { - return protocol.Diagnostic{ - Range: protocol.Range{ - Start: protocol.Position{ - Line: uint32(f.StartLine - 1), // LSP is 0-indexed - Character: uint32(f.StartColumn - 1), - }, - End: protocol.Position{ - Line: uint32(f.EndLine - 1), - Character: uint32(f.EndColumn), - }, - }, - Severity: protocol.DiagnosticSeverityWarning, // Configurable in Phase 2 - Source: "gitleaks", - Message: formatMessage(f), - Code: f.RuleID, - } -} - -// formatMessage creates a human-readable diagnostic message -func formatMessage(f Finding) string { - // Example: "Detected AWS Access Key (entropy: 4.2)" -} -``` - -**Key Decisions**: -- LSP uses 0-indexed lines, gitleaks uses 1-indexed -- Default severity: Warning (not Error, to avoid being too noisy) -- Include entropy in message for educational value -- Store RuleID in diagnostic.Code for reference - -### 3.3 Document State Management - -```go -// handlers.go - -// DocumentStore tracks open documents and their diagnostics -type DocumentStore struct { - mu sync.RWMutex - documents map[string]*Document // URI -> Document -} - -// Document represents an open file -type Document struct { - URI string - Version int32 - Content string - Diagnostics []protocol.Diagnostic -} - -// UpdateDocument updates document content and version -func (ds *DocumentStore) UpdateDocument(uri string, version int32, content string) - -// GetDocument retrieves a document by URI -func (ds *DocumentStore) GetDocument(uri string) (*Document, bool) -``` - -**Design Decision**: Simple in-memory map, no persistence needed. - -### 3.4 Server State - -```go -// main.go - -// Server holds the language server state -type Server struct { - glspServer *glsp.Server - scanner *Scanner - documents *DocumentStore - config *Config - logger *slog.Logger -} - -// NewServer creates and initializes the language server -func NewServer() (*Server, error) - -// Start begins serving LSP requests over stdio -func (s *Server) Start() error -``` - ---- - -## 4. LSP Message Flows - -### 4.1 Initialize Sequence - -``` -Neovim gitleaks-ls - | | - |-- initialize request ----------->| - | |-- Load config (.gitleaks.toml) - | |-- Initialize scanner - | |-- Setup document store - | | - |<- initialize result -------------| - | (capabilities) | - | | - |-- initialized notification ----->| - | |-- Ready to serve -``` - -**Server Capabilities** (Phase 1): -```go -capabilities := protocol.ServerCapabilities{ - TextDocumentSync: protocol.TextDocumentSyncOptions{ - OpenClose: true, - Change: protocol.TextDocumentSyncKindFull, - Save: &protocol.SaveOptions{IncludeText: true}, - }, -} -``` - -### 4.2 Document Open Flow - -``` -Neovim gitleaks-ls - | | - |-- textDocument/didOpen --------->| - | {uri, text, version} | - | |-- Store document - | |-- Scan content - | |-- Convert findings to diagnostics - | | - |<- textDocument/publishDiagnostics| - | {uri, diagnostics[]} | -``` - -**Handler Implementation**: -```go -func (s *Server) handleDidOpen(ctx *glsp.Context, params *protocol.DidOpenTextDocumentParams) error { - uri := params.TextDocument.URI - content := params.TextDocument.Text - version := params.TextDocument.Version - - // Store document - s.documents.UpdateDocument(uri, version, content) - - // Scan for secrets - findings, err := s.scanner.ScanContent(ctx, uri, content) - if err != nil { - s.logger.Error("scan failed", "uri", uri, "error", err) - return nil // Don't propagate error to client - } - - // Convert to diagnostics - diagnostics := FindingsToDiagnostics(findings) - - // Publish - s.glspServer.PublishDiagnostics(uri, diagnostics) - - return nil -} -``` - -### 4.3 Document Change Flow (Phase 1: Simple) - -``` -Neovim gitleaks-ls - | | - |-- textDocument/didChange ------->| - | {uri, text, version} | - | |-- Update document - | |-- (No scan - wait for save) -``` - -**Phase 1 Decision**: Only scan on save, not on change. Simpler, sufficient. - -### 4.4 Document Save Flow - -``` -Neovim gitleaks-ls - | | - |-- textDocument/didSave --------->| - | {uri, text} | - | |-- Scan content - | |-- Publish diagnostics -``` - -### 4.5 Hover Flow (Phase 2) - -``` -Neovim gitleaks-ls - | | - |-- textDocument/hover ----------->| - | {uri, position} | - | |-- Find diagnostic at position - | |-- Format finding details - | | - |<- hover result ------------------| - | {markdown content} | -``` - -**Hover Content Example**: -```markdown -### Detected Secret: AWS Access Key - -**Rule ID**: `aws-access-key` -**Entropy**: 4.2 -**Confidence**: High - -This pattern matches AWS access keys. - -**Recommendation**: -- Store credentials in environment variables -- Use AWS IAM roles instead of hardcoded keys -- Add to `.gitleaksignore` if this is a false positive - -**To ignore**: Add `// gitleaks:allow` on the line above -``` - ---- - -## 5. Configuration Management - -### 5.1 Config Type - -```go -// config.go - -import ( - "github.com/fsnotify/fsnotify" - "github.com/gitleaks/gitleaks/v8/config" -) - -// Config manages gitleaks configuration -type Config struct { - path string - config *config.ViperConfig - watcher *fsnotify.Watcher - onReload func() // Callback when config changes -} - -// NewConfig loads config from path or uses defaults -func NewConfig(workspaceRoot string) (*Config, error) { - // Look for .gitleaks.toml in workspace root - // If not found, use gitleaks default config -} - -// Watch starts watching the config file for changes -func (c *Config) Watch(ctx context.Context) error - -// GetConfig returns the current gitleaks config -func (c *Config) GetConfig() *config.ViperConfig -``` - -**Implementation**: -1. On startup, search for `.gitleaks.toml` in workspace root -2. If found, load it; otherwise use `config.DefaultConfig()` -3. Start fsnotify watcher on the config file -4. On file change event, reload config and clear cache - -### 5.2 Config File Search Order - -1. `.gitleaks.toml` in workspace root -2. Fall back to gitleaks default config - -**No cascading search**: Keep it simple, just check workspace root. - ---- - -## 6. Caching Strategy (Phase 2) - -### 6.1 Cache Type - -```go -// cache.go - -import ( - "crypto/sha256" - "sync" -) - -// Cache stores scan results keyed by content hash -type Cache struct { - mu sync.RWMutex - entries map[[32]byte][]Finding // hash -> findings -} - -// NewCache creates a new result cache -func NewCache() *Cache - -// Get retrieves cached findings for content -func (c *Cache) Get(content string) ([]Finding, bool) { - hash := sha256.Sum256([]byte(content)) - c.mu.RLock() - defer c.mu.RUnlock() - findings, ok := c.entries[hash] - return findings, ok -} - -// Put stores findings for content -func (c *Cache) Put(content string, findings []Finding) { - hash := sha256.Sum256([]byte(content)) - c.mu.Lock() - defer c.mu.Unlock() - c.entries[hash] = findings -} - -// Clear empties the cache (e.g., on config reload) -func (c *Cache) Clear() -``` - -**Cache Invalidation**: -- On config file change: clear entire cache -- No TTL: content hash is sufficient -- No size limit in Phase 2 (assume reasonable workspace size) - -**Performance Target**: Cache hit should be <1ms vs ~50ms for full scan. - ---- - -## 7. Error Handling Strategy - -### 7.1 Error Categories - -1. **Initialization Errors** (fatal): Can't start server - - Invalid config file - - Can't initialize gitleaks detector - - **Action**: Log and exit with error code - -2. **Scan Errors** (recoverable): Problem scanning a file - - File too large - - Invalid content - - Gitleaks detector error - - **Action**: Log error, publish empty diagnostics, continue - -3. **LSP Protocol Errors** (recoverable): Malformed request - - Invalid parameters - - Unknown document URI - - **Action**: Log warning, return error to client - -### 7.2 Error Logging Pattern - -```go -// All errors logged with structured logging -slog.Error("scan failed", - "uri", uri, - "error", err, - "duration_ms", elapsed.Milliseconds()) - -// Non-critical issues as warnings -slog.Warn("config file not found, using defaults", - "path", configPath) - -// Important events as info -slog.Info("scanner initialized", - "config", configPath, - "rules", len(rules)) -``` - -### 7.3 Graceful Degradation - -```go -// Example: Handle large files gracefully -func (s *Scanner) ScanContent(ctx context.Context, filename, content string) ([]Finding, error) { - const maxSize = 1_000_000 // 1MB - - if len(content) > maxSize { - slog.Warn("file too large, skipping scan", - "filename", filename, - "size", len(content)) - return nil, nil // Return empty findings, not error - } - - // ... continue with scan -} -``` - ---- - -## 8. Concurrency Model - -### 8.1 Threading Model (Phase 1) - -**Simple approach**: Handle each LSP request synchronously -- glsp handles concurrent requests via goroutines -- Our handlers run sequentially per document -- No explicit goroutine management needed - -**Why**: Scanning is fast enough (<50ms target), no need for async. - -### 8.2 Synchronization Points - -1. **DocumentStore**: RWMutex for concurrent access - ```go - type DocumentStore struct { - mu sync.RWMutex - documents map[string]*Document - } - ``` - -2. **Cache** (Phase 2): RWMutex for concurrent reads - ```go - type Cache struct { - mu sync.RWMutex - entries map[[32]byte][]Finding - } - ``` - -3. **Config Watcher**: Single goroutine, uses channel for events - ```go - go func() { - for { - select { - case event := <-watcher.Events: - handleConfigChange(event) - case <-ctx.Done(): - return - } - } - }() - ``` - -### 8.3 Concurrency (Phase 3: Workspace Scan) - -```go -// Scan multiple files in parallel -func (s *Server) ScanWorkspace(ctx context.Context, files []string) error { - var wg sync.WaitGroup - sem := make(chan struct{}, 10) // Limit to 10 concurrent scans - - for _, file := range files { - wg.Add(1) - go func(f string) { - defer wg.Done() - sem <- struct{}{} // Acquire - defer func() { <-sem }() // Release - - s.scanFile(ctx, f) - }(file) - } - - wg.Wait() - return nil -} -``` - ---- - -## 9. Testing Strategy - -### 9.1 Unit Tests - -**scanner_test.go**: -```go -func TestScanner_DetectsAWSKey(t *testing.T) { - scanner, err := NewScanner("") - require.NoError(t, err) - - content := ` - package main - const key = "AKIAIOSFODNN7EXAMPLE" - ` - - findings, err := scanner.ScanContent(context.Background(), "test.go", content) - require.NoError(t, err) - assert.Len(t, findings, 1) - assert.Equal(t, "aws-access-key", findings[0].RuleID) -} - -func TestScanner_HandlesLargeFile(t *testing.T) { - scanner, err := NewScanner("") - require.NoError(t, err) - - // 2MB file - content := strings.Repeat("x", 2_000_000) - - findings, err := scanner.ScanContent(context.Background(), "large.txt", content) - require.NoError(t, err) - assert.Empty(t, findings) // Should skip, not error -} -``` - -**diagnostics_test.go**: -```go -func TestFindingToDiagnostic(t *testing.T) { - finding := Finding{ - RuleID: "test-rule", - Description: "Test Secret", - StartLine: 10, - StartColumn: 5, - EndLine: 10, - EndColumn: 20, - } - - diag := FindingToDiagnostic(finding) - - assert.Equal(t, uint32(9), diag.Range.Start.Line) // 0-indexed - assert.Equal(t, uint32(4), diag.Range.Start.Character) - assert.Equal(t, "gitleaks", diag.Source) -} -``` - -### 9.2 Integration Tests - -**integration_test.go**: -```go -func TestLSPIntegration(t *testing.T) { - // Start server - server := startTestServer(t) - defer server.Stop() - - // Send initialize - resp := server.SendRequest("initialize", initParams) - assert.NotNil(t, resp.Result) - - // Open document with secret - server.SendNotification("textDocument/didOpen", didOpenParams) - - // Expect publishDiagnostics - diag := waitForDiagnostics(t, server, 1*time.Second) - assert.Len(t, diag.Diagnostics, 1) - assert.Contains(t, diag.Diagnostics[0].Message, "AWS") -} -``` - -### 9.3 Benchmark Tests - -**scanner_bench_test.go**: -```go -func BenchmarkScanner_SmallFile(b *testing.B) { - scanner, _ := NewScanner("") - content := readTestFile("small.go") // ~100 lines - - b.ResetTimer() - for i := 0; i < b.N; i++ { - scanner.ScanContent(context.Background(), "test.go", content) - } -} - -func BenchmarkScanner_MediumFile(b *testing.B) { - scanner, _ := NewScanner("") - content := readTestFile("medium.go") // ~1000 lines - - b.ResetTimer() - for i := 0; i < b.N; i++ { - scanner.ScanContent(context.Background(), "test.go", content) - } -} - -func BenchmarkCache_Hit(b *testing.B) { - cache := NewCache() - content := "test content" - cache.Put(content, []Finding{}) - - b.ResetTimer() - for i := 0; i < b.N; i++ { - cache.Get(content) - } -} -``` - -**Performance Targets**: -- `BenchmarkScanner_SmallFile`: <10ms per operation -- `BenchmarkScanner_MediumFile`: <50ms per operation -- `BenchmarkCache_Hit`: <1µs per operation - ---- - -## 10. Dependencies and Their Usage - -### 10.1 glsp (LSP Server) - -```go -import ( - "github.com/tliron/glsp" - protocol "github.com/tliron/glsp/protocol_3_16" - "github.com/tliron/glsp/server" -) - -// Create server -handler := protocol.Handler{ - Initialize: handleInitialize, - TextDocumentDidOpen: handleDidOpen, - TextDocumentDidChange: handleDidChange, - TextDocumentDidSave: handleDidSave, -} - -glspServer := server.NewServer(&handler, "gitleaks-ls", false) -glspServer.RunStdio() -``` - -**Key APIs Used**: -- `server.NewServer()` - Create LSP server -- `server.RunStdio()` - Start serving over stdio -- `server.PublishDiagnostics()` - Send diagnostics to client - -### 10.2 gitleaks/v8 (Detection Engine) - -```go -import ( - "github.com/gitleaks/gitleaks/v8/config" - "github.com/gitleaks/gitleaks/v8/detect" - "github.com/gitleaks/gitleaks/v8/report" -) - -// Load config -cfg, err := config.NewConfig("path/to/.gitleaks.toml") -if err != nil { - cfg = config.DefaultConfig() // Use defaults -} - -// Create detector -detector := detect.NewDetector(cfg) - -// Scan content -fragment := detect.Fragment{ - Raw: content, - FilePath: filename, -} - -findings := detector.DetectString(fragment) -``` - -**Key Types**: -- `config.ViperConfig` - Configuration -- `detect.Detector` - Detection engine -- `report.Finding` - Detection result - -### 10.3 fsnotify (File Watching) - -```go -import "github.com/fsnotify/fsnotify" - -watcher, err := fsnotify.NewWatcher() -watcher.Add(configPath) - -go func() { - for { - select { - case event := <-watcher.Events: - if event.Op&fsnotify.Write == fsnotify.Write { - reloadConfig() - } - case err := <-watcher.Errors: - log.Error("watcher error", err) - } - } -}() -``` - -### 10.4 slog (Logging) - -```go -import "log/slog" - -// Setup in main() -logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{ - Level: slog.LevelInfo, -})) -slog.SetDefault(logger) - -// Use throughout code -slog.Info("server started", "version", version) -slog.Error("scan failed", "uri", uri, "error", err) -slog.Debug("cache hit", "hash", hash) -``` - ---- - -## 11. Build and Deployment - -### 11.1 Build Configuration - -**Makefile**: -```makefile -.PHONY: build test bench clean - -build: - go build -o gitleaks-ls main.go - -test: - go test -v ./... - -bench: - go test -bench=. -benchmem ./... - -coverage: - go test -coverprofile=coverage.out ./... - go tool cover -html=coverage.out - -lint: - golangci-lint run - -clean: - rm -f gitleaks-ls coverage.out -``` - -### 11.2 Installation - -**For Development**: -```bash -go build -o gitleaks-ls -sudo mv gitleaks-ls /usr/local/bin/ -``` - -**For Users** (Phase 4): -```bash -go install github.com/user/gitleaks-ls@latest -``` - ---- - -## 12. Performance Budgets - -### 12.1 Latency Targets - -| Operation | Target | Measurement | -|-----------|--------|-------------| -| Scan small file (<100 lines) | <10ms | p95 | -| Scan medium file (<1000 lines) | <50ms | p95 | -| Scan large file (<10k lines) | <200ms | p95 | -| Cache hit | <1ms | p95 | -| Diagnostic publish | <5ms | p95 | -| Server startup | <500ms | max | - -### 12.2 Memory Targets - -| Component | Target | Max | -|-----------|--------|-----| -| Server baseline | <10MB | 20MB | -| Per document | <1MB | 5MB | -| Cache (100 files) | <20MB | 50MB | -| Total (typical workspace) | <50MB | 100MB | - -### 12.3 CPU Targets - -| Scenario | Target | Max | -|----------|--------|-----| -| Idle | <1% | 2% | -| Active scanning | <20% | 50% | -| Workspace scan (100 files) | <5 seconds | 10 seconds | - ---- - -## 13. Open Technical Questions - -### 13.1 Resolved for Phase 1 - -✅ **Q**: Use glsp or custom LSP implementation? -**A**: Use glsp - mature, well-documented, actively maintained. - -✅ **Q**: Store documents in memory or read from disk? -**A**: In memory - LSP provides content via didOpen/didChange. - -✅ **Q**: Scan on every keystroke or only on save? -**A**: Only on save for Phase 1. Add debounced onChange in Phase 2 if needed. - -✅ **Q**: How to handle gitleaks configuration? -**A**: Auto-detect .gitleaks.toml in workspace root, fall back to defaults. - -### 13.2 Resolved in Phase 2-3 - -✅ **Q**: Should cache be persisted to disk? -**A**: No, in-memory content-hash cache is sufficient and fast. - -✅ **Q**: Should we support custom ignore patterns beyond .gitleaksignore? -**A**: Yes, workspace scanning respects .gitignore patterns. - -### 13.3 To Be Decided in Phase 4+ - -❓ **Q**: How to handle multiple workspace folders? -**Proposal**: LSP provides rootUri, use that. Multi-folder support if needed. - ---- - -## 14. Implementation Checklist - -### Phase 1: Core LSP -- [x] Initialize go.mod with dependencies -- [x] Create file structure (main.go, scanner.go, etc.) -- [x] Implement scanner.go (gitleaks wrapper) -- [x] Implement diagnostics.go with line/column conversion -- [x] Implement main.go (glsp initialization) -- [x] Add didOpen/didChange/didSave handlers -- [x] Implement config.go with .gitleaks.toml support - -### Phase 2: Enhanced Features -- [x] Hover provider with finding details -- [x] Code actions for 40+ languages -- [x] Content-hash caching -- [x] .gitleaksignore support - -### Phase 3: Performance -- [x] Benchmark suite -- [x] Workspace scanning with parallel execution -- [x] .gitignore support for workspace scan -- [x] File size limits and binary file detection - -### Phase 4: Polish & Stability -- [x] Integration tests -- [x] Memory leak testing (automated stress test) -- [x] Enhanced error handling with context -- [x] Structured logging improvements -- [x] Final documentation review - -### Phase 5: CI/CD & Enhancements -- [x] GitHub Actions CI (test matrix, lint, build) -- [x] Release workflow with cross-platform binaries -- [x] Coverage threshold (70% minimum) -- [x] Benchmark CI job -- [x] Progress reporting for workspace scans -- [x] Configurable diagnostic severity -- [x] golangci-lint configuration - ---- - -**Document Version**: 1.4 -**Last Updated**: 2025-12-05 diff --git a/LESSONS_LEARNED.md b/LESSONS_LEARNED.md deleted file mode 100644 index 8ed59a4..0000000 --- a/LESSONS_LEARNED.md +++ /dev/null @@ -1,92 +0,0 @@ -# AI Assistant Context - Gitleaks Language Server - -**Purpose:** Prevent repeated mistakes and reduce AI exploration paths. - ---- - -## Critical: Library Gotchas - -### Gitleaks v8 Import Path -```go -// CORRECT - use zricethezav, not gitleaks -import "github.com/zricethezav/gitleaks/v8/config" -import "github.com/zricethezav/gitleaks/v8/detect" -``` -The module redirects from `gitleaks/gitleaks` but declares itself as `zricethezav/gitleaks`. - -### Gitleaks Config Loading (Viper Required) -```go -v := viper.New() -v.SetConfigType("toml") -v.SetConfigFile(path) // or v.ReadConfig(strings.NewReader(config.DefaultConfig)) -v.ReadInConfig() -var vc config.ViperConfig -v.Unmarshal(&vc) -cfg, _ := vc.Translate() // This is the only way to get config.Config -``` -**There is no `config.NewConfig()` function.** - -### Gitleaks Scanning -```go -fragment := detect.Fragment{Raw: content, FilePath: filename} -findings := detector.Detect(fragment) -``` -Note: `detect.Fragment` is deprecated (v8), will be `sources.Fragment` in v9. Suppress with golangci-lint. - -### glsp LSP Types - Pointers Required -```go -severity := protocol.DiagnosticSeverityWarning -diag.Severity = &severity // Must be pointer - -source := "gitleaks" -diag.Source = &source // Must be pointer - -// WorkDoneProgressBegin.Cancellable is *bool - omit if false -``` - ---- - -## Valid Test Secrets - -**AWS Access Key:** `AKIATESTKEYEXAMPLE7A` -- Must be: `AKIA` + 16 chars from `[A-Z2-7]` (Base32) -- Invalid: `AKIAIOSFODNN7EXAMPLE` (contains O, I) - -**GitHub PAT:** `ghp_1234567890abcdefghijklmnopqrstuvwx` -- Must be: `ghp_` + exactly 36 alphanumeric chars - -**Validate with CLI:** -```bash -echo 'key = "AKIATESTKEYEXAMPLE7A"' | gitleaks detect --no-git --source=/dev/stdin -``` - ---- - -## LSP Indexing - -| Source | Lines | Columns | -|--------|-------|---------| -| LSP | 0-indexed | 0-indexed | -| Gitleaks | 1-indexed | varies by line | - -Conversion in `diagnostics.go:adjustColumn()` handles the off-by-one differences. - ---- - -## Cross-Platform URIs - -```go -// Windows: file:///C:/path → C:\path -// Unix: file:///path → /path -// See uri.go for implementation -``` - ---- - -## Common Mistakes to Avoid - -1. **Invalid test secrets** - Always validate with `gitleaks detect` CLI first -2. **Wrong import path** - Use `zricethezav`, not `gitleaks` -3. **Missing pointers** - LSP types need `&value` for optional fields -4. **Unchecked type assertions** - Use `val, ok := x.(Type)` pattern -5. **Platform-specific paths** - Use `uri.go` functions, not string manipulation diff --git a/PRD.md b/PRD.md deleted file mode 100644 index cb7542a..0000000 --- a/PRD.md +++ /dev/null @@ -1,1084 +0,0 @@ -# Product Requirements Document: Gitleaks Language Server - -## 1. Executive Summary - -### 1.1 Product Vision -Create a Language Server Protocol (LSP) implementation for Gitleaks that provides real-time secret detection capabilities directly within code editors and IDEs, enabling developers to identify and prevent secret leaks before committing code. - -### 1.2 Problem Statement -Developers often accidentally commit sensitive information (API keys, passwords, tokens, credentials) to version control systems. Current solutions like pre-commit hooks and CI/CD scanning catch secrets too late in the development workflow, after code has been written and staged. This leads to: -- Security incidents requiring secret rotation -- Delayed development cycles when secrets are caught in CI -- Increased cognitive load from context switching -- Potential exposure of secrets in git history - -### 1.3 Solution -A gitleaks-powered language server that integrates seamlessly with any LSP-compatible editor (VS Code, Neovim, Emacs, Sublime Text, etc.) to provide: -- Real-time secret detection as developers type -- Inline diagnostics and warnings -- Quick fixes and remediation suggestions -- Context-aware intelligence about detected secrets - -## 2. Goals and Objectives - -### 2.1 Primary Goals -1. **Shift-Left Security**: Catch secrets at write-time, not commit-time -2. **Developer Experience**: Provide frictionless, non-intrusive security feedback -3. **Universal Compatibility**: Support all LSP-compatible editors and IDEs -4. **Performance**: Maintain sub-100ms response time for real-time feedback -5. **Accuracy**: Leverage gitleaks' proven detection engine with minimal false positives - -### 2.2 Success Metrics -**Performance is the primary success metric for this internal/prototype phase:** -- **Response Time**: <100ms for 95th percentile diagnostic requests -- **Scan Time**: <50ms for typical files (<1000 lines) -- **Memory Usage**: <50MB for typical workspaces (10-100 files) -- **CPU Usage**: <5% average CPU during idle, <20% during active scanning -- **Startup Time**: <500ms language server initialization -- **Throughput**: Handle 100+ file scans per second for workspace scanning - -### 2.3 Non-Goals (Out of Scope for Prototype) -- Secret management or rotation capabilities -- Automated secret remediation -- Cloud-based secret scanning services -- Git history scanning (use gitleaks CLI instead) -- Custom rule creation UI (use gitleaks config files) -- Editor extensions/plugins (raw LSP only) -- Multi-user or enterprise features -- Public distribution or marketplace publishing -- User documentation or marketing materials - -## 3. User Personas - -### 3.1 Primary Persona: Security-Conscious Developer -- **Background**: Software engineer working on cloud-native applications -- **Pain Points**: Accidentally committed secrets, slow CI feedback loops -- **Needs**: Immediate feedback, minimal disruption to workflow -- **Technical Level**: Comfortable with CLI tools and editor configuration - -### 3.2 Secondary Persona: Security/DevSecOps Engineer -- **Background**: Responsible for implementing security controls across teams -- **Pain Points**: Difficulty enforcing security practices, lack of visibility -- **Needs**: Centralized configuration, metrics, policy enforcement -- **Technical Level**: Advanced technical knowledge, infrastructure experience - -### 3.3 Tertiary Persona: Junior Developer -- **Background**: New to secure coding practices -- **Pain Points**: Unaware of security risks, unclear about what constitutes a secret -- **Needs**: Educational feedback, clear guidance on remediation -- **Technical Level**: Basic development skills, learning security concepts - -## 4. Core Features - -### 4.1 Real-Time Secret Detection (P0) -**Description**: Scan file contents as the user types and provide immediate feedback. - -**Requirements**: -- Scan on file open, edit, and save events -- Support all file types and languages -- Use gitleaks detection engine and default rule set -- Debounce scanning to avoid performance issues (300ms default) -- Cache scan results to minimize redundant work - -**User Stories**: -- As a developer, I want to see warnings when I type a secret so I can correct it immediately -- As a developer, I want scanning to feel instant so my workflow isn't disrupted - -### 4.2 Inline Diagnostics (P0) -**Description**: Display detected secrets as editor diagnostics with severity levels. - -**Requirements**: -- Show diagnostics as warnings or errors (configurable) -- Underline the exact secret location in the editor -- Support diagnostic ranges that span multiple lines if needed -- Clear diagnostics when secrets are removed -- Respect `.gitleaksignore` files and `gitleaks:allow` comments - -**Diagnostic Information**: -- Rule ID and description -- Secret type (e.g., "AWS Access Key", "GitHub Token") -- Line and column numbers -- Entropy score (if applicable) -- Confidence level - -**User Stories**: -- As a developer, I want to see exactly where the secret is so I can fix it quickly -- As a developer, I want to understand what type of secret was detected - -### 4.3 Hover Documentation (P0) -**Description**: Provide detailed information when hovering over a detected secret. - -**Requirements**: -- Show full rule details including pattern -- Display remediation guidance -- Include links to documentation -- Show how to suppress false positives -- Provide entropy analysis details - -**User Stories**: -- As a developer, I want to understand why something was flagged as I hover over it -- As a developer, I want quick access to documentation about the detected secret type - -### 4.4 Code Actions / Quick Fixes (P1) -**Description**: Offer actionable fixes for detected secrets. - -**Requirements**: -- "Ignore this occurrence" - add `gitleaks:allow` comment -- "Add to .gitleaksignore" - append fingerprint to ignore file -- "View rule details" - open documentation -- "Copy fingerprint" - copy to clipboard for allowlisting - -**User Stories**: -- As a developer, I want to quickly suppress false positives without leaving my editor -- As a developer, I want to easily allowlist test/example secrets - -### 4.5 Configuration Support (P0) -**Description**: Support gitleaks configuration files and custom rules. - -**Requirements**: -- Auto-detect `.gitleaks.toml` in workspace root -- Support custom config path via LSP settings -- Hot-reload configuration on file changes -- Fallback to gitleaks default config if none found -- Validate configuration and show errors - -**Configuration Options**: -```json -{ - "gitleaks-ls.configPath": "", - "gitleaks-ls.enabled": true, - "gitleaks-ls.severity": "warning", - "gitleaks-ls.scanMode": "onChange", - "gitleaks-ls.debounceMs": 300, - "gitleaks-ls.maxFileSizeMB": 1, - "gitleaks-ls.enabledRules": [], - "gitleaks-ls.disabledRules": [], - "gitleaks-ls.logLevel": "info" -} -``` - -**User Stories**: -- As a security engineer, I want to enforce custom detection rules across my team -- As a developer, I want the language server to respect my project's gitleaks config - -### 4.6 Workspace Scanning (P1) -**Description**: Scan entire workspace/project on demand. - -**Requirements**: -- Command to scan all files in workspace -- Progress reporting for large workspaces -- Generate summary report of findings -- Option to scan only open files vs. all files -- Exclude files based on `.gitignore` patterns - -**User Stories**: -- As a developer, I want to scan my entire project before committing -- As a security engineer, I want to audit a new codebase for existing secrets - -### 4.7 Ignore File Support (P0) -**Description**: Respect `.gitleaksignore` files for suppressing false positives. - -**Requirements**: -- Auto-detect `.gitleaksignore` in workspace root -- Support fingerprint-based ignoring -- Support path-based ignoring -- Hot-reload on ignore file changes -- Provide command to add current finding to ignore file - -**User Stories**: -- As a developer, I want to maintain a list of known false positives -- As a team, we want to share approved exceptions via version control - -## 5. Technical Architecture - -### 5.1 Technology Stack -**Core Principle**: Use modern, popular, well-maintained libraries. Keep code simple. - -- **Language**: Go 1.21+ (for consistency with gitleaks, excellent performance) -- **LSP Library**: `github.com/tliron/glsp` (most popular, actively maintained, simple API) - - Alternative: `gopls` libraries if needed for specific features -- **JSON-RPC**: Built into glsp (handles LSP transport layer) -- **Detection Engine**: `github.com/gitleaks/gitleaks/v8` (import as library) -- **Configuration**: Use gitleaks' native TOML parser (`github.com/pelletier/go-toml/v2`) -- **File Watching**: `github.com/fsnotify/fsnotify` (standard library for file events) -- **Logging**: `log/slog` (Go 1.21+ standard library, structured logging) -- **Testing**: Go standard testing + `github.com/stretchr/testify` (assertions) -- **Benchmarking**: Go standard `testing.B` (built-in benchmarking) - -### 5.2 System Components - -**Design Principle**: Keep it simple. Avoid over-engineering. Use standard patterns. - -``` -┌─────────────────────────────────────────────────────┐ -│ Neovim LSP Client │ -└───────────────────┬─────────────────────────────────┘ - │ LSP Protocol (JSON-RPC over stdio) - │ -┌───────────────────▼─────────────────────────────────┐ -│ Gitleaks Language Server │ -│ │ -│ main.go - Entry point, glsp server setup │ -│ │ -│ handlers.go - LSP request handlers │ -│ • didOpen/didChange/didSave → trigger scan │ -│ • hover → return diagnostic details │ -│ • codeAction → return ignore actions │ -│ │ -│ scanner.go - Gitleaks integration (simple wrapper) │ -│ • ScanFile(content) → []Finding │ -│ • LoadConfig() → use gitleaks defaults │ -│ │ -│ cache.go - Simple in-memory cache │ -│ • map[fileURI][]Diagnostic (that's it) │ -│ │ -│ config.go - Configuration loader │ -│ • Find .gitleaks.toml in workspace │ -│ • Watch for changes with fsnotify │ -│ │ -└─────────────────────────────────────────────────────┘ -``` - -**File Structure** (keep it flat, no deep nesting): -``` -gitleaks-ls/ -├── main.go # Entry point, server initialization -├── handlers.go # LSP message handlers -├── scanner.go # Gitleaks wrapper -├── cache.go # Simple caching -├── config.go # Config loading/watching -├── diagnostics.go # Convert findings to LSP diagnostics -├── go.mod # Dependencies -├── go.sum -├── README.md -└── *_test.go # Tests alongside source files -``` - -### 5.3 Performance Considerations -**Keep It Simple First, Optimize Later** - -**Phase 1 - Simple approach**: -- Scan entire file on every change (with 300ms debounce) -- Simple map-based cache: `map[uri][]Diagnostic` -- No fancy algorithms - just fast regex matching via gitleaks - -**Phase 2 - Add caching**: -- Cache by content hash: `hash(fileContent) → []Finding` -- Skip scan if hash hasn't changed - -**Phase 3 - Optimize if needed**: -- Parallel workspace scanning (simple `sync.WaitGroup`) -- File size limits (skip if >1MB) -- Consider incremental scanning only if performance targets not met - -**Avoid Premature Optimization**: -- No complex data structures -- No custom scheduling/queuing -- No sophisticated debouncing (use `time.AfterFunc`) -- Rely on gitleaks' optimized regex engine - -### 5.4 Supported LSP Features - -| Feature | Priority | Status | -|---------|----------|--------| -| textDocument/publishDiagnostics | P0 | Required | -| textDocument/didOpen | P0 | Required | -| textDocument/didChange | P0 | Required | -| textDocument/didSave | P0 | Required | -| textDocument/didClose | P1 | Nice-to-have | -| textDocument/hover | P0 | Required | -| textDocument/codeAction | P1 | Required | -| workspace/didChangeConfiguration | P1 | Required | -| workspace/executeCommand | P1 | Optional | -| $/cancelRequest | P2 | Optional | - -## 6. User Experience - -### 6.1 Installation & Setup -1. Build language server binary: `go build -o gitleaks-ls` -2. Add to PATH or note binary location -3. Configure Neovim LSP client in `init.lua`: -```lua -vim.lsp.start({ - name = 'gitleaks-ls', - cmd = {'/path/to/gitleaks-ls'}, - root_dir = vim.fs.dirname(vim.fs.find({'.git'}, { upward = true })[1]), -}) -``` -4. Optional: Add `.gitleaks.toml` to project root for custom rules - -### 6.2 Typical Workflow -1. Developer opens file in editor -2. Language server scans file and shows diagnostics -3. Developer hovers over warning to see details -4. Developer either: - - Removes the secret - - Replaces with environment variable - - Adds `gitleaks:allow` comment for false positives - - Adds to `.gitleaksignore` for persistent exceptions -5. Diagnostics clear when issue is resolved - -### 6.3 Example Diagnostic Output -``` -Warning: Detected AWS Access Key [gitleaks:aws-access-key] -Line 42: aws_key = "AKIAIOSFODNN7EXAMPLE" - ^^^^^^^^^^^^^^^^^^^^^^^^ -Entropy: 4.2 | Confidence: High - -🛠️ Quick Fixes: - • Ignore this occurrence (add gitleaks:allow comment) - • Add to .gitleaksignore - • Learn more about AWS secret management - -💡 Recommendation: Store credentials in environment variables or use AWS IAM roles. -``` - -## 7. Editor Integration - -### 7.1 Neovim (Raw LSP Only) -**Target**: Neovim with native LSP client (`:h lsp`) - -**Setup Method**: Manual configuration via `vim.lsp.start()` or `lspconfig` -- No custom plugin required -- Use Neovim's built-in LSP client -- Standard LSP protocol only -- Configuration via `init.lua` - -**Example Configuration**: -```lua -vim.lsp.start({ - name = 'gitleaks-ls', - cmd = {'gitleaks-ls'}, - root_dir = vim.fs.dirname(vim.fs.find({'.git'}, { upward = true })[1]), -}) -``` - -### 7.2 Future Editors (Post-Prototype) -- VS Code extension -- Emacs (lsp-mode) -- Other LSP-compatible editors -- Custom plugins and enhanced integrations - -## 8. Configuration Examples - -### 8.1 Basic Editor Configuration (VS Code) -```json -{ - "gitleaks-ls.enabled": true, - "gitleaks-ls.severity": "warning", - "gitleaks-ls.scanMode": "onChange" -} -``` - -### 8.2 Custom Rules Configuration (`.gitleaks.toml`) -```toml -[extend] -useDefault = true - -[[rules]] -id = "custom-api-key" -description = "Custom API Key Pattern" -regex = '''(?i)custom[_-]?api[_-]?key[:\s=]+['"]?([a-zA-Z0-9]{32})['"]?''' -keywords = ["custom_api_key"] -``` - -### 8.3 Ignore Configuration (`.gitleaksignore`) -``` -# Example test credentials -abc123def456:src/tests/fixtures/credentials.txt:test-api-key:10 - -# False positive in documentation -**/docs/examples/** -``` - -## 9. Security & Privacy - -### 9.1 Data Handling -- **No External Communication**: All scanning happens locally -- **No Telemetry by Default**: Optional, opt-in anonymous usage stats -- **No Secret Transmission**: Detected secrets never leave the user's machine -- **Secure Logging**: Redact secrets from logs by default - -### 9.2 Performance & Resource Usage -- **Memory**: Target <100MB for typical workspaces -- **CPU**: Background scanning with low priority threads -- **Disk**: Minimal disk I/O, read-only access to workspace files -- **Network**: None (except for optional update checks) - -## 10. Testing Strategy - -### 10.1 Unit Tests -- LSP message handling (request/response parsing) -- Configuration parsing -- Ignore file handling -- Cache invalidation logic -- Diagnostic creation and formatting - -### 10.2 Integration Tests -- Basic LSP protocol compliance (initialize, shutdown) -- File scanning with gitleaks library -- Configuration hot-reload -- Memory leak detection (long-running sessions) - -### 10.3 Performance Tests (CRITICAL) -- Benchmark suite for all performance metrics -- Load testing (100+ files) -- Stress testing (large files, rapid edits) -- Memory profiling -- CPU profiling -- Response time percentiles (p50, p95, p99) - -### 10.4 Test Coverage Goals -- LSP handlers: >80% -- Core scanning logic: >75% -- Configuration: >70% -- Overall: >75% (focus on correctness over coverage) - -## 11. Development Plan - -### 11.1 Phase 1: Core LSP (v0.1.0) - Week 1-2 -**Focus**: Simplest possible working implementation - -**Files to Create** (~500 lines total): -- `main.go`: glsp server setup, stdio transport (50 lines) -- `handlers.go`: didOpen, didChange, didSave handlers (100 lines) -- `scanner.go`: Call gitleaks library, return findings (50 lines) -- `diagnostics.go`: Convert gitleaks findings to LSP diagnostics (50 lines) -- `config.go`: Load .gitleaks.toml if exists, else defaults (50 lines) - -**Key Implementation Details**: -- Use glsp's `NewServer()` with stdio protocol -- Scan on didOpen/didSave immediately (no debouncing yet) -- Store diagnostics in simple `map[string][]protocol.Diagnostic` -- Publish diagnostics with `server.PublishDiagnostics()` - -**Success Criteria**: Can open file in Neovim and see secret warnings - -### 11.2 Phase 2: Enhanced Features (v0.2.0) - Week 3-4 -**Focus**: Essential UX improvements - -**Add** (~300 lines): -- `hover.go`: Return finding details on hover (50 lines) -- `actions.go`: Code actions for adding `gitleaks:allow` comment (80 lines) -- `cache.go`: Simple hash-based cache `map[hash][]Finding` (50 lines) -- `debounce.go`: Simple debouncing with `time.AfterFunc` (30 lines) -- `.gitleaksignore` support in scanner.go (50 lines) - -**Keep It Simple**: -- Hover: just format the finding as markdown -- Code actions: single action to insert `// gitleaks:allow` on previous line -- Cache: hash file content, store results, check before scanning -- Debounce: cancel previous timer, start new one - -**Success Criteria**: Productive workflow for handling secrets and false positives - -### 11.3 Phase 3: Performance Optimization (v0.3.0) - Week 5-6 -**Focus**: Measure first, optimize what matters - -**Add** (~200 lines): -- `bench_test.go`: Comprehensive benchmarks (100 lines) -- `workspace.go`: Parallel workspace scanning with WaitGroup (50 lines) -- Profile-guided optimizations based on benchmark results -- File size limits and early returns - -**Methodology**: -1. Write benchmarks for all operations (scan, cache, publish) -2. Run `go test -bench` and `pprof` to find bottlenecks -3. Optimize ONLY the slow paths -4. Keep code simple - no clever tricks without proven benefit - -**Simple Optimizations**: -- Add file size check before scanning -- Use `sync.Map` instead of mutex if contention detected -- Reuse gitleaks detector instance instead of recreating - -**Success Criteria**: All performance metrics met consistently - -### 11.4 Phase 4: Polish & Stability (v0.4.0) - Week 7-8 -**Focus**: Make it robust and debuggable - -**Add** (~300 lines): -- Proper error handling with context (wrap errors) -- Structured logging with `slog` (info, debug levels) -- `workspace/executeCommand` for manual scan -- Integration tests using real LSP messages (150 lines) -- README with setup instructions - -**Error Handling Pattern**: -```go -if err != nil { - slog.Error("failed to scan file", "uri", uri, "error", err) - return // fail gracefully, don't crash -} -``` - -**Testing Approach**: -- Unit tests for each package -- Integration test: send didOpen, verify diagnostics received -- Benchmark tests for performance validation -- Manual testing in Neovim - -**Success Criteria**: Stable for daily development use - -## 12. Documentation Requirements (Minimal for Prototype) - -### 12.1 Developer Documentation (Essential) -- Architecture overview -- Build and run instructions -- Neovim setup example (`init.lua` config) -- Performance benchmarking guide -- Debugging guide - -### 12.2 Code Documentation -- Inline code comments for complex logic -- Package-level documentation -- API documentation for key interfaces - -### 12.3 Testing Documentation -- How to run tests -- Performance test suite usage -- Test coverage expectations - -## 13. Success Criteria (Prototype Phase) - -### 13.1 Performance Benchmarks (PRIMARY) -**Must achieve consistently:** -- ✅ <100ms response time for 95th percentile diagnostic requests -- ✅ <50ms scan time for files <1000 lines -- ✅ <500ms language server startup time -- ✅ <50MB memory usage for typical workspace (10-100 files) -- ✅ <5% CPU usage when idle -- ✅ <20% CPU usage during active scanning -- ✅ 100+ files/second throughput for workspace scanning - -### 13.2 Functional Success -- ✅ Detects secrets in real-time (on didChange/didSave) -- ✅ Shows accurate diagnostics with proper ranges -- ✅ Hover provides useful information -- ✅ Code actions work for ignoring false positives -- ✅ Respects `.gitleaks.toml` and `.gitleaksignore` -- ✅ Works reliably in Neovim via native LSP client - -### 13.3 Stability Success -- ✅ Zero crashes during normal operation -- ✅ No memory leaks over 8+ hour sessions -- ✅ Handles large files (>10k lines) gracefully -- ✅ Recovers from configuration errors -- ✅ Clean error messages and logging - -## 14. Risks & Mitigations - -### 14.1 Technical Risks - -| Risk | Impact | Probability | Mitigation | -|------|--------|-------------|------------| -| Performance degrades on large files | High | Medium | Implement file size limits, incremental scanning | -| High false positive rate | High | Medium | Extensive testing, community feedback loop | -| LSP compatibility issues | Medium | Low | Follow LSP spec strictly, test with multiple editors | -| Memory leaks in long-running sessions | Medium | Low | Regular profiling, leak detection tests | -| Gitleaks library API changes | Low | Medium | Pin to stable version, abstract integration layer | - -### 14.2 Development Risks - -| Risk | Impact | Probability | Mitigation | -|------|--------|-------------|------------| -| Performance targets not achievable | High | Medium | Early prototyping, profiling, alternative algorithms | -| Gitleaks library integration issues | Medium | Medium | Test early, consider vendoring or forking | -| LSP protocol complexity | Medium | Low | Use well-tested LSP libraries, reference implementations | -| Neovim LSP client quirks | Low | Medium | Test thoroughly, consult Neovim docs/community | - -## 15. Dependencies - -### 15.1 External Dependencies (Minimal, Modern, Popular) - -**Runtime Dependencies**: -```go -require ( - github.com/gitleaks/gitleaks/v8 v8.18.0+ // Detection engine - github.com/tliron/glsp v0.2.0+ // LSP server (5k+ stars) - github.com/fsnotify/fsnotify v1.7.0+ // File watching (9k+ stars) - github.com/pelletier/go-toml/v2 v2.1.0+ // TOML parsing (1.5k+ stars) -) -``` - -**Development Dependencies**: -```go -require ( - github.com/stretchr/testify v1.8.4+ // Test assertions (22k+ stars) -) -``` - -**Rationale for Each Library**: -- **glsp**: Most complete Go LSP library, active maintenance, simple API -- **gitleaks/v8**: Core requirement, battle-tested, comprehensive rules -- **fsnotify**: De facto standard for file watching in Go -- **go-toml/v2**: Fast, compliant, used by gitleaks itself -- **testify**: Industry standard for Go testing - -**No External Tools Required**: -- Use Go standard library for HTTP, JSON, logging (slog) -- No build tools beyond `go build` -- No package managers beyond Go modules - -### 15.2 Development Dependencies -- **Testing**: Go standard library + testify -- **Benchmarking**: Go standard `testing.B` -- **Profiling**: Go standard `pprof` -- **CI/CD**: GitHub Actions (simple Go workflow) -- **Linting**: `golangci-lint` (standard Go linter aggregator) - -## 16. Implementation Decisions (KISS Principle) - -### Resolved for Simplicity: - -1. **Caching Strategy**: Content hash only. Simpler, no path management needed. - ```go - hash := sha256.Sum256([]byte(content)) - if cached, ok := cache[hash]; ok { return cached } - ``` - -2. **Scan Trigger**: On save only for Phase 1. Add debounced onChange in Phase 2. - - Simpler to start, sufficient for most use cases - - Add onChange if users request it - -3. **Large Files**: Hard limit 1MB. Return early with info diagnostic. - ```go - if len(content) > 1_000_000 { - return // skip silently or show "file too large" diagnostic - } - ``` - -4. **Configuration Hot Reload**: Automatic with fsnotify. Simple to implement. - - Watch .gitleaks.toml file - - Reload and clear cache on change - -5. **Logging**: Default INFO level, log to stderr (Neovim captures it). - ```go - slog.SetDefault(slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{ - Level: slog.LevelInfo, - }))) - ``` - -6. **Debouncing**: Simple timer pattern, no libraries needed: - ```go - var timer *time.Timer - if timer != nil { timer.Stop() } - timer = time.AfterFunc(300*time.Millisecond, func() { scan() }) - ``` - -### Open Questions for Phase 3+: -- Should we scan git-ignored files? (default: no) -- Incremental scanning worth the complexity? (decide after benchmarks) - -## 17. Appendix - -### 17.1 Competitive Analysis - -| Tool | Type | Pros | Cons | -|------|------|------|------| -| Talisman | Pre-commit hook | Mature, local scanning | Requires git setup, no IDE integration | -| GitGuardian | Cloud service | High accuracy, dashboard | Requires cloud service, privacy concerns | -| detect-secrets | CLI/Pre-commit | Baseline support | Python dependency, no real-time feedback | -| SecretLint | Pre-commit | Rule-based | Node.js dependency, limited rules | - -### 17.2 References -- [LSP Specification](https://microsoft.github.io/language-server-protocol/) -- [Gitleaks GitHub](https://github.com/gitleaks/gitleaks) -- [VS Code Extension API](https://code.visualstudio.com/api) -- [Neovim LSP Documentation](https://neovim.io/doc/user/lsp.html) - -### 17.3 Glossary -- **LSP**: Language Server Protocol - standardized protocol for editor/IDE integrations -- **Diagnostic**: Editor annotation showing warnings/errors -- **Code Action**: Quick fix or refactoring option -- **Fingerprint**: Unique identifier for a specific secret finding -- **Entropy**: Measure of randomness in a string (used for secret detection) -- **False Positive**: Non-secret incorrectly flagged as a secret - ---- - -**Document Version**: 2.1 -**Last Updated**: 2025-12-05 - ---- - -## 18. Next Steps: From PRD to Implementation - -### 18.1 Recommended Development Sequence - -**Step 1: Project Initialization** (30 minutes) -```bash -# Initialize Go module -go mod init github.com/yourusername/gitleaks-ls - -# Add dependencies -go get github.com/tliron/glsp@latest -go get github.com/gitleaks/gitleaks/v8@latest -go get github.com/fsnotify/fsnotify@latest -go get github.com/pelletier/go-toml/v2@latest -go get -d github.com/stretchr/testify@latest - -# Create project structure -mkdir -p cmd/gitleaks-ls -touch main.go handlers.go scanner.go diagnostics.go config.go -touch README.md .gitignore -``` - -**Step 2: Create Technical Design Document** (2-3 hours) -- Document LSP message flows (sequence diagrams) -- Define Go interfaces and types -- Specify gitleaks library integration points -- Design cache data structures -- Map out error handling strategy - -**Step 3: Implement Phase 1 - Core LSP** (1-2 weeks) -Follow the order that minimizes dependencies: -1. `main.go` - Basic glsp server setup, just accept connections -2. `scanner.go` - Gitleaks wrapper (can test standalone) -3. `diagnostics.go` - Convert findings to LSP format (can test standalone) -4. `handlers.go` - Wire up didOpen/didSave handlers -5. `config.go` - Load .gitleaks.toml -6. Integration testing with real Neovim - -**Step 4: Iterate on Phases 2-4** (4-6 weeks) -- Add one feature at a time -- Test each feature before moving to next -- Benchmark continuously - -### 18.2 Documents to Create Next - -#### A. Technical Design Document (TDD) -**Purpose**: Bridge PRD and code. Answers "how" not "what". - -**Contents**: -- LSP protocol flow diagrams -- Go package/type design -- Interface definitions -- Data structure specifications -- Concurrency model -- Error handling strategy - -**For Copilot CLI**: Create TDD by having CLI expand on technical sections: -``` -"Create a technical design document for the LSP handlers. -Include Go type definitions for all LSP message handlers, -the diagnostic cache structure, and scanner interface." -``` - -#### B. Implementation Tasks (GitHub Issues/TODOs) -**Purpose**: Break work into discrete, testable chunks. - -**Example Task Breakdown**: -```markdown -## Phase 1 Tasks - -### Task 1.1: Basic LSP Server Setup -- [ ] Create main.go with glsp server initialization -- [ ] Accept stdio connections -- [ ] Handle initialize/shutdown requests -- [ ] Test with Neovim LSP client -- Estimated: 2 hours - -### Task 1.2: Gitleaks Scanner Wrapper -- [ ] Create scanner.go with ScanFile function -- [ ] Import gitleaks detector -- [ ] Convert gitleaks.Finding to internal type -- [ ] Write unit tests with known secrets -- Estimated: 4 hours - -### Task 1.3: Diagnostic Conversion -- [ ] Create diagnostics.go -- [ ] Convert findings to protocol.Diagnostic -- [ ] Map severity levels -- [ ] Format diagnostic messages -- [ ] Unit tests -- Estimated: 3 hours -... -``` - -**For Copilot CLI**: Generate issues from PRD: -``` -"Generate GitHub issues for Phase 1 implementation tasks. -Each issue should be 2-4 hours of work, include acceptance -criteria, and reference the PRD sections." -``` - -#### C. API/Interface Documentation -**Purpose**: Define contracts before implementation. - -**Example**: -```go -// pkg/scanner/scanner.go - -// Scanner detects secrets in source code using gitleaks -type Scanner interface { - // ScanContent scans the provided content and returns findings - ScanContent(ctx context.Context, content string) ([]Finding, error) - - // LoadConfig loads gitleaks configuration from path - LoadConfig(path string) error -} - -// Finding represents a detected secret -type Finding struct { - RuleID string - Description string - StartLine int - StartColumn int - EndLine int - EndColumn int - Secret string // redacted - Entropy float64 -} -``` - -**For Copilot CLI**: -``` -"Define Go interfaces for the scanner, cache, and config -packages based on the PRD architecture section." -``` - -### 18.3 Optimal Copilot CLI Workflow - -**Iterative Development Pattern**: - -```bash -# 1. Start with architecture -$ @workspace "Create pkg/scanner/scanner.go with the Scanner -interface and Finding type based on the PRD section 5.2" - -# 2. Implement piece by piece -$ @workspace "Implement the Scanner interface using gitleaks/v8. -Keep it simple, just wrap the gitleaks detector." - -# 3. Add tests as you go -$ @workspace "Write unit tests for scanner.go using testify. -Test with a sample secret string from AWS access keys." - -# 4. Build up incrementally -$ @workspace "Create handlers.go with didOpen handler that -calls scanner and publishes diagnostics." - -# 5. Test integration points -$ @workspace "Create an integration test that sends a -textDocument/didOpen LSP message and verifies diagnostic output." - -# 6. Iterate on features -$ @workspace "Add hover support to handlers.go. Show the -finding details formatted as markdown." - -# 7. Optimize when needed -$ @workspace "Add benchmark tests for ScanContent. We need -to scan 100 files in under 1 second." -``` - -**Key Principles for Copilot CLI**: -1. **Incremental**: One file/function at a time -2. **Contextual**: Reference PRD sections explicitly -3. **Testable**: Ask for tests with each implementation -4. **Concrete**: Provide specific examples of input/output -5. **Measurable**: Request benchmarks for performance-critical code - -### 18.4 Development Environment Setup - -**Prerequisites**: -```bash -# Required -- Go 1.21+ installed -- Neovim 0.8+ with native LSP -- git - -# Recommended -- golangci-lint for linting -- delve for debugging -- pprof for profiling -``` - -**Neovim Test Configuration** (`test-init.lua`): -```lua --- Minimal config for testing gitleaks-ls -vim.lsp.start({ - name = 'gitleaks-ls', - cmd = {'./gitleaks-ls'}, - root_dir = vim.fn.getcwd(), - on_attach = function(client, bufnr) - print('gitleaks-ls attached') - end, -}) - --- Enable LSP logging -vim.lsp.set_log_level('debug') -``` - -**Test with**: -```bash -nvim -u test-init.lua test-file-with-secrets.go -``` - -### 18.5 First Implementation Session Checklist - -**Session 1: Project Setup (2-3 hours)** -- [ ] Initialize Go module -- [ ] Add all dependencies -- [ ] Create file structure -- [ ] Write README with build instructions -- [ ] Create .gitignore -- [ ] Set up basic CI (GitHub Actions for `go test`) -- [ ] Create first commit - -**Session 2: Core Scanner (3-4 hours)** -- [ ] Implement scanner.go -- [ ] Write unit tests -- [ ] Test with real secrets -- [ ] Verify gitleaks integration works - -**Session 3: LSP Basics (4-6 hours)** -- [ ] Implement main.go with glsp setup -- [ ] Add initialize/shutdown handlers -- [ ] Test connection with Neovim -- [ ] Implement didOpen handler -- [ ] End-to-end test: see diagnostics in Neovim - -**Deliverable**: Working prototype that shows warnings in Neovim - -### 18.6 Suggested File Creation Order - -**Order optimized for testing and incremental progress**: - -1. `go.mod`, `go.sum` - Dependencies -2. `README.md` - Project documentation -3. `scanner.go` + `scanner_test.go` - Core logic (testable standalone) -4. `diagnostics.go` + `diagnostics_test.go` - Conversion logic (testable standalone) -5. `config.go` + `config_test.go` - Config loading (testable standalone) -6. `main.go` - Wiring everything together -7. `handlers.go` - LSP handlers -8. `integration_test.go` - End-to-end testing - -**Rationale**: Build and test independent components first, then integrate. - -### 18.7 Quality Gates Per Phase - -**Phase 1 Checklist**: -- [ ] `go build` succeeds -- [ ] All tests pass (`go test ./...`) -- [ ] Can connect from Neovim -- [ ] Shows diagnostic for AWS key in test file -- [ ] No crashes on invalid input -- [ ] Basic error messages in logs - -**Phase 2 Checklist**: -- [ ] All Phase 1 checks pass -- [ ] Hover shows finding details -- [ ] Code action adds gitleaks:allow comment -- [ ] Cache speeds up second scan (benchmark proves it) -- [ ] Respects .gitleaksignore file - -**Phase 3 Checklist**: -- [ ] All Phase 2 checks pass -- [ ] All performance benchmarks meet targets -- [ ] `pprof` shows no memory leaks -- [ ] Can scan 100+ file workspace -- [ ] CPU usage acceptable - -**Phase 4 Checklist**: -- [ ] All Phase 3 checks pass -- [ ] Test coverage >75% -- [ ] No panics in stress test -- [ ] README has complete setup instructions -- [ ] Handles configuration errors gracefully - -### 18.8 Example First Copilot CLI Commands - -**To kick off implementation**: - -```bash -# 1. Initialize project structure -$ "Initialize a Go project for gitleaks-ls. Create go.mod with -module github.com/user/gitleaks-ls, add dependencies from PRD -section 15.1, create basic file structure from section 5.2" - -# 2. Start with scanner -$ "Create pkg/scanner/scanner.go implementing a Scanner interface -that wraps gitleaks/v8. Include a ScanContent method that takes -a string and returns findings. Keep it under 100 lines." - -# 3. Add tests -$ "Create pkg/scanner/scanner_test.go with testify. Test that -ScanContent detects an AWS access key AKIAIOSFODNN7EXAMPLE" - -# 4. Build diagnostics converter -$ "Create pkg/lsp/diagnostics.go that converts scanner.Finding -to protocol.Diagnostic from glsp. Map line/column numbers and -severity levels." - -# 5. Wire up main server -$ "Create cmd/gitleaks-ls/main.go that initializes a glsp server -with stdio transport. Just handle initialize/shutdown for now." - -# 6. Add first handler -$ "Add textDocument/didOpen handler to handlers.go. When a file -opens, scan it with scanner, convert to diagnostics, and publish." - -# 7. Test it -$ "Create an integration test that simulates opening a file with -a secret and verifies we receive a diagnostic message." -``` - -### 18.9 Common Pitfalls to Avoid - -1. **Don't build everything at once** - Build incrementally, test constantly -2. **Don't optimize prematurely** - Get it working first, then optimize -3. **Don't skip tests** - Tests are documentation and safety net -4. **Don't hardcode paths** - Use workspace root detection -5. **Don't ignore errors** - Log all errors with context -6. **Don't block the main goroutine** - Run scans in background -7. **Don't cache without invalidation** - Stale cache is worse than no cache - -### 18.10 Success Criteria for "Real Project" Status - -**Minimum Viable Project** (End of Phase 1): -- ✅ Builds without errors -- ✅ Connects to Neovim LSP client -- ✅ Detects and displays at least one type of secret -- ✅ Has basic tests -- ✅ README explains how to build and use - -**Production-Ready Prototype** (End of Phase 4): -- ✅ All features from PRD implemented -- ✅ Meets all performance targets -- ✅ >75% test coverage -- ✅ Used daily by author without issues -- ✅ Complete documentation - ---- - -## 19. Quick Start Guide for Development - -### For the Very First Implementation Step: - -```bash -# 1. Create the project -mkdir -p gitleaks-ls -cd gitleaks-ls - -# 2. Ask Copilot CLI: -$ @workspace "I'm starting to implement the gitleaks-ls project -from PRD.md. First, initialize the Go project structure with go.mod, -basic file stubs (main.go, scanner.go, handlers.go, diagnostics.go, -config.go), and add all dependencies from section 15.1 of the PRD." - -# 3. Then ask: -$ @workspace "Now implement scanner.go based on PRD section 5.2. -It should wrap gitleaks/v8 and provide a simple ScanContent function. -Keep it under 100 lines and include inline comments explaining -how it integrates with gitleaks." - -# 4. Test it: -$ @workspace "Create scanner_test.go with a test that verifies we -can detect the AWS access key pattern. Use testify for assertions." - -# 5. Build up from there... -``` - -**That's it!** The PRD provides the "what" and "why". The TDD (to be created) -provides the "how". Copilot CLI executes the implementation step by step. diff --git a/go.mod b/go.mod index 50b0ca5..b8ae3bf 100644 --- a/go.mod +++ b/go.mod @@ -1,6 +1,6 @@ module github.com/arch-stack/gitleaks-ls -go 1.24.0 +go 1.25.4 require ( github.com/fsnotify/fsnotify v1.9.0 @@ -10,7 +10,7 @@ require ( github.com/stretchr/testify v1.11.1 github.com/tliron/commonlog v0.2.19 github.com/tliron/glsp v0.2.1 - github.com/zricethezav/gitleaks/v8 v8.29.1 + github.com/zricethezav/gitleaks/v8 v8.30.0 ) require ( diff --git a/go.sum b/go.sum index e78542e..cf4afc6 100644 --- a/go.sum +++ b/go.sum @@ -250,8 +250,8 @@ github.com/wasilibs/wazero-helpers v0.0.0-20240620070341-3dff1577cd52/go.mod h1: github.com/xyproto/randomstring v1.0.5 h1:YtlWPoRdgMu3NZtP45drfy1GKoojuR7hmRcnhZqKjWU= github.com/xyproto/randomstring v1.0.5/go.mod h1:rgmS5DeNXLivK7YprL0pY+lTuhNQW3iGxZ18UQApw/E= github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY= -github.com/zricethezav/gitleaks/v8 v8.29.1 h1:6wX6DlXLVkgOcK03Hxw4GiH/wCkVsoEIPaIHIOOd/bI= -github.com/zricethezav/gitleaks/v8 v8.29.1/go.mod h1:dH8vlu3hiQjWJXU1BLzdkfxtWHmJexAY3U2XXaO4gos= +github.com/zricethezav/gitleaks/v8 v8.30.0 h1:5heLlxRQkHfXgTJgdQsJhi/evX1oj6i+xBanDu2XUM8= +github.com/zricethezav/gitleaks/v8 v8.30.0/go.mod h1:M5JQW5L+vZmkAqs9EX29hFQnn7uFz9sOQCPNewaZD9E= go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU= go.opencensus.io v0.22.0/go.mod h1:+kGneAE2xo2IficOXnaByMWTGM9T73dGwxeWcUqIpI8= go.opencensus.io v0.22.2/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw=