Skip to content

Latest commit

 

History

History
353 lines (251 loc) · 10.8 KB

File metadata and controls

353 lines (251 loc) · 10.8 KB

CAO Reports - CrowdStrike Threat Intelligence Fetcher

Version Python Tests

A modern CLI tool for fetching, managing, and generating CrowdStrike Counter Adversary Operations (CAO) threat intelligence reports.

Overview

CAO Reports provides a streamlined interface to download threat intelligence reports from the CrowdStrike Falcon Intelligence API. The tool saves reports as both JSON (metadata) and PDF (human-readable) formats, with support for bulk downloads, daemon mode for continuous monitoring, and credential validation.

Features

Core functionalities include:

  • Fetch Reports: Download threat intelligence reports from CrowdStrike Falcon CAO Intelligence API
  • Generate PDFs: Recreate PDF files from JSON data where PDFs are missing or where never existed
  • Daemon Mode: Run continuously to monitor for new reports
  • ZIP Archives: Optionally create ZIP archives of downloaded reports
  • PDF Repair: Automatically repair PDFs that may fail OCR processing in GhostScript

Additional visual and usability features:

  • Metadata File Creation: Save report metadata in separate JSON files for audit access
  • Rich CLI Output: Beautiful, colored terminal output using Rich
  • Validate Credentials: Test API credentials and verify required permissions
  • Resume Support: Resume interrupted downloads from where you left off
  • Flexible Configuration: Configure via environment variables, .env files, or CLI arguments

Screenshot

CAO Reports CLI Screenshot

Prerequisites

Important

CrowdStrike Falcon Account: CrowdStrike Falcon Counter Adversary Operations requires a Licensed Capability

Note

API Credentials Required: Client ID, Client Secret, and Cloud Region (us1, us2, eu1, usgov1, usgov2, or auto) with the

  • API Client having Intel API Read permissions (Reports - Falcon Intelligence: Read ✓)

System Requirements: Python 3.8+ and tidy-html5 for PDF generation

  • macOS: brew install tidy-html5
  • Ubuntu/Debian: sudo apt-get install tidy
  • Verify: tidy --version (v5.8.0+)

Installation

From Source

  1. Clone the repository:
git clone https://github.com/cs-shadowbq/cao-report-fetcher.git
cd cao-report-fetcher/cao-report-fetcher
  1. Install the package:
pip install -e .

Using the Shim Script

After installation, you can also use the convenient shim script:

./bin/cao-reports --help

Configuration

Environment Variables (.env)

Create a .env file from the example:

cp .env.example .env

Edit .env with your credentials:

# Required
FALCON_CLIENT_ID=your_client_id_here
FALCON_CLIENT_SECRET=your_client_secret_here
FALCON_CLIENT_CLOUD=auto

# Optional - Directories
DIR_REPORTS=./reports
DIR_ARCHIVES=./archives

# Optional - Logging
LOG_LEVEL=INFO
LOG_FILE=cao_reports.log

See .env.example for all available configuration options.

Usage

Basic Commands

Validate Credentials

Test your API credentials and permissions:

cao-reports validate

This will:

  • Verify your credentials are valid
  • Test API connectivity
  • Confirm you have the required Intel API read permissions
  • Display your rate limit information

Fetch Reports

Download reports from the API:

# Fetch all reports
cao-reports fetch

# Fetch with a filter (e.g., only CSA reports)
cao-reports fetch --filter "CSA"

# Fetch the latest 100 reports
cao-reports fetch --limit 100 --reverse

# Fetch and create ZIP archives
cao-reports fetch --create-zip

# Resume an interrupted fetch
cao-reports fetch --resume

# Note: --filter cannot be used with --resume
# The filter is read from the marker file when resuming

Note

When using --resume/-u, the following options cannot be provided on the command line and will be read from the marker file (marker_file.json) that was saved during the previous operation:

  • --filter/-f - Search filter string
  • --skip-pdf-recreation/-x - Skip PDF recreation flag
  • --create-zip/-z - Create ZIP archives flag
  • --remove-after-archive/-y - Remove folders after archiving flag

This ensures continuity when resuming an interrupted download with the exact same settings.

Generate PDF

Recreate a PDF from existing JSON data:

cao-reports generate --report CSA-13001

Daemon Mode

Run continuously to monitor for new reports:

cao-reports daemon

The daemon will:

  • Periodically check for new reports
  • Download any new reports found
  • Use exponential backoff when no new reports are available
  • Resume from the last known position on restart

Command Options

fetch - Fetch reports from API

Options:
  --client-id TEXT            CrowdStrike API client ID (env: FALCON_CLIENT_ID)
  --client-secret TEXT        CrowdStrike API client secret (env: FALCON_CLIENT_SECRET)
  --cloud [auto|us1|us2|eu1|usgov1|usgov2]
                              Cloud region (env: FALCON_CLIENT_CLOUD)
  -f, --filter TEXT           Search filter string for report names
  -r, --reverse               Reverse sort order (latest reports first)
  -l, --limit INTEGER         Maximum number of reports to fetch
  -o, --offset INTEGER        Starting offset for pagination
  -m, --round-limit INTEGER   Number of reports per pagination round
  -x, --skip-pdf-recreation   Skip recreation of missing PDFs from JSON
  -z, --create-zip            Create ZIP archives after fetching
  -y, --remove-after-archive  Delete report folders after archiving (requires -z)
  -u, --resume                Resume from last marker file (cannot be used with -f)
  -v, --verbose               Enable verbose output

Important

The --resume/-u flag cannot be used together with --filter/-f, --skip-pdf-recreation/-x, --create-zip/-z, or --remove-after-archive/-y. When resuming, these options are automatically read from the marker_file.json that was created during the previous fetch /daemon operation.

generate - Generate PDF from JSON

Options:
  -r, --report TEXT           Report name (e.g., CSA-13001) [required]
  --reports-dir TEXT          Directory containing reports (env: DIR_REPORTS)
  -v, --verbose               Enable verbose output

validate - Validate credentials

Options:
  --client-id TEXT            CrowdStrike API client ID (env: FALCON_CLIENT_ID)
  --client-secret TEXT        CrowdStrike API client secret (env: FALCON_CLIENT_SECRET)
  --cloud [auto|us1|us2|eu1|usgov1|usgov2]
                              Cloud region (env: FALCON_CLIENT_CLOUD)

daemon - Run as daemon

Options:
  --client-id TEXT            CrowdStrike API client ID (env: FALCON_CLIENT_ID)
  --client-secret TEXT        CrowdStrike API client secret (env: FALCON_CLIENT_SECRET)
  --cloud [auto|us1|us2|eu1|usgov1|usgov2]
                              Cloud region (env: FALCON_CLIENT_CLOUD)
  -f, --filter TEXT           Search filter string for report names
  -r, --reverse               Reverse sort order (latest reports first)
  -m, --round-limit INTEGER   Number of reports per pagination round
  -x, --skip-pdf-recreation   Skip recreation of missing PDFs from JSON
  -z, --create-zip            Create ZIP archives after each cycle
  -y, --remove-after-archive  Delete report folders after archiving (requires -z)
  -u, --resume                Resume from last marker file (cannot be used with -f)
  -v, --verbose               Enable verbose output

Important

The --resume/-u flag cannot be used together with --filter/-f, --skip-pdf-recreation/-x, --create-zip/-z, or --remove-after-archive/-y. When resuming, these options are automatically read from the marker_file.json that was created during the previous daemon cycle.

zip - Create ZIP archives

Options:
  --reports-dir TEXT          Directory containing reports (env: DIR_REPORTS)
  --archives-dir TEXT         Directory to store archives (env: DIR_ARCHIVES)
  -i, --index INTEGER         Starting index for archive naming
  --min-size INTEGER          Minimum archive size in MB
  --max-size INTEGER          Maximum archive size in MB
  -y, --remove-after-archive  Delete report folders after they have been archived
  -v, --verbose               Enable verbose output

Examples

Fetch all CSA reports

cao-reports fetch --filter "CSA"

Fetch the latest 50 reports and create archives

cao-reports fetch --limit 50 --reverse --create-zip

Run as daemon with verbose logging

cao-reports daemon --verbose

Validate credentials for US Government cloud

cao-reports validate --cloud usgov1

Data Storage

Storage Requirements: Downloading all ~20,000 reports requires approximately 50-100 GB of disk space for both JSON and PDF files.

Directory Structure:

reports/
├── CSA-13001/
│   ├── CSA-13001.json
│   ├── CSA-13001.pdf
│   └── CSA-13001.meta.json
├── CSA-13010/
│   ├── CSA-13010.json
│   ├── CSA-13010.pdf
│   └── CSA-13010.meta.json
...

API Rate Limits

The CrowdStrike Intel API has rate limits:

  • 6,000 requests per minute (default)
  • The tool displays your remaining rate limit after each query
  • Use --round-limit to control batch sizes and manage rate limits

Troubleshooting

Authentication Errors

If you see authentication errors:

  1. Verify your credentials in .env
  2. Run cao-reports validate to test credentials
  3. Ensure your API client has Intel API Read scope

PDF Generation Issues

If PDFs fail to generate:

  1. Verify tidy-html5 is installed: tidy --version
  2. Check logs for specific errors: tail -f cao_reports.log
  3. Use --skip-pdf-recreation to skip PDF generation

Resuming Downloads

If a fetch is interrupted:

cao-reports fetch --resume

This will continue from the last saved position in marker_file.json.

Support

For issues, questions, or contributions, please visit:

License

MIT License - Copyright (c) 2024-2026 CrowdStrike

See LICENSE.md for full license text.