CAO Reports - CrowdStrike Threat Intelligence Fetcher

A modern CLI tool for fetching, managing, and generating CrowdStrike Counter Adversary Operations (CAO) threat intelligence reports.

Overview

CAO Reports provides a streamlined interface to download threat intelligence reports from the CrowdStrike Falcon Intelligence API. The tool saves reports as both JSON (metadata) and PDF (human-readable) formats, with support for bulk downloads, daemon mode for continuous monitoring, and credential validation.

Features

Core functionalities include:

Fetch Reports: Download threat intelligence reports from CrowdStrike Falcon CAO Intelligence API
Generate PDFs: Recreate PDF files from JSON data where PDFs are missing or where never existed
Daemon Mode: Run continuously to monitor for new reports
ZIP Archives: Optionally create ZIP archives of downloaded reports
PDF Repair: Automatically repair PDFs that may fail OCR processing in GhostScript

Additional visual and usability features:

Metadata File Creation: Save report metadata in separate JSON files for audit access
Rich CLI Output: Beautiful, colored terminal output using Rich
Validate Credentials: Test API credentials and verify required permissions
Resume Support: Resume interrupted downloads from where you left off
Flexible Configuration: Configure via environment variables, .env files, or CLI arguments

Screenshot

Prerequisites

Important

CrowdStrike Falcon Account: CrowdStrike Falcon Counter Adversary Operations requires a Licensed Capability

Note

API Credentials Required: Client ID, Client Secret, and Cloud Region (us1, us2, eu1, usgov1, usgov2, or auto) with the

API Client having Intel API Read permissions (Reports - Falcon Intelligence: Read ✓)

System Requirements: Python 3.8+ and tidy-html5 for PDF generation

macOS: brew install tidy-html5
Ubuntu/Debian: sudo apt-get install tidy
Verify: tidy --version (v5.8.0+)

Installation

From Source

Clone the repository:

git clone https://github.com/cs-shadowbq/cao-report-fetcher.git
cd cao-report-fetcher/cao-report-fetcher

Install the package:

pip install -e .

Using the Shim Script

After installation, you can also use the convenient shim script:

./bin/cao-reports --help

Configuration

Environment Variables (.env)

Create a .env file from the example:

cp .env.example .env

Edit .env with your credentials:

# Required
FALCON_CLIENT_ID=your_client_id_here
FALCON_CLIENT_SECRET=your_client_secret_here
FALCON_CLIENT_CLOUD=auto

# Optional - Directories
DIR_REPORTS=./reports
DIR_ARCHIVES=./archives

# Optional - Logging
LOG_LEVEL=INFO
LOG_FILE=cao_reports.log

See .env.example for all available configuration options.

Usage

Basic Commands

Validate Credentials

Test your API credentials and permissions:

cao-reports validate

This will:

Verify your credentials are valid
Test API connectivity
Confirm you have the required Intel API read permissions
Display your rate limit information

Fetch Reports

Download reports from the API:

# Fetch all reports
cao-reports fetch

# Fetch with a filter (e.g., only CSA reports)
cao-reports fetch --filter "CSA"

# Fetch the latest 100 reports
cao-reports fetch --limit 100 --reverse

# Fetch and create ZIP archives
cao-reports fetch --create-zip

# Resume an interrupted fetch
cao-reports fetch --resume

# Note: --filter cannot be used with --resume
# The filter is read from the marker file when resuming

Note

When using --resume/-u, the following options cannot be provided on the command line and will be read from the marker file (marker_file.json) that was saved during the previous operation:

--filter/-f - Search filter string
--skip-pdf-recreation/-x - Skip PDF recreation flag
--create-zip/-z - Create ZIP archives flag
--remove-after-archive/-y - Remove folders after archiving flag

This ensures continuity when resuming an interrupted download with the exact same settings.

Generate PDF

Recreate a PDF from existing JSON data:

cao-reports generate --report CSA-13001

Daemon Mode

Run continuously to monitor for new reports:

cao-reports daemon

The daemon will:

Periodically check for new reports
Download any new reports found
Use exponential backoff when no new reports are available
Resume from the last known position on restart

Command Options

`fetch` - Fetch reports from API

Options:
  --client-id TEXT            CrowdStrike API client ID (env: FALCON_CLIENT_ID)
  --client-secret TEXT        CrowdStrike API client secret (env: FALCON_CLIENT_SECRET)
  --cloud [auto|us1|us2|eu1|usgov1|usgov2]
                              Cloud region (env: FALCON_CLIENT_CLOUD)
  -f, --filter TEXT           Search filter string for report names
  -r, --reverse               Reverse sort order (latest reports first)
  -l, --limit INTEGER         Maximum number of reports to fetch
  -o, --offset INTEGER        Starting offset for pagination
  -m, --round-limit INTEGER   Number of reports per pagination round
  -x, --skip-pdf-recreation   Skip recreation of missing PDFs from JSON
  -z, --create-zip            Create ZIP archives after fetching
  -y, --remove-after-archive  Delete report folders after archiving (requires -z)
  -u, --resume                Resume from last marker file (cannot be used with -f)
  -v, --verbose               Enable verbose output

Important

The --resume/-u flag cannot be used together with --filter/-f, --skip-pdf-recreation/-x, --create-zip/-z, or --remove-after-archive/-y. When resuming, these options are automatically read from the marker_file.json that was created during the previous fetch /daemon operation.

`generate` - Generate PDF from JSON

Options:
  -r, --report TEXT           Report name (e.g., CSA-13001) [required]
  --reports-dir TEXT          Directory containing reports (env: DIR_REPORTS)
  -v, --verbose               Enable verbose output

`validate` - Validate credentials

Options:
  --client-id TEXT            CrowdStrike API client ID (env: FALCON_CLIENT_ID)
  --client-secret TEXT        CrowdStrike API client secret (env: FALCON_CLIENT_SECRET)
  --cloud [auto|us1|us2|eu1|usgov1|usgov2]
                              Cloud region (env: FALCON_CLIENT_CLOUD)

`daemon` - Run as daemon

Options:
  --client-id TEXT            CrowdStrike API client ID (env: FALCON_CLIENT_ID)
  --client-secret TEXT        CrowdStrike API client secret (env: FALCON_CLIENT_SECRET)
  --cloud [auto|us1|us2|eu1|usgov1|usgov2]
                              Cloud region (env: FALCON_CLIENT_CLOUD)
  -f, --filter TEXT           Search filter string for report names
  -r, --reverse               Reverse sort order (latest reports first)
  -m, --round-limit INTEGER   Number of reports per pagination round
  -x, --skip-pdf-recreation   Skip recreation of missing PDFs from JSON
  -z, --create-zip            Create ZIP archives after each cycle
  -y, --remove-after-archive  Delete report folders after archiving (requires -z)
  -u, --resume                Resume from last marker file (cannot be used with -f)
  -v, --verbose               Enable verbose output

Important

`zip` - Create ZIP archives

Options:
  --reports-dir TEXT          Directory containing reports (env: DIR_REPORTS)
  --archives-dir TEXT         Directory to store archives (env: DIR_ARCHIVES)
  -i, --index INTEGER         Starting index for archive naming
  --min-size INTEGER          Minimum archive size in MB
  --max-size INTEGER          Maximum archive size in MB
  -y, --remove-after-archive  Delete report folders after they have been archived
  -v, --verbose               Enable verbose output

Examples

Fetch all CSA reports

cao-reports fetch --filter "CSA"

Fetch the latest 50 reports and create archives

cao-reports fetch --limit 50 --reverse --create-zip

Run as daemon with verbose logging

cao-reports daemon --verbose

Validate credentials for US Government cloud

cao-reports validate --cloud usgov1

Data Storage

Storage Requirements: Downloading all ~20,000 reports requires approximately 50-100 GB of disk space for both JSON and PDF files.

Directory Structure:

reports/
├── CSA-13001/
│   ├── CSA-13001.json
│   ├── CSA-13001.pdf
│   └── CSA-13001.meta.json
├── CSA-13010/
│   ├── CSA-13010.json
│   ├── CSA-13010.pdf
│   └── CSA-13010.meta.json
...

API Rate Limits

The CrowdStrike Intel API has rate limits:

6,000 requests per minute (default)
The tool displays your remaining rate limit after each query
Use --round-limit to control batch sizes and manage rate limits

Troubleshooting

Authentication Errors

If you see authentication errors:

Verify your credentials in .env
Run cao-reports validate to test credentials
Ensure your API client has Intel API Read scope

PDF Generation Issues

If PDFs fail to generate:

Verify tidy-html5 is installed: tidy --version
Check logs for specific errors: tail -f cao_reports.log
Use --skip-pdf-recreation to skip PDF generation

Resuming Downloads

If a fetch is interrupted:

cao-reports fetch --resume

This will continue from the last saved position in marker_file.json.

Support

For issues, questions, or contributions, please visit:

GitHub Issues: https://github.com/cs-shadowbq/cao-report-fetcher/issues
Developer Documentation: DEV.md

License

See LICENSE.md for full license text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAO Reports - CrowdStrike Threat Intelligence Fetcher

Overview

Features

Screenshot

Prerequisites

Installation

From Source

Using the Shim Script

Configuration

Environment Variables (.env)

Usage

Basic Commands

Validate Credentials

Fetch Reports

Generate PDF

Daemon Mode

Command Options

`fetch` - Fetch reports from API

`generate` - Generate PDF from JSON

`validate` - Validate credentials

`daemon` - Run as daemon

`zip` - Create ZIP archives

Examples

Fetch all CSA reports

Fetch the latest 50 reports and create archives

Run as daemon with verbose logging

Validate credentials for US Government cloud

Data Storage

API Rate Limits

Troubleshooting

Authentication Errors

PDF Generation Issues

Resuming Downloads

Support

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

CAO Reports - CrowdStrike Threat Intelligence Fetcher

Overview

Features

Screenshot

Prerequisites

Installation

From Source

Using the Shim Script

Configuration

Environment Variables (.env)

Usage

Basic Commands

Validate Credentials

Fetch Reports

Generate PDF

Daemon Mode

Command Options

fetch - Fetch reports from API

generate - Generate PDF from JSON

validate - Validate credentials

daemon - Run as daemon

zip - Create ZIP archives

Examples

Fetch all CSA reports

Fetch the latest 50 reports and create archives

Run as daemon with verbose logging

Validate credentials for US Government cloud

Data Storage

API Rate Limits

Troubleshooting

Authentication Errors

PDF Generation Issues

Resuming Downloads

Support

License

`fetch` - Fetch reports from API

`generate` - Generate PDF from JSON

`validate` - Validate credentials

`daemon` - Run as daemon

`zip` - Create ZIP archives