Skip to content

buikhacnam/tech-radar

Repository files navigation

Tech Radar

Better insight into daily trending Github repos:

Environment

Create a local env file:

cp .env.example .env

Important:

  • .env.example defaults to local Postgres on localhost:5432.
  • Docker Compose overrides TECHRADAR_DATABASE_URL for the web container so it can reach the postgres service by hostname.

Example local database URL:

TECHRADAR_DATABASE_URL=postgresql+psycopg://techradar:techradar@localhost:5432/techradar

Local Run

Install dependencies:

uv sync --extra dev

Install Playwright Chromium for local crawling and language enrichment:

uv run playwright install chromium

Run migrations:

uv run alembic upgrade head

Start the app:

uv run uvicorn app.main:app --reload --port 8001

Open:

  • App: http://127.0.0.1:8001/dashboard/languages
  • API docs: http://127.0.0.1:8001/docs

Docker Run

Start Postgres and the app:

docker compose up --build

Open:

  • App: http://127.0.0.1:8000/dashboard/languages

Data Collection Commands

Crawl trending pages:

uv run techradar crawl trending
uv run techradar crawl trending --language python --period weekly

Default crawl size is controlled by TECHRADAR_TRENDING_REPO_LIMIT in .env. Trending targets are controlled by TECHRADAR_TRENDING_LANGUAGES.

Run the full pipeline:

uv run techradar sync
uv run techradar sync --period monthly
uv run techradar aggregate daily --date 2026-06-19
uv run techradar aggregate daily --today

sync crawls the configured trending targets, stores only the configured number of repos per snapshot, and then enriches stored repos that still need language refresh. If --language is omitted it crawls every configured language; if --period is omitted it uses daily. For aggregation, --today is a shortcut for the current UTC date if you do not want to type --date YYYY-MM-DD.

Override the default crawl size of 3:

uv run techradar crawl trending --repo-limit 5
uv run techradar sync --repo-limit 5

Refresh language breakdowns:

uv run techradar enrich languages

Optional --limit is available on enrich commands if you want a manual partial run, but the default is to process all eligible stored repos.

Target a single repository:

uv run techradar enrich languages --full-name acme/rocket

Daily Operations

Daily metrics are a separate step from crawling and language enrichment. The safe UTC run order is:

uv run techradar crawl trending
uv run techradar enrich languages
uv run techradar aggregate daily --today

Recommended operator guidance:

  • Run aggregate daily only after the crawl and language enrichment for that UTC date have finished.
  • Use an explicit UTC date for aggregation so reruns and backfills are deterministic.
  • Schedule the daily run after upstream trending data is expected to be available. Cron and GitHub Actions are both valid deployment options.

Backfills:

uv run techradar crawl trending --language python
uv run techradar enrich languages --full-name acme/rocket
uv run techradar aggregate daily --date 2026-06-18
uv run techradar aggregate daily --date 2026-06-19
  • Backfill one UTC metric date at a time, oldest to newest, so rolling 7-day metrics rebuild in the right order.
  • If a past date is rerun after source data changes, future rolling-window rows may also change because the aggregator recomputes downstream windows.
  • The daily aggregation command is idempotent for unchanged source rows, so rerunning the same date is safe.

Retry behavior:

  • If crawl fails, rerun crawl before enrichment or aggregation.
  • If language enrichment fails, rerun the failed enrich command and then rerun aggregate daily for the affected UTC date.
  • If aggregation fails mid-run, fix the issue and rerun aggregate daily --date ...; the command rewrites that date deterministically.

API Endpoints

  • GET /api/snapshots
  • GET /api/snapshots/{snapshot_id}

Web Pages

  • /dashboard/languages
  • /dashboard/languages/{language}

Language Dashboard Calculation

The language treemap at /dashboard/languages uses the latest aggregated daily metrics and sizes each language tile by that language's total weighted activity.

Calculation steps:

  1. For each repository in the latest daily_repo_metrics snapshot, determine a repo weight.
    • Use stars_gained if it exists and is greater than 0.
    • Otherwise use a rank fallback weight of 101 - rank, with a minimum of 1.
  2. Split that repo weight across the repository's language breakdown.
    • Example: if a repo weight is 20 and its breakdown is Python 80% and Shell 20%, then:
      • Python contribution = 20 * 0.8 = 16
      • Shell contribution = 20 * 0.2 = 4
  3. Sum those contributions across all repositories for each language.
    • That sum is the language's weighted contribution.
  4. Sum all language weighted contributions together.
  5. Compute each language's share:
    • weighted share = language_weighted_contribution / total_weighted_contribution

Treemap behavior:

  • Tile area is based on the raw weighted contribution.
  • The displayed percentage is the language's weighted share of the total activity across all languages.
  • Clicking a language opens /dashboard/languages/{language}, which shows the repositories contributing to that language and their individual percentages.

Testing

Run tests:

uv run pytest

Run lint:

uv run ruff check .

Notes

  • Trending pages are scraped with Playwright for Python.
  • Repository detail pages are currently fetched with httpx and parsed with BeautifulSoup.
  • The current schema stores snapshot history, language breakdown snapshots, and aggregated daily metrics.

About

Visual insight of trending Github repositories

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages