Skip to content

cagataycali/agi-diy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

47 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

agi.diy

License Strands Browser

Build your own AGI. In your browser. Right now.

▢️ Launch β€’ πŸ“– SDK Docs β€’ πŸ“± Install as App


Why agi.diy?

ChatGPT/Claude.ai agi.diy
Privacy Data on their servers 100% in your browser
Cost $20-200/month subscription Pay only for API usage
Offline Mode ❌ βœ… WebLLM runs locally
Custom Tools Limited plugins Create unlimited tools
Multi-Agent ❌ βœ… Coordinated agent teams
Self-Modifying ❌ βœ… Agent evolves itself
Open Source ❌ βœ… Fully auditable

Quick Start

Use the hosted version:

open https://agi.diy

Or self-host:

git clone https://github.com/cagataycali/agi-diy.git
cd agi-diy/docs && python3 -m http.server 8080

Then: Settings β†’ Add API key β†’ Start chatting

Want 100% local? Select WebLLM β†’ Download model once β†’ Works offline forever


How It Works

graph LR
    A[You type message] --> B[Agent selects tools]
    B --> C[Calls AI model]
    C --> D[Executes in browser]
    D --> E[Shows result]
    
    style A fill:#3b82f6,color:#fff
    style E fill:#10b981,color:#fff
Loading

Everything runs in your browser. The only external call is to the AI provider you choose (or none with WebLLM).


What Can It Do?

πŸ› οΈ Create Custom Tools

Ask the agent to build tools on the fly:

"Create a tool that fetches Bitcoin price from CoinGecko"

The tool saves to localStorage and persists forever. Use it anytime:

"What's the current Bitcoin price?"
β†’ Uses your custom tool β†’ "$67,432"

πŸ“§ Automate Email

"Every morning at 9am, check my Gmail and notify me of urgent emails"

Agent connects via Google OAuth, schedules a cron job, and sends push notifications.

πŸ’» Pair Programming

"Watch my screen every 30 seconds and help me debug"

Agent captures your screen, spots errors, and suggests fixes in real-time.

πŸ‘₯ Multi-Agent Research

"Spawn researcher, analyst, and writer agents. Research AI safety."

Agents coordinate through ring attentionβ€”when one learns something, others see it.

πŸŒ™ Background Thinking

Ask about a topic, then walk away. Agent keeps exploring while you're gone. When you return, findings are injected into your next message.

πŸ—ΊοΈ Location Intelligence

"Mark top 5 coffee shops near me and fly me through them"

Interactive Google Maps with GPS tracking and smooth camera animations.


Two Modes

Mode File What it's for
Single Agent index.html Personal assistant, coding help, quick tasks
Multi-Agent agi.html Research teams, parallel processing, scheduled automation

Multi-Agent Architecture

graph TD
    You[You] --> Main[Main Agent]
    Main --> R[Researcher]
    Main --> A[Analyst]  
    Main --> W[Writer]
    
    R -.-> Ring[(Ring Buffer)]
    A -.-> Ring
    W -.-> Ring
    Ring -.-> R
    Ring -.-> A
    Ring -.-> W
Loading

Ring Attention: Agents share context automatically. When the researcher finds papers, the analyst and writer see that context immediately.


Models

Cloud Models (API key required)

Provider Models Best for
Anthropic Claude Opus, Sonnet, Haiku Quality reasoning
OpenAI GPT-4o, GPT-4, GPT-3.5 General tasks
Amazon Bedrock Claude + extended thinking Deep analysis

Local Models (free, offline)

Model Size Notes
Qwen 2.5 3B ⭐ ~2GB Recommended for most users
Qwen 2.5 1.5B ~1GB Faster, less capable
Llama 3.2 1B ~700MB Smallest, for mobile
Hermes 8B ~4GB Best tool usage

WebLLM requires Chrome/Edge 113+ with WebGPU


Tools Reference

Core

  • render_ui β€” Dynamic HTML components in chat
  • javascript_eval β€” Execute JS, return results
  • storage_get/set β€” Persistent localStorage
  • fetch_url β€” HTTP requests
  • notify β€” Push notifications (works in background)

Self-Modification

  • create_tool β€” Define new tools at runtime
  • list_tools β€” See all available tools
  • delete_tool β€” Remove tools
  • update_self β€” Rewrite system prompt

Vision & Context

  • get_user_context β€” Activity state, mouse position, idle time
  • set_context β€” Add custom context
  • scan_bluetooth β€” Find nearby devices and agents

Maps

  • add_map_marker β€” Place markers with emoji/labels
  • fly_to_location β€” Smooth camera animations
  • tour_markers β€” Animated journey through points
  • get_map_location β€” Current GPS position

Google APIs

  • google_auth β€” OAuth 2.0 authentication
  • use_google β€” Access 200+ Google services
  • gmail_send β€” Send emails directly

Multi-Agent (agi.html only)

  • use_agent β€” Spawn sub-agents
  • invoke_agent β€” Call agent, wait for response
  • broadcast_to_agents β€” Message all agents
  • scheduler β€” Cron-based recurring tasks

Ambient Mode

Agent thinks while you're away.

Mode Trigger Behavior
πŸŒ™ Standard 30s idle Runs 3 iterations, then pauses
πŸš€ Autonomous Click button Runs until [AMBIENT_DONE] or stopped

How it works: You ask about quantum computing, go make coffee. Agent explores applications, recent breakthroughs, industry adoption. When you return, those findings auto-inject into your next message.


Privacy & Security

Your data never leaves your browser (except queries to your chosen AI provider).

What Where it's stored
API Keys localStorage (never transmitted)
Conversations localStorage
Custom Tools localStorage
Settings Sync AES-256-GCM encrypted

With WebLLM: Zero external calls. Everything runs on your GPU.


Install as PWA

Platform Steps
iOS Safari β†’ Share β†’ Add to Home Screen
Android Chrome β†’ Menu β†’ Install app
Desktop Click install icon in URL bar

Features: Home screen icon, background notifications, offline support, settings sync between devices.

Sync Settings

  1. Settings β†’ Sync β†’ Enter password
  2. Copy encrypted URL
  3. On other device: paste URL, enter password
  4. All settings transfer securely

Configuration

Get API Keys

Extended Thinking (Bedrock)

Paste in Settings β†’ API β†’ Additional Request Fields:

{
  "thinking": { "type": "enabled", "budget_tokens": 10000 }
}

Google OAuth

  1. Cloud Console β†’ Create OAuth Client
  2. Add authorized origin: https://agi.diy
  3. Settings β†’ Google β†’ Paste Client ID

URL Shortcuts

https://agi.diy/?q=what+time+is+it

Great for iOS Shortcutsβ€”one tap to query.


Console API

agi.agent              // Agent instance
agi.clear()            // Clear conversation
agi.tools.list()       // List custom tools
agi.tools.delete(name) // Remove tool

agiContext.getContext()      // All context data
agiContext.scanBluetooth()   // Find nearby devices

Troubleshooting

Issue Solution
No response Settings β†’ Check API key
WebLLM won't load Use Chrome/Edge 113+
Model download stuck Refresh page
Screen capture denied Allow browser permission
No notifications Enable in browser settings

Project Structure

docs/
β”œβ”€β”€ index.html      # Single agent mode
β”œβ”€β”€ agi.html        # Multi-agent mode
β”œβ”€β”€ strands.js      # Strands SDK bundle
β”œβ”€β”€ vision.js       # Screen capture, ambient mode
β”œβ”€β”€ webllm.js       # Local model inference
β”œβ”€β”€ map.js          # Google Maps integration
β”œβ”€β”€ tools/google.js # Google API tools
β”œβ”€β”€ sw.js           # Service worker (PWA)
└── manifest.json   # PWA config

Contributing

PRs welcome for:

  • βœ… New tools
  • βœ… Model providers
  • βœ… Bug fixes
  • ❌ Build systems
  • ❌ Framework dependencies

The project is intentionally minimalβ€”single HTML files, no build step, fully auditable.


License

Apache 2.0


Built with Strands Agents SDK
agi.diy

About

Build your own AGI. In your browser. Right now.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors