Skip to content

XD3an/PocketFlow

 
 

Repository files navigation

Pocket Flow – 100-line minimalist LLM framework

Note: This is a fork of the original PocketFlow by Zachary Huang and The Pocket. All credit for the core framework goes to them.

This fork adds extra utilities, including a Universal LLM API Client for easy access to 8+ LLM providers.

English | 中文 | Español | 日本語 | Deutsch | Русский | Português | Français | 한국어

License: MIT Docs

Pocket Flow is a 100-line minimalist LLM framework

  • Lightweight: Just 100 lines. Zero bloat, zero dependencies, zero vendor lock-in.

  • Expressive: Everything you love—(Multi-)Agents, Workflow, RAG, and more.

  • Agentic Coding: Let AI Agents (e.g., Cursor AI) build Agents—10x productivity boost!

Get started with Pocket Flow:

Why Pocket Flow?

Current LLM frameworks are bloated... You only need 100 lines for LLM Framework!

Abstraction App-Specific Wrappers Vendor-Specific Wrappers Lines Size
LangChain Agent, Chain Many
(e.g., QA, Summarization)
Many
(e.g., OpenAI, Pinecone, etc.)
405K +166MB
CrewAI Agent, Chain Many
(e.g., FileReadTool, SerperDevTool)
Many
(e.g., OpenAI, Anthropic, Pinecone, etc.)
18K +173MB
SmolAgent Agent Some
(e.g., CodeAgent, VisitWebTool)
Some
(e.g., DuckDuckGo, Hugging Face, etc.)
8K +198MB
LangGraph Agent, Graph Some
(e.g., Semantic Search)
Some
(e.g., PostgresStore, SqliteSaver, etc.)
37K +51MB
AutoGen Agent Some
(e.g., Tool Agent, Chat Agent)
Many [Optional]
(e.g., OpenAI, Pinecone, etc.)
7K
(core-only)
+26MB
(core-only)
PocketFlow Graph None None 100 +56KB

How does Pocket Flow work?

The 100 lines capture the core abstraction of LLM frameworks: Graph!


From there, it's easy to implement popular design patterns like (Multi-)Agents, Workflow, RAG, etc.


✨ Below are basic tutorials:
Name Difficulty Description
Chat ☆☆☆ Dummy A basic chat bot with conversation history
Structured Output ☆☆☆ Dummy Extracting structured data from resumes by prompting
Workflow ☆☆☆ Dummy A writing workflow that outlines, writes content, and applies styling
Agent ☆☆☆ Dummy A research agent that can search the web and answer questions
RAG ☆☆☆ Dummy A simple Retrieval-augmented Generation process
Batch ☆☆☆ Dummy A batch processor that translates markdown into multiple languages
Streaming ☆☆☆ Dummy A real-time LLM streaming demo with user interrupt capability
Chat Guardrail ☆☆☆ Dummy A travel advisor chatbot that only processes travel-related queries
Majority Vote ☆☆☆ Dummy Improve reasoning accuracy by aggregating multiple solution attempts
Map-Reduce ☆☆☆ Dummy Batch resume qualification using map-reduce pattern
CLI HITL ☆☆☆ Dummy A command-line joke generator with human-in-the-loop feedback
Multi-Agent ★☆☆ Beginner A Taboo word game for async communication between 2 agents
Supervisor ★☆☆ Beginner Research agent is getting unreliable... Let's build a supervision process
Parallel ★☆☆ Beginner A parallel execution demo that shows 3x speedup
Parallel Flow ★☆☆ Beginner A parallel image processing showing 8x speedup
Thinking ★☆☆ Beginner Solve complex reasoning problems through Chain-of-Thought
Memory ★☆☆ Beginner A chat bot with short-term and long-term memory
Text2SQL ★☆☆ Beginner Convert natural language to SQL queries with an auto-debug loop
Code Generator ★☆☆ Beginner Generate test cases, implement solutions, and iteratively improve code
MCP ★☆☆ Beginner Agent using Model Context Protocol for numerical operations
A2A ★☆☆ Beginner Agent wrapped with A2A protocol for inter-agent communication
Streamlit FSM ★☆☆ Beginner Streamlit app with finite state machine for HITL image generation
FastAPI WebSocket ★☆☆ Beginner Real-time chat interface with streaming LLM responses via WebSocket
FastAPI Background ★☆☆ Beginner FastAPI app with background jobs and real-time progress via SSE
Voice Chat ★☆☆ Beginner An interactive voice chat application with VAD, STT, LLM, and TTS.

👀 Want to see other tutorials for dummies? Create an issue!

How to Use Pocket Flow?

🚀 Through Agentic Coding—the fastest LLM App development paradigm-where humans design and agents code!



✨ Below are examples of more complex LLM Apps:

App Name Difficulty Topics Human Design Agent Code
Website Chatbot
Turn your website into a 24/7 customer support genius
★★☆
Medium
Agent
RAG
Design Doc Flow Code
Danganronpa Simulator
Forget the Turing test. Danganronpa, the ultimate AI experiment!
★★★
Advanced
Workflow
Agent
Design Doc Flow Code
Codebase Knowledge Builder
Life's too short to stare at others' code in confusion
★★☆
Medium
Workflow Design Doc Flow Code
Build Cursor with Cursor
We'll reach the singularity soon ...
★★★
Advanced
Agent Design Doc Flow Code
Ask AI Paul Graham
Ask AI Paul Graham, in case you don't get in
★★☆
Medium
RAG
Map Reduce
TTS
Design Doc Flow Code
Youtube Summarizer
Explain YouTube Videos to you like you're 5
★☆☆
Beginner
Map Reduce Design Doc Flow Code
Cold Opener Generator
Instant icebreakers that turn cold leads hot
★☆☆
Beginner
Map Reduce
Web Search
Design Doc Flow Code
  • Want to learn Agentic Coding?

    • Check out my YouTube for video tutorial on how some apps above are made!

    • Want to build your own LLM App? Read this post! Start with this template!


🆕 Additional Feature: Universal LLM API Client

This fork includes an enhanced utils/llm_api.py module that provides a unified interface for multiple LLM providers:

Features

  • Multiple Providers: OpenAI, Azure OpenAI, Anthropic, Google Gemini, DeepSeek, SiliconFlow, Local models, and Ollama
  • Image Support: Attach images to prompts across compatible providers
  • Easy Configuration: Simple environment variable setup with .env support
  • Consistent Interface: Same API regardless of the underlying provider

Quick Start

from utils.llm_api import LLMClient

# Simple query with any provider
llm = LLMClient(provider="openai")
response = llm.query("What's the meaning of life?")

# Query with image (supported providers)
llm = LLMClient(provider="openai")
response = llm.query("Describe this image", image_path="photo.jpg")

# Backward compatibility with existing code
llm = LLMClient(provider="ollama")
messages = [{"role": "user", "content": "Hello!"}]
response = llm.call(messages=messages)

Supported Providers

Provider Model Notes
openai GPT-4o Supports images
anthropic Claude 3.5 Sonnet Supports images
gemini Gemini 2.0 Flash Supports images
azure Azure OpenAI Enterprise solution
deepseek DeepSeek Chat Cost-effective
siliconflow DeepSeek R1 High performance
local Self-hosted Custom endpoint
ollama Local models Privacy-first

Environment Setup

Create a .env file with your API keys:

# OpenAI
OPENAI_API_KEY=your_openai_api_key

# Anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key

# Google Gemini
GOOGLE_API_KEY=your_google_api_key

# Azure OpenAI
AZURE_OPENAI_API_KEY=your_azure_api_key
AZURE_OPENAI_MODEL_DEPLOYMENT=gpt-4o-ms

# DeepSeek
DEEPSEEK_API_KEY=your_deepseek_api_key

# SiliconFlow
SILICONFLOW_API_KEY=your_siliconflow_api_key

This makes it incredibly easy to switch between different LLM providers in your PocketFlow applications!

About

Pocket Flow: 100-line LLM framework. Let Agents build Agents!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 71.8%
  • Jupyter Notebook 20.3%
  • HTML 7.5%
  • CSS 0.4%