You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Convert text2sql example to German Pokemon MySQL database
- Updated from SQLite ecommerce DB to MySQL Pokemon database
- Added German Pokemon database setup instructions with Docker
- Created comprehensive German Pokemon sample queries
- Added complex queries demonstrating evolution chains, type effectiveness, statistical analysis
- Updated README.md with proper setup guide and advanced query examples
- Added .env.example template for configuration
- Cleaned up debugging artifacts while preserving functional debugging workflow
- Updated all code comments and documentation for MySQL implementation
A PocketFlow example demonstrating a text-to-SQL workflow that converts natural language questions into executable SQL queries for an SQLite database, including an LLM-powered debugging loop for failed queries.
3
+
A PocketFlow example demonstrating a text-to-SQL workflow that converts natural language questions into executable SQL queries for a MySQL Pokemon database, including an LLM-powered debugging loop for failed queries.
4
4
5
5
- Check out the [Substack Post Tutorial](https://zacharyhuang.substack.com/p/text-to-sql-from-scratch-tutorial) for more!
6
6
7
7
## Features
8
8
9
9
-**Schema Awareness**: Automatically retrieves the database schema to provide context to the LLM.
10
-
-**LLM-Powered SQL Generation**: Uses an LLM (GPT-4o) to translate natural language questions into SQLite queries (using YAML structured output).
10
+
-**LLM-Powered SQL Generation**: Uses an LLM (GPT-4o) to translate natural language questions into MySQL queries (using YAML structured output).
11
11
-**Automated Debugging Loop**: If SQL execution fails, an LLM attempts to correct the query based on the error message. This process repeats up to a configurable number of times.
12
+
-**German Pokemon Database**: Works with a comprehensive German Pokemon database containing 898 Pokemon, types, moves, and relationships.
12
13
## Getting Started
13
14
14
-
1.**Install Packages:**
15
+
### Prerequisites
16
+
17
+
1.**Pokemon Database Setup:**
18
+
19
+
Set up the German Pokemon database from the [matse-spicker-db](https://github.com/pblan/matse-spicker-db/) repository:
- MySQL database on port 3308 with the Pokemon data
28
+
- phpMyAdmin interface on port 8080 for database management
29
+
30
+
2. **Install Packages:**
15
31
```bash
16
32
pip install -r requirements.txt
17
33
```
34
+
*or using uv:*
35
+
```bash
36
+
uv sync
37
+
```
18
38
19
-
2. **Set API Key:**
20
-
Set the environment variable for your OpenAI API key.
39
+
3. **Set Environment Variables:**
40
+
41
+
Copy the example environment file and configure your settings:
21
42
```bash
22
-
export OPENAI_API_KEY="your-api-key-here"
43
+
cp .env.example .env
44
+
```
45
+
46
+
Edit the `.env` file with your OpenAI API key:
47
+
```
48
+
OPENAI_API_KEY=your-openai-api-key-here
49
+
```
50
+
51
+
The MySQL database configuration should match the Docker setup:
52
+
```
53
+
MYSQL_HOST=localhost
54
+
MYSQL_PORT=3308
55
+
MYSQL_USER=root
56
+
MYSQL_PASSWORD=root
57
+
MYSQL_DB=db_pokemon
23
58
```
24
-
*(Replace `"your-api-key-here"` with your actual key)*
25
59
26
-
3. **Verify API Key (Optional):**
60
+
4. **Verify API Key (Optional):**
27
61
Run a quick check using the utility script. If successful, it will print a short joke.
28
62
```bash
29
-
python utils.py
63
+
python utils/call_llm.py
64
+
```
65
+
*or using uv:*
66
+
```bash
67
+
uv run python utils/call_llm.py
30
68
```
31
69
*(Note: This requires a valid API key to be set.)*
32
70
33
-
4. **Run Default Example:**
34
-
Execute the main script. This will create the sample `ecommerce.db`if it doesn't exist and run the workflow with a default query.
71
+
5. **Run Default Example:**
72
+
Execute the main scriptwith a sample query:
35
73
```bash
36
-
python main.py
74
+
python main.py"Show me 5 Feuer type Pokemon"
37
75
```
38
-
The default query is:
39
-
> Show me the names and email addresses of customers from New York
40
-
41
-
5. **Run Custom Query:**
42
-
Provide your own natural language query as command-line arguments after the script name.
76
+
*or using uv:*
43
77
```bash
44
-
python main.py What is the total stock quantity for products in the 'Accessories' category?
78
+
uv run python main.py "Show me 5 Feuer type Pokemon"
45
79
```
46
-
Or, for queries with spaces, ensure they are treated as a single argument by the shell if necessary (quotes might help depending on your shell):
80
+
81
+
6. **Run Custom Queries:**
82
+
Try different queries in German or English:
47
83
```bash
48
-
python main.py "List orders placed in the last 30 days with status 'shipped'"
84
+
python main.py "Show me 3 Pokemon names"
85
+
python main.py "Find all Pokemon with Wasser type"
86
+
python main.py "List Pokemon from Generation 1"
49
87
```
50
88
51
89
## How It Works
@@ -57,7 +95,7 @@ graph LR
57
95
A[Get Schema] --> B[Generate SQL]
58
96
B --> C[Execute SQL]
59
97
C -- Success --> E[End]
60
-
C -- SQLite Error --> D{Debug SQL Attempt}
98
+
C -- MySQL Error --> D{Debug SQL Attempt}
61
99
D -- Corrected SQL --> C
62
100
C -- Max Retries Reached --> F[End with Error]
63
101
@@ -68,11 +106,11 @@ graph LR
68
106
69
107
**Node Descriptions:**
70
108
71
-
1. **`GetSchema`**: Connects to the SQLite database (`ecommerce.db` by default) and extracts the schema (table names and columns).
72
-
2. **`GenerateSQL`**: Takes the natural language query and the database schema, prompts the LLM to generate an SQLite query (expecting YAML output with the SQL), and parses the result.
73
-
3. **`ExecuteSQL`**: Attempts to run the generated SQL against the database.
109
+
1. **`GetSchema`**: Connects to the MySQL Pokemon database and extracts the schema (table names and columns) including tables like `pokemon`, `typ`, `attacke`, etc.
110
+
2. **`GenerateSQL`**: Takes the natural language query and the database schema, prompts the LLM to generate a MySQL query (expecting YAML output with the SQL), and parses the result.
111
+
3. **`ExecuteSQL`**: Attempts to run the generated SQL against the Pokemon database.
74
112
* If successful, the results are stored, and the flow ends successfully.
75
-
* If an `sqlite3.Error` occurs (e.g., syntax error), it captures the error message and triggers the debug loop.
113
+
* If a MySQL error occurs (e.g., syntax error), it captures the error message and triggers the debug loop.
76
114
4. **`DebugSQL`**: If `ExecuteSQL` failed, this node takes the original query, schema, failed SQL, and error message, prompts the LLM to generate a *corrected* SQL query (again, expecting YAML).
77
115
5. **(Loop)**: The corrected SQL from `DebugSQL` is passed back to `ExecuteSQL`for another attempt.
78
116
6. **(End Conditions)**: The loop continues until`ExecuteSQL` succeeds or the maximum number of debug attempts (default: 3) is reached.
@@ -82,81 +120,145 @@ graph LR
82
120
- [`main.py`](./main.py): Main entry point to run the workflow. Handles command-line arguments for the query.
83
121
- [`flow.py`](./flow.py): Defines the PocketFlow `Flow` connecting the different nodes, including the debug loop logic.
84
122
- [`nodes.py`](./nodes.py): Contains the `Node` classes for each step (`GetSchema`, `GenerateSQL`, `ExecuteSQL`, `DebugSQL`).
85
-
- [`utils.py`](./utils.py): Contains the minimal `call_llm` utility function.
86
-
- [`populate_db.py`](./populate_db.py): Script to create and populate the sample `ecommerce.db` SQLite database.
123
+
- [`utils/call_llm.py`](./utils/call_llm.py): Contains the `call_llm` utility functionfor OpenAI API interactions.
124
+
- [`populate_db.py`](./populate_db.py): Utility script with MySQL connection helper function.
0 commit comments