Excel to SQLite REPL

A Python-based interactive shell that allows you to load Excel files into a SQLite database and query them using SQL with a beautiful, colorful interface. Features persistent caching, auto-loading, and both interactive and non-interactive modes.

Features

High Performance: Optimized Python implementation using calamine (Rust-based) for blazing fast Excel loading (3.3x faster than standard loaders).
Dual Backend: Support for both DuckDB (default, high performance) and SQLite (compatibility).
Load Excel Files: Load a single .xlsx file or an entire directory of files in parallel.
Automatic Sanitization:
- Sheet names are converted to valid SQL table names (e.g., "Sales Data (2024)" -> Sales_Data__2024_).
- Column headers are sanitized to be valid SQL identifiers (e.g., "Salary ($)" -> Salary____).
SQL Querying: Full SQL support via DuckDB/SQLite. You can JOIN tables derived from different Excel files.
Large Dataset Support: Handles large Excel files (tested with 1M+ rows) efficiently.
Rich UI: Colorful, formatted output using the rich library.
Multi-line Queries: Write SQL queries across multiple lines. Press Enter to continue, end with ; to execute.
Smart History: Multi-line queries are stored as a single history entry (use Up arrow to recall).
Detailed Metadata: The tables command shows row count, column count, and all column names with types.

Supported SQL Features

Since this tool uses the standard SQLite engine, all standard SQLite SQL syntax is supported. This includes but is not limited to:

Standard Clauses: SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT.
Joins: INNER JOIN, LEFT JOIN, CROSS JOIN (Right and Full Outer joins are not natively supported by SQLite, but can be emulated).
Aggregations: COUNT(), SUM(), AVG(), MIN(), MAX().
Subqueries & CTEs: Common Table Expressions (WITH ... AS) and nested queries.
Built-in Functions:
- String: UPPER(), LOWER(), SUBSTR(), TRIM(), REPLACE().
- Date/Time: DATE(), TIME(), DATETIME(), STRFTIME().
- Math: ABS(), ROUND(), RANDOM().
Set Operations: UNION, UNION ALL, INTERSECT, EXCEPT.

Note: The database is persistent (stored in ~/.sql_excel_data.db). Changes made with INSERT, UPDATE, or DELETE will persist across sessions. Use the refresh command to reload from Excel files.

Installation

This project uses uv for fast dependency management.

Install uv (if not already installed):
```
pip install uv
```

Set up the environment:

uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
# Installs: pandas, openpyxl, rich, prompt_toolkit, fastmcp, duckdb, python-calamine, sqlparse

Usage

Quick Start

Generate Test Data (optional):
```
uv run python create_test_data.py
```
Auto-load and start REPL (recommended):
```
uv run python main.py test_data
```
This loads the data folder on first run and caches it. Subsequent runs use the cache automatically!

Interactive Mode

Start the REPL with auto-loading:

uv run python main.py test_data

Or start empty and load manually:

uv run python main.py

Then inside the REPL:

(sql-excel) load test_data

View loaded tables:

(sql-excel) tables

Refresh data from Excel files:

(sql-excel) refresh

This clears the cache and reloads from the original Excel files.

Non-Interactive Mode

Execute queries directly without entering the REPL:

# Query using cached data (defaults to test_data folder)
uv run python main.py --query "SELECT * FROM users_Sheet1"

# Short form
uv run python main.py -q "SELECT name, email FROM users_Sheet1 WHERE id = 1"

# Specify a different source folder
uv run python main.py --source my_data -q "SELECT * FROM my_table"

# Complex query with joins (uses default test_data)
uv run python main.py -q "SELECT u.name, o.product_name FROM users_Sheet1 u JOIN orders_Sheet1 o ON u.id = o.user_id"

Query Examples

Simple Select:

SELECT * FROM users_Sheet1;

Multi-line Join Query:

SELECT u.name, o.product_name, o.amount 
FROM users_Sheet1 u 
JOIN orders_Sheet1 o ON u.id = o.user_id;

Note: Press Enter to add new lines. End with ; and press Enter to execute.

Querying with Sanitized Names:

SELECT Full_Name, Salary____ 
FROM complex_data_Employee_Records 
WHERE Salary____ > 60000;

Aggregation with Group By:

SELECT u.name, SUM(o.amount) as total_spent 
FROM users_Sheet1 u 
JOIN orders_Sheet1 o ON u.id = o.user_id 
GROUP BY u.name 
ORDER BY total_spent DESC;

Available Commands

load <path> - Load Excel file(s) from a path
tables - List all tables with metadata (rows, columns, types)
schema <table> - Show CREATE TABLE statement
refresh - Clear cache and reload from Excel files
exit / quit - Exit the REPL
<sql query> - Execute any SQL query (end with ;)

Command-Line Arguments

python main.py [data_folder] [--query QUERY] [--source SOURCE] [--db DB_PATH]

data_folder - Optional positional argument for data folder path
--query, -q - Execute a query and exit (non-interactive)
--source, -s - Default data source folder (default: test_data)
--db - Custom database path (default: ~/.sql_excel_data.db)

Note: The positional data_folder takes priority over --source. If neither is provided, it defaults to test_data.

Additional Resources

See TEST_QUERIES.md for more example queries you can copy and paste.
Use the help command inside the REPL to see all available commands.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
excel_loader_rs		excel_loader_rs
test_data		test_data
tests		tests
.gitignore		.gitignore
DEVELOPMENT_SUMMARY.md		DEVELOPMENT_SUMMARY.md
KEYBOARD_CONTROLS.md		KEYBOARD_CONTROLS.md
MCP_SETUP.md		MCP_SETUP.md
README.md		README.md
TEST_QUERIES.md		TEST_QUERIES.md
api_server.py		api_server.py
conftest.py		conftest.py
create_test_data.py		create_test_data.py
loader.py		loader.py
main.py		main.py
mcp_config.json		mcp_config.json
mcp_rest_client.py		mcp_rest_client.py
mcp_server.py		mcp_server.py
requirements.txt		requirements.txt
show_schemas.py		show_schemas.py
test_ctrlc.py		test_ctrlc.py
test_speedup_v3.duckdb		test_speedup_v3.duckdb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Excel to SQLite REPL

Features

Supported SQL Features

Installation

Usage

Quick Start

Interactive Mode

Non-Interactive Mode

Query Examples

Available Commands

Command-Line Arguments

Additional Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Excel to SQLite REPL

Features

Supported SQL Features

Installation

Usage

Quick Start

Interactive Mode

Non-Interactive Mode

Query Examples

Available Commands

Command-Line Arguments

Additional Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages