Orso is not intended to compete with Polars or Pandas (or your favorite bear DataFrame technology), instead it is developed as a common layer for Mabel and Opteryx.
Key Use Cases:
- In Opteryx, Orso provides most of the database Cursor functionality
 - In Mabel, Orso provides the data schema and validation functionality
 
Orso DataFrames are row-based, driven by their initial target use-case as the WAL for Mabel and Cursor for Opteryx. Each row in an Orso DataFrame can be quickly converted to a Tuple of values, a Dictionary, or a byte representation.
Install Orso from PyPI:
pip install orsoimport orso
# Create from list of dictionaries
df = orso.DataFrame([
    {'name': 'Alice', 'age': 30, 'city': 'New York'},
    {'name': 'Bob', 'age': 25, 'city': 'San Francisco'},
    {'name': 'Charlie', 'age': 35, 'city': 'Chicago'}
])
print(f"Created DataFrame with {df.rowcount} rows and {df.columncount} columns")# Display the DataFrame
print(df.display())
# Convert to different formats
arrow_table = df.arrow()  # PyArrow Table
pandas_df = df.pandas()   # Pandas DataFrame# Access column names
print("Columns:", df.column_names)
# Access schema information  
print("Schema:", df.schema)# From PyArrow
import pyarrow as pa
arrow_table = pa.table({'x': [1, 2, 3], 'y': ['a', 'b', 'c']})
orso_df = orso.DataFrame.from_arrow(arrow_table)
# To Pandas
pandas_df = orso_df.pandas()- Lightweight: Minimal overhead for tabular data operations
 - Row-based: Optimized for row-oriented operations
 - Interoperable: Easy conversion to/from PyArrow, Pandas
 - Schema-aware: Built-in data validation and type checking
 - Fast serialization: Efficient conversion to bytes, tuples, and dictionaries
 
The main DataFrame class provides the following key methods:
DataFrame(dictionaries=None, *, rows=None, schema=None)- Constructordisplay(limit=5, colorize=True, show_types=True)- Pretty print the DataFramearrow(size=None)- Convert to PyArrow Tablepandas(size=None)- Convert to Pandas DataFramefrom_arrow(tables)- Create DataFrame from PyArrow Table(s)fetchall()- Get all rows as list of Row objectscollect()- Materialize the DataFrameappend(other)- Append another DataFramedistinct()- Get unique rows
rowcount- Number of rowscolumncount- Number of columnscolumn_names- List of column namesschema- Schema information
# Clone the repository
git clone https://github.com/mabel-dev/orso.git
cd orso
# Install dependencies
pip install -r requirements.txt
pip install -r tests/requirements.txt
# Build Cython extensions
make compile
# Run tests
make testOrso is part of the Mabel ecosystem. Contributions are welcome! Please ensure:
- All tests pass: 
make test - Code follows the project style: 
make lint - New features include appropriate tests
 - Documentation is updated for API changes
 
Orso includes a comprehensive performance benchmark suite to compare different versions:
# Run full benchmark suite
python tests/test_benchmark_suite.py
# Compare two versions
python tests/test_benchmark_suite.py -o baseline.json
# <switch version>
python tests/test_benchmark_suite.py -o current.json -c baseline.jsonSee BENCHMARK_SUITE.md for detailed documentation.
Orso is licensed under Apache 2.0 unless explicitly indicated otherwise.
Orso is in beta. Beta means different things to different people, to us, being beta means:
- Interfaces are generally stable but may still have breaking changes
 - Unit tests are not reliable enough to capture breaks to functionality
 - Bugs are likely to exist in edge cases
 - Code may not be tuned for performance
 
As such, we really don't recommend using Orso in critical applications.
