Deeplake 4.3.0
Deeplake 4.3.0 is a major update bringing many new features to the Deeplake ecosystem.
New Data and Index Types
- Complete revisit of
Sequence
types to support visual and structured data - Video type support is now available in Deeplake, supporting MP4 and MKV videos with H264 codec and providing fast random access to video frames
- Indexing for numeric types, enabling fast queries for numeric comparisons in TQL, including
IN
andBETWEEN
operations - Significant improvements to textual index types, providing faster search without requiring index regeneration
Data Import/Export
- Fully rewritten
from_csv
function with support for large CSV files - New
to_csv
API to export Deeplake datasets/views to CSV format
Python Typing
- Support for specifying Python builtin types when defining dataset schemas
- Support for using Pydantic Models as dataset schemas
- Enriched async operations typing, to support better integration with linters and IDEs.
Improvements and Bug Fixes
- Improved TQL data fetching and linear scan performance for non-indexed columns
- Better memory usage tracking to prevent out-of-memory errors
- Various stability improvements and bug fixes
Compatibility Notice
Deeplake 4.3.0 is backward compatible with datasets created in v4.2.x. However, datasets created or modified with v4.3.0 cannot be opened with v4.2.x versions due to internal format enhancements. We recommend upgrading all environments to v4.3.0 when working with shared datasets.