Skip to content

v4.3.0 🌈

Latest
Compare
Choose a tag to compare
@khustup2 khustup2 released this 29 Aug 08:50
17bef74

Deeplake 4.3.0

Deeplake 4.3.0 is a major update bringing many new features to the Deeplake ecosystem.

New Data and Index Types

  • Complete revisit of Sequence types to support visual and structured data
  • Video type support is now available in Deeplake, supporting MP4 and MKV videos with H264 codec and providing fast random access to video frames
  • Indexing for numeric types, enabling fast queries for numeric comparisons in TQL, including IN and BETWEEN operations
  • Significant improvements to textual index types, providing faster search without requiring index regeneration

Data Import/Export

  • Fully rewritten from_csv function with support for large CSV files
  • New to_csv API to export Deeplake datasets/views to CSV format

Python Typing

  • Support for specifying Python builtin types when defining dataset schemas
  • Support for using Pydantic Models as dataset schemas
  • Enriched async operations typing, to support better integration with linters and IDEs.

Improvements and Bug Fixes

  • Improved TQL data fetching and linear scan performance for non-indexed columns
  • Better memory usage tracking to prevent out-of-memory errors
  • Various stability improvements and bug fixes

Compatibility Notice

Deeplake 4.3.0 is backward compatible with datasets created in v4.2.x. However, datasets created or modified with v4.3.0 cannot be opened with v4.2.x versions due to internal format enhancements. We recommend upgrading all environments to v4.3.0 when working with shared datasets.