Releases: google/langextract
Releases · google/langextract
v1.1.1
What's New
Improvements
- Multi-language tokenizer support with Unicode & Regex (#284)
- Significantly improves support for CJK (Chinese, Japanese, Korean) languages
- Better handling of non-Latin scripts
Bug Fixes
- Fix Gemini Batch API project parameter passing (#286)
- Resolves "Required parameter: project" error when using Vertex AI
Full Changelog: v1.1.0...v1.1.1
v1.1.0
What's New
Features
- Vertex AI Batch API Support (#279)
- Cost-effective processing with automatic chunking, GCS caching, and fault tolerance
- Automatic fallback to standard online prediction if batch job fails
- FormatHandler and schema validation framework (#239)
- Independent progress bar control (
show_progress) (#227) - Zenodo DOI support (#218)
- Alignment parameter support via
resolver_params(#211) - Community Providers:
Improvements
- Streamlined annotation layer with lazy streaming (#276)
- Diverse text type benchmark with tokenization quality metrics (#272)
- Enable
suppress_parse_errorsparameter inresolver_params(#261) - Resolve pylint naming convention warnings in provider modules (#273)
Full Changelog: v1.0.9...v1.1.0
v1.0.9
What's New
Features
- Prompt alignment validation for few-shot examples (#215)
- Validates that example extractions exist in their source text
- Three modes: OFF, WARNING (default), ERROR
- New parameters:
prompt_validation_levelandprompt_validation_strict
- Vertex AI authentication support for Gemini provider (#60)
- llama-cpp-python community provider added (#202)
Improvements
- Changed
debug=Falseas default inextract()for cleaner output - Fixed router typings for provider plugins (#190)
- Allow T-prefixed TypeVars in pylint (#194)
Full Changelog: v1.0.8...v1.0.9
v1.0.8
What's Changed
Features
- Ollama timeout improvements (#154)
- Increased default timeout from 30s to 120s
- Made timeout configurable via ModelConfig
- Fixed kwargs not being passed through
Documentation
- Improved visualization examples for Jupyter/Colab (#153)
- Added Romeo & Juliet Colab notebook
Full Changelog: v1.0.7...v1.0.8
v1.0.7
What's New
- Debug logging support when
debug=Trueinlx.extract()(#142) - GPT-5 model registration fixes (#143)
- Improved documentation for provider plugins and schema support
- Automated plugin generator script for external providers
- Base URL support for OpenAI-compatible endpoints (#138)
See the full changelog for details.
v1.0.6 - Custom Model Provider Plugins & Schema System Refactor
Major Features
Custom Model Provider Plugin Support
- New provider registry infrastructure for extending LangExtract with custom LLM providers
- Plugin discovery via entry points allows third-party packages to register providers
- Example implementation available at examples/custom_provider_plugin
Schema System Refactor
- Refactored schema system to support provider-specific schema implementations
- Providers can now define their own schema constraints and validation
- Better separation of concerns between core schema logic and provider implementations
Enhancements
- Ollama Provider: Added support for Hugging Face style model IDs (e.g.,
meta-llama/Llama-3.2-1B-Instruct) - Extract API: Added
modelandconfigparameters toextract()for more flexible model configuration - Examples: Updated Ollama quickstart to demonstrate ModelConfig pattern with JSON mode
- Testing: Improved test infrastructure for provider registry and plugin system
Bug Fixes
- Fixed lazy loading for provider pattern registration
- Fixed unicode escaping in example generation
- Fixed test failures related to provider registry initialization
Installation
pip install langextract==1.0.6Full Changelog: v1.0.5...v1.0.6
LangExtract v1.0.5
What's Changed
Bug Fixes
- Fix chunking bug when newlines fall at chunk boundaries (#88) - Resolves issue where content was incorrectly chunked when newline characters appeared at chunk boundaries
- Fix IPython import warnings and improve notebook detection (#86) - Eliminates import warnings in Jupyter notebooks and improves compatibility
New Features
- Add base_url parameter to OpenAILanguageModel (#51) - Enables using custom OpenAI-compatible endpoints for alternative LLM providers
Full Changelog: v1.0.4...v1.0.5
v1.0.4 - Ollama integration and improvements
What's Changed
- Added Ollama language model integration – Full support for local LLMs via Ollama
- Docker deployment support – Production-ready docker-compose setup with health checks
- Comprehensive examples – Quickstart script and detailed documentation in
examples/ollama/ - Fixed OllamaLanguageModel parameter – Changed from
modeltomodel_idfor consistency (#57) - Enhanced CI/CD – Added Ollama integration tests that run on every PR
- Improved documentation – Consistent API examples across all language models
Technical Details
- Supports all Ollama models (gemma2:2b, llama3.2, mistral, etc.)
- Secure setup with localhost-only binding by default
- Integration tests use lightweight models for faster CI runs
- Docker setup includes automatic model pulling and health checks
Usage Example
import langextract as lx
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt,
examples=examples,
language_model_type=lx.inference.OllamaLanguageModel,
model_id="gemma2:2b",
model_url="http://localhost:11434",
fence_output=False,
use_schema_constraints=False
)Quick setup: Install Ollama from ollama.com, run ollama pull gemma2:2b, then ollama serve.
For detailed installation, Docker setup, and more examples, see examples/ollama/.
Full Changelog: v1.0.3...v1.0.4
v1.0.3 - OpenAI language model support
v1.0.3 – OpenAI language model support
What's Changed
- Added OpenAI language model integration – Support for GPT-4o, GPT-4o-mini, and other OpenAI models
- Enhanced documentation – Added OpenAI usage examples and API key setup instructions to README
- Comprehensive test coverage – Added unit tests for OpenAI backend
Technical Details
- Uses modern OpenAI v1.x client API with parallel processing support
- Note: Schema constraints for OpenAI are not yet implemented (use
use_schema_constraints=False)
Full Changelog: v1.0.2...v1.0.3
v1.0.2: Removes pylibmagic dependency
v1.0.2 – Slimmer install, Windows fix, OpenAI v1.x support
What’s Changed
- Removed
langfunandpylibmagicdependencies – lighter install; nolibmagicneeded on Windows - Fixed Windows-installation failure [#25]
- Restored compatibility with modern OpenAI SDK v1.x [#16]
- Updated README and Dockerfile to match the new, slimmer dependency set
Note
LangFunLanguageModel has been removed.
If you still need LangFun support, please open a new issue so we can discuss re-adding it in a cross-platform way.
Full Changelog: v1.0.1...v1.0.2