Releases · google/langextract

27 Nov 04:49

aksg87

v1.1.1

afc7fd9

v1.1.1 Latest

Latest

What's New

Improvements

Multi-language tokenizer support with Unicode & Regex (#284)
- Significantly improves support for CJK (Chinese, Japanese, Korean) languages
- Better handling of non-Latin scripts

Bug Fixes

Fix Gemini Batch API project parameter passing (#286)
- Resolves "Required parameter: project" error when using Vertex AI

Full Changelog: v1.1.0...v1.1.1

Assets 2

14 Nov 22:21

aksg87

v1.1.0

22ac025

v1.1.0

What's New

Features

Vertex AI Batch API Support (#279)
- Cost-effective processing with automatic chunking, GCS caching, and fault tolerance
- Automatic fallback to standard online prediction if batch job fails
FormatHandler and schema validation framework (#239)
Independent progress bar control (show_progress) (#227)
Zenodo DOI support (#218)
Alignment parameter support via resolver_params (#211)
Community Providers:
- Outlines (#250)
- vLLM (#244)
- llama-cpp-python (#202)

Improvements

Streamlined annotation layer with lazy streaming (#276)
Diverse text type benchmark with tokenization quality metrics (#272)
Enable suppress_parse_errors parameter in resolver_params (#261)
Resolve pylint naming convention warnings in provider modules (#273)

Full Changelog: v1.0.9...v1.1.0

Assets 2

31 Aug 19:50

aksg87

v1.0.9

2446bbe

v1.0.9

What's New

Features

Prompt alignment validation for few-shot examples (#215)
- Validates that example extractions exist in their source text
- Three modes: OFF, WARNING (default), ERROR
- New parameters: prompt_validation_level and prompt_validation_strict
Vertex AI authentication support for Gemini provider (#60)
llama-cpp-python community provider added (#202)

Improvements

Changed debug=False as default in extract() for cleaner output
Fixed router typings for provider plugins (#190)
Allow T-prefixed TypeVars in pylint (#194)

Full Changelog: v1.0.8...v1.0.9

Assets 2

15 Aug 07:19

aksg87

v1.0.8

803fe68

v1.0.8

What's Changed

Features

Ollama timeout improvements (#154)
- Increased default timeout from 30s to 120s
- Made timeout configurable via ModelConfig
- Fixed kwargs not being passed through

Documentation

Improved visualization examples for Jupyter/Colab (#153)
Added Romeo & Juliet Colab notebook

Full Changelog: v1.0.7...v1.0.8

Assets 2

14 Aug 11:37

aksg87

v1.0.7

57c83fa

v1.0.7

What's New

Debug logging support when debug=True in lx.extract() (#142)
GPT-5 model registration fixes (#143)
Improved documentation for provider plugins and schema support
Automated plugin generator script for external providers
Base URL support for OpenAI-compatible endpoints (#138)

See the full changelog for details.

Assets 2

13 Aug 10:22

aksg87

v1.0.6

bdcd416

v1.0.6 - Custom Model Provider Plugins & Schema System Refactor

Major Features

Custom Model Provider Plugin Support

New provider registry infrastructure for extending LangExtract with custom LLM providers
Plugin discovery via entry points allows third-party packages to register providers
Example implementation available at examples/custom_provider_plugin

Schema System Refactor

Refactored schema system to support provider-specific schema implementations
Providers can now define their own schema constraints and validation
Better separation of concerns between core schema logic and provider implementations

Enhancements

Ollama Provider: Added support for Hugging Face style model IDs (e.g., meta-llama/Llama-3.2-1B-Instruct)
Extract API: Added model and config parameters to extract() for more flexible model configuration
Examples: Updated Ollama quickstart to demonstrate ModelConfig pattern with JSON mode
Testing: Improved test infrastructure for provider registry and plugin system

Bug Fixes

Fixed lazy loading for provider pattern registration
Fixed unicode escaping in example generation
Fixed test failures related to provider registry initialization

Installation

pip install langextract==1.0.6

Full Changelog: v1.0.5...v1.0.6

Assets 2

08 Aug 01:32

aksg87

v1.0.5

b3bff86

LangExtract v1.0.5

What's Changed

Bug Fixes

Fix chunking bug when newlines fall at chunk boundaries (#88) - Resolves issue where content was incorrectly chunked when newline characters appeared at chunk boundaries
Fix IPython import warnings and improve notebook detection (#86) - Eliminates import warnings in Jupyter notebooks and improves compatibility

New Features

Add base_url parameter to OpenAILanguageModel (#51) - Enables using custom OpenAI-compatible endpoints for alternative LLM providers

Full Changelog: v1.0.4...v1.0.5

Assets 2

05 Aug 12:29

aksg87

v1.0.4

a7ef0bd

v1.0.4 - Ollama integration and improvements

What's Changed

Added Ollama language model integration – Full support for local LLMs via Ollama
Docker deployment support – Production-ready docker-compose setup with health checks
Comprehensive examples – Quickstart script and detailed documentation in examples/ollama/
Fixed OllamaLanguageModel parameter – Changed from model to model_id for consistency (#57)
Enhanced CI/CD – Added Ollama integration tests that run on every PR
Improved documentation – Consistent API examples across all language models

Technical Details

Supports all Ollama models (gemma2:2b, llama3.2, mistral, etc.)
Secure setup with localhost-only binding by default
Integration tests use lightweight models for faster CI runs
Docker setup includes automatic model pulling and health checks

Usage Example

import langextract as lx

result = lx.extract(
    text_or_documents=input_text,
    prompt_description=prompt,
    examples=examples,
    language_model_type=lx.inference.OllamaLanguageModel,
    model_id="gemma2:2b",
    model_url="http://localhost:11434",
    fence_output=False,
    use_schema_constraints=False
)

Quick setup: Install Ollama from ollama.com, run ollama pull gemma2:2b, then ollama serve.

For detailed installation, Docker setup, and more examples, see examples/ollama/.

Full Changelog: v1.0.3...v1.0.4

Assets 2

03 Aug 17:26

aksg87

v1.0.3

8289b3a

v1.0.3 - OpenAI language model support

v1.0.3 – OpenAI language model support

What's Changed

Added OpenAI language model integration – Support for GPT-4o, GPT-4o-mini, and other OpenAI models
Enhanced documentation – Added OpenAI usage examples and API key setup instructions to README
Comprehensive test coverage – Added unit tests for OpenAI backend

Technical Details

Uses modern OpenAI v1.x client API with parallel processing support
Note: Schema constraints for OpenAI are not yet implemented (use use_schema_constraints=False)

Full Changelog: v1.0.2...v1.0.3

Assets 2

03 Aug 13:47

aksg87

v1.0.2

88520cc

v1.0.2: Removes pylibmagic dependency

v1.0.2 – Slimmer install, Windows fix, OpenAI v1.x support

What’s Changed

Removed langfun and pylibmagic dependencies – lighter install; no libmagic needed on Windows
Fixed Windows-installation failure [#25]
Restored compatibility with modern OpenAI SDK v1.x [#16]
Updated README and Dockerfile to match the new, slimmer dependency set

Note

LangFunLanguageModel has been removed.
If you still need LangFun support, please open a new issue so we can discuss re-adding it in a cross-platform way.

Full Changelog: v1.0.1...v1.0.2

Assets 2

Releases: google/langextract

v1.1.1

What's New

Improvements

Bug Fixes

Uh oh!

v1.1.0

What's New

Features

Improvements

Uh oh!

v1.0.9

What's New

Features

Improvements

Uh oh!

v1.0.8

What's Changed

Features

Documentation

Uh oh!

v1.0.7

What's New

Uh oh!

v1.0.6 - Custom Model Provider Plugins & Schema System Refactor

Major Features

Custom Model Provider Plugin Support

Schema System Refactor

Enhancements

Bug Fixes

Installation

Uh oh!

LangExtract v1.0.5

What's Changed

Bug Fixes

New Features

Uh oh!

v1.0.4 - Ollama integration and improvements

What's Changed

Technical Details

Usage Example

Uh oh!

v1.0.3 - OpenAI language model support

v1.0.3 – OpenAI language model support

What's Changed

Technical Details

Uh oh!

v1.0.2: Removes pylibmagic dependency

v1.0.2 – Slimmer install, Windows fix, OpenAI v1.x support

What’s Changed

Note

Uh oh!