Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Release History

## 10.0.0-preview.1

- Initial preview release of Microsoft.Extensions.DataIngestion.Abstractions
- Introduced `IngestionDocument` class for representing format-agnostic document containers
- Introduced `IngestionDocumentElement` abstract base class for document elements
- Introduced document element types:
- `IngestionDocumentSection` - Represents a section or page in a document
- `IngestionDocumentParagraph` - Represents a paragraph
- `IngestionDocumentHeader` - Represents a header with optional level
- `IngestionDocumentFooter` - Represents a footer
- `IngestionDocumentTable` - Represents a table with 2D cell array
- `IngestionDocumentImage` - Represents an image with optional binary content and alternative text
- Introduced `IngestionChunk<T>` class for representing content chunks
- Introduced `IngestionChunker<T>` abstract base class for splitting documents into chunks
- Introduced `IngestionDocumentReader` abstract base class for reading source content and converting to documents
- Introduced `IngestionDocumentProcessor` abstract base class for processing documents
- Introduced `IngestionChunkProcessor<T>` abstract base class for processing chunks
- Introduced `IngestionChunkWriter<T>` abstract base class for writing chunks to storage
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Release History

## 10.0.0-preview.1

- Initial preview release of Microsoft.Extensions.DataIngestion.MarkItDown
- Introduced `MarkItDownReader` class for converting documents to markdown using the MarkItDown CLI
- Introduced `MarkItDownMcpReader` class for converting documents using MarkItDown Model Context Protocol (MCP) server
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Release History

## 10.0.0-preview.1

- Initial preview release of Microsoft.Extensions.DataIngestion.Markdig
- Introduced `MarkdownReader` class for reading markdown documents and converting them to `IngestionDocument`
25 changes: 25 additions & 0 deletions src/Libraries/Microsoft.Extensions.DataIngestion/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Release History

## 10.1.0-preview.1

- Introduced `SectionChunker` class for treating each document section as a separate entity (https://github.com/dotnet/extensions/pull/7015)

## 10.0.0-preview.1

- Initial preview release of Microsoft.Extensions.DataIngestion
- Introduced `IngestionPipeline<T>` class for orchestrating document ingestion workflows
- Introduced `IngestionPipelineOptions` class for configuring pipeline behavior
- Introduced `IngestionResult` class for representing ingestion operation results
- Introduced chunker implementations:
- `HeaderChunker` - Splits documents based on headers and their levels
- `SemanticSimilarityChunker` - Splits documents based on semantic similarity using embeddings
- Introduced `IngestionChunkerOptions` class for configuring chunker behavior (token limits, overlap, etc.)
- Introduced document processors/enrichers:
- `ClassificationEnricher` - Enriches document metadata with classifications
- `KeywordEnricher` - Enriches document metadata with keywords
- `SentimentEnricher` - Enriches document metadata with sentiment analysis
- `SummaryEnricher` - Enriches document metadata with summaries
- `ImageAlternativeTextEnricher` - Enriches images with alternative text descriptions
- Introduced `EnricherOptions` class for configuring enricher behavior
- Introduced `VectorStoreWriter<T>` class for writing chunks to vector stores
- Introduced `VectorStoreWriterOptions<T>` class for configuring vector store writing behavior
Loading