Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
c1e8ba2
feat: implement auto-translation task for multilingual content
subhash-0000 Sep 9, 2025
1ab9320
fix: address code review feedback
subhash-0000 Sep 9, 2025
8feab1d
docs: update cognify pipeline documentation
subhash-0000 Sep 9, 2025
ea035d9
Address CodeRabbit review feedback
subhash-0000 Sep 9, 2025
cb7f9c3
Address final CodeRabbit review comments
subhash-0000 Sep 9, 2025
d945381
Address maintainer feedback - implement pluggable translation system
subhash-0000 Sep 10, 2025
376aaa1
Address CodeRabbit review feedback
subhash-0000 Sep 10, 2025
9f2079c
Enhance docstrings and fix remaining search call
subhash-0000 Sep 10, 2025
8d83c54
Address remaining CodeRabbit feedback
subhash-0000 Sep 10, 2025
59c53e2
Final CodeRabbit fixes - no more iterations needed
subhash-0000 Sep 10, 2025
5c2fd67
Complete remaining CodeRabbit fixes
subhash-0000 Sep 10, 2025
ab002a4
Address all CodeRabbit feedback - final cleanup
subhash-0000 Sep 10, 2025
11c6dcf
Address final CodeRabbit nitpicks for robustness
subhash-0000 Sep 10, 2025
5097abd
FINAL: Complete comprehensive audit addressing ALL issues
subhash-0000 Sep 10, 2025
4e21f03
Apply final robustness improvements per CodeRabbit review
subhash-0000 Sep 10, 2025
f309817
feat: address final CodeRabbit feedback for production readiness
subhash-0000 Sep 10, 2025
08f62f5
refactor: address priority CodeRabbit nitpicks for code quality
subhash-0000 Sep 10, 2025
c7ccc1d
fix: Address all CodeRabbit feedback and add comprehensive tests
subhash-0000 Sep 11, 2025
286aff7
fix: Address final 2 CodeRabbit actionable comments
subhash-0000 Sep 11, 2025
cca3396
fix: Address key CodeRabbit nitpicks for approval
subhash-0000 Sep 11, 2025
115d99b
Address CodeRabbit feedback: Fix imports, validation, and clean up code
subhash-0000 Sep 12, 2025
18f70b1
fix: Address remaining CodeRabbit feedback
subhash-0000 Sep 12, 2025
98af2ed
fix: resolve all CodeRabbit feedback for auto-translation feature
subhash-0000 Sep 12, 2025
bff993d
fix: Address code review comments for translation system
subhash-0000 Sep 12, 2025
b11fef9
fix: resolve critical language normalization and test validation issues
subhash-0000 Sep 12, 2025
94a013c
Improve translation functionality and code structure
subhash-0000 Sep 12, 2025
0bac83b
chore: consolidate cognify and polish translation\n- Remove duplicate…
subhash-0000 Sep 12, 2025
d304b0c
chore(api): remove duplicate api/v1/cognify/cognify.py; standardize o…
subhash-0000 Sep 12, 2025
4f6a541
translation: normalize detection confidence, guard empty translations…
subhash-0000 Sep 13, 2025
5fa262e
translation: allow fallback by returning None from NoOp.detect_langua…
subhash-0000 Sep 13, 2025
88efbd5
translation: normalize language codes and use math.isnan; OpenAI: set…
subhash-0000 Sep 13, 2025
9187fd8
translation: refactor translate_content into helpers to reduce statem…
subhash-0000 Sep 13, 2025
f6f9dc4
translation: canonical provider name in metadata; preflight init erro…
subhash-0000 Sep 13, 2025
8d446bc
translation: remove stray duplicate return in translate_content; fix …
subhash-0000 Sep 13, 2025
003ec79
fix: remove duplicate tasks/translation module to resolve import shad…
subhash-0000 Sep 13, 2025
82985c9
new updates
subhash-0000 Sep 13, 2025
9f6b2dc
📝 Add docstrings to `auto-translate-task`
coderabbitai[bot] Sep 13, 2025
94c53c1
refactor: Refactor translation task for maintainability and determinism
subhash-0000 Sep 13, 2025
c01b63e
feat: Add docstrings from bot PR
subhash-0000 Sep 13, 2025
9867b35
feat: Merge docstrings and resolve conflicts
subhash-0000 Sep 13, 2025
888f745
fix: Address review comments and resolve merge conflicts
subhash-0000 Sep 13, 2025
4741a9f
feat: implement all review suggestions
subhash-0000 Sep 13, 2025
2ff6f94
fix: address nitpick comments from review
subhash-0000 Sep 13, 2025
8af2f4a
fix: address final round of review comments
subhash-0000 Sep 13, 2025
5c71402
fix: address all outstanding code review comments
subhash-0000 Sep 13, 2025
0817bd1
fix: comprehensive code review improvements and provider fallback
subhash-0000 Sep 14, 2025
05f7986
final fixes
subhash-0000 Sep 14, 2025
5aa9a74
Fix TRY300 linting warnings - move statements to else blocks
subhash-0000 Sep 14, 2025
6efb9df
Comprehensive translation system improvements
subhash-0000 Sep 14, 2025
2b72e1e
refactor: apply cleaner helper function approach to resolve R0914 com…
subhash-0000 Sep 14, 2025
f8b6e0e
fix: address security and linting issues in translation module
subhash-0000 Sep 14, 2025
52aca87
fix: resolve technical issues and optimize provider performance
subhash-0000 Sep 14, 2025
c89eda0
fix: address function complexity and prompt optimization
subhash-0000 Sep 14, 2025
ce343c2
Fix recurring patterns: async cancellation, timeouts, and defensive p…
subhash-0000 Sep 14, 2025
67fb9aa
Apply systematic pattern fixes: prevent metadata conflicts and race c…
subhash-0000 Sep 15, 2025
c256b67
Refactor translation models, fix DataPoint import, and retarget PR to…
subhash-0000 Sep 20, 2025
d80f51e
Merge branch 'dev' into auto-translate-task
subhash-0000 Sep 20, 2025
615000a
Fix import and undefined references in cognify.py for clean execution
subhash-0000 Sep 20, 2025
085d505
observability: restore get_observe to return langfuse observe or no-o…
subhash-0000 Sep 30, 2025
2f14ec2
observability: remove redundant observe None fallback; add translatio…
subhash-0000 Sep 30, 2025
d480614
translation: clean imports, add provider error classes, and tidy tran…
subhash-0000 Sep 30, 2025
858173b
translation: finalize detection_provider wiring and exports
subhash-0000 Sep 30, 2025
f70a2a1
chore: pin httpx/httpcore for googletrans compatibility; add smoke Ge…
subhash-0000 Oct 1, 2025
cb4bc17
chore: unregister untested translation providers (google/azure/llm) p…
subhash-0000 Oct 3, 2025
aa1c17c
chore: remove untested translation providers (google/azure/llm) and s…
subhash-0000 Oct 3, 2025
03488f3
fix: remove duplicate imports and resolve import errors in translatio…
subhash-0000 Oct 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
docs: update cognify pipeline documentation
- Add translation step to processing pipeline documentation
- Document translation_provider parameter with accepted values
- Explain OpenAI API key requirement for openai provider
- Clarify cross-language search capability

Addresses remaining documentation feedback from code review.
  • Loading branch information
subhash-0000 committed Sep 9, 2025
commit 8feab1d65b6fa6a41730df275939984b7474fc3d
14 changes: 10 additions & 4 deletions cognee/api/v1/cognify/cognify.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,11 @@ async def cognify(
1. **Document Classification**: Identifies document types and structures
2. **Permission Validation**: Ensures user has processing rights
3. **Text Chunking**: Breaks content into semantically meaningful segments
4. **Entity Extraction**: Identifies key concepts, people, places, organizations
5. **Relationship Detection**: Discovers connections between entities
6. **Graph Construction**: Builds semantic knowledge graph with embeddings
7. **Content Summarization**: Creates hierarchical summaries for navigation
4. **Translation**: Auto-translates non-English chunks to English and attaches metadata
5. **Entity Extraction**: Identifies key concepts, people, places, organizations
6. **Relationship Detection**: Discovers connections between entities
7. **Graph Construction**: Builds semantic knowledge graph with embeddings
8. **Content Summarization**: Creates hierarchical summaries for navigation

Graph Model Customization:
The `graph_model` parameter allows custom knowledge structures:
Expand Down Expand Up @@ -108,6 +109,11 @@ async def cognify(
If provided, this prompt will be used instead of the default prompts for
knowledge graph extraction. The prompt should guide the LLM on how to
extract entities and relationships from the text content.
translation_provider: Translation service to use for multilingual content.
- "noop": No translation (default, safe fallback)
- "langdetect": Local language detection without translation
- "openai": OpenAI-powered translation (requires OPENAI_API_KEY)
Enables cross-language search by translating non-English content to English.

Returns:
Union[dict, list[PipelineRunInfo]]:
Expand Down