Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
8afbecf
merge
Vasilije1990 Apr 18, 2025
7bdb2ab
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 Apr 18, 2025
b35e047
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 Apr 19, 2025
2a485f9
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 Apr 19, 2025
f072e8d
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 Apr 20, 2025
98a1b79
fix: run cognee in Docker [COG-1961] (#775)
dexters1 Apr 23, 2025
17a77c5
Merge remote-tracking branch 'origin/main' into dev
borisarzentar Apr 24, 2025
80e5edc
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 Apr 25, 2025
5aca3f0
fix: Doesn't drop entire PG database, just cleans public schema - Cog…
Vasilije1990 Apr 25, 2025
0a9e1a4
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 Apr 26, 2025
79921f8
Merge remote-tracking branch 'origin/main' into dev
Vasilije1990 Apr 26, 2025
6109bf5
feat: Add uv and poetry support to Cognee [COG-1572] (#780)
dexters1 Apr 28, 2025
a627841
fix: networkx id type change [COG-1876] (#786)
dexters1 Apr 28, 2025
c4915a4
Mcp SSE support [COG-1781] (#785)
dexters1 Apr 28, 2025
773752a
feat: Add detailed log handling options for Cognee exceptions [COG-19…
dexters1 Apr 28, 2025
66ecd35
fix: s3fs version fix [COG-2025] (#798)
dexters1 Apr 30, 2025
ad943d0
docs: add cognee UI (#799)
hande-k Apr 30, 2025
cd9c489
feat: remove get_distance_from_collection_names and adapt search (#766)
borisarzentar Apr 30, 2025
7db7422
docs: update colab demo (#795)
hande-k Apr 30, 2025
5970d96
feat: pass context argument to tasks that require it (#788)
borisarzentar Apr 30, 2025
9729547
feat: abstract logging tool integration (#787)
borisarzentar Apr 30, 2025
d417c71
merged
Vasilije1990 May 8, 2025
5d415dc
feat: Add Memgraph integration (#751)
matea16 May 10, 2025
34b95b6
refactor: Handle boto3 s3fs dependencies better (#809)
dexters1 May 10, 2025
a78fec3
fix: Fixes collection search limit in brute force triplet search (#814)
hajdul88 May 12, 2025
9c131f0
refactor: Update lanceDB and change delete to work async (#770)
dexters1 May 12, 2025
f93463e
fix: make onnxruntime flexible (#815)
borisarzentar May 13, 2025
8ea0097
fix: graphiti example (#816)
soobrosa May 13, 2025
13bb244
feat: Create notebook to show how to compute ranks from graph (#771)
diegoabt May 13, 2025
966e337
feat: add MCP check status tool [COG-1784] (#793)
dexters1 May 13, 2025
e3121f5
docs: Update log level of CollectionNotFoundError (#819)
dexters1 May 13, 2025
91f3cd9
fix: notebooks (#818)
soobrosa May 13, 2025
1e7b56f
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 May 13, 2025
0f3522e
fix: cognee docker image (#820)
borisarzentar May 15, 2025
badd73c
Merge branch 'dev' of github.com:topoteretes/cognee into dev
Vasilije1990 May 15, 2025
c058219
Clean up core cognee repo
Vasilije1990 May 15, 2025
729cb9b
Revert "Clean up core cognee repo"
Vasilije1990 May 15, 2025
ad0bb0c
version: v0.1.40 (#825)
borisarzentar May 15, 2025
7ac5761
Merge branch 'main' into dev
Vasilije1990 May 15, 2025
f9f18d1
feat: Add columns as nodes in relational db migration (#826)
dexters1 May 15, 2025
8178b72
fix: exclude files from build (#828)
borisarzentar May 15, 2025
1dd179b
feat: OpenAI compatible route /api/v1/responses (#792)
dm1tryG May 16, 2025
3b07f3c
feat: Test db examples (#817)
hande-k May 16, 2025
4371b9d
fix: 812 anthropic fix (#822)
Vasilije1990 May 16, 2025
5cf14eb
fix: Mcp small updates (#831)
Vasilije1990 May 16, 2025
86efeee
fix: pipeline run status migration (#836)
borisarzentar May 19, 2025
3ed9504
feat: Add developer rules (#833)
Vasilije1990 May 19, 2025
a874988
fix: Fixes pipeline run status migration (#838)
hajdul88 May 19, 2025
f8f7877
Fix: Fixes graph completion search limit (#839)
hajdul88 May 19, 2025
5c36a5d
feat: Adds modal parallel evaluation for retriever development (#844)
hajdul88 May 20, 2025
9d9ea63
fix: use default threading in Fastembed (#846)
lxobr May 20, 2025
4c52ef6
feat: added util logger OS (#841)
Vasilije1990 May 20, 2025
7eee769
Feat: Adds dashboard application to parallel modal evals (#847)
hajdul88 May 21, 2025
94c785d
fix: hotfix the file uploader in the delete router. (#842)
soobrosa May 21, 2025
08bc472
Feat: Removes hardcoded user prompts from adapters
hajdul88 May 21, 2025
e0798ff
Feat: Adds chain of thought retriever (#864)
hajdul88 May 22, 2025
d663921
Feat: Adds context extension search (#865)
hajdul88 May 22, 2025
b71b704
chore: Move files (#848)
Vasilije1990 May 22, 2025
4650c9c
chore: add neo4j to mcp dependencies (#867)
hande-k May 23, 2025
834d959
Readme local install (#872)
dexters1 May 26, 2025
965033e
Feat: Adds subgraph retriever to graph based completion searches (#874)
hajdul88 May 27, 2025
ec68e99
Fix: removes ontology resolver initialization at import (#876)
hajdul88 May 27, 2025
bb68d6a
Docstring tasks. (#878)
soobrosa May 27, 2025
ff997f4
Docstring modules. (#877)
soobrosa May 27, 2025
b5ebed1
Docstring infrastructure. (#880)
soobrosa May 28, 2025
b94c846
Fix: Disable faulty graph metrics calculation in demos (#888)
hajdul88 May 29, 2025
d8ef290
feat: removes unused properies from node and edge pydantic models (#884)
hajdul88 May 30, 2025
d91602e
0.1.41 Release fixes (#889)
borisarzentar May 30, 2025
5a04421
version: v0.1.41 (#890)
borisarzentar May 30, 2025
57b0e0e
Merge with main (#892)
borisarzentar May 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Docstring modules. (#877)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
  • Loading branch information
soobrosa and Vasilije1990 authored May 27, 2025
commit ff997f48b5dbc8b05b83296f48b9d4bc2c3eeaf3
19 changes: 19 additions & 0 deletions cognee/modules/chunking/models/DocumentChunk.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,25 @@


class DocumentChunk(DataPoint):
"""
Represents a chunk of text from a document with associated metadata.

Public methods include:

- No public methods defined in the provided code.

Instance variables include:

- text: The textual content of the chunk.
- chunk_size: The size of the chunk.
- chunk_index: The index of the chunk in the original document.
- cut_type: The type of cut that defined this chunk.
- is_part_of: The document to which this chunk belongs.
- contains: A list of entities contained within the chunk (default is None).
- metadata: A dictionary to hold meta information related to the chunk, including index
fields.
"""

text: str
chunk_size: int
chunk_index: int
Expand Down
8 changes: 8 additions & 0 deletions cognee/modules/engine/models/EntityType.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@


class EntityType(DataPoint):
"""
Represents a type of entity with a name and description.

This class inherits from DataPoint and includes two primary attributes: `name` and
`description`. Additionally, it contains a metadata dictionary that specifies
`index_fields` for indexing purposes.
"""

name: str
description: str

Expand Down
6 changes: 6 additions & 0 deletions cognee/modules/engine/operations/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,11 @@


async def setup():
"""
Set up the necessary databases and tables.

This function asynchronously creates a relational database and its corresponding tables,
followed by creating a PGVector database and its tables.
"""
await create_relational_db_and_tables()
await create_pgvector_db_and_tables()
56 changes: 53 additions & 3 deletions cognee/modules/retrieval/EntityCompletionRetriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,21 @@


class EntityCompletionRetriever(BaseRetriever):
"""Retriever that uses entity-based completion for generating responses."""
"""
Retriever that uses entity-based completion for generating responses.

Public methods:

- get_context
- get_completion

Instance variables:

- extractor
- context_provider
- user_prompt_path
- system_prompt_path
"""

def __init__(
self,
Expand All @@ -26,7 +40,24 @@ def __init__(
self.system_prompt_path = system_prompt_path

async def get_context(self, query: str) -> Any:
"""Get context using entity extraction and context provider."""
"""
Get context using entity extraction and context provider.

Logs the processing of the query and retrieves entities. If entities are extracted, it
attempts to retrieve the corresponding context using the context provider. Returns None
if no entities or context are found, or logs the error if an exception occurs.

Parameters:
-----------

- query (str): The query string for which context is being retrieved.

Returns:
--------

- Any: The context retrieved from the context provider or None if not found or an
error occurred.
"""
try:
logger.info(f"Processing query: {query[:100]}")

Expand All @@ -47,7 +78,26 @@ async def get_context(self, query: str) -> Any:
return None

async def get_completion(self, query: str, context: Optional[Any] = None) -> List[str]:
"""Generate completion using provided context or fetch new context."""
"""
Generate completion using provided context or fetch new context.

If context is not provided, it fetches context using the query. If no context is
available, it returns an error message. Logs an error if completion generation fails due
to an exception.

Parameters:
-----------

- query (str): The query string for which completion is being generated.
- context (Optional[Any]): Optional context to be used for generating completion;
fetched if not provided. (default None)

Returns:
--------

- List[str]: A list containing the generated completion or an error message if no
relevant entities were found.
"""
try:
if context is None:
context = await self.get_context(query)
Expand Down
48 changes: 45 additions & 3 deletions cognee/modules/retrieval/chunks_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,16 @@


class ChunksRetriever(BaseRetriever):
"""Retriever for handling document chunk-based searches."""
"""
Handles document chunk-based searches by retrieving relevant chunks and generating
completions from them.

Public methods:

- get_context: Retrieves document chunks based on a query.
- get_completion: Generates a completion using provided context or retrieves context if
not given.
"""

def __init__(
self,
Expand All @@ -16,7 +25,22 @@ def __init__(
self.top_k = top_k

async def get_context(self, query: str) -> Any:
"""Retrieves document chunks context based on the query."""
"""
Retrieves document chunks context based on the query.

Searches for document chunks relevant to the specified query using a vector engine.
Raises a NoDataError if no data is found in the system.

Parameters:
-----------

- query (str): The query string to search for relevant document chunks.

Returns:
--------

- Any: A list of document chunk payloads retrieved from the search.
"""
vector_engine = get_vector_engine()

try:
Expand All @@ -27,7 +51,25 @@ async def get_context(self, query: str) -> Any:
return [result.payload for result in found_chunks]

async def get_completion(self, query: str, context: Optional[Any] = None) -> Any:
"""Generates a completion using document chunks context."""
"""
Generates a completion using document chunks context.

If the context is not provided, it retrieves the context based on the query. Returns the
context, which can be used for further processing or generation of outputs.

Parameters:
-----------

- query (str): The query string to be used for generating a completion.
- context (Optional[Any]): Optional pre-fetched context to use for generating the
completion; if None, it retrieves the context for the query. (default None)

Returns:
--------

- Any: The context used for the completion or the retrieved context if none was
provided.
"""
if context is None:
context = await self.get_context(query)
return context
7 changes: 6 additions & 1 deletion cognee/modules/retrieval/code_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,12 @@ class CodeRetriever(BaseRetriever):
"""Retriever for handling code-based searches."""

class CodeQueryInfo(BaseModel):
"""Response model for information extraction from the query"""
"""
Model for representing the result of a query related to code files.

This class holds a list of filenames and the corresponding source code extracted from a
query. It is used to encapsulate response data in a structured format.
"""

filenames: List[str] = []
sourcecode: str
Expand Down
46 changes: 43 additions & 3 deletions cognee/modules/retrieval/completion_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,13 @@


class CompletionRetriever(BaseRetriever):
"""Retriever for handling LLM-based completion searches."""
"""
Retriever for handling LLM-based completion searches.

Public methods:
- get_context(query: str) -> str
- get_completion(query: str, context: Optional[Any] = None) -> Any
"""

def __init__(
self,
Expand All @@ -22,7 +28,24 @@ def __init__(
self.top_k = top_k if top_k is not None else 1

async def get_context(self, query: str) -> str:
"""Retrieves relevant document chunks as context."""
"""
Retrieves relevant document chunks as context.

Fetches document chunks based on a query from a vector engine and combines their text.
Returns empty string if no chunks are found. Raises NoDataError if the collection is not
found.

Parameters:
-----------

- query (str): The query string used to search for relevant document chunks.

Returns:
--------

- str: A string containing the combined text of the retrieved document chunks, or an
empty string if none are found.
"""
vector_engine = get_vector_engine()

try:
Expand All @@ -38,7 +61,24 @@ async def get_context(self, query: str) -> str:
raise NoDataError("No data found in the system, please add data first.") from error

async def get_completion(self, query: str, context: Optional[Any] = None) -> Any:
"""Generates an LLM completion using the context."""
"""
Generates an LLM completion using the context.

Retrieves context if not provided and generates a completion based on the query and
context using an external completion generator.

Parameters:
-----------

- query (str): The input query for which the completion is generated.
- context (Optional[Any]): Optional context to use for generating the completion; if
not provided, it will be retrieved using get_context. (default None)

Returns:
--------

- Any: A list containing the generated completion from the LLM.
"""
if context is None:
context = await self.get_context(query)

Expand Down
43 changes: 40 additions & 3 deletions cognee/modules/retrieval/cypher_search_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,13 @@


class CypherSearchRetriever(BaseRetriever):
"""Retriever for handling cypher-based search"""
"""
Retriever for handling cypher-based search.

Public methods include:
- get_context: Retrieves relevant context using a cypher query.
- get_completion: Returns the graph connections context.
"""

def __init__(
self,
Expand All @@ -22,7 +28,22 @@ def __init__(
self.system_prompt_path = system_prompt_path

async def get_context(self, query: str) -> Any:
"""Retrieves relevant context using a cypher query."""
"""
Retrieves relevant context using a cypher query.

If the graph engine is an instance of NetworkXAdapter, raises SearchTypeNotSupported. If
any error occurs during execution, logs the error and raises CypherSearchError.

Parameters:
-----------

- query (str): The cypher query used to retrieve context.

Returns:
--------

- Any: The result of the cypher query execution.
"""
try:
graph_engine = await get_graph_engine()

Expand All @@ -38,7 +59,23 @@ async def get_context(self, query: str) -> Any:
return result

async def get_completion(self, query: str, context: Optional[Any] = None) -> Any:
"""Returns the graph connections context."""
"""
Returns the graph connections context.

If no context is provided, it retrieves the context using the specified query.

Parameters:
-----------

- query (str): The query to retrieve context.
- context (Optional[Any]): Optional context to use, otherwise fetched using the
query. (default None)

Returns:
--------

- Any: The context, either provided or retrieved.
"""
if context is None:
context = await self.get_context(query)
return context
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,21 @@


class GraphCompletionContextExtensionRetriever(GraphCompletionRetriever):
"""
Handles graph context completion for question answering tasks, extending context based
on retrieved triplets.

Public methods:
- get_completion

Instance variables:
- user_prompt_path
- system_prompt_path
- top_k
- node_type
- node_name
"""

def __init__(
self,
user_prompt_path: str = "graph_context_for_question.txt",
Expand All @@ -28,6 +43,30 @@ def __init__(
async def get_completion(
self, query: str, context: Optional[Any] = None, context_extension_rounds=4
) -> List[str]:
"""
Extends the context for a given query by retrieving related triplets and generating new
completions based on them.

The method runs for a specified number of rounds to enhance context until no new
triplets are found or the maximum rounds are reached. It retrieves triplet suggestions
based on a generated completion from previous iterations, logging the process of context
extension.

Parameters:
-----------

- query (str): The input query for which the completion is generated.
- context (Optional[Any]): The existing context to use for enhancing the query; if
None, it will be initialized from triplets generated for the query. (default None)
- context_extension_rounds: The maximum number of rounds to extend the context with
new triplets before halting. (default 4)

Returns:
--------

- List[str]: A list containing the generated answer based on the query and the
extended context.
"""
triplets = []

if context is None:
Expand Down
Loading
Loading