-
Notifications
You must be signed in to change notification settings - Fork 8.2k
fix(pinecone): issue #10512 Pinecone vector store returns zero results #10602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jovicdev97
wants to merge
4
commits into
langflow-ai:main
Choose a base branch
from
jovicdev97:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+389
−6
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
5e6cf75
fixes issue #10512 where Pinecone vector store returns zero results …
jovicdev97 0f076ec
[autofix.ci] apply automated fixes
autofix-ci[bot] 2591c2e
[autofix.ci] apply automated fixes (attempt 2/3)
autofix-ci[bot] 8c1faf4
[autofix.ci] apply automated fixes (attempt 3/3)
autofix-ci[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
352 changes: 352 additions & 0 deletions
352
src/backend/tests/unit/components/vectorstores/test_pinecone_vector_store_component.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,352 @@ | ||
| from typing import Any | ||
| from unittest.mock import MagicMock, Mock, patch | ||
|
|
||
| import pytest | ||
| from lfx.components.pinecone import PineconeVectorStoreComponent | ||
| from lfx.schema.data import Data | ||
|
|
||
| from tests.base import ComponentTestBaseWithoutClient, VersionComponentMapping | ||
|
|
||
|
|
||
| @pytest.mark.api_key_required | ||
| class TestPineconeVectorStoreComponent(ComponentTestBaseWithoutClient): | ||
| @pytest.fixture | ||
| def component_class(self) -> type[Any]: | ||
| """Return the component class to test.""" | ||
| return PineconeVectorStoreComponent | ||
|
|
||
| @pytest.fixture | ||
| def default_kwargs(self) -> dict[str, Any]: | ||
| """Return the default kwargs for the component.""" | ||
| from lfx.components.openai.openai import OpenAIEmbeddingsComponent | ||
|
|
||
| from tests.api_keys import get_openai_api_key | ||
|
|
||
| try: | ||
| api_key = get_openai_api_key() | ||
| except ValueError: | ||
| pytest.skip("OPENAI_API_KEY is not set") | ||
|
|
||
| return { | ||
| "embedding": OpenAIEmbeddingsComponent(openai_api_key=api_key).build_embeddings(), | ||
| "index_name": "test-index", | ||
| "namespace": "test-namespace", | ||
| "pinecone_api_key": "test-pinecone-key", | ||
| "text_key": "text", | ||
| } | ||
|
|
||
| @pytest.fixture | ||
| def file_names_mapping(self) -> list[VersionComponentMapping]: | ||
| """Return the file names mapping for different versions.""" | ||
| return [] | ||
|
|
||
| def test_search_documents_with_namespace( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that search_documents properly passes namespace parameter. | ||
|
|
||
| This test verifies the fix for issue #10512 where namespace wasn't being | ||
| properly passed to Pinecone queries, resulting in zero results. | ||
| """ | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client and index | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| # Mock query results | ||
| mock_match = Mock() | ||
| mock_match.metadata = {"text": "test result", "source": "test"} | ||
| mock_results = Mock() | ||
| mock_results.matches = [mock_match] | ||
| mock_index.query.return_value = mock_results | ||
|
|
||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone), | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
| results = component.search_documents() | ||
|
|
||
| # Verify Pinecone was called correctly | ||
| mock_pinecone.Index.assert_called_once_with("test-index") | ||
| mock_index.query.assert_called_once() | ||
|
|
||
| # Verify namespace was passed | ||
| call_kwargs = mock_index.query.call_args[1] | ||
| assert "namespace" in call_kwargs | ||
| assert call_kwargs["namespace"] == "test-namespace" | ||
| assert call_kwargs["top_k"] == 4 | ||
| assert call_kwargs["include_metadata"] is True | ||
|
|
||
| # Verify results are returned | ||
| assert len(results) == 1 | ||
| assert isinstance(results[0], Data) | ||
| assert results[0].text == "test result" | ||
|
|
||
| def test_search_documents_without_namespace( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that search_documents works without namespace.""" | ||
| # Remove namespace from kwargs | ||
| default_kwargs.pop("namespace", None) | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client and index | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| # Mock query results | ||
| mock_match = Mock() | ||
| mock_match.metadata = {"text": "test result"} | ||
| mock_results = Mock() | ||
| mock_results.matches = [mock_match] | ||
| mock_index.query.return_value = mock_results | ||
|
|
||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone), | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
| results = component.search_documents() | ||
|
|
||
| # Verify namespace was NOT passed when not set | ||
| call_kwargs = mock_index.query.call_args[1] | ||
| assert "namespace" not in call_kwargs | ||
|
|
||
| # Verify results are still returned | ||
| assert len(results) == 1 | ||
|
|
||
| def test_search_documents_empty_query( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that empty query returns empty results.""" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
| component.set(search_query="") | ||
| results = component.search_documents() | ||
| assert results == [] | ||
|
|
||
| def test_search_documents_with_custom_text_key( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that custom text_key is properly used to extract content.""" | ||
| default_kwargs["text_key"] = "chunk_text" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client and index | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| # Mock query results with custom text key | ||
| mock_match = Mock() | ||
| mock_match.metadata = {"chunk_text": "custom text content", "source": "test"} | ||
| mock_results = Mock() | ||
| mock_results.matches = [mock_match] | ||
| mock_index.query.return_value = mock_results | ||
|
|
||
| # Mock Float32Embeddings | ||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone), | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
| results = component.search_documents() | ||
|
|
||
| # Verify the custom text_key was used | ||
| assert len(results) == 1 | ||
| assert results[0].text == "custom text content" | ||
|
|
||
| def test_search_documents_pinecone_api_error( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that Pinecone API errors are properly handled.""" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client to raise an error | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_index.query.side_effect = Exception("Pinecone API error") | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone), | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
|
|
||
| # Verify error is raised | ||
| with pytest.raises(ValueError, match="Error searching documents"): | ||
| component.search_documents() | ||
|
|
||
| def test_search_documents_embedding_error( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that embedding errors are properly handled.""" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock Float32Embeddings to raise an error | ||
| with patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32: | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.side_effect = Exception("Embedding error") | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
|
|
||
| # Verify error is raised | ||
| with pytest.raises(ValueError, match="Error searching documents"): | ||
| component.search_documents() | ||
|
|
||
| def test_float32_embeddings_wrapper_usage( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that Float32Embeddings wrapper is correctly instantiated and used.""" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client and index | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| # Mock query results | ||
| mock_match = Mock() | ||
| mock_match.metadata = {"text": "test result"} | ||
| mock_results = Mock() | ||
| mock_results.matches = [mock_match] | ||
| mock_index.query.return_value = mock_results | ||
|
|
||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone), | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
| component.search_documents() | ||
|
|
||
| # Verify Float32Embeddings was instantiated with the correct embeddings | ||
| mock_float32.assert_called_once_with(default_kwargs["embedding"]) | ||
|
|
||
| # Verify embed_query was called with the search query | ||
| mock_embeddings_instance.embed_query.assert_called_once_with("test query") | ||
|
|
||
| def test_pinecone_client_initialization( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that Pinecone client is initialized with correct parameters.""" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client and index | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| # Mock query results | ||
| mock_match = Mock() | ||
| mock_match.metadata = {"text": "test result"} | ||
| mock_results = Mock() | ||
| mock_results.matches = [mock_match] | ||
| mock_index.query.return_value = mock_results | ||
|
|
||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone) as mock_pc_class, | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
| component.search_documents() | ||
|
|
||
| # Verify Pinecone was initialized with correct API key | ||
| mock_pc_class.assert_called_once_with(api_key="test-pinecone-key") | ||
|
|
||
| # Verify Index was called with correct index name | ||
| mock_pinecone.Index.assert_called_once_with("test-index") | ||
|
|
||
| def test_search_documents_with_missing_text_key_in_metadata( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that documents with missing text_key in metadata are handled gracefully.""" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client and index | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| # Mock query results with metadata missing the text key | ||
| mock_match = Mock() | ||
| mock_match.metadata = {"source": "test"} # Missing "text" key | ||
| mock_results = Mock() | ||
| mock_results.matches = [mock_match] | ||
| mock_index.query.return_value = mock_results | ||
|
|
||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone), | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
| results = component.search_documents() | ||
|
|
||
| # Verify results are returned with empty text | ||
| assert len(results) == 1 | ||
| assert results[0].text == "" | ||
|
|
||
| def test_search_documents_with_none_metadata( | ||
| self, component_class: type[PineconeVectorStoreComponent], default_kwargs: dict[str, Any] | ||
| ) -> None: | ||
| """Test that documents with None metadata are handled gracefully.""" | ||
| component: PineconeVectorStoreComponent = component_class().set(**default_kwargs) | ||
|
|
||
| # Mock the Pinecone client and index | ||
| mock_pinecone = MagicMock() | ||
| mock_index = MagicMock() | ||
| mock_pinecone.Index.return_value = mock_index | ||
|
|
||
| # Mock query results with None metadata | ||
| mock_match = Mock() | ||
| mock_match.metadata = None | ||
| mock_results = Mock() | ||
| mock_results.matches = [mock_match] | ||
| mock_index.query.return_value = mock_results | ||
|
|
||
| with ( | ||
| patch("lfx.components.pinecone.pinecone.Pinecone", return_value=mock_pinecone), | ||
| patch("lfx.components.pinecone.pinecone.Float32Embeddings") as mock_float32, | ||
| ): | ||
| mock_embeddings_instance = MagicMock() | ||
| mock_embeddings_instance.embed_query.return_value = [0.1] * 3072 | ||
| mock_float32.return_value = mock_embeddings_instance | ||
|
|
||
| component.set(search_query="test query") | ||
| results = component.search_documents() | ||
|
|
||
| # Verify results are returned with empty text and metadata | ||
| assert len(results) == 1 | ||
| assert results[0].text == "" | ||
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Pinecone mocking target doesn’t match implementation import; tests may call the real client
In
search_documentsthe implementation does a local import:This binds
Pineconeas a local name from the externalpineconemodule. The tests, however, patchlfx.components.pinecone.pinecone.Pinecone, which is a different symbol and won’t intercept that local import. As a result, these tests can end up using the real Pinecone client (or failing withImportError) instead of the mocked one.To ensure the tests actually mock what’s used at runtime, you have two main options:
Update all contexts like:
to patch the external module instead:
This way
from pinecone import Pineconepulls in the mocked class.If you prefer to keep patching
lfx.components.pinecone.pinecone.Pinecone, move the import to module scope with a guarded pattern and havesearch_documentsreference the globalPinecone. That’s a broader behavioral change (and may affect optional-dependency semantics), so Option A is safer for this PR.Also note that
test_search_documents_embedding_errordoesn’t patch Pinecone at all; ifpineconeisn’t installed, the resultingValueError("Error searching documents")would be due to an import error, not the embedding error. The assertion still passes, but the failure mode is different from what the test name suggests.[/run_scripts]
🏁 Script executed:
Length of output: 49
🏁 Script executed:
Length of output: 0
🏁 Script executed:
Length of output: 93
🏁 Script executed:
Length of output: 40
🏁 Script executed:
Length of output: 90
🏁 Script executed:
Length of output: 447
🏁 Script executed:
Length of output: 1620
🏁 Script executed:
Length of output: 0
[/run_scripts]
🏁 Script executed:
Length of output: 817
🏁 Script executed:
Length of output: 1125
🏁 Script executed:
Length of output: 2154
🏁 Script executed:
Length of output: 0
🏁 Script executed:
Length of output: 2
🏁 Script executed:
Length of output: 99
Pinecone mocking target doesn't match implementation import; tests may call the real client
In
search_documents()(line 100), the implementation does a local import:This creates a local binding to the external
pineconemodule. The tests, however, patchlfx.components.pinecone.pinecone.Pinecone, which doesn't exist at module scope (Pinecone is never imported there). As a result, these tests won't intercept the local import and can end up using the real Pinecone client instead of the mocked one.To fix, update all patch calls from:
to:
This ensures
from pinecone import Pineconepulls in the mocked class.Also, the test file is missing a trailing newline at EOF (W292).
🤖 Prompt for AI Agents