-
Notifications
You must be signed in to change notification settings - Fork 8.2k
fix: changed embedding model to have api base and watsonx api endpoint #10524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe changes add IBM watsonx.ai support to the EmbeddingModelComponent by introducing a new base_url_ibm_watsonx configuration field with six regional endpoints. A new watsonx constants module is created, the embedding component logic is extended with provider-specific URL handling, and a starter project template is updated with the new configuration. Changes
Sequence DiagramsequenceDiagram
participant User
participant Component as EmbeddingModelComponent
participant Config as update_build_config
participant Build as build_embeddings
User->>Component: Select Provider (OpenAI/Ollama/IBM watsonx.ai)
Component->>Config: Trigger config update
alt Provider is OpenAI
Config->>Config: Hide base_url_ibm_watsonx
Config->>Config: Hide api_base
else Provider is Ollama
Config->>Config: Hide base_url_ibm_watsonx
Config->>Config: Show api_base (not advanced)
else Provider is IBM watsonx.ai
Config->>Config: Show base_url_ibm_watsonx<br/>(with IBM_WATSONX_URLS options)
Config->>Config: Hide api_base
Config->>Config: Ensure project_id required
end
User->>Build: Trigger build_embeddings
Build->>Build: Check provider
alt Provider is IBM watsonx.ai
Build->>Build: Use base_url_ibm_watsonx as URL
else Other providers
Build->>Build: Use existing URL resolution
end
Build->>User: Return configured embeddings
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touchesImportant Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning, 3 inconclusive)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
edwinjosechittilappilly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #10524 +/- ##
==========================================
- Coverage 31.40% 31.40% -0.01%
==========================================
Files 1325 1325
Lines 59994 59995 +1
Branches 8983 8983
==========================================
- Hits 18844 18843 -1
- Misses 40244 40245 +1
- Partials 906 907 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (1)
1942-1971: Watsonx endpoint URLs are currently in sync; however, address broader duplication in language_model.py.Verification confirms the JSON dropdown options match IBM_WATSONX_URLS in watsonx_constants.py exactly. However, there's a systemic issue:
language_model.pydefines its own inline copy of IBM_WATSONX_URLS instead of importing from the constant. This creates two independent sources of truth.Actions:
- Update
language_model.pyto import IBM_WATSONX_URLS from watsonx_constants (line where it's currently defined inline)- For the JSON file: add build-time validation to compare against the constant, or document that manual sync is required when endpoints change
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/backend/base/langflow/initial_setup/starter_projects/News Aggregator.json(1 hunks)src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json(3 hunks)src/lfx/src/lfx/base/models/watsonx_constants.py(1 hunks)src/lfx/src/lfx/components/models/embedding_model.py(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/lfx/src/lfx/components/models/embedding_model.py (1)
src/lfx/src/lfx/inputs/inputs.py (1)
DropdownInput(465-490)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Run Frontend Unit Tests / Frontend Jest Unit Tests
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
- GitHub Check: Run Frontend Tests / Determine Test Suites and Shard Distribution
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
- GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
- GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
- GitHub Check: Test Starter Templates
- GitHub Check: test-starter-projects
- GitHub Check: Optimize new Python code in this PR
- GitHub Check: Update Component Index
🔇 Additional comments (1)
src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (1)
1852-1852: No action needed for code hash update.Metadata bump only; nothing to review here.
| "title_case": false, | ||
| "type": "code", | ||
| "value": "from typing import Any\n\nfrom langchain_openai import OpenAIEmbeddings\n\nfrom lfx.base.embeddings.model import LCEmbeddingsModel\nfrom lfx.base.models.ollama_constants import OLLAMA_EMBEDDING_MODELS\nfrom lfx.base.models.openai_constants import OPENAI_EMBEDDING_MODEL_NAMES\nfrom lfx.base.models.watsonx_constants import WATSONX_EMBEDDING_MODEL_NAMES\nfrom lfx.field_typing import Embeddings\nfrom lfx.io import (\n BoolInput,\n DictInput,\n DropdownInput,\n FloatInput,\n IntInput,\n MessageTextInput,\n SecretStrInput,\n)\nfrom lfx.schema.dotdict import dotdict\n\n\nclass EmbeddingModelComponent(LCEmbeddingsModel):\n display_name = \"Embedding Model\"\n description = \"Generate embeddings using a specified provider.\"\n documentation: str = \"https://docs.langflow.org/components-embedding-models\"\n icon = \"binary\"\n name = \"EmbeddingModel\"\n category = \"models\"\n\n inputs = [\n DropdownInput(\n name=\"provider\",\n display_name=\"Model Provider\",\n options=[\"OpenAI\", \"Ollama\", \"IBM watsonx.ai\"],\n value=\"OpenAI\",\n info=\"Select the embedding model provider\",\n real_time_refresh=True,\n options_metadata=[{\"icon\": \"OpenAI\"}, {\"icon\": \"Ollama\"}, {\"icon\": \"WatsonxAI\"}],\n ),\n DropdownInput(\n name=\"model\",\n display_name=\"Model Name\",\n options=OPENAI_EMBEDDING_MODEL_NAMES,\n value=OPENAI_EMBEDDING_MODEL_NAMES[0],\n info=\"Select the embedding model to use\",\n ),\n SecretStrInput(\n name=\"api_key\",\n display_name=\"OpenAI API Key\",\n info=\"Model Provider API key\",\n required=True,\n show=True,\n real_time_refresh=True,\n ),\n MessageTextInput(\n name=\"api_base\",\n display_name=\"API Base URL\",\n info=\"Base URL for the API. Leave empty for default.\",\n advanced=True,\n ),\n # Watson-specific inputs\n MessageTextInput(\n name=\"project_id\",\n display_name=\"Project ID\",\n info=\"IBM watsonx.ai Project ID (required for IBM watsonx.ai)\",\n show=False,\n ),\n IntInput(\n name=\"dimensions\",\n display_name=\"Dimensions\",\n info=\"The number of dimensions the resulting output embeddings should have. \"\n \"Only supported by certain models.\",\n advanced=True,\n ),\n IntInput(name=\"chunk_size\", display_name=\"Chunk Size\", advanced=True, value=1000),\n FloatInput(name=\"request_timeout\", display_name=\"Request Timeout\", advanced=True),\n IntInput(name=\"max_retries\", display_name=\"Max Retries\", advanced=True, value=3),\n BoolInput(name=\"show_progress_bar\", display_name=\"Show Progress Bar\", advanced=True),\n DictInput(\n name=\"model_kwargs\",\n display_name=\"Model Kwargs\",\n advanced=True,\n info=\"Additional keyword arguments to pass to the model.\",\n ),\n ]\n\n def build_embeddings(self) -> Embeddings:\n provider = self.provider\n model = self.model\n api_key = self.api_key\n api_base = self.api_base\n dimensions = self.dimensions\n chunk_size = self.chunk_size\n request_timeout = self.request_timeout\n max_retries = self.max_retries\n show_progress_bar = self.show_progress_bar\n model_kwargs = self.model_kwargs or {}\n\n if provider == \"OpenAI\":\n if not api_key:\n msg = \"OpenAI API key is required when using OpenAI provider\"\n raise ValueError(msg)\n return OpenAIEmbeddings(\n model=model,\n dimensions=dimensions or None,\n base_url=api_base or None,\n api_key=api_key,\n chunk_size=chunk_size,\n max_retries=max_retries,\n timeout=request_timeout or None,\n show_progress_bar=show_progress_bar,\n model_kwargs=model_kwargs,\n )\n\n if provider == \"Ollama\":\n try:\n from langchain_ollama import OllamaEmbeddings\n except ImportError:\n try:\n from langchain_community.embeddings import OllamaEmbeddings\n except ImportError:\n msg = \"Please install langchain-ollama: pip install langchain-ollama\"\n raise ImportError(msg) from None\n\n return OllamaEmbeddings(\n model=model,\n base_url=api_base or \"http://localhost:11434\",\n **model_kwargs,\n )\n\n if provider == \"IBM watsonx.ai\":\n try:\n from langchain_ibm import WatsonxEmbeddings\n except ImportError:\n msg = \"Please install langchain-ibm: pip install langchain-ibm\"\n raise ImportError(msg) from None\n\n if not api_key:\n msg = \"IBM watsonx.ai API key is required when using IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n project_id = self.project_id\n\n if not project_id:\n msg = \"Project ID is required for IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n params = {\n \"model_id\": model,\n \"url\": api_base or \"https://us-south.ml.cloud.ibm.com\",\n \"apikey\": api_key,\n }\n\n params[\"project_id\"] = project_id\n\n return WatsonxEmbeddings(**params)\n\n msg = f\"Unknown provider: {provider}\"\n raise ValueError(msg)\n\n def update_build_config(self, build_config: dotdict, field_value: Any, field_name: str | None = None) -> dotdict:\n if field_name == \"provider\":\n if field_value == \"OpenAI\":\n build_config[\"model\"][\"options\"] = OPENAI_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = OPENAI_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"OpenAI API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"display_name\"] = \"OpenAI API Base URL\"\n build_config[\"api_base\"][\"advanced\"] = True\n build_config[\"project_id\"][\"show\"] = False\n\n elif field_value == \"Ollama\":\n build_config[\"model\"][\"options\"] = OLLAMA_EMBEDDING_MODELS\n build_config[\"model\"][\"value\"] = OLLAMA_EMBEDDING_MODELS[0]\n build_config[\"api_key\"][\"display_name\"] = \"API Key (Optional)\"\n build_config[\"api_key\"][\"required\"] = False\n build_config[\"api_key\"][\"show\"] = False\n build_config[\"api_base\"][\"display_name\"] = \"Ollama Base URL\"\n build_config[\"api_base\"][\"value\"] = \"http://localhost:11434\"\n build_config[\"api_base\"][\"advanced\"] = True\n build_config[\"project_id\"][\"show\"] = False\n\n elif field_value == \"IBM watsonx.ai\":\n build_config[\"model\"][\"options\"] = WATSONX_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = WATSONX_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"IBM watsonx.ai API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"display_name\"] = \"IBM watsonx.ai URL\"\n build_config[\"api_base\"][\"value\"] = \"https://us-south.ml.cloud.ibm.com\"\n build_config[\"api_base\"][\"advanced\"] = False\n build_config[\"project_id\"][\"show\"] = True\n\n return build_config\n" | ||
| "value": "from typing import Any\n\nfrom langchain_openai import OpenAIEmbeddings\n\nfrom lfx.base.embeddings.model import LCEmbeddingsModel\nfrom lfx.base.models.ollama_constants import OLLAMA_EMBEDDING_MODELS\nfrom lfx.base.models.openai_constants import OPENAI_EMBEDDING_MODEL_NAMES\nfrom lfx.base.models.watsonx_constants import IBM_WATSONX_URLS, WATSONX_EMBEDDING_MODEL_NAMES\nfrom lfx.field_typing import Embeddings\nfrom lfx.io import (\n BoolInput,\n DictInput,\n DropdownInput,\n FloatInput,\n IntInput,\n MessageTextInput,\n SecretStrInput,\n)\nfrom lfx.schema.dotdict import dotdict\n\n\nclass EmbeddingModelComponent(LCEmbeddingsModel):\n display_name = \"Embedding Model\"\n description = \"Generate embeddings using a specified provider.\"\n documentation: str = \"https://docs.langflow.org/components-embedding-models\"\n icon = \"binary\"\n name = \"EmbeddingModel\"\n category = \"models\"\n\n inputs = [\n DropdownInput(\n name=\"provider\",\n display_name=\"Model Provider\",\n options=[\"OpenAI\", \"Ollama\", \"IBM watsonx.ai\"],\n value=\"OpenAI\",\n info=\"Select the embedding model provider\",\n real_time_refresh=True,\n options_metadata=[{\"icon\": \"OpenAI\"}, {\"icon\": \"Ollama\"}, {\"icon\": \"WatsonxAI\"}],\n ),\n MessageTextInput(\n name=\"api_base\",\n display_name=\"API Base URL\",\n info=\"Base URL for the API. Leave empty for default.\",\n advanced=True,\n ),\n DropdownInput(\n name=\"base_url_ibm_watsonx\",\n display_name=\"watsonx API Endpoint\",\n info=\"The base URL of the API (IBM watsonx.ai only)\",\n options=IBM_WATSONX_URLS,\n value=IBM_WATSONX_URLS[0],\n show=False,\n real_time_refresh=True,\n ),\n DropdownInput(\n name=\"model\",\n display_name=\"Model Name\",\n options=OPENAI_EMBEDDING_MODEL_NAMES,\n value=OPENAI_EMBEDDING_MODEL_NAMES[0],\n info=\"Select the embedding model to use\",\n ),\n SecretStrInput(\n name=\"api_key\",\n display_name=\"OpenAI API Key\",\n info=\"Model Provider API key\",\n required=True,\n show=True,\n real_time_refresh=True,\n ),\n # Watson-specific inputs\n MessageTextInput(\n name=\"project_id\",\n display_name=\"Project ID\",\n info=\"IBM watsonx.ai Project ID (required for IBM watsonx.ai)\",\n show=False,\n ),\n IntInput(\n name=\"dimensions\",\n display_name=\"Dimensions\",\n info=\"The number of dimensions the resulting output embeddings should have. \"\n \"Only supported by certain models.\",\n advanced=True,\n ),\n IntInput(name=\"chunk_size\", display_name=\"Chunk Size\", advanced=True, value=1000),\n FloatInput(name=\"request_timeout\", display_name=\"Request Timeout\", advanced=True),\n IntInput(name=\"max_retries\", display_name=\"Max Retries\", advanced=True, value=3),\n BoolInput(name=\"show_progress_bar\", display_name=\"Show Progress Bar\", advanced=True),\n DictInput(\n name=\"model_kwargs\",\n display_name=\"Model Kwargs\",\n advanced=True,\n info=\"Additional keyword arguments to pass to the model.\",\n ),\n ]\n\n def build_embeddings(self) -> Embeddings:\n provider = self.provider\n model = self.model\n api_key = self.api_key\n api_base = self.api_base\n base_url_ibm_watsonx = self.base_url_ibm_watsonx\n dimensions = self.dimensions\n chunk_size = self.chunk_size\n request_timeout = self.request_timeout\n max_retries = self.max_retries\n show_progress_bar = self.show_progress_bar\n model_kwargs = self.model_kwargs or {}\n\n if provider == \"OpenAI\":\n if not api_key:\n msg = \"OpenAI API key is required when using OpenAI provider\"\n raise ValueError(msg)\n return OpenAIEmbeddings(\n model=model,\n dimensions=dimensions or None,\n base_url=api_base or None,\n api_key=api_key,\n chunk_size=chunk_size,\n max_retries=max_retries,\n timeout=request_timeout or None,\n show_progress_bar=show_progress_bar,\n model_kwargs=model_kwargs,\n )\n\n if provider == \"Ollama\":\n try:\n from langchain_ollama import OllamaEmbeddings\n except ImportError:\n try:\n from langchain_community.embeddings import OllamaEmbeddings\n except ImportError:\n msg = \"Please install langchain-ollama: pip install langchain-ollama\"\n raise ImportError(msg) from None\n\n return OllamaEmbeddings(\n model=model,\n base_url=api_base or \"http://localhost:11434\",\n **model_kwargs,\n )\n\n if provider == \"IBM watsonx.ai\":\n try:\n from langchain_ibm import WatsonxEmbeddings\n except ImportError:\n msg = \"Please install langchain-ibm: pip install langchain-ibm\"\n raise ImportError(msg) from None\n\n if not api_key:\n msg = \"IBM watsonx.ai API key is required when using IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n project_id = self.project_id\n\n if not project_id:\n msg = \"Project ID is required for IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n params = {\n \"model_id\": model,\n \"url\": base_url_ibm_watsonx or \"https://us-south.ml.cloud.ibm.com\",\n \"apikey\": api_key,\n }\n\n params[\"project_id\"] = project_id\n\n return WatsonxEmbeddings(**params)\n\n msg = f\"Unknown provider: {provider}\"\n raise ValueError(msg)\n\n def update_build_config(self, build_config: dotdict, field_value: Any, field_name: str | None = None) -> dotdict:\n if field_name == \"provider\":\n if field_value == \"OpenAI\":\n build_config[\"model\"][\"options\"] = OPENAI_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = OPENAI_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"OpenAI API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"display_name\"] = \"OpenAI API Base URL\"\n build_config[\"api_base\"][\"advanced\"] = True\n build_config[\"project_id\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = False\n\n elif field_value == \"Ollama\":\n build_config[\"model\"][\"options\"] = OLLAMA_EMBEDDING_MODELS\n build_config[\"model\"][\"value\"] = OLLAMA_EMBEDDING_MODELS[0]\n build_config[\"api_key\"][\"display_name\"] = \"API Key (Optional)\"\n build_config[\"api_key\"][\"required\"] = False\n build_config[\"api_key\"][\"show\"] = False\n build_config[\"api_base\"][\"display_name\"] = \"Ollama Base URL\"\n build_config[\"api_base\"][\"value\"] = \"http://localhost:11434\"\n build_config[\"api_base\"][\"advanced\"] = False\n build_config[\"project_id\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = False\n\n elif field_value == \"IBM watsonx.ai\":\n build_config[\"model\"][\"options\"] = WATSONX_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = WATSONX_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"IBM watsonx.ai API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = True\n build_config[\"project_id\"][\"show\"] = True\n\n return build_config\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Provider logic looks solid; make project_id required for IBM to match runtime check.
Build raises if project_id is missing, but the UI doesn’t mark it required when IBM watsonx.ai is selected.
Apply inside the Python code string’s update_build_config IBM branch:
elif field_value == "IBM watsonx.ai":
build_config["model"]["options"] = WATSONX_EMBEDDING_MODEL_NAMES
build_config["model"]["value"] = WATSONX_EMBEDDING_MODEL_NAMES[0]
build_config["api_key"]["display_name"] = "IBM watsonx.ai API Key"
build_config["api_key"]["required"] = True
build_config["api_key"]["show"] = True
build_config["api_base"]["show"] = False
build_config["base_url_ibm_watsonx"]["show"] = True
build_config["project_id"]["show"] = True
+ build_config["project_id"]["required"] = TrueAlso, consider passing request_timeout/max_retries to Ollama if supported by the version in use. Low priority.
🌐 Web query:
Confirm for langchain_ibm==0.3.19: WatsonxEmbeddings constructor kwargs are `model_id`, `url`, `apikey`, and optionally `project_id`. Also confirm for langchain_openai==0.3.23 that OpenAIEmbeddings accepts `timeout` and `show_progress_bar`.
💡 Result:
Yes.
-
langchain_ibm==0.3.19 — WatsonxEmbeddings constructor accepts model_id, url, apikey (and optional project_id/space_id). [1][2]
-
langchain_openai==0.3.23 — OpenAIEmbeddings accepts a request_timeout / timeout parameter and a show_progress_bar boolean. [3][4]
Sources:
- langchain_ibm WatsonxEmbeddings API reference / module docs. [turn0search7, turn0search10].
- LangChain IBM watsonx.ai integration docs (instantiation examples showing model_id, url, project_id). [turn0search2, turn0search7].
- OpenAIEmbeddings API reference (request_timeout alias timeout; show_progress_bar). [turn0search1].
- LangChain OpenAI embeddings docs (show_progress_bar and timeout described). [turn0search0].
Mark project_id required in UI for IBM watsonx.ai to match runtime validation.
The build_embeddings() method raises ValueError if project_id is missing for IBM watsonx.ai, but update_build_config() only sets show=True without marking it required=True.
Apply inside the Python code string's update_build_config IBM branch:
elif field_value == "IBM watsonx.ai":
build_config["model"]["options"] = WATSONX_EMBEDDING_MODEL_NAMES
build_config["model"]["value"] = WATSONX_EMBEDDING_MODEL_NAMES[0]
build_config["api_key"]["display_name"] = "IBM watsonx.ai API Key"
build_config["api_key"]["required"] = True
build_config["api_key"]["show"] = True
build_config["api_base"]["show"] = False
build_config["base_url_ibm_watsonx"]["show"] = True
build_config["project_id"]["show"] = True
+ build_config["project_id"]["required"] = TrueWatsonxEmbeddings constructor accepts model_id, url, apikey, and optionally project_id, and OpenAIEmbeddings accepts both timeout and show_progress_bar parameters—the code passes these correctly.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "value": "from typing import Any\n\nfrom langchain_openai import OpenAIEmbeddings\n\nfrom lfx.base.embeddings.model import LCEmbeddingsModel\nfrom lfx.base.models.ollama_constants import OLLAMA_EMBEDDING_MODELS\nfrom lfx.base.models.openai_constants import OPENAI_EMBEDDING_MODEL_NAMES\nfrom lfx.base.models.watsonx_constants import IBM_WATSONX_URLS, WATSONX_EMBEDDING_MODEL_NAMES\nfrom lfx.field_typing import Embeddings\nfrom lfx.io import (\n BoolInput,\n DictInput,\n DropdownInput,\n FloatInput,\n IntInput,\n MessageTextInput,\n SecretStrInput,\n)\nfrom lfx.schema.dotdict import dotdict\n\n\nclass EmbeddingModelComponent(LCEmbeddingsModel):\n display_name = \"Embedding Model\"\n description = \"Generate embeddings using a specified provider.\"\n documentation: str = \"https://docs.langflow.org/components-embedding-models\"\n icon = \"binary\"\n name = \"EmbeddingModel\"\n category = \"models\"\n\n inputs = [\n DropdownInput(\n name=\"provider\",\n display_name=\"Model Provider\",\n options=[\"OpenAI\", \"Ollama\", \"IBM watsonx.ai\"],\n value=\"OpenAI\",\n info=\"Select the embedding model provider\",\n real_time_refresh=True,\n options_metadata=[{\"icon\": \"OpenAI\"}, {\"icon\": \"Ollama\"}, {\"icon\": \"WatsonxAI\"}],\n ),\n MessageTextInput(\n name=\"api_base\",\n display_name=\"API Base URL\",\n info=\"Base URL for the API. Leave empty for default.\",\n advanced=True,\n ),\n DropdownInput(\n name=\"base_url_ibm_watsonx\",\n display_name=\"watsonx API Endpoint\",\n info=\"The base URL of the API (IBM watsonx.ai only)\",\n options=IBM_WATSONX_URLS,\n value=IBM_WATSONX_URLS[0],\n show=False,\n real_time_refresh=True,\n ),\n DropdownInput(\n name=\"model\",\n display_name=\"Model Name\",\n options=OPENAI_EMBEDDING_MODEL_NAMES,\n value=OPENAI_EMBEDDING_MODEL_NAMES[0],\n info=\"Select the embedding model to use\",\n ),\n SecretStrInput(\n name=\"api_key\",\n display_name=\"OpenAI API Key\",\n info=\"Model Provider API key\",\n required=True,\n show=True,\n real_time_refresh=True,\n ),\n # Watson-specific inputs\n MessageTextInput(\n name=\"project_id\",\n display_name=\"Project ID\",\n info=\"IBM watsonx.ai Project ID (required for IBM watsonx.ai)\",\n show=False,\n ),\n IntInput(\n name=\"dimensions\",\n display_name=\"Dimensions\",\n info=\"The number of dimensions the resulting output embeddings should have. \"\n \"Only supported by certain models.\",\n advanced=True,\n ),\n IntInput(name=\"chunk_size\", display_name=\"Chunk Size\", advanced=True, value=1000),\n FloatInput(name=\"request_timeout\", display_name=\"Request Timeout\", advanced=True),\n IntInput(name=\"max_retries\", display_name=\"Max Retries\", advanced=True, value=3),\n BoolInput(name=\"show_progress_bar\", display_name=\"Show Progress Bar\", advanced=True),\n DictInput(\n name=\"model_kwargs\",\n display_name=\"Model Kwargs\",\n advanced=True,\n info=\"Additional keyword arguments to pass to the model.\",\n ),\n ]\n\n def build_embeddings(self) -> Embeddings:\n provider = self.provider\n model = self.model\n api_key = self.api_key\n api_base = self.api_base\n base_url_ibm_watsonx = self.base_url_ibm_watsonx\n dimensions = self.dimensions\n chunk_size = self.chunk_size\n request_timeout = self.request_timeout\n max_retries = self.max_retries\n show_progress_bar = self.show_progress_bar\n model_kwargs = self.model_kwargs or {}\n\n if provider == \"OpenAI\":\n if not api_key:\n msg = \"OpenAI API key is required when using OpenAI provider\"\n raise ValueError(msg)\n return OpenAIEmbeddings(\n model=model,\n dimensions=dimensions or None,\n base_url=api_base or None,\n api_key=api_key,\n chunk_size=chunk_size,\n max_retries=max_retries,\n timeout=request_timeout or None,\n show_progress_bar=show_progress_bar,\n model_kwargs=model_kwargs,\n )\n\n if provider == \"Ollama\":\n try:\n from langchain_ollama import OllamaEmbeddings\n except ImportError:\n try:\n from langchain_community.embeddings import OllamaEmbeddings\n except ImportError:\n msg = \"Please install langchain-ollama: pip install langchain-ollama\"\n raise ImportError(msg) from None\n\n return OllamaEmbeddings(\n model=model,\n base_url=api_base or \"http://localhost:11434\",\n **model_kwargs,\n )\n\n if provider == \"IBM watsonx.ai\":\n try:\n from langchain_ibm import WatsonxEmbeddings\n except ImportError:\n msg = \"Please install langchain-ibm: pip install langchain-ibm\"\n raise ImportError(msg) from None\n\n if not api_key:\n msg = \"IBM watsonx.ai API key is required when using IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n project_id = self.project_id\n\n if not project_id:\n msg = \"Project ID is required for IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n params = {\n \"model_id\": model,\n \"url\": base_url_ibm_watsonx or \"https://us-south.ml.cloud.ibm.com\",\n \"apikey\": api_key,\n }\n\n params[\"project_id\"] = project_id\n\n return WatsonxEmbeddings(**params)\n\n msg = f\"Unknown provider: {provider}\"\n raise ValueError(msg)\n\n def update_build_config(self, build_config: dotdict, field_value: Any, field_name: str | None = None) -> dotdict:\n if field_name == \"provider\":\n if field_value == \"OpenAI\":\n build_config[\"model\"][\"options\"] = OPENAI_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = OPENAI_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"OpenAI API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"display_name\"] = \"OpenAI API Base URL\"\n build_config[\"api_base\"][\"advanced\"] = True\n build_config[\"project_id\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = False\n\n elif field_value == \"Ollama\":\n build_config[\"model\"][\"options\"] = OLLAMA_EMBEDDING_MODELS\n build_config[\"model\"][\"value\"] = OLLAMA_EMBEDDING_MODELS[0]\n build_config[\"api_key\"][\"display_name\"] = \"API Key (Optional)\"\n build_config[\"api_key\"][\"required\"] = False\n build_config[\"api_key\"][\"show\"] = False\n build_config[\"api_base\"][\"display_name\"] = \"Ollama Base URL\"\n build_config[\"api_base\"][\"value\"] = \"http://localhost:11434\"\n build_config[\"api_base\"][\"advanced\"] = False\n build_config[\"project_id\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = False\n\n elif field_value == \"IBM watsonx.ai\":\n build_config[\"model\"][\"options\"] = WATSONX_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = WATSONX_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"IBM watsonx.ai API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = True\n build_config[\"project_id\"][\"show\"] = True\n\n return build_config\n" | |
| "value": "from typing import Any\n\nfrom langchain_openai import OpenAIEmbeddings\n\nfrom lfx.base.embeddings.model import LCEmbeddingsModel\nfrom lfx.base.models.ollama_constants import OLLAMA_EMBEDDING_MODELS\nfrom lfx.base.models.openai_constants import OPENAI_EMBEDDING_MODEL_NAMES\nfrom lfx.base.models.watsonx_constants import IBM_WATSONX_URLS, WATSONX_EMBEDDING_MODEL_NAMES\nfrom lfx.field_typing import Embeddings\nfrom lfx.io import (\n BoolInput,\n DictInput,\n DropdownInput,\n FloatInput,\n IntInput,\n MessageTextInput,\n SecretStrInput,\n)\nfrom lfx.schema.dotdict import dotdict\n\n\nclass EmbeddingModelComponent(LCEmbeddingsModel):\n display_name = \"Embedding Model\"\n description = \"Generate embeddings using a specified provider.\"\n documentation: str = \"https://docs.langflow.org/components-embedding-models\"\n icon = \"binary\"\n name = \"EmbeddingModel\"\n category = \"models\"\n\n inputs = [\n DropdownInput(\n name=\"provider\",\n display_name=\"Model Provider\",\n options=[\"OpenAI\", \"Ollama\", \"IBM watsonx.ai\"],\n value=\"OpenAI\",\n info=\"Select the embedding model provider\",\n real_time_refresh=True,\n options_metadata=[{\"icon\": \"OpenAI\"}, {\"icon\": \"Ollama\"}, {\"icon\": \"WatsonxAI\"}],\n ),\n MessageTextInput(\n name=\"api_base\",\n display_name=\"API Base URL\",\n info=\"Base URL for the API. Leave empty for default.\",\n advanced=True,\n ),\n DropdownInput(\n name=\"base_url_ibm_watsonx\",\n display_name=\"watsonx API Endpoint\",\n info=\"The base URL of the API (IBM watsonx.ai only)\",\n options=IBM_WATSONX_URLS,\n value=IBM_WATSONX_URLS[0],\n show=False,\n real_time_refresh=True,\n ),\n DropdownInput(\n name=\"model\",\n display_name=\"Model Name\",\n options=OPENAI_EMBEDDING_MODEL_NAMES,\n value=OPENAI_EMBEDDING_MODEL_NAMES[0],\n info=\"Select the embedding model to use\",\n ),\n SecretStrInput(\n name=\"api_key\",\n display_name=\"OpenAI API Key\",\n info=\"Model Provider API key\",\n required=True,\n show=True,\n real_time_refresh=True,\n ),\n # Watson-specific inputs\n MessageTextInput(\n name=\"project_id\",\n display_name=\"Project ID\",\n info=\"IBM watsonx.ai Project ID (required for IBM watsonx.ai)\",\n show=False,\n ),\n IntInput(\n name=\"dimensions\",\n display_name=\"Dimensions\",\n info=\"The number of dimensions the resulting output embeddings should have. \"\n \"Only supported by certain models.\",\n advanced=True,\n ),\n IntInput(name=\"chunk_size\", display_name=\"Chunk Size\", advanced=True, value=1000),\n FloatInput(name=\"request_timeout\", display_name=\"Request Timeout\", advanced=True),\n IntInput(name=\"max_retries\", display_name=\"Max Retries\", advanced=True, value=3),\n BoolInput(name=\"show_progress_bar\", display_name=\"Show Progress Bar\", advanced=True),\n DictInput(\n name=\"model_kwargs\",\n display_name=\"Model Kwargs\",\n advanced=True,\n info=\"Additional keyword arguments to pass to the model.\",\n ),\n ]\n\n def build_embeddings(self) -> Embeddings:\n provider = self.provider\n model = self.model\n api_key = self.api_key\n api_base = self.api_base\n base_url_ibm_watsonx = self.base_url_ibm_watsonx\n dimensions = self.dimensions\n chunk_size = self.chunk_size\n request_timeout = self.request_timeout\n max_retries = self.max_retries\n show_progress_bar = self.show_progress_bar\n model_kwargs = self.model_kwargs or {}\n\n if provider == \"OpenAI\":\n if not api_key:\n msg = \"OpenAI API key is required when using OpenAI provider\"\n raise ValueError(msg)\n return OpenAIEmbeddings(\n model=model,\n dimensions=dimensions or None,\n base_url=api_base or None,\n api_key=api_key,\n chunk_size=chunk_size,\n max_retries=max_retries,\n timeout=request_timeout or None,\n show_progress_bar=show_progress_bar,\n model_kwargs=model_kwargs,\n )\n\n if provider == \"Ollama\":\n try:\n from langchain_ollama import OllamaEmbeddings\n except ImportError:\n try:\n from langchain_community.embeddings import OllamaEmbeddings\n except ImportError:\n msg = \"Please install langchain-ollama: pip install langchain-ollama\"\n raise ImportError(msg) from None\n\n return OllamaEmbeddings(\n model=model,\n base_url=api_base or \"http://localhost:11434\",\n **model_kwargs,\n )\n\n if provider == \"IBM watsonx.ai\":\n try:\n from langchain_ibm import WatsonxEmbeddings\n except ImportError:\n msg = \"Please install langchain-ibm: pip install langchain-ibm\"\n raise ImportError(msg) from None\n\n if not api_key:\n msg = \"IBM watsonx.ai API key is required when using IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n project_id = self.project_id\n\n if not project_id:\n msg = \"Project ID is required for IBM watsonx.ai provider\"\n raise ValueError(msg)\n\n params = {\n \"model_id\": model,\n \"url\": base_url_ibm_watsonx or \"https://us-south.ml.cloud.ibm.com\",\n \"apikey\": api_key,\n }\n\n params[\"project_id\"] = project_id\n\n return WatsonxEmbeddings(**params)\n\n msg = f\"Unknown provider: {provider}\"\n raise ValueError(msg)\n\n def update_build_config(self, build_config: dotdict, field_value: Any, field_name: str | None = None) -> dotdict:\n if field_name == \"provider\":\n if field_value == \"OpenAI\":\n build_config[\"model\"][\"options\"] = OPENAI_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = OPENAI_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"OpenAI API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"display_name\"] = \"OpenAI API Base URL\"\n build_config[\"api_base\"][\"advanced\"] = True\n build_config[\"project_id\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = False\n\n elif field_value == \"Ollama\":\n build_config[\"model\"][\"options\"] = OLLAMA_EMBEDDING_MODELS\n build_config[\"model\"][\"value\"] = OLLAMA_EMBEDDING_MODELS[0]\n build_config[\"api_key\"][\"display_name\"] = \"API Key (Optional)\"\n build_config[\"api_key\"][\"required\"] = False\n build_config[\"api_key\"][\"show\"] = False\n build_config[\"api_base\"][\"display_name\"] = \"Ollama Base URL\"\n build_config[\"api_base\"][\"value\"] = \"http://localhost:11434\"\n build_config[\"api_base\"][\"advanced\"] = False\n build_config[\"project_id\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = False\n\n elif field_value == \"IBM watsonx.ai\":\n build_config[\"model\"][\"options\"] = WATSONX_EMBEDDING_MODEL_NAMES\n build_config[\"model\"][\"value\"] = WATSONX_EMBEDDING_MODEL_NAMES[0]\n build_config[\"api_key\"][\"display_name\"] = \"IBM watsonx.ai API Key\"\n build_config[\"api_key\"][\"required\"] = True\n build_config[\"api_key\"][\"show\"] = True\n build_config[\"api_base\"][\"show\"] = False\n build_config[\"base_url_ibm_watsonx\"][\"show\"] = True\n build_config[\"project_id\"][\"show\"] = True\n build_config[\"project_id\"][\"required\"] = True\n\n return build_config\n" |
🤖 Prompt for AI Agents
In src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json
around line 2006, the update_build_config IBM watsonx.ai branch sets project_id
to show=True but does not mark it required; add
build_config["project_id"]["required"] = True in that IBM branch so the UI
enforces the Project ID requirement to match the runtime validation that raises
if project_id is missing.
| IBM_WATSONX_URLS = [ | ||
| "https://us-south.ml.cloud.ibm.com", | ||
| "https://eu-de.ml.cloud.ibm.com", | ||
| "https://eu-gb.ml.cloud.ibm.com", | ||
| "https://au-syd.ml.cloud.ibm.com", | ||
| "https://jp-tok.ml.cloud.ibm.com", | ||
| "https://ca-tor.ml.cloud.ibm.com", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the Mumbai watsonx.ai endpoint
IBM’s current watsonx.ai endpoint list includes the Mumbai (ap-south-1) region. Omitting it means users in that region can’t configure embeddings with this dropdown, even though the service is available there.(cloud.ibm.com)
Apply this diff to include the missing endpoint:
IBM_WATSONX_URLS = [
"https://us-south.ml.cloud.ibm.com",
"https://eu-de.ml.cloud.ibm.com",
"https://eu-gb.ml.cloud.ibm.com",
"https://au-syd.ml.cloud.ibm.com",
"https://jp-tok.ml.cloud.ibm.com",
"https://ca-tor.ml.cloud.ibm.com",
+ "https://ap-south-1.aws.wxai.ibm.com",
]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| IBM_WATSONX_URLS = [ | |
| "https://us-south.ml.cloud.ibm.com", | |
| "https://eu-de.ml.cloud.ibm.com", | |
| "https://eu-gb.ml.cloud.ibm.com", | |
| "https://au-syd.ml.cloud.ibm.com", | |
| "https://jp-tok.ml.cloud.ibm.com", | |
| "https://ca-tor.ml.cloud.ibm.com", | |
| ] | |
| IBM_WATSONX_URLS = [ | |
| "https://us-south.ml.cloud.ibm.com", | |
| "https://eu-de.ml.cloud.ibm.com", | |
| "https://eu-gb.ml.cloud.ibm.com", | |
| "https://au-syd.ml.cloud.ibm.com", | |
| "https://jp-tok.ml.cloud.ibm.com", | |
| "https://ca-tor.ml.cloud.ibm.com", | |
| "https://ap-south-1.aws.wxai.ibm.com", | |
| ] |
🤖 Prompt for AI Agents
In src/lfx/src/lfx/base/models/watsonx_constants.py around lines 14 to 21, the
IBM_WATSONX_URLS list is missing the Mumbai (ap-south-1) endpoint; add
"https://ap-south-1.ml.cloud.ibm.com" to the list (ideally near other regional
entries) so users in the Mumbai region can select it from the dropdown.
| elif field_value == "IBM watsonx.ai": | ||
| build_config["model"]["options"] = WATSONX_EMBEDDING_MODEL_NAMES | ||
| build_config["model"]["value"] = WATSONX_EMBEDDING_MODEL_NAMES[0] | ||
| build_config["api_key"]["display_name"] = "IBM watsonx.ai API Key" | ||
| build_config["api_key"]["required"] = True | ||
| build_config["api_key"]["show"] = True | ||
| build_config["api_base"]["display_name"] = "IBM watsonx.ai URL" | ||
| build_config["api_base"]["value"] = "https://us-south.ml.cloud.ibm.com" | ||
| build_config["api_base"]["advanced"] = False | ||
| build_config["api_base"]["show"] = False | ||
| build_config["base_url_ibm_watsonx"]["show"] = True | ||
| build_config["project_id"]["show"] = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve IBM watsonx base URL when upgrading existing flows
Previously, Watsonx embeddings read the endpoint from api_base. After this change, we hide that field and fall back to the dropdown, which defaults to US-South. When a saved flow that targets (say) eu-de loads, base_url_ibm_watsonx keeps the default and every request jumps to the wrong region, breaking auth for those users. Migrate any stored api_base value into the new field and clear the hidden input so legacy flows keep working.
Apply this diff to copy the existing value:
elif field_value == "IBM watsonx.ai":
build_config["model"]["options"] = WATSONX_EMBEDDING_MODEL_NAMES
build_config["model"]["value"] = WATSONX_EMBEDDING_MODEL_NAMES[0]
build_config["api_key"]["display_name"] = "IBM watsonx.ai API Key"
build_config["api_key"]["required"] = True
build_config["api_key"]["show"] = True
build_config["api_base"]["show"] = False
build_config["base_url_ibm_watsonx"]["show"] = True
+ api_base_value = build_config.get("api_base", {}).get("value")
+ if api_base_value:
+ build_config["base_url_ibm_watsonx"]["value"] = api_base_value
+ build_config["api_base"]["value"] = ""
build_config["project_id"]["show"] = True🤖 Prompt for AI Agents
In src/lfx/src/lfx/components/models/embedding_model.py around lines 196-204,
the migration logic must copy any existing saved api_base into
base_url_ibm_watsonx and then clear the legacy api_base so upgrades don’t
silently switch regions; update the flow-load / build_config initialization to:
if build_config.get("api_base") is truthy and not
build_config.get("base_url_ibm_watsonx"), set
build_config["base_url_ibm_watsonx"]=build_config["api_base"] and then set
build_config["api_base"]="" (or delete the key), ensuring you do not overwrite
an existing base_url_ibm_watsonx and that the hidden api_base is cleared for
backward compatibility.
langflow-ai#10524) * Changed embedding model to have api base and watsonx api endpoint * updated tests * Fixed tests * Update component_index.json --------- Co-authored-by: Edwin Jose <[email protected]>
This pull request introduces improved support for configuring IBM watsonx.ai embedding models in the starter projects and core embedding model component. The main change is the addition of a dedicated dropdown input for selecting the watsonx API endpoint, making configuration more explicit and user-friendly. The logic for building embeddings and updating configuration now uses this new input, and the available endpoints are centralized in a constant.
Enhancements for IBM watsonx.ai configuration:
IBM_WATSONX_URLStowatsonx_constants.pycontaining all supported IBM watsonx.ai API endpoints.base_url_ibm_watsonxto theEmbeddingModelComponentand starter project config, allowing users to select the watsonx API endpoint directly. [1] [2]base_url_ibm_watsonxfor the endpoint when building Watsonx embeddings, instead of relying onapi_base. [1] [2] [3]Other updates:
googlepackage version in the News Aggregator starter project for dependency freshness.Summary by CodeRabbit
Release Notes
New Features
Chores