-
Notifications
You must be signed in to change notification settings - Fork 8.2k
fix: Smart Transform component fails with ibm watsonx ai language model #11066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Smart Transform component fails with ibm watsonx ai language model #11066
Conversation
* Revert "Revert "docs: update component documentation links to individual pages"" This reverts commit 0bc27d6. * [autofix.ci] apply automated fixes * llm-selector-renamed * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Apply suggestions from code review * [autofix.ci] apply automated fixes * Apply suggestions from code review * [autofix.ci] apply automated fixes * rebuild-component-index * update-component-index * [autofix.ci] apply automated fixes * build-index * [autofix.ci] apply automated fixes --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
…10586) * fix: resolved merge conflict * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * fix: create a new message to avoid mutating shared instances * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * fix: resolved merge conflict * [autofix.ci] apply automated fixes * fix: resolved merge conflict * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * fix: added a check for using exisiting message object * fix: remove unwanted import * fix: resolve merge conflict * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * fix: add None checks to prevent errors * fix: resolve merge conflict * [autofix.ci] apply automated fixes * fix: backend unit test * fix: resolve merge conflict * [autofix.ci] apply automated fixes * fix: ruff styling errors * [autofix.ci] apply automated fixes --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
* feat: optimize dropdown filtering and output resolution misc: remove commented out code feat: add refresh button and sort flows by updated_at date from most to least recent ruff (flow.py imports) improve fn contracts in runflow and improve flow id retrieval logic based on graph exec context add dynamic outputs and optimize db lookups add flow cache and db query for getting a single flow by id or name cache run outputs and add refresh context to build config misc misc use ids for flow retrieval misc fix missing flow_id bug add unit and integration tests add input field flag to persist hidden fields at runtime move unit tests and change input and output display names chore: update component index fix: fix tool mode when flow has multiple inputs by dynamically creating resolvers chore: update component index ruff (run_flow and tests) add resolvers to outputs map for non tool mode runtime fix tests (current flow excluded in db fetch) mypy (helpers/flow.py) chore: update component index remove unused code and clean up comments fix: persist user messages in chat-based flows via session injection chore: update component index empty string fallback for sessionid in chat.py chore: update component index chore: update component index cache invalidation with timestamps misc add cache invalidation chore: update component index chore: update comp idx ruff (run_flow.py) change session_id input type to MessageTextInput chore: update component index chore: update component index chore: update component index chore: update component index sync starter projects with main chore: update component index chore: update component index chore: update component index remove dead code + impl coderabbit suggestions chore: update component index chore: update component index clear options metadata before updating chore: update component index sync starter projects with main sync starter projects with main default param val (list flows) * chore: update component index * add integration tests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: Cristhian Zanforlin <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
upgrade altk:
…ls (#10806) * use existing event loop instead of recreating when calling mcp tools * component index * [autofix.ci] apply automated fixes * starter projects * [autofix.ci] apply automated fixes --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
* removed unnecessary buttons on the flows page * added the asChild prop and hid button so they are not accessible by tabbing * added tab index to ensure that buttons as not selectable using the tab * made sure that accessibility is possible one bulk selection is enabled * made sure that accessibility is possible one bulk selection is enabled * Fix: added testcases and refactor * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes --------- Co-authored-by: Olayinka Adelakun <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
* remove console warnings * [autofix.ci] apply automated fixes --------- Co-authored-by: Olayinka Adelakun <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
* fix: mask value to hide null field being returned * [autofix.ci] apply automated fixes * fix: added testcase and updated functionality --------- Co-authored-by: Olayinka Adelakun <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Coelho <[email protected]> Co-authored-by: Olayinka Adelakun <[email protected]>
#10827) Fix: Allow refresh list button to stay stagnant while zoom (Safari) (#10777) * remove sticky as it was causing the refresh list to float on safari * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes --------- Co-authored-by: Olayinka Adelakun <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
This reverts commit 423419e.
* fix: Ollama model list fails to load in Agent and Ollama components * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
* fix: made sure the tab is visible * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Fix: added typing * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * fix: added testcases * fix: added handleOnValue change function and created a helper file --------- Co-authored-by: Olayinka Adelakun <[email protected]> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Olayinka Adelakun <[email protected]> Co-authored-by: Carlos Coelho <[email protected]>
Co-authored-by: Sami Marreed <[email protected]>
Remove DataFrameToToolsetComponent and related tests Deleted the DataFrameToToolsetComponent implementation, its import/registration in the processing module, and all associated unit tests. This cleans up unused code and test files related to converting DataFrame rows into toolset actions.
fix: Proper parsing of GCP credentials JSON (#10828) * fix: Proper parsing of GCP credentials JSON * Update save_file.py * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Update test_save_file_component.py * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Fix GCP issues * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Update test_save_file_component.py * Update save_file.py * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Update save_file.py * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Update save_file.py * Fix ruff errors * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
fix: Refine data logging in LambdaFilterComponent to capture actual payload
WalkthroughUpdates Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (3 warnings)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (1)
405-414: Version update is incomplete and does not achieve stated objective.FastAPI 0.124.2 (PyPI latest) and 0.124.4 (GitHub latest) are available, but the metadata shows a bump to only 0.123.0. This represents an outdated dependency that falls 2-3 patch versions behind current releases. If the PR's objective is to update dependencies, the fastapi version should be bumped to at least 0.124.0 or newer, not 0.123.0.
src/backend/base/langflow/initial_setup/starter_projects/Meeting Summary.json (1)
3150-3176: Provider options in template don't match embedded component code.The template
optionsarray lists only["OpenAI", "Anthropic", "Google"]with 3options_metadataentries, but the embedded component code declares 5 providers (["OpenAI", "Anthropic", "Google", "IBM watsonx.ai", "Ollama"]) with full handler implementations. The template should be updated to include all 5 providers with corresponding metadata entries, or the embedded code should be reduced to match the template. The mismatch creates an inconsistency where users cannot select IBM watsonx.ai or Ollama from the dropdown despite the code supporting them.src/backend/base/langflow/initial_setup/starter_projects/Instagram Copywriter.json (1)
1044-1120: Remove unused_serialize_datamethod from ChatOutput.FastAPI 0.123.0 is a valid release. However, the
_serialize_datamethod defined in the ChatOutput class is not invoked anywhere inmessage_response()orconvert_to_string()and should be removed to eliminate dead code.
♻️ Duplicate comments (9)
src/backend/base/langflow/initial_setup/starter_projects/Price Deal Finder.json (2)
419-495: ChatOutput: same shared implementation as in SaaS PricingThis
ChatOutputblock is identical to the implementation already reviewed in SaaS Pricing.json; the previous approval and reasoning apply here as well.
1606-1797: AgentComponent: shared provider‑aware agent implementationThis embedded
AgentComponentmatches the shared version reviewed in *SaaS Pricing.json`; it reuses the same provider‑agnostic LLM wiring and structured‑output logic, so the earlier approval (including the IBM watsonx.ai considerations) applies here too.src/backend/base/langflow/initial_setup/starter_projects/Knowledge Retrieval.json (1)
235-310: ChatOutput: shared implementation reused hereThis
ChatOutputdefinition is the same shared implementation already approved in the other starters (session/source handling, input validation, serialization); no additional concerns specific to this flow.src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json (3)
237-237: ChatInput starter matches other flows; behavior looks consistentThis
ChatInputimplementation is the same as in the Invoice Summarizer starter, with proper file normalization andsession_id‑based storage gating. The earlier note about optionally guardingself.graph.session_idwith anhasattrcheck applies here as well if you ever expect to run this component without an attached graph.Also applies to: 285-285
511-511: ChatOutput starter aligns with shared implementation; no additional issues spottedThis
ChatOutputmatches the shared implementation reviewed in the other starter (conversion, validation, session handling, and source metadata). No extra issues here beyond the optional graph‑presence guard already mentioned.Also applies to: 520-520, 585-585
1560-1560: Second LanguageModelComponent block mirrors the firstThe second
LanguageModelComponentcode block is identical to the first one, so all the observations about provider handling and external interactions apply here as well.src/backend/base/langflow/initial_setup/starter_projects/Document Q&A.json (1)
414-489: ChatOutput here matches the reviewed implementation in Blog WriterThis
ChatOutputdefinition is identical to the one already reviewed inBlog Writer.json(including_build_source,_validate_input, andconvert_to_string). The same comments about defensivegraphhandling apply; otherwise, the implementation looks good.src/backend/base/langflow/initial_setup/starter_projects/Basic Prompting.json (2)
120-169: ChatInput matches the implementation reviewed in Document Q&AThis
ChatInputdefinition is the same as inDocument Q&A.json(including the updated docs URL and file handling). The same suggestion applies to makesession_idderivation robust against missinggraph:- session_id = self.session_id or self.graph.session_id or "" + graph = getattr(self, "graph", None) + session_id = self.session_id or (graph.session_id if graph else "")Otherwise, the component looks good.
587-662: ChatOutput identical to previously reviewed implementationThis
ChatOutputblock (including_build_source,_validate_input,convert_to_string, and the new docs URL) is identical to the version already reviewed inBlog Writer.json. The same notes apply:
- Behavior and error handling are good and should work with different model providers, including IBM watsonx.ai.
- Consider guarding
self.graphwhen computingmessage.session_idandmessage.flow_id, as suggested there.No additional issues beyond that.
🧹 Nitpick comments (11)
src/backend/base/langflow/initial_setup/starter_projects/Basic Prompt Chaining.json (1)
709-709: Type annotation mismatch in_build_sourcemethod.The method signature declares
source: str | Nonebut the implementation handles objects withmodel_nameormodelattributes. Consider updating the type hint:-def _build_source(self, id_: str | None, display_name: str | None, source: str | None) -> Source: +def _build_source(self, id_: str | None, display_name: str | None, source: Any) -> Source:Also note: In
convert_to_string, theclean_dataparameter is passed tosafe_convertonly wheninput_valueis a list, but not for single items. If this inconsistency is intentional, consider adding a comment to clarify.src/backend/base/langflow/helpers/flow.py (1)
102-117: Redundant None check onorder_params.After lines 102-103,
order_paramsis guaranteed to be a dict, so the check on line 114 (if order_params is not None) is always true and can be removed.if order_params is None: order_params = {"column": "updated_at", "direction": "desc"} try: async with session_scope() as session: uuid_user_id = UUID(user_id) if isinstance(user_id, str) else user_id uuid_folder_id = UUID(folder_id) if isinstance(folder_id, str) else folder_id stmt = ( select(Flow.id, Flow.name, Flow.updated_at) .where(Flow.user_id == uuid_user_id) .where(Flow.folder_id == uuid_folder_id) ) - if order_params is not None: - sort_col = getattr(Flow, order_params.get("column", "updated_at"), Flow.updated_at) - sort_dir = SORT_DISPATCHER.get(order_params.get("direction", "desc"), desc) - stmt = stmt.order_by(sort_dir(sort_col)) + sort_col = getattr(Flow, order_params.get("column", "updated_at"), Flow.updated_at) + sort_dir = SORT_DISPATCHER.get(order_params.get("direction", "desc"), desc) + stmt = stmt.order_by(sort_dir(sort_col))src/backend/base/langflow/initial_setup/starter_projects/Invoice Summarizer.json (2)
308-309: ChatOutput session handling & validation look correct; only minor defensive optionThe new
ChatOutputimplementation (conversion + validation + gated persistence) looks solid and consistent with other starters. The only small thing you might consider is guardingself.graph.session_idthe same way you already guardself.graph.flow_id(e.g., viahasattr(self, "graph")) if there is any chanceChatOutputcan be used outside a graph context; if not, current code is fine.Also applies to: 317-317, 383-383
864-864: ChatInput files cleanup and session gating are good; same note aboutgraphavailabilityThe updated
ChatInputcorrectly normalizesfilesto a clean list and only persists messages when asession_idis present. As withChatOutput, this assumesself.graphis always set when resolvingsession_id; ifChatInputcan ever execute without an attached graph, adding anhasattr(self, "graph")guard aroundself.graph.session_idwould prevent anAttributeError.Also applies to: 914-914
src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json (1)
1778-1778: StructuredOutputComponent trustcall + LangChain fallback design is sound; double‑check result shapes fromget_chat_resultThe new
StructuredOutputComponentbuilds a Pydantic list‑wrapper model fromoutput_schema, then:
- Tries a Trustcall‑based extractor first.
- Falls back to
llm.with_structured_output(schema)if Trustcall isn’t available or fails.- Normalizes results into either a list of objects (for
Data/DataFrameoutputs) or raises when nothing usable is returned.Overall the flow and error handling look good and should improve robustness with different providers (including IBM watsonx.ai). The only thing to be careful about is that both
_extract_output_with_trustcalland_extract_output_with_langchainrely on the concrete shape returned byget_chat_result(list vs. dict withresponses, BaseModel vs. plain dict). It would be worth adding/confirming tests that cover:
- Trustcall success and failure paths.
- The LangChain fallback for providers where tool calling/Trustcall isn’t supported.
- Using at least one IBM watsonx.ai model through this component to ensure structured outputs parse as expected.
If those tests pass, the implementation here should be in good shape.
Also applies to: 1849-1849
src/backend/base/langflow/initial_setup/starter_projects/Blog Writer.json (3)
480-555: ChatOutput: behavior is good, but session handling should guard missinggraphThe new
ChatOutputimplementation (conversion, validation, and_build_source) is solid and fixes issues whensourceis a model object (e.g., non‑OpenAI) by always coercing it to a string before buildingSource.One edge case: both
message_responseimplementations useself.graph.session_idwithout checking whetherself.graphis set/non‑None, which can raiseAttributeErrorwhen the component is used outside a fully wired graph (tests, programmatic usage, tools).Consider hardening this:
- message.session_id = self.session_id or self.graph.session_id or "" - message.flow_id = self.graph.flow_id if hasattr(self, "graph") else None + graph = getattr(self, "graph", None) + message.session_id = self.session_id or (graph.session_id if graph else "") + message.flow_id = graph.flow_id if graph else NoneSame pattern applies to the identical ChatOutput definitions in other starter projects.
780-831: ParserComponent: logic is sound; minor config simplification possibleThe new
ParserComponentcorrectly handlesDataFramevsData, structured dicts, and raises clear errors for unsupported types. TheStringifypath viaconvert_to_stringand the default‑value handling for missing keys are reasonable.Two minor points you may want to tighten:
update_build_configusesself.modeinstead offield_value, andif field_value:will always be truthy for"Parser"/"Stringify". You can simplify and make it more explicit:- if field_name == "mode": - build_config["pattern"]["show"] = self.mode == "Parser" - build_config["pattern"]["required"] = self.mode == "Parser" - if field_value: + if field_name == "mode": + is_parser = field_value == "Parser" + build_config["pattern"]["show"] = is_parser + build_config["pattern"]["required"] = is_parser + if is_parser: ... - else: - build_config.pop("clean_data", None) + else: + build_config.pop("clean_data", None)
- In
convert_to_string,safe_convert(self.input_data or False)will passFalseifinput_datais falsy, which is probably not intended forData/DataFrame; you might just drop theor False.These are non‑blocking and can be deferred.
976-1089: URLComponent: implementation is robust; watch for loader exceptions outsideRequestExceptionThe new
URLComponentadds good validation (ensure_url,validate_url), clear configuration, and structured output with safe conversion and metadata, which improves reliability and downstream use.One thing to keep in mind: inside
fetch_url_contents, onlyrequests.exceptions.RequestExceptionis handled per‑URL; any other exception fromRecursiveUrlLoader.load()will bubble to the outerexcept Exception, aborting the whole operation instead of just skipping the bad URL. IfRecursiveUrlLoadercan raise non‑RequestExceptionerrors (e.g., parsing issues), you may want to broaden the innerexceptor adjust your logging/continue behavior.Otherwise this component looks solid.
src/backend/base/langflow/initial_setup/starter_projects/Document Q&A.json (3)
148-199: ChatInput: good file handling; consider safergraphaccessThe updated
ChatInputcorrectly normalizesfilesinto a list, filters out empty values, and includes them when creating theMessage. This should avoid previousNone/empty entries issues.As with
ChatOutput,session_idis derived viaself.session_id or self.graph.session_id or "". Ifself.graphis ever unset orNone(e.g., in tests or tool calls), this will raise. A more defensive pattern would be:- session_id = self.session_id or self.graph.session_id or "" + graph = getattr(self, "graph", None) + session_id = self.session_id or (graph.session_id if graph else "")Behavior in normal flows stays the same but becomes safer in edge contexts.
1190-1227: FileComponent metadata/dependencies: description clarified; ensure dependency alignmentThe File component description is clearer (“Loads and returns the content from uploaded files”), and metadata now declares
langchain_coreandpydanticalongsidelfx. That matches the new use ofStructuredToolandBaseModelin the component code.Please double‑check that:
langchain_core==0.3.80andpydantic==2.11.10are compatible with the versions pinned inpyproject.tomland elsewhere.- Runtime environments don’t have conflicting
pydanticv1/v2 usages.If everything is already aligned, this metadata change is fine.
1286-1615: FileComponent: strong design with Docling isolation; a few edge cases to considerThe new
FileComponentis a substantial improvement: tool‑friendly (_get_tools), dynamic description that names files, content‑type validation for images, Docling processing isolated to a subprocess, and clear advanced‑mode behavior. Overall it’s thoughtfully implemented.A few non‑blocking considerations:
Handling
advanced_data is Nonefrom Docling pathIn
process_files, advanced path:advanced_data: Data | None = self._process_docling_in_subprocess(file_path) payload = getattr(advanced_data, "data", {}) or {} doc_rows = payload.get("doc") ... else: final_return.extend(self.rollup_data(file_list, [advanced_data]))If
_process_docling_in_subprocessreturnsNone(e.g., emptyfile_pathor unexpected failure),advanced_dataisNoneand gets passed intorollup_data, which may not expectNone. Consider short‑circuiting whenadvanced_datais falsy and logging/raising instead of rolling it up.Standard path concurrency flag semantics
You’ve marked
use_multithreadingas deprecated and documented that “Processing Concurrency > 1” enables multithreading, but the code still gates onuse_multithreading:concurrency = 1 if not self.use_multithreading else max(1, self.concurrency_multithreading)If the intent is “any
concurrency_multithreading > 1enables parallelism”, you could simplify to:concurrency = max(1, self.concurrency_multithreading)and ignore
use_multithreading, or auto‑derive it from the concurrency value.Docling compatibility and formats
_is_docling_compatibleincludes a wide set of extensions (including some structured ones). Combined withadvanced_modegates, this is probably intentional, but it means that if someone somehow forcesadvanced_mode=Truefor a.csv/.jsonfile (e.g., via API), Docling will be tried and any failure will only surface via theData(error=...)path. That’s acceptable, just be aware of the behavior.Subprocess path safety
You already sanitize
args["file_path"]for shell metacharacters and pass config via stdin topython -c, which is good. Since you’re not invoking a shell and not interpolating the path into the command, the risk is low; the extra filter is mainly defense‑in‑depth and looks fine.None of these are blockers; the component should work well as written.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
src/frontend/package-lock.jsonis excluded by!**/package-lock.jsonuv.lockis excluded by!**/*.lock
📒 Files selected for processing (21)
.secrets.baseline(1 hunks)pyproject.toml(1 hunks)src/backend/base/langflow/helpers/flow.py(3 hunks)src/backend/base/langflow/initial_setup/starter_projects/Basic Prompt Chaining.json(5 hunks)src/backend/base/langflow/initial_setup/starter_projects/Basic Prompting.json(5 hunks)src/backend/base/langflow/initial_setup/starter_projects/Blog Writer.json(9 hunks)src/backend/base/langflow/initial_setup/starter_projects/Custom Component Generator.json(7 hunks)src/backend/base/langflow/initial_setup/starter_projects/Document Q&A.json(8 hunks)src/backend/base/langflow/initial_setup/starter_projects/Financial Report Parser.json(9 hunks)src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json(7 hunks)src/backend/base/langflow/initial_setup/starter_projects/Instagram Copywriter.json(9 hunks)src/backend/base/langflow/initial_setup/starter_projects/Invoice Summarizer.json(7 hunks)src/backend/base/langflow/initial_setup/starter_projects/Knowledge Ingestion.json(5 hunks)src/backend/base/langflow/initial_setup/starter_projects/Knowledge Retrieval.json(6 hunks)src/backend/base/langflow/initial_setup/starter_projects/Meeting Summary.json(13 hunks)src/backend/base/langflow/initial_setup/starter_projects/Memory Chatbot.json(7 hunks)src/backend/base/langflow/initial_setup/starter_projects/Price Deal Finder.json(7 hunks)src/backend/base/langflow/initial_setup/starter_projects/Research Agent.json(7 hunks)src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json(11 hunks)src/backend/base/langflow/initial_setup/starter_projects/SEO Keyword Generator.json(3 hunks)src/backend/base/langflow/initial_setup/starter_projects/SaaS Pricing.json(7 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
{pyproject.toml,uv.lock}
📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)
Use
uv(>=0.4) for Python dependency management in backend development
Files:
pyproject.toml
src/backend/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)
src/backend/**/*.py: Use FastAPI async patterns withawaitfor async operations in component execution methods
Useasyncio.create_task()for background tasks and implement proper cleanup with try/except forasyncio.CancelledError
Usequeue.put_nowait()for non-blocking queue operations andasyncio.wait_for()with timeouts for controlled get operations
Files:
src/backend/base/langflow/helpers/flow.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/backend_development.mdc:0-0
Timestamp: 2025-11-24T19:46:09.104Z
Learning: Applies to src/backend/base/langflow/components/**/*.py : Add new components to the appropriate subdirectory under `src/backend/base/langflow/components/` (agents/, data/, embeddings/, input_output/, models/, processing/, prompts/, tools/, or vectorstores/)
📚 Learning: 2025-11-24T19:46:09.104Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/backend_development.mdc:0-0
Timestamp: 2025-11-24T19:46:09.104Z
Learning: Applies to src/backend/base/langflow/components/**/__init__.py : Update `__init__.py` with alphabetically sorted imports when adding new components
Applied to files:
src/backend/base/langflow/initial_setup/starter_projects/Custom Component Generator.jsonsrc/backend/base/langflow/helpers/flow.py
📚 Learning: 2025-06-26T19:43:18.260Z
Learnt from: ogabrielluiz
Repo: langflow-ai/langflow PR: 0
File: :0-0
Timestamp: 2025-06-26T19:43:18.260Z
Learning: In langflow custom components, the `module_name` parameter is now propagated through template building functions to add module metadata and code hashes to frontend nodes for better component tracking and debugging.
Applied to files:
src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.jsonsrc/backend/base/langflow/initial_setup/starter_projects/Knowledge Ingestion.json
📚 Learning: 2025-11-24T19:47:28.997Z
Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-24T19:47:28.997Z
Learning: Applies to src/backend/tests/**/*.py : Use predefined JSON flows and utility functions from `tests.unit.build_utils` (create_flow, build_flow, get_build_events, consume_and_assert_stream) for flow execution testing
Applied to files:
src/backend/base/langflow/helpers/flow.py
🧬 Code graph analysis (1)
src/backend/base/langflow/helpers/flow.py (3)
src/lfx/src/lfx/schema/data.py (1)
Data(26-288)src/lfx/src/lfx/helpers/flow.py (3)
list_flows_by_flow_folder(88-131)list_flows_by_folder_id(134-169)get_flow_by_id_or_name(172-202)src/backend/tests/conftest.py (1)
flow(549-569)
🔇 Additional comments (48)
src/backend/base/langflow/initial_setup/starter_projects/Financial Report Parser.json (3)
1163-1163: Good fallback mechanism for model compatibility.The implementation correctly handles the IBM watsonx.ai compatibility issue by first attempting extraction with trustcall, then falling back to LangChain's
with_structured_outputwhen trustcall fails. The warning message appropriately explains that some models don't support tool calling, which aligns with the PR objective.
229-229: Consistent session ID fallback pattern.The session handling logic
self.session_id or self.graph.session_id or ""provides a robust fallback chain. This pattern is consistently applied in both ChatOutput and ChatInput components.
160-163: FastAPI version 0.123.0 is available and valid.The dependency version bump is not an issue—this version exists and is established in the release timeline.
src/backend/base/langflow/initial_setup/starter_projects/Basic Prompt Chaining.json (4)
364-376: LGTM!Metadata updates (code_hash, dependencies, module path) correctly reflect the updated ChatInput component code.
413-413: Improved session ID handling looks good.The fallback chain
self.session_id or self.graph.session_id or ""ensures consistent session resolution across the flow. The asyncMessage.create()pattern and file filtering logic are correct.
1346-1361: Provider options in template don't include IBM watsonx.ai or Ollama.The template's
optionsarray only lists["OpenAI", "Anthropic", "Google"], but the embedded code supports 5 providers including "IBM watsonx.ai" and "Ollama". Given the PR title mentions fixing IBM watsonx.ai issues, verify this mismatch is intentional and that users can still select IBM watsonx.ai from the dropdown in the UI.The same pattern appears in the other two LanguageModelComponent nodes (lines 1668-1693, 1989-2014).
643-645: FastAPI 0.123.0 is a valid PyPI release. Consider whether pinning to a more recent 0.123.x patch version (0.123.9 or 0.123.10) may be preferred for bug fixes and stability.src/backend/base/langflow/initial_setup/starter_projects/Custom Component Generator.json (3)
2240-2250: FastAPI 0.123.0 dependency bump is safe to proceed with.The upgrade from FastAPI 0.120.0 to 0.123.0 does not introduce breaking changes affecting
fastapi.encoders.jsonable_encoderor standard FastAPI imports. Changes between these versions focus on dependency handling improvements and Pydantic v1 deprecation notices, not on core API modifications.
302-302: Embedded Python code syntax validates correctly.The three embedded Python code blocks have been verified and all pass Python syntax validation:
- Memory Component: 10,363 characters, 12 import statements, 4 async methods
- ChatInput Component: 3,483 characters, 6 import statements, 1 async method
- ChatOutput Component: 7,003 characters, 13 import statements, 1 async method
All imports are valid, method signatures are compatible with dependencies, and Session ID handling logic is properly implemented.
240-240: Code hash metadata is correctly updated and consistent.All three code_hash values match their embedded code blocks:
- Memory Component (line 240): efd064ef48ff ✓
- ChatInput (line 1960): 7a26c54d89ed ✓
- ChatOutput (line 2240): cae45e2d53f6 ✓
src/backend/base/langflow/initial_setup/starter_projects/Knowledge Ingestion.json (3)
92-92: SplitTextComponent updates look consistent.The code_hash update and documentation URL change to
https://docs.langflow.org/split-textare consistent with the embedded code modifications.Also applies to: 181-181
356-356: URLComponent updates look consistent.The code_hash update and documentation URL change to
https://docs.langflow.org/urlare consistent with the embedded code modifications.Also applies to: 467-467
773-773: Dependency version bump for langchain_cohere.The update from version 0.3.3 to 0.3.5 aligns with the broader dependency updates in this PR.
.secrets.baseline (1)
768-883: Secrets baseline updates for starter project code hashes.These additions document known false positives for
Hex High Entropy Stringdetections in starter project JSON files. The flagged values are code_hash identifiers (12-character hex strings) generated for component tracking, not actual secrets. All entries correctly have"is_secret": false.src/backend/base/langflow/initial_setup/starter_projects/SEO Keyword Generator.json (2)
639-639: ChatOutput session handling improvements.The updated
message_responsemethod includes improved session_id resolution with a fallback chain:message.session_id = self.session_id or self.graph.session_id or ""The guard for Message reuse (
not self.is_connected_to_chat_input()) ensures proper message creation when connected to a ChatInput component. The documentation URL update tochat-input-and-outputconsolidates the reference.
565-574: Metadata updates for ChatOutput component.The code_hash and fastapi dependency version bump (0.120.0 → 0.123.0) are consistent with the embedded code changes.
pyproject.toml (1)
138-139: Dependency version updates are reasonable.The version bumps for
cuga(from ~=0.1.11) andagent-lifecycle-toolkit(to ~=0.4.4) are minor updates. The change incugafrom exact pinning (==) to compatible release (~=) allows patch-level updates within the 0.1.x series, which is appropriate for minor versions.src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (5)
480-480: ChatOutput embedded code looks correct.The embedded ChatOutput component code includes proper session handling with fallback logic (
self.session_id or self.graph.session_id or ""), source building with model attribute detection, and input validation. The implementation follows established patterns.
687-736: ChatInput embedded code properly handles session context.The ChatInput component correctly:
- Handles file input normalization
- Falls back to
graph.session_idwhensession_idis not provided- Uses
Message.create()factory method for async message creation- Properly filters None/empty file values
987-1036: ParserComponent code update looks correct.The ParserComponent properly handles both "Parser" and "Stringify" modes, with appropriate type validation for DataFrame and Data inputs. The
update_build_configmethod correctly toggles thepatternfield visibility based on mode selection.
1171-1237: LoopComponent code correctly handles Message to Data conversion.The LoopComponent:
- Properly validates input data types (DataFrame, Data, Message, list)
- Converts Message objects to Data for consistent processing
- Manages loop state via context (
ctx) with proper initialization checks- Updates run dependencies for loop iteration
1624-1699: TypeConverterComponent code is well-structured.The TypeConverterComponent provides clean conversion functions between Message, Data, and DataFrame types with proper auto-parsing support for JSON/CSV content. The
update_outputsmethod correctly manages dynamic output display based on selected output type.src/backend/base/langflow/initial_setup/starter_projects/Memory Chatbot.json (3)
151-199: ChatInput component metadata and code updates are consistent.The ChatInput component correctly handles:
- Session ID fallback chain (
self.session_id or self.graph.session_id or "")- Async message creation with
Message.create()- File normalization and filtering
- Context ID for memory layering
426-500: ChatOutput component properly handles source attribution.The
_build_sourcemethod correctly handles different source object types (checking formodel_nameandmodelattributes), and themessage_responsemethod properly sets message properties including session/context IDs and flow ID.
902-966: Memory component session_id handling is correct and already consistent with other chat components.The concern about missing fallback logic is unfounded. By default, if the Session ID value is empty, it is set to the same value as the Flow ID, and all components in the flow automatically use this session_id value. The Memory component's info text confirms this: "If empty, the current session ID parameter will be used." This behavior is managed by the framework, not by individual component code. ChatInput and ChatOutput components work the same way—they do not implement explicit fallback logic because the framework handles session_id resolution uniformly across all components. The empty default for session_id in the Memory component is intentional and correct.
src/backend/base/langflow/initial_setup/starter_projects/Research Agent.json (3)
480-528: Updated ChatInput component with improved session handling.The session ID resolution now correctly falls back through multiple sources:
self.session_id or self.graph.session_id or "". This ensures robust session handling when the component-level session ID is not explicitly set.
2579-2770: Agent component updated with langchain_core 0.3.80 and improved exception handling.The Agent component code shows proper exception handling patterns and structured output validation. The dependency version bump for langchain_core aligns with the coordinated update.
1644-1720: ChatOutput component updated with FastAPI 0.123.0.The
_build_sourcehelper method properly handles different model object types (checking formodel_name, thenmodel, then falling back tostr()). The code is compatible with the specified FastAPI version.Note: FastAPI 0.124.4 is now available as the latest version. Consider upgrading if this is part of a broader dependency update initiative.
src/backend/base/langflow/helpers/flow.py (4)
9-10: New SQLAlchemy imports for aliased joins and sorting.The imports for
aliased,asc, anddescsupport the new query patterns for flow listing. This is a clean approach for dynamic ordering.
30-33: SORT_DISPATCHER provides clean dynamic ordering.Using a dictionary dispatcher for sort direction is a good pattern that avoids repetitive if-else chains and provides safe defaults.
52-89: Newlist_flows_by_flow_folderretrieves flows in the same folder as a given flow.The self-join via
aliased(Flow)correctly identifies the folder of the specified flow and returns sibling flows. Thenoqa: B006for the mutable default dict is acceptable since it's only read, not mutated. The_mappingaccess (line 86) is a SQLAlchemy Row attribute - consider using._asdict()for cleaner conversion, though this is minor.
126-163: Newget_flow_by_id_or_namecorrectly prioritizes flow_id over flow_name.The logic properly handles both lookup methods and converts string IDs to UUIDs when needed. The implementation aligns with the lfx stub interface shown in the relevant code snippets.
One minor observation: the check on lines 148-150 is defensive but redundant since lines 142-147 guarantee
attrandvalare set if we pass the validation on line 135-137. However, this defensive coding is acceptable.src/backend/base/langflow/initial_setup/starter_projects/SaaS Pricing.json (3)
371-448: ChatOutput: session/source handling and input validation look solidThe updated
ChatOutputimplementation (validated inputs,convert_to_string, session ID fallback, and_build_sourceusage) is consistent and robust forData/DataFrame/Messageinputs and aligns with the shared component version used elsewhere. No issues from this starter embedding.
689-742: CalculatorComponent: safe AST evaluation with clear error handlingThe calculator implementation safely restricts allowed AST node types and operators, handles legacy
ast.Num, and returns a cleanly formatted numeric result or structured error inData. This looks correct and appropriate for a starter tool.
868-1059: AgentComponent: provider‑agnostic wiring suitable for IBM watsonx.aiThe embedded
AgentComponentcorrectly delegates provider‑specific details toMODEL_PROVIDERS_DICT/MODEL_PROVIDERS_LIST, filters outjson_modefrom OpenAI, and centralizes LLM construction in_build_llm_model/get_llm. This should work uniformly for providers like IBM watsonx.ai so long as their entries are present inMODEL_PROVIDERS_DICT, which matches the intent of the PR.Please double‑check that the IBM watsonx.ai provider entry in
MODEL_PROVIDERS_DICTexposes the expectedcomponent_class,inputs, andprefixso this generic logic wires it correctly.src/backend/base/langflow/initial_setup/starter_projects/Price Deal Finder.json (1)
140-191: ChatInput: improved file handling and session resolutionThe new
ChatInput.message_responsecorrectly normalizesfiles, ignores empty/None entries, derivessession_idwith a sensible fallback to the graph session, and only persists when a non‑empty session is present. This is a clear improvement with no obvious regressions.src/backend/base/langflow/initial_setup/starter_projects/Knowledge Retrieval.json (2)
109-159: TextInputComponent: docs URL and implementation look correctThe
TextInputComponentwrapper cleanly mapsinput_valueinto aMessageand updates the documentation URL to the newtext-input-and-outputpage. No issues here.
514-609: KnowledgeRetrievalComponent: end‑to‑end retrieval pipeline is consistent and safeThe new
KnowledgeRetrievalComponentcorrectly:
- Resolves the per‑user knowledge base path and validates user/session context.
- Loads and decrypts embedding metadata, building an appropriate embedding model for OpenAI/HuggingFace/Cohere.
- Uses Chroma similarity search (with and without scores) and optionally enriches results with stored embeddings keyed by
_id.- Wraps rows into
Datainstances and returns aDataFrame(data=data_list)compatible with the existingData/DataFrameschemas.This wiring looks coherent and should behave well for Cohere given the explicit API‑key enforcement and the bumped
langchain_cohereversion.It would still be good to run a quick manual retrieval against a Cohere‑backed knowledge base on this branch to confirm there are no runtime issues (e.g., API‑parameter mismatches) with
langchain_cohere==0.3.5.src/backend/base/langflow/initial_setup/starter_projects/Invoice Summarizer.json (1)
1154-1160: AgentComponent provider plumbing (incl. IBM/Ollama) looks coherent; rely on tests for build_config edge casesThe reworked
AgentComponentusesMODEL_PROVIDERS_DICT/MODEL_PROVIDERS_LISTconsistently to inject and prune provider‑specific fields, and theupdate_build_configlogic foragent_llm/dynamic fields looks internally consistent (including theconnect_other_modelspath and required‑keys check). Given the branching and external dependency on per‑provider component classes, it would be good to exercise this with tests that:
- Switch
agent_llmacross all providers (including IBM watsonx.ai / Ollama) and back.- Verify
build_configstill contains the expected default keys and that provider‑specific fields are added/removed as intended.This looks correct, but those tests will guard against regressions as the provider matrix evolves.
Also applies to: 1345-1345
src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json (1)
1235-1235: LanguageModelComponent IBM/Ollama integration looks reasonable; please validate provider flows against real backendsThe updated
LanguageModelComponentcleanly wires in IBM watsonx.ai and Ollama:build_modelenforces provider‑specific required fields and instantiatesChatWatsonx/ChatOllama, whileupdate_build_configdynamically updates model options usingfetch_ibm_models,get_ollama_models, and URL validation. The control flow and fallbacks (defaults + warnings on failures) look coherent, so I don’t see an obvious correctness bug in the component itself. Given the reliance on external HTTP calls and third‑party client libraries, I’d strongly recommend:
- Exercising provider switches (OpenAI/Anthropic/Google/IBM/Ollama) via tests or manual runs.
- Verifying that IBM model list fetching and Ollama model discovery behave as expected in your target environments (and degrade gracefully when endpoints are unreachable).
src/backend/base/langflow/initial_setup/starter_projects/Blog Writer.json (1)
353-427: TextInputComponent update looks correct and self‑containedThe updated
TextInputComponentlogic and docs URL are straightforward; it simply wrapsinput_valueinto aMessagewithout side effects. No issues from a correctness or UX standpoint.src/backend/base/langflow/initial_setup/starter_projects/Meeting Summary.json (4)
718-718: Embedded ChatOutput code updates are well-structured.The updated component code includes:
- New
_build_sourcehelper for constructingSourceobjects with proper null handling- Enhanced session handling with fallback chain:
self.session_id or self.graph.session_id or ""- Improved input validation via
_validate_inputmethodThe same code is consistently applied to all three ChatOutput nodes in this flow.
1656-1719: Memory component updates align with the session handling improvements.The Memory component code includes:
- Proper async methods for
store_messageandretrieve_messages- Consistent
lfximports matching other updated components- Mode-based dynamic output configuration
2018-2067: ChatInput component updates are consistent with ChatOutput session handling.The ChatInput component now uses the same session ID fallback pattern:
session_id = self.session_id or self.graph.session_id or ""This ensures consistent session handling across input and output components in the flow.
642-660: Metadata and dependency version updates are consistent.The
code_hashupdate and FastAPI version bump from 0.120.0 to 0.123.0 are correctly applied, with 0.123.0 being a valid released version in the 0.123.x series. The changes align with the updated embedded component code that includes enhanced session handling and the new_build_sourcehelper method.src/backend/base/langflow/initial_setup/starter_projects/Instagram Copywriter.json (3)
318-369: LGTM - Improved session handling in ChatInput.The session_id fallback chain (
self.session_id or self.graph.session_id or "") provides robust handling when no explicit session ID is configured. The asyncMessage.create()pattern is appropriate.
760-810: LGTM - TextInput component with updated documentation URL.The component is straightforward and the documentation URL update aligns with the project's documentation structure.
1965-2174: Agent component supports multiple LLM providers including IBM watsonx.ai.The embedded Agent code includes IBM watsonx.ai as one of the supported MODEL_PROVIDERS alongside OpenAI, Anthropic, Google, and Ollama. The implementation shows proper multi-provider handling with dynamic build_config updates and schema validation for structured outputs. Async error handling is appropriate with specific exception types.
The code is well-structured with proper async/await patterns and comprehensive error handling. The component properly integrates memory management via MemoryComponent and tool calling functionality.
| "title_case": false, | ||
| "type": "code", | ||
| "value": "from lfx.custom.custom_component.component import Component\nfrom lfx.helpers.data import safe_convert\nfrom lfx.inputs.inputs import BoolInput, HandleInput, MessageTextInput, MultilineInput, TabInput\nfrom lfx.schema.data import Data\nfrom lfx.schema.dataframe import DataFrame\nfrom lfx.schema.message import Message\nfrom lfx.template.field.base import Output\n\n\nclass ParserComponent(Component):\n display_name = \"Parser\"\n description = \"Extracts text using a template.\"\n documentation: str = \"https://docs.langflow.org/components-processing#parser\"\n icon = \"braces\"\n\n inputs = [\n HandleInput(\n name=\"input_data\",\n display_name=\"Data or DataFrame\",\n input_types=[\"DataFrame\", \"Data\"],\n info=\"Accepts either a DataFrame or a Data object.\",\n required=True,\n ),\n TabInput(\n name=\"mode\",\n display_name=\"Mode\",\n options=[\"Parser\", \"Stringify\"],\n value=\"Parser\",\n info=\"Convert into raw string instead of using a template.\",\n real_time_refresh=True,\n ),\n MultilineInput(\n name=\"pattern\",\n display_name=\"Template\",\n info=(\n \"Use variables within curly brackets to extract column values for DataFrames \"\n \"or key values for Data.\"\n \"For example: `Name: {Name}, Age: {Age}, Country: {Country}`\"\n ),\n value=\"Text: {text}\", # Example default\n dynamic=True,\n show=True,\n required=True,\n ),\n MessageTextInput(\n name=\"sep\",\n display_name=\"Separator\",\n advanced=True,\n value=\"\\n\",\n info=\"String used to separate rows/items.\",\n ),\n ]\n\n outputs = [\n Output(\n display_name=\"Parsed Text\",\n name=\"parsed_text\",\n info=\"Formatted text output.\",\n method=\"parse_combined_text\",\n ),\n ]\n\n def update_build_config(self, build_config, field_value, field_name=None):\n \"\"\"Dynamically hide/show `template` and enforce requirement based on `stringify`.\"\"\"\n if field_name == \"mode\":\n build_config[\"pattern\"][\"show\"] = self.mode == \"Parser\"\n build_config[\"pattern\"][\"required\"] = self.mode == \"Parser\"\n if field_value:\n clean_data = BoolInput(\n name=\"clean_data\",\n display_name=\"Clean Data\",\n info=(\n \"Enable to clean the data by removing empty rows and lines \"\n \"in each cell of the DataFrame/ Data object.\"\n ),\n value=True,\n advanced=True,\n required=False,\n )\n build_config[\"clean_data\"] = clean_data.to_dict()\n else:\n build_config.pop(\"clean_data\", None)\n\n return build_config\n\n def _clean_args(self):\n \"\"\"Prepare arguments based on input type.\"\"\"\n input_data = self.input_data\n\n match input_data:\n case list() if all(isinstance(item, Data) for item in input_data):\n msg = \"List of Data objects is not supported.\"\n raise ValueError(msg)\n case DataFrame():\n return input_data, None\n case Data():\n return None, input_data\n case dict() if \"data\" in input_data:\n try:\n if \"columns\" in input_data: # Likely a DataFrame\n return DataFrame.from_dict(input_data), None\n # Likely a Data object\n return None, Data(**input_data)\n except (TypeError, ValueError, KeyError) as e:\n msg = f\"Invalid structured input provided: {e!s}\"\n raise ValueError(msg) from e\n case _:\n msg = f\"Unsupported input type: {type(input_data)}. Expected DataFrame or Data.\"\n raise ValueError(msg)\n\n def parse_combined_text(self) -> Message:\n \"\"\"Parse all rows/items into a single text or convert input to string if `stringify` is enabled.\"\"\"\n # Early return for stringify option\n if self.mode == \"Stringify\":\n return self.convert_to_string()\n\n df, data = self._clean_args()\n\n lines = []\n if df is not None:\n for _, row in df.iterrows():\n formatted_text = self.pattern.format(**row.to_dict())\n lines.append(formatted_text)\n elif data is not None:\n # Use format_map with a dict that returns default_value for missing keys\n class DefaultDict(dict):\n def __missing__(self, key):\n return data.default_value or \"\"\n\n formatted_text = self.pattern.format_map(DefaultDict(data.data))\n lines.append(formatted_text)\n\n combined_text = self.sep.join(lines)\n self.status = combined_text\n return Message(text=combined_text)\n\n def convert_to_string(self) -> Message:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n result = \"\"\n if isinstance(self.input_data, list):\n result = \"\\n\".join([safe_convert(item, clean_data=self.clean_data or False) for item in self.input_data])\n else:\n result = safe_convert(self.input_data or False)\n self.log(f\"Converted to string with length: {len(result)}\")\n\n message = Message(text=result)\n self.status = message\n return message\n" | ||
| "value": "from lfx.custom.custom_component.component import Component\nfrom lfx.helpers.data import safe_convert\nfrom lfx.inputs.inputs import BoolInput, HandleInput, MessageTextInput, MultilineInput, TabInput\nfrom lfx.schema.data import Data\nfrom lfx.schema.dataframe import DataFrame\nfrom lfx.schema.message import Message\nfrom lfx.template.field.base import Output\n\n\nclass ParserComponent(Component):\n display_name = \"Parser\"\n description = \"Extracts text using a template.\"\n documentation: str = \"https://docs.langflow.org/parser\"\n icon = \"braces\"\n\n inputs = [\n HandleInput(\n name=\"input_data\",\n display_name=\"Data or DataFrame\",\n input_types=[\"DataFrame\", \"Data\"],\n info=\"Accepts either a DataFrame or a Data object.\",\n required=True,\n ),\n TabInput(\n name=\"mode\",\n display_name=\"Mode\",\n options=[\"Parser\", \"Stringify\"],\n value=\"Parser\",\n info=\"Convert into raw string instead of using a template.\",\n real_time_refresh=True,\n ),\n MultilineInput(\n name=\"pattern\",\n display_name=\"Template\",\n info=(\n \"Use variables within curly brackets to extract column values for DataFrames \"\n \"or key values for Data.\"\n \"For example: `Name: {Name}, Age: {Age}, Country: {Country}`\"\n ),\n value=\"Text: {text}\", # Example default\n dynamic=True,\n show=True,\n required=True,\n ),\n MessageTextInput(\n name=\"sep\",\n display_name=\"Separator\",\n advanced=True,\n value=\"\\n\",\n info=\"String used to separate rows/items.\",\n ),\n ]\n\n outputs = [\n Output(\n display_name=\"Parsed Text\",\n name=\"parsed_text\",\n info=\"Formatted text output.\",\n method=\"parse_combined_text\",\n ),\n ]\n\n def update_build_config(self, build_config, field_value, field_name=None):\n \"\"\"Dynamically hide/show `template` and enforce requirement based on `stringify`.\"\"\"\n if field_name == \"mode\":\n build_config[\"pattern\"][\"show\"] = self.mode == \"Parser\"\n build_config[\"pattern\"][\"required\"] = self.mode == \"Parser\"\n if field_value:\n clean_data = BoolInput(\n name=\"clean_data\",\n display_name=\"Clean Data\",\n info=(\n \"Enable to clean the data by removing empty rows and lines \"\n \"in each cell of the DataFrame/ Data object.\"\n ),\n value=True,\n advanced=True,\n required=False,\n )\n build_config[\"clean_data\"] = clean_data.to_dict()\n else:\n build_config.pop(\"clean_data\", None)\n\n return build_config\n\n def _clean_args(self):\n \"\"\"Prepare arguments based on input type.\"\"\"\n input_data = self.input_data\n\n match input_data:\n case list() if all(isinstance(item, Data) for item in input_data):\n msg = \"List of Data objects is not supported.\"\n raise ValueError(msg)\n case DataFrame():\n return input_data, None\n case Data():\n return None, input_data\n case dict() if \"data\" in input_data:\n try:\n if \"columns\" in input_data: # Likely a DataFrame\n return DataFrame.from_dict(input_data), None\n # Likely a Data object\n return None, Data(**input_data)\n except (TypeError, ValueError, KeyError) as e:\n msg = f\"Invalid structured input provided: {e!s}\"\n raise ValueError(msg) from e\n case _:\n msg = f\"Unsupported input type: {type(input_data)}. Expected DataFrame or Data.\"\n raise ValueError(msg)\n\n def parse_combined_text(self) -> Message:\n \"\"\"Parse all rows/items into a single text or convert input to string if `stringify` is enabled.\"\"\"\n # Early return for stringify option\n if self.mode == \"Stringify\":\n return self.convert_to_string()\n\n df, data = self._clean_args()\n\n lines = []\n if df is not None:\n for _, row in df.iterrows():\n formatted_text = self.pattern.format(**row.to_dict())\n lines.append(formatted_text)\n elif data is not None:\n # Use format_map with a dict that returns default_value for missing keys\n class DefaultDict(dict):\n def __missing__(self, key):\n return data.default_value or \"\"\n\n formatted_text = self.pattern.format_map(DefaultDict(data.data))\n lines.append(formatted_text)\n\n combined_text = self.sep.join(lines)\n self.status = combined_text\n return Message(text=combined_text)\n\n def convert_to_string(self) -> Message:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n result = \"\"\n if isinstance(self.input_data, list):\n result = \"\\n\".join([safe_convert(item, clean_data=self.clean_data or False) for item in self.input_data])\n else:\n result = safe_convert(self.input_data or False)\n self.log(f\"Converted to string with length: {len(result)}\")\n\n message = Message(text=result)\n self.status = message\n return message\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Missing clean_data parameter and incorrect fallback in convert_to_string.
In the convert_to_string method, the else branch has an issue:
result = safe_convert(self.input_data or False)Two problems:
self.input_data or Falseconverts falsy inputs (like empty string/list) to booleanFalse, which is likely unintended- The
clean_dataparameter is missing, unlike the list branch which correctly passesclean_data=self.clean_data or False
Apply this fix within the embedded code string:
- result = safe_convert(self.input_data or False)
+ result = safe_convert(self.input_data, clean_data=self.clean_data or False)🤖 Prompt for AI Agents
In src/backend/base/langflow/initial_setup/starter_projects/Financial Report
Parser.json around line 1437, the convert_to_string else branch incorrectly
calls safe_convert(self.input_data or False) and omits the clean_data argument;
change it to call safe_convert(self.input_data, clean_data=self.clean_data or
False) (i.e., pass the actual input_data without coercing falsy values to False
and include the same clean_data parameter used in the list branch) so behavior
and parameters mirror the list handling.
Summary by CodeRabbit
New Features
Improvements
Chores
✏️ Tip: You can customize this high-level summary in your review settings.