-
Notifications
You must be signed in to change notification settings - Fork 2
Roam: Eng-749 hyde functionality #344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request has been ignored for the connected project Preview Branches by Supabase. |
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the hyde.ts code (until line 342) was reviewed in
#131
new changes are
- ESLint Directive Added
typescript/* eslint-disable @typescript-eslint/naming-convention */
- New Imports Added
typescriptimport { getLoggedInClient } from "./supabaseContext";
import normalizePageTitle from "roamjs-components/queries/normalizePageTitle";
import findDiscourseNode from "./findDiscourseNode"
- New Type Definitions
typescripttype ResultItemMin = { uid: string };
type ExistingResultGroup = {
label: string;
results: Record<string, ResultItemMin>;
};
- API Configuration Simplified
Old version had a more complex structure:
typescriptBASE_URL: { DEV: "http://localhost:3000", PROD: "https://discoursegraphs.com" },
EMBEDDINGS: { PATH: "/api/embeddings/openai/small" },
SUPABASE: { MATCH_EMBEDDINGS_PATH: "/api/supabase/rpc/search-content" }
New version is simpler:
typescriptEMBEDDINGS_URL: "https://discoursegraphs.com/api/embeddings/openai/small"
- Removed Functions
getBaseUrl()
function was removed (no longer needed with simplified config)
- Modified searchEmbeddings Function
The biggest change - it now uses direct Supabase client instead of REST API:
Old version:
typescriptconst fullApiUrl = `${getBaseUrl()}${API_CONFIG.SUPABASE.MATCH_EMBEDDINGS_PATH}`;
const response = await fetch(fullApiUrl, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
queryEmbedding: queryEmbedding,
subsetPlatformIds: subsetRoamUids,
}),
});
New version:
typescriptconst supabaseClient = await getLoggedInClient();
const { data, error } = await supabaseClient.rpc(
"match_embeddings_for_subset_nodes",
{
p_query_embedding: JSON.stringify(queryEmbedding),
p_subset_roam_uids: subsetRoamUids,
},
);
- Error Handling Changes
errorData type changed from implicit to unknown
Removed conditional error message construction in handleApiError
Simplified error handling in searchEmbeddings to use Supabase's error object
- Simplified createEmbedding
Now directly uses API_CONFIG.EMBEDDINGS_URL instead of constructing URL with getBaseUrl()
✅ Actions performedFull review triggered. |
📝 WalkthroughWalkthroughAdds a new Hyde-based node discovery utility for Roam, implementing hypothetical node generation, embedding, and Supabase similarity search, with Roam API helpers. Also adjusts supabaseContext to lazily resolve the settings page UID within getOrCreateSpacePassword instead of at module load. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant UI as Caller
participant HYDE as hyde.ts
participant LLM as LLM (hypothetical gen)
participant EMB as Embedding Service
participant SB as Supabase (embedding index)
participant RA as Roam API
UI->>HYDE: performHydeSearch(params)
HYDE->>RA: Fetch candidate pages/refs (per flags)
RA-->>HYDE: Candidate nodes
loop For each relation triplet
HYDE->>LLM: Generate hypothetical related node texts
LLM-->>HYDE: Hypothetical texts
HYDE->>EMB: Embed hypothetical texts
EMB-->>HYDE: Embedding vectors
HYDE->>SB: Similarity search (per embedding)
SB-->>HYDE: Matching nodes + scores
end
HYDE->>HYDE: Merge, dedupe, rank by best score
HYDE-->>UI: SuggestedNode[]
note over HYDE: Errors/timeouts caught with logging
sequenceDiagram
autonumber
participant Caller
participant SC as supabaseContext.ts
participant RA as Roam API
Caller->>SC: getOrCreateSpacePassword()
SC->>RA: getPageUidByPageTitle(settings)
RA-->>SC: settingsConfigPageUid
SC->>RA: getBlockProps(space-user-password)
RA-->>SC: existing or missing
alt missing
SC->>SC: generate UUID password
SC->>RA: setBlockProps(space-user-password)
end
SC-->>Caller: password
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (6)
apps/roam/src/utils/supabaseContext.ts (2)
31-33
: Optional: cache the config page UID after the first lookupYou now pay the lookup cost on every call. You can keep lazy initialization while avoiding repeated queries by caching the UID.
Apply this change within the function:
- const settingsConfigPageUid = getPageUidByPageTitle( - DISCOURSE_CONFIG_PAGE_TITLE, - ); + const settingsConfigPageUid = + _settingsConfigPageUid || + (_settingsConfigPageUid = getPageUidByPageTitle( + DISCOURSE_CONFIG_PAGE_TITLE, + ));And add this module-level cache (outside the function):
// cache lazily-resolved config page UID let _settingsConfigPageUid: string | null = null;
35-36
: Nit: simplify type ofexisting
string | unknown
collapses tounknown
in TypeScript. Either rely on inference or annotate asunknown
for clarity:const existing = props["space-user-password"] as unknown;apps/roam/src/utils/hyde.ts (4)
113-121
: Ensure AbortSignal.timeout is supported or provide a fallback
AbortSignal.timeout(...)
isn’t available in all browsers/environments. Roam runs in the browser; compatibility can vary. Consider a manual AbortController-based timeout.Example replacement:
const controller = new AbortController(); const timeoutId = setTimeout(() => { // Create a TimeoutError-like abort controller.abort(new DOMException("Timeout", "TimeoutError")); }, API_CONFIG.LLM.TIMEOUT_MS); try { response = await fetch(API_CONFIG.LLM.URL, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify(requestBody), signal: controller.signal, }); } finally { clearTimeout(timeoutId); }Would you like me to patch this in?
58-67
: Consider externalizing API endpoints/configsHardcoding API URLs/models makes testing and environment switching harder. Consider reading from environment or a centralized config.
Example: read from
process.env
with sane defaults.
21-23
: Naming nit:CandidateNodeWithEmbedding
does not carry an embeddingThe type only has
{ type: string }
and aResult
shape. Consider renaming toCandidateNode
(or add anembedding?: number[]
if you plan to use client-side matching).
344-350
: All pages query: LGTMThe datalog looks correct for fetching
[pageName, pageUid]
pairs. Keep an eye on performance on very large graphs; you may need pagination or Supabase-backed listing (your TODO hints this).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
apps/roam/src/utils/hyde.ts
(1 hunks)apps/roam/src/utils/supabaseContext.ts
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-25T22:56:17.522Z
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#0
File: :0-0
Timestamp: 2025-06-25T22:56:17.522Z
Learning: In the Roam discourse-graph system, the existence of the configuration page (identified by DISCOURSE_CONFIG_PAGE_TITLE) and its corresponding UID is a system invariant. The code can safely assume this page will always exist, so defensive null checks are not needed when using `getPageUidByPageTitle(DISCOURSE_CONFIG_PAGE_TITLE)`.
Applied to files:
apps/roam/src/utils/supabaseContext.ts
🔇 Additional comments (4)
apps/roam/src/utils/supabaseContext.ts (1)
31-33
: Lazy resolution of settings page UID: LGTMMoving
getPageUidByPageTitle(DISCOURSE_CONFIG_PAGE_TITLE)
intogetOrCreateSpacePassword
is sound and matches the PR objective. Given the invariant that the config page always exists, this is safe.apps/roam/src/utils/hyde.ts (3)
127-127
: Confirm LLM endpoint returns plain textYou’re using
await response.text()
. If the endpoint returns JSON (e.g.,{ content: "..." }
), this will produce incorrect output downstream. Please confirm the API contract. If it returns JSON, parse and extract the text field instead.Example:
const { content } = (await response.json()) as { content: string }; return content;
181-187
: Double-check RPC param types (embedding marshaling)You
JSON.stringify
the embedding vector. If the Postgres functionmatch_embeddings_for_subset_nodes
expects a numeric array orvector
, stringify-ing may be incorrect. If it expects JSON, this is fine. Please verify the function signature and adjust accordingly.Potential adjustment:
const { data, error } = await supabaseClient.rpc( "match_embeddings_for_subset_nodes", { p_query_embedding: queryEmbedding, // pass as array if function expects numeric[] p_subset_roam_uids: subsetRoamUids, } );
498-507
: HYDE ranking pipeline: LGTMThe flow (generate hypotheticals → per-hypo embedding search → max aggregation → ranking) is coherent and robust to partial failures. Good error handling and early exits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💃
Summary by CodeRabbit
New Features
Refactor