Skip to content

Conversation

maparent
Copy link
Collaborator

@maparent maparent commented Jun 17, 2025

These are the proposed functions to materialize nodes in the concept database.

Summary by CodeRabbit

  • New Features
    • Added new functionality to convert Roam Research discourse nodes and relations into a local concept format for database integration.
    • Enhanced extraction and aggregation of related concept information, including metadata such as author, creation time, and references.

Also wrote functions to order concepts in order of dependency. Not strictly needed in the current scenario, but could be needed down the road, and adds to the understanding of constraints of the sync function.

Copy link

linear bot commented Jun 17, 2025

Copy link

vercel bot commented Jun 17, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
discourse-graph ⬜️ Skipped (Inspect) Jun 24, 2025 3:20am

Copy link
Contributor

coderabbitai bot commented Jun 17, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

A new module was introduced to provide utility functions for converting discourse nodes and relations from Roam Research into a local concept data format suitable for database insertion. The module defines type aliases, helper functions for querying metadata, conversion functions for various discourse structures, and utilities for extracting related concept references.

Changes

File(s) Change Summary
apps/roam/src/utils/conceptConversion.ts New module with exported functions for converting discourse nodes and relations to local concepts.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant conceptConversion
    participant RoamAPI

    Caller->>conceptConversion: discourseNodeSchemaToLocalConcept(context, node)
    conceptConversion->>RoamAPI: getNodeMetadata(node)
    RoamAPI-->>conceptConversion: metadata
    conceptConversion-->>Caller: LocalConceptDataInput

    Caller->>conceptConversion: discourseRelationDataToLocalConcept(context, relationSchemaUid, relationNodes)
    conceptConversion->>RoamAPI: getNodeMetadata(relationNodes[role])
    RoamAPI-->>conceptConversion: metadata for each node
    conceptConversion-->>Caller: LocalConceptDataInput
Loading

Poem

In the warren of concepts, a new path appears,
Nodes and relations, now crystal clear.
From Roam’s wild fields to the database deep,
Conversion is swift, no secrets to keep.
With helpers and schemas, the rabbits delight—
Data hops smoothly, from morning to night!
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@maparent
Copy link
Collaborator Author

@CodeRabbit review

Copy link
Contributor

coderabbitai bot commented Jun 17, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
apps/roam/src/utils/conceptConversion.ts (2)

7-8: Spelling mistake & TODO reference need cleanup

The commented-out import path contains a typo (@repo/dataase/input_types) and the TODO itself will become obsolete once the real type is available. Cleaning this up early avoids future search noise.

-// When it's merged: import type { LocalConceptDataInput } from '@repo/dataase/input_types';
+// TODO(ENG-332): Replace the local alias with the shared type once
+// `@repo/database/input_types` is published.

60-76: Minor: consider trimming and length-guarding text for name

name: text might exceed expected column limits or contain trailing whitespace. A quick sanitisation step avoids DB errors.

-    name: text,
+    name: text.trim().slice(0, 255), // keep within varchar(255)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0afedb and 19b383c.

📒 Files selected for processing (1)
  • apps/roam/src/utils/conceptConversion.ts (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
apps/roam/src/utils/conceptConversion.ts (2)
packages/database/types.gen.ts (1)
  • Database (9-694)
apps/roam/src/utils/supabaseContext.ts (1)
  • SupabaseContext (9-13)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: deploy
🔇 Additional comments (1)
apps/roam/src/utils/conceptConversion.ts (1)

144-150: .flat() explodes strings into single characters

Object.values(concept.local_reference_content || {}).flat() will split a string "abc" into ['a','b','c'].
Use explicit normalisation to an array first.

-  const relations = Object.values(concept.local_reference_content || {}).flat();
+  const relations = Object.values(concept.local_reference_content || {})
+    .flatMap((v) => (Array.isArray(v) ? v : [v]));

Likely an incorrect or invalid review comment.

@maparent maparent requested a review from mdroidian June 17, 2025 23:48
@mdroidian
Copy link
Contributor

@maparent I'm curious why this is merging into ENG-446-bulk-db-concept-upsert instead of main.

Copy link
Contributor

@mdroidian mdroidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a first pass before reviewing the functionality.

@maparent
Copy link
Collaborator Author

@maparent I'm curious why this is merging into ENG-446-bulk-db-concept-upsert instead of main.

Because it relies on the on the concept local input types defined in 446.

@maparent maparent force-pushed the ENG-332-concept-translation branch from 7eecff6 to 9375544 Compare June 18, 2025 01:38
@maparent maparent force-pushed the ENG-446-bulk-db-concept-upsert branch from 92d0931 to a8ec1bf Compare June 18, 2025 01:39
@maparent maparent force-pushed the ENG-332-concept-translation branch from 9375544 to 8377b00 Compare June 19, 2025 01:16
@maparent maparent force-pushed the ENG-446-bulk-db-concept-upsert branch from a8ec1bf to 42633c9 Compare June 19, 2025 01:16
Base automatically changed from ENG-446-bulk-db-concept-upsert to main June 19, 2025 01:35
@maparent
Copy link
Collaborator Author

actually wait will make minor adjustments

@maparent maparent force-pushed the ENG-332-concept-translation branch from 8377b00 to 1e5c4c6 Compare June 19, 2025 01:40
@maparent
Copy link
Collaborator Author

added comments as in the other one.

@maparent maparent force-pushed the ENG-332-concept-translation branch from 1e5c4c6 to 6c22171 Compare June 19, 2025 12:12
@maparent maparent force-pushed the ENG-332-concept-translation branch from 6c22171 to 35e3e48 Compare June 19, 2025 20:39
@maparent maparent force-pushed the ENG-332-concept-translation branch from 3e5c7cf to 92b1032 Compare June 23, 2025 15:35
@maparent maparent requested a review from mdroidian June 23, 2025 15:36
@maparent
Copy link
Collaborator Author

@CodeRabbit review function orderConceptsByDependency

Copy link
Contributor

coderabbitai bot commented Jun 23, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's find the orderConceptsByDependency function
ast-grep --pattern $'function orderConceptsByDependency($$$) {
  $$$
}'

# Also search for it as an exported function or arrow function
rg "orderConceptsByDependency" -A 20 -B 5

Length of output: 2806


🏁 Script executed:

#!/bin/bash
# Let's find the orderConceptsRec helper function
rg "orderConceptsRec" -A 15 -B 2

# Also let's get the full context of the orderConceptsByDependency function
rg "export const orderConceptsByDependency" -A 25 -B 10

Length of output: 7047


@maparent I've found several critical issues in the orderConceptsByDependency function that need immediate attention:

🚨 Critical Bugs

1. Reversed Key-Value Mapping (Line ~)

// BROKEN:
const conceptById: { [key: string]: LocalConceptDataInput } =
  Object.fromEntries(concepts.map((c) => [c, c.represented_by_local_id]));

// SHOULD BE:
const conceptById: { [key: string]: LocalConceptDataInput } =
  Object.fromEntries(concepts.map((c) => [c.represented_by_local_id!, c]));

2. Wrong Array Method in orderConceptsRec

// BROKEN: unshift() returns array length, not the element
const relatedConceptId = relatedConceptIds.unshift();

// SHOULD BE: shift() removes and returns first element
const relatedConceptId = relatedConceptIds.shift();

⚠️ Logic Issues

3. Incomplete Dependency Resolution
The function only processes the first concept's dependency tree. If you have disconnected dependency graphs, some concepts will never be processed:

// Current: Only processes first concept
const first = conceptById[concepts[0].represented_by_local_id!];
orderConceptsRec(ordered, first, conceptById);

// Should process all remaining concepts
while (Object.keys(conceptById).length > 0) {
  const next = Object.values(conceptById)[0];
  orderConceptsRec(ordered, next, conceptById);
}

4. Missing Cycle Detection
Circular dependencies will cause infinite recursion and stack overflow. You need a visited set to detect and handle cycles.

5. Type Safety Issue
Using ! assertion without null checks could cause runtime errors if represented_by_local_id is undefined.

🔧 Suggested Fix

export const orderConceptsByDependency = (
  concepts: LocalConceptDataInput[],
): LocalConceptDataInput[] => {
  if (concepts.length === 0) return concepts;
  
  // Fix the mapping
  const conceptById: { [key: string]: LocalConceptDataInput } = 
    Object.fromEntries(
      concepts
        .filter(c => c.represented_by_local_id) // Filter out invalid concepts
        .map((c) => [c.represented_by_local_id!, c])
    );
  
  const ordered: LocalConceptDataInput[] = [];
  const visited = new Set<string>(); // Cycle detection
  
  // Process all disconnected components
  while (Object.keys(conceptById).length > 0) {
    const next = Object.values(conceptById)[0];
    orderConceptsRec(ordered, next, conceptById, visited);
  }
  
  return ordered;
};

const orderConceptsRec = (
  ordered: LocalConceptDataInput[],
  concept: LocalConceptDataInput,
  remainder: { [key: string]: LocalConceptDataInput },
  visited: Set<string>, // Add visited tracking
): void => {
  const conceptId = concept.represented_by_local_id!;
  
  // Cycle detection
  if (visited.has(conceptId)) {
    console.warn(`Circular dependency detected for concept: ${conceptId}`);
    return;
  }
  
  visited.add(conceptId);
  
  const relatedConceptIds = relatedConcepts(concept);
  while (relatedConceptIds.length > 0) {
    const relatedConceptId = relatedConceptIds.shift(); // Fix: use shift()
    const relatedConcept = remainder[relatedConceptId!];
    if (relatedConcept !== undefined) {
      orderConceptsRec(ordered, relatedConcept, remainder, visited);
      delete remainder[relatedConceptId!];
    }
  }
  
  ordered.push(concept);
  delete remainder[conceptId];
  visited.delete(conceptId); // Clean up for backtracking
};

These bugs would cause the function to completely fail at runtime. Please fix them before merging!

(\   /)

( .. )
o
(")(") worried about those bugs

@maparent
Copy link
Collaborator Author

@CodeRabbit thank you, those were serious concerns indeed. I stayed a bit simpler than what you proposed, because I know I can assume acyclicity. I also gave more context. How does it look now?

Copy link
Contributor

coderabbitai bot commented Jun 23, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the latest version of the orderConceptsByDependency function and related code
rg "orderConceptsByDependency" -A 30 -B 5

# Also get the orderConceptsRec helper function
rg "orderConceptsRec" -A 20 -B 2

Length of output: 7980


@maparent Great job fixing the critical bugs! 🎉 The key improvements I see:

Fixed Issues

  • Key-value mapping: Now correctly maps [c.represented_by_local_id, c]
  • Array method: Using shift() instead of unshift() - perfect!
  • Dependency resolution: Added while loop to process all concepts
  • Enhanced design: Returning both ordered and missing dependencies is excellent
  • Documentation: The comprehensive comments are very helpful

🐛 One Remaining Bug

There's still an issue in the main while loop:

// BUG: This will fail after first iteration
const first = conceptById[concepts[0].represented_by_local_id!]; 

After the first iteration, concepts[0] might have been deleted from conceptById, causing this to return undefined.

Simple fix:

const first = Object.values(conceptById)[0]; // Get any remaining concept

🌟 Love the Improvements

The missing dependency tracking is brilliant - it allows callers to verify that all dependencies exist in the database before upserting. The cleaner return type { ordered, missing } makes this much more robust.

Once you fix that one line, this function will be solid! The assumption of acyclicity simplifies things perfectly for your use case.


✏️ Learnings added
Learnt from: maparent
URL: https://github.com/DiscourseGraphs/discourse-graph/pull/220

Timestamp: 2025-06-23T20:16:39.977Z
Learning: The user maparent can assume acyclicity in dependency graphs for the orderConceptsByDependency function in the discourse-graph repository, which simplifies the implementation by eliminating the need for cycle detection.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Contributor

@mdroidian mdroidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unused import and update the inputTypes import, then feel free to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants