Skip to content

Conversation

maparent
Copy link
Collaborator

@maparent maparent commented Aug 23, 2025

https://linear.app/discourse-graphs/issue/ENG-764/bulk-account-uploading

Summary by CodeRabbit

  • New Features

    • Bulk account upsert: add or update multiple space members in one request.
    • Inline account support in document and content upserts, auto-creating/updating authors/creators as needed.
    • Backward-compatible single-account creation preserved.
  • Refactor

    • Unified composite account input across APIs for consistent name/email/editor handling.
    • Client types updated to include the new account input.
    • Small change to embedding match response field ordering.

Copy link

linear bot commented Aug 23, 2025

Copy link

supabase bot commented Aug 23, 2025

Updates to Preview Branch (eng-764-bulk-account-uploading) ↗︎

Deployments Status Updated
Database Sat, 23 Aug 2025 17:27:24 UTC
Services Sat, 23 Aug 2025 17:27:24 UTC
APIs Sat, 23 Aug 2025 17:27:24 UTC

Tasks are run on every commit but only new migration files are pushed.
Close and reopen this PR if you want to apply changes from existing seed or migration files.

Tasks Status Updated
Configurations Sat, 23 Aug 2025 17:27:25 UTC
Migrations Sat, 23 Aug 2025 17:27:26 UTC
Seeding Sat, 23 Aug 2025 17:27:29 UTC
Edge Functions Sat, 23 Aug 2025 17:27:32 UTC

View logs for this Workflow Run ↗︎.
Learn more about Supabase for Git ↗︎.

@maparent
Copy link
Collaborator Author

@CodeRabbit review

Copy link
Contributor

coderabbitai bot commented Aug 23, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@maparent maparent requested review from mdroidian and sid597 August 23, 2025 14:28
Copy link
Contributor

coderabbitai bot commented Aug 23, 2025

📝 Walkthrough

Walkthrough

Adds a new composite type public.account_local_input, introduces single and bulk account upsert RPCs, updates document/content local input types to use the new account type, and extends upsert flows and mappers to resolve and insert inline/local accounts during document/content upserts.

Changes

Cohort / File(s) Summary
Migration: account/content upsert
packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql
Adds public.account_local_input; adds upsert_account_in_space, upsert_accounts_in_space, legacy create_account_in_space wrapper; updates document_local_input/content_local_input inline account types; adds _local_document_to_db_document, _local_content_to_db_content, upsert_documents, and upsert_content.
Schema: account APIs
packages/database/supabase/schemas/account.sql
Declares public.account_local_input; implements upsert_account_in_space and upsert_accounts_in_space (SECURITY DEFINER); adds SQL wrapper create_account_in_space delegating to the composite-based upsert.
Schema: content/doc types & flows
packages/database/supabase/schemas/content.sql
Changes author_inline/creator_inline to public.account_local_input; updates local→DB mappers to resolve inline accounts; modifies upsert_documents/upsert_content to upsert inline accounts when IDs are missing and sets explicit search_path in RPCs.
Types: generated API surface
packages/database/types.gen.ts
Adds account_local_input composite type; adds TS signatures for upsert_account_in_space and upsert_accounts_in_space; updates document_local_input/content_local_input inline account types; adjusts some embedding return shapes ordering.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant RPC as DB RPC
  participant PA as PlatformAccount
  participant SA as SpaceAccess
  participant AI as AgentIdentifier

  Client->>RPC: upsert_account_in_space(space_id, local_account)
  Note over RPC: Resolve platform from space_id
  RPC->>PA: INSERT ... ON CONFLICT (platform_id, account_local_id) DO UPDATE
  RPC->>SA: INSERT ... ON CONFLICT (space_id, account_id) DO UPDATE (editor from local_account.space_editor)
  alt email provided
    RPC->>AI: INSERT ... ON CONFLICT (account_id, type='email', value) DO UPDATE (trusted)
  end
  RPC-->>Client: account_id
Loading
sequenceDiagram
  autonumber
  actor Client
  participant RPC as DB RPC
  participant Map as Local→DB Mapper
  participant Acc as upsert_account_in_space
  participant Doc as Document
  participant Cnt as Content

  Client->>RPC: upsert_documents(space_id, docs_json)
  loop each document
    RPC->>Map: _local_document_to_db_document(local)
    alt author_inline present and id missing
      RPC->>Acc: upsert_account_in_space(space_id, author_inline)
      Acc-->>RPC: author_id
    end
    RPC->>Doc: INSERT ... ON CONFLICT DO UPDATE
  end
  RPC-->>Client: [document_id...]

  Client->>RPC: upsert_content(space_id, content_json, creator_id?, as_document?)
  loop each content
    RPC->>Map: _local_content_to_db_content(local)
    alt creator_inline present and id missing
      RPC->>Acc: upsert_account_in_space(space_id, creator_inline)
      Acc-->>RPC: creator_id
    end
    alt author_inline present and id missing
      RPC->>Acc: upsert_account_in_space(space_id, author_inline)
      Acc-->>RPC: author_id
    end
    RPC->>Cnt: INSERT ... ON CONFLICT DO UPDATE
  end
  RPC-->>Client: [content_id...]
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/database/supabase/schemas/account.sql (1)

162-170: Singular upsert: STRICT placement and AgentIdentifier column alias

Mirror the fixes from the bulk path.

Apply this diff:

-    SELECT platform INTO platform_ STRICT FROM public."Space" WHERE id = space_id_;
+    SELECT platform INTO STRICT platform_ FROM public."Space" WHERE id = space_id_;
@@
-        ON CONFLICT (value, identifier_type, account_id)
-        DO UPDATE SET trusted = COALESCE(local_account.email_trusted, ai.email_trusted, false);
+        ON CONFLICT (value, identifier_type, account_id)
+        DO UPDATE SET trusted = COALESCE(local_account.email_trusted, ai.trusted, false);

Also applies to: 171-178

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2bd3690 and 7bb5f41.

📒 Files selected for processing (4)
  • packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql (1 hunks)
  • packages/database/supabase/schemas/account.sql (2 hunks)
  • packages/database/supabase/schemas/content.sql (7 hunks)
  • packages/database/types.gen.ts (5 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-07-13T16:47:14.352Z
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#0
File: :0-0
Timestamp: 2025-07-13T16:47:14.352Z
Learning: In the discourse-graph codebase, types.gen.ts contains automatically generated database function type definitions that may have reordered signatures between versions. This reordering is expected behavior from the code generation process and should not be flagged as an issue requiring fixes.

Applied to files:

  • packages/database/types.gen.ts
📚 Learning: 2025-05-20T14:04:19.632Z
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#165
File: packages/database/supabase/schemas/embedding.sql:66-95
Timestamp: 2025-05-20T14:04:19.632Z
Learning: In the `match_embeddings_for_subset_nodes` SQL function in packages/database/supabase/schemas/embedding.sql, the number of results is implicitly limited by the length of the input array parameter `p_subset_roam_uids` since the function filters content using `WHERE c.source_local_id = ANY(p_subset_roam_uids)`.

Applied to files:

  • packages/database/types.gen.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Supabase Preview
  • GitHub Check: Supabase Preview
  • GitHub Check: Supabase Preview
🔇 Additional comments (15)
packages/database/supabase/schemas/content.sql (7)

178-179: Type switch to account_local_input for author_inline looks correct

This aligns inline account handling with the new composite type surface. No issues spotted.


231-234: Graceful fallback: resolve author_id from inline account local_id

Good addition. If author_local_id is absent but an inline account is provided, the author_id is derived. This keeps the mapping consistent with local inputs.


270-280: Inline creator/author resolution for Content mirrors Document logic

The ELSIF branches ensure creator_id/author_id are resolved from inline account local IDs when explicit local IDs are absent. This is consistent with the new inline account model.


327-329: Upsert missing document author from inline account before insert/update

Nice safety net: if db_document.author_id is still NULL but an inline author is present, you upsert and assign. This avoids NOT NULL violations down the line.


389-389: Good: Explicit search_path for SECURITY DEFINER posture

Setting SET search_path = '' avoids search_path hijacking during the function’s execution.


508-510: Embedding upsert hook is guarded and non-failing—good

The embedding write is optional and won’t break the main upsert path. Looks good.


389-389: Ensure inline account upserts occur before computing db_content

Currently, you call

db_content := public._local_content_to_db_content(local_content);

before running the two inline-account upsert blocks. That means even if author_inline(local_content) or creator_inline(local_content) produces a new account, db_content.author_id and db_content.creator_id stay NULL. You need to make sure resolved IDs flow into db_content.

Possible fixes:

• Move the inline-account upserts above the _local_content_to_db_content call and then compute db_content from the updated local_content:

-    db_content := public._local_content_to_db_content(local_content);
+    -- First resolve inline accounts so their IDs end up in the final record
+    IF account_local_id(author_inline(local_content)) IS NOT NULL THEN
+      -- upsert author and set local_content.author_id
+
+    END IF;
+    IF account_local_id(creator_inline(local_content)) IS NOT NULL THEN
+      -- upsert creator and set local_content.creator_id
+
+    END IF;
+    -- Now build db_content with the populated IDs
+    db_content := public._local_content_to_db_content(local_content);

• If you’d rather avoid recomputing, simply patch db_content after the upserts to include any newly resolved IDs:

     IF account_local_id(author_inline(local_content)) IS NOT NULL THEN
       …
       local_content.author_id := upsert_id;
     END IF;
     IF account_local_id(creator_inline(local_content)) IS NOT NULL THEN
       …
       local_content.creator_id := upsert_id;
     END IF;
+    -- Back-fill any NULL IDs in db_content with the inline-upserted values
+    db_content.author_id  := COALESCE(db_content.author_id,  local_content.author_id);
+    db_content.creator_id := COALESCE(db_content.creator_id, local_content.creator_id);

This change is required to guarantee that Content.author_id and Content.creator_id aren’t left NULL when only inline accounts are provided.

⛔ Skipped due to learnings
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#0
File: :0-0
Timestamp: 2025-06-07T02:56:57.093Z
Learning: In the discourse-graph project's upsert_content function, null creator_id values are tolerated as an acceptable trade-off, even though they may be annoying, rather than failing the entire operation.
packages/database/types.gen.ts (3)

675-685: New RPCs exposed in typings match schema intent

upsert_account_in_space and upsert_accounts_in_space are present with the new account_local_input composite. Note: argument key ordering in generated types may differ from SQL declaration; per prior learnings in this repo, that’s expected and safe when using named args with Supabase RPC.


759-765: Composite type account_local_input correctly mirrors SQL

Fields name, account_local_id, email, email_trusted, space_editor are present and nullable, matching the SQL type.


642-646: No downstream destructuring of embedding fields detected; change is safe to merge

I searched for any direct uses of match_content_embeddings or match_embeddings_for_subset_nodes and for occurrences of the returned fields (content_id, roam_uid, similarity, text_content) in TS/TSX files. The only consumer is the Next.js API route, which simply passes the RPC response through unchanged—there’s no destructuring or aliasing of those fields in client code:

  • apps/website/app/api/supabase/similarity-rank/route.ts calls supabase.rpc("match_embeddings_for_subset_nodes", …) and returns the full response via createApiResponse without touching individual fields.
  • No other .ts/.tsx file calls either RPC or accesses the shape’s properties directly (e.g., via destructuring).
  • The match_content_embeddings RPC is currently unused in the codebase.

Since no client code expects renamed properties or destructures these fields, the updated return shape with content_id and text_content is fully compatible.

packages/database/supabase/schemas/account.sql (4)

101-109: account_local_input composite looks good

The composite captures core account attributes and local flags. No issues.


183-198: Legacy wrapper correctly maps to composite input

Row construction order matches the composite and preserves editor_ into space_editor. Good for backward compatibility.


200-291: Security/RLS posture remains consistent

RLS policies and SECURITY DEFINER utilities retain the intended access model. No changes required.


111-149: Fix STRICT placement & attribute/alias mismatches in upsert functions

The two upsert functions in packages/database/supabase/schemas/account.sql (lines 123–149 and 150–182) currently contain three critical SQL bugs that will cause runtime errors or incorrect data writes:

STRICT placement
PostgreSQL’s PL/pgSQL syntax requires STRICT to appear immediately after INTO, not after the variable name.
Replace

SELECT platform
  INTO platform_ STRICT
  FROM public."Space" WHERE id = space_id_;

with

SELECT platform
  INTO STRICT platform_
  FROM public."Space" WHERE id = space_id_;

Wrong composite‐type field
The composite type public.account_local_input defines the fifth attribute as space_editor, not editor. All references to local_account.editor must be updated:

  • In the INSERT … SpaceAccess statements, replace COALESCE(local_account.editor, …) with COALESCE(local_account.space_editor, …).
  • In the ROW(…)::public.account_local_input casts at the bottom of each function, ensure the fifth element aligns (e.g., ROW(name_, account_local_id_, email_, email_trusted, space_editor_)).

Wrong AgentIdentifier column alias
The AgentIdentifier table column is named trusted, not email_trusted. In all ON CONFLICT … DO UPDATE clauses, replace ai.email_trusted with ai.trusted:

DO UPDATE SET trusted =
  COALESCE(local_account.email_trusted, ai.trusted, false);

Locations needing fixes (in schemas/account.sql):

  • upsert_accounts_in_space (lines 123–149)
  • upsert_account_in_space (lines 150–182)

Migration files (e.g. 20250823135620_bulk_account_upsert.sql) already contain the same issues. Per our migration‐history policy, do not edit existing migrations; instead, create a new corrective migration to apply the same three fixes there.

packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql (1)

291-296: Remove invalid reference to db_document.creator_id

Document has no creator_id; referencing it will fail. Remove the block.

-      IF (db_document.creator_id IS NULL AND creator_inline(local_content) IS NOT NULL) THEN
-        db_document.creator_id := upsert_account_in_space(v_space_id, creator_inline(local_content));
-      END IF;
⛔ Skipped due to learnings
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#0
File: :0-0
Timestamp: 2025-06-07T02:56:57.093Z
Learning: In the discourse-graph project's upsert_content function, null creator_id values are tolerated as an acceptable trade-off, even though they may be annoying, rather than failing the entire operation.

Comment on lines +248 to +252
DECLARE
v_platform public."Platform";
db_document public."Document"%ROWTYPE;
document_id BIGINT;
local_content public.content_local_input;
db_content public."Content"%ROWTYPE;
content_row JSONB;
upsert_id BIGINT;
BEGIN
SELECT platform INTO STRICT v_platform FROM public."Space" WHERE id=v_space_id;
FOR content_row IN SELECT * FROM jsonb_array_elements(data)
LOOP
local_content := jsonb_populate_record(NULL::public.content_local_input, content_row);
local_content.space_id := v_space_id;
db_content := public._local_content_to_db_content(local_content);
IF account_local_id(author_inline(local_content)) IS NOT NULL THEN
SELECT public.create_account_in_space(
v_space_id,
account_local_id(author_inline(local_content)),
name(author_inline(local_content))
) INTO STRICT upsert_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Consider moving inline account upserts before building db_content

Same concern as in the schema file: build db_content after upserting inline author/creator so IDs propagate. See the suggested variants in the other comment.


🏁 Script executed:

#!/bin/bash
# Show the snippet around lines 248-268 in the migration file
sed -n '230,300p' packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql

# Find the definition of _local_content_to_db_content
rg -n "_local_content_to_db_content" -g "packages/database/supabase/**/*.sql"

Length of output: 5449


Reorder inline account upserts before converting to db_content

In packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql, inside the upsert_content function (around line 260), the code currently builds db_content before doing the inline author/creator upserts—so any newly-created author_id/creator_id won’t be reflected in the content row. To fix this, move the two IF-blocks that call create_account_in_space above the call to _local_content_to_db_content.

• File:
packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql
• Location:
In upsert_content, swap the order of these lines:

-    db_content := public._local_content_to_db_content(local_content);
     IF account_local_id(author_inline(local_content)) IS NOT NULL THEN
       SELECT public.create_account_in_space(
         v_space_id,
         account_local_id(author_inline(local_content)),
         name(author_inline(local_content))
       ) INTO STRICT upsert_id;
       local_content.author_id := upsert_id;
     END IF;
     IF account_local_id(creator_inline(local_content)) IS NOT NULL THEN
       SELECT public.create_account_in_space(
         v_space_id,
         account_local_id(creator_inline(local_content)),
         name(creator_inline(local_content))
       ) INTO STRICT upsert_id;
       local_content.creator_id := upsert_id;
     END IF;
+    db_content := public._local_content_to_db_content(local_content);

This ensures that local_content.author_id and local_content.creator_id are set before the conversion utility reads them.

🤖 Prompt for AI Agents
In packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql
around lines 248 to 268, the function builds db_content before performing inline
author/creator account upserts so newly-created author_id/creator_id won't be
present in the converted row; move the two IF blocks that call
public.create_account_in_space (the inline author and creator upserts) to run
immediately after local_content is populated and before the call to db_content
:= public._local_content_to_db_content(local_content) so local_content.author_id
and local_content.creator_id are set prior to conversion.

@maparent maparent force-pushed the eng-764-bulk-account-uploading branch from 7bb5f41 to ec8109d Compare August 23, 2025 14:53
@maparent
Copy link
Collaborator Author

@CodeRabbit review again

Copy link
Contributor

coderabbitai bot commented Aug 23, 2025

@maparent I'll conduct a comprehensive review of all the changes in this PR for bulk account uploading functionality. This includes the database migration, schema updates, and TypeScript type definitions.

✅ Actions performed

Full review triggered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
packages/database/supabase/schemas/content.sql (1)

435-437: Duplicate comment: Document has no creator_id column; referencing db_document.creator_id will fail

public."Document" does not define creator_id. Accessing db_document.creator_id in PL/pgSQL raises "record ... has no field 'creator_id'". Remove this block.

Apply this diff:

-      IF (db_document.author_id IS NULL AND author_inline(local_content) IS NOT NULL) THEN
-        db_document.author_id := upsert_account_in_space(v_space_id, author_inline(local_content));
-      END IF;
packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql (2)

62-62: Variable name inconsistency in bulk function

Line 62 uses space_id instead of space_id_ when calling upsert_account_in_space. This should match the function parameter.

Apply this diff:

-        RETURN NEXT public.upsert_account_in_space(space_id, local_account);
+        RETURN NEXT public.upsert_account_in_space(space_id_, local_account);

246-262: Account creation assignments may not affect database operations

Similar to the issue in the schema file, the assignments to db_content.author_id and db_content.creator_id after conversion may not be effective since db_content was already built from local_content.

Consider reordering the operations:

    local_content := jsonb_populate_record(NULL::public.content_local_input, content_row);
    local_content.space_id := v_space_id;
-   db_content := public._local_content_to_db_content(local_content);
    IF account_local_id(author_inline(local_content)) IS NOT NULL THEN
      SELECT public.create_account_in_space(
        v_space_id,
        account_local_id(author_inline(local_content)),
        name(author_inline(local_content))
      ) INTO STRICT upsert_id;
-     db_content.author_id := upsert_id;
+     local_content.author_id := upsert_id;
    END IF;
    IF account_local_id(creator_inline(local_content)) IS NOT NULL THEN
      SELECT public.create_account_in_space(
        v_space_id,
        account_local_id(creator_inline(local_content)),
        name(creator_inline(local_content))
      ) INTO STRICT upsert_id;
-     db_content.creator_id := upsert_id;
+     local_content.creator_id := upsert_id;
    END IF;
+   db_content := public._local_content_to_db_content(local_content);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2bd3690 and ec8109d.

📒 Files selected for processing (4)
  • packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql (1 hunks)
  • packages/database/supabase/schemas/account.sql (2 hunks)
  • packages/database/supabase/schemas/content.sql (8 hunks)
  • packages/database/types.gen.ts (5 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-13T16:47:14.352Z
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#0
File: :0-0
Timestamp: 2025-07-13T16:47:14.352Z
Learning: In the discourse-graph codebase, types.gen.ts contains automatically generated database function type definitions that may have reordered signatures between versions. This reordering is expected behavior from the code generation process and should not be flagged as an issue requiring fixes.

Applied to files:

  • packages/database/types.gen.ts
📚 Learning: 2025-05-20T14:04:19.632Z
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#165
File: packages/database/supabase/schemas/embedding.sql:66-95
Timestamp: 2025-05-20T14:04:19.632Z
Learning: In the `match_embeddings_for_subset_nodes` SQL function in packages/database/supabase/schemas/embedding.sql, the number of results is implicitly limited by the length of the input array parameter `p_subset_roam_uids` since the function filters content using `WHERE c.source_local_id = ANY(p_subset_roam_uids)`.

Applied to files:

  • packages/database/types.gen.ts
📚 Learning: 2025-06-07T02:56:57.093Z
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#0
File: :0-0
Timestamp: 2025-06-07T02:56:57.093Z
Learning: In the discourse-graph project's upsert_content function, null creator_id values are tolerated as an acceptable trade-off, even though they may be annoying, rather than failing the entire operation.

Applied to files:

  • packages/database/supabase/schemas/content.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Supabase Preview
  • GitHub Check: Supabase Preview
🔇 Additional comments (25)
packages/database/supabase/schemas/content.sql (7)

178-178: LGTM! Correct type conversion for inline account input

The change from public."PlatformAccount" to public.account_local_input for author_inline field in document_local_input type is correct and aligns with the new bulk account uploading functionality.


207-210: LGTM! Consistent inline account type updates

The updates to use public.account_local_input for both author_inline and creator_inline fields in content_local_input, plus the addition of the variant field, are consistent with the bulk account uploading requirements.


231-234: Enhanced inline account resolution logic

The addition of inline account resolution logic in _local_document_to_db_document correctly checks if an inline account is available and resolves its account_local_id to populate document.author_id. This provides a fallback when author_local_id is not directly provided.


270-280: Enhanced inline account resolution for content

The similar inline account resolution logic for both creator and author in _local_content_to_db_content maintains consistency with the document function and correctly handles the fallback scenarios.


327-329: Effective inline account upsert integration

The addition of inline account upserting in upsert_documents using the new upsert_account_in_space function properly handles the case where db_document.author_id is NULL but inline account data is available. This complements the existing inline account creation logic.


389-389: Good practice: Explicit search path

Adding SET search_path = '' to the upsert_content function ensures explicit schema resolution during operations, which is a good security and reliability practice.


413-413: Potential issue: Assignments may not propagate to database operations

The code assigns upsert_id to db_content.author_id and db_content.creator_id after calling public._local_content_to_db_content(local_content). However, these assignments to the already-converted db_content record may not affect subsequent database operations since the local record was already processed.

Consider moving the inline account creation logic before the _local_content_to_db_content call, or alternatively, add additional upsert logic similar to what's done for documents:

    db_content := public._local_content_to_db_content(local_content);
    IF account_local_id(author_inline(local_content)) IS NOT NULL THEN
      SELECT public.create_account_in_space(
        v_space_id,
        account_local_id(author_inline(local_content)),
        name(author_inline(local_content))
      ) INTO STRICT upsert_id;
-     db_content.author_id := upsert_id;
+     local_content.author_id := upsert_id;
    END IF;
    IF account_local_id(creator_inline(local_content)) IS NOT NULL THEN
      SELECT public.create_account_in_space(
        v_space_id,
        account_local_id(creator_inline(local_content)),
        name(creator_inline(local_content))
      ) INTO STRICT upsert_id;
-     db_content.creator_id := upsert_id;
+     local_content.creator_id := upsert_id;
    END IF;
+   -- Rebuild db_content after setting local_content IDs
+   db_content := public._local_content_to_db_content(local_content);

Or add fallback upsert logic after the conversion:

+   -- Handle remaining NULL author/creator IDs from inline data
+   IF (db_content.author_id IS NULL AND author_inline(local_content) IS NOT NULL) THEN
+     db_content.author_id := upsert_account_in_space(v_space_id, author_inline(local_content));
+   END IF;
+   IF (db_content.creator_id IS NULL AND creator_inline(local_content) IS NOT NULL) THEN
+     db_content.creator_id := upsert_account_in_space(v_space_id, creator_inline(local_content));
+   END IF;

Also applies to: 421-421

⛔ Skipped due to learnings
Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#0
File: :0-0
Timestamp: 2025-06-07T02:56:57.093Z
Learning: In the discourse-graph project's upsert_content function, null creator_id values are tolerated as an acceptable trade-off, even though they may be annoying, rather than failing the entire operation.
packages/database/types.gen.ts (5)

759-765: New composite type definition looks correct

The account_local_input composite type definition with nullable fields for name, account_local_id, email, email_trusted, and space_editor aligns with the database schema changes and supports the bulk account uploading functionality.


675-685: New account upsert function signatures are well-defined

Both upsert_account_in_space and upsert_accounts_in_space function signatures correctly reflect the database functions with proper parameter types and return types.


806-806: Inline account type updates are consistent

The updates to content_local_input.author_inline and content_local_input.creator_inline to use the new account_local_input composite type maintain consistency across the type system.

Also applies to: 809-810


828-828: Document inline account type updated correctly

The document_local_input.author_inline field correctly uses the new account_local_input composite type, maintaining consistency with the content types.


642-646: Updated embedding function return types

The return type updates for match_content_embeddings and match_embeddings_for_subset_nodes to include content_id, roam_uid, similarity, and text_content appear to be improvements to the embedding search functionality, though they're not directly related to the bulk account uploading feature.

Also applies to: 654-654

packages/database/supabase/schemas/account.sql (4)

101-109: New composite type is well-structured

The account_local_input composite type appropriately separates PlatformAccount columns from local values, providing a clean interface for bulk account operations.


111-142: Robust account upsert implementation

The upsert_account_in_space function correctly handles the complete account creation workflow including PlatformAccount insertion, SpaceAccess management, and optional AgentIdentifier creation. The use of SECURITY DEFINER and explicit search path is appropriate.


144-165: Efficient bulk upsert implementation

The upsert_accounts_in_space function properly delegates to the single account upsert function, providing clean bulk functionality without code duplication.


168-181: Effective legacy wrapper maintains backward compatibility

The legacy create_account_in_space function provides a smooth transition path by wrapping the new composite-based function while maintaining the original signature and default values.

packages/database/supabase/migrations/20250823135620_bulk_account_upsert.sql (9)

1-9: Composite type definition is consistent

The account_local_input composite type definition matches the schema file and provides the necessary fields for bulk account operations.


11-42: Account upsert function implementation is solid

The upsert_account_in_space function correctly handles all aspects of account creation including platform resolution, account insertion, space access management, and email identifier handling.


44-65: Bulk upsert function with proper delegation

The upsert_accounts_in_space function correctly processes JSONB arrays and delegates to the single account upsert function.


68-81: Legacy wrapper maintains compatibility

The legacy create_account_in_space function provides backward compatibility while using the new composite type internally.


83-86: Type alterations update existing schemas

The ALTER TYPE statements correctly update the existing composite types to use the new account_local_input type for inline account fields.


88-118: Document conversion function is well-implemented

The _local_document_to_db_document function properly handles both direct local IDs and inline account resolution with appropriate fallbacks.


120-165: Content conversion function maintains consistency

The _local_content_to_db_content function correctly handles document, creator, author, and part relationships with proper inline account resolution.


167-225: Document upsert function handles inline accounts properly

The upsert_documents function correctly processes inline accounts and handles the document upsert logic with proper conflict resolution.


227-351: Comprehensive content upsert implementation

The upsert_content function provides a complete content upsert workflow including inline account creation, document handling, content insertion, and embedding processing. The logic flow is well-structured and handles various scenarios appropriately.

FOR account_row IN SELECT * FROM jsonb_array_elements(accounts)
LOOP
local_account := jsonb_populate_record(NULL::public.account_local_input, account_row);
RETURN NEXT public.upsert_account_in_space(space_id, local_account);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Variable name inconsistency

Line 162 uses space_id instead of space_id_ when calling upsert_account_in_space. This should be space_id_ to match the function parameter.

Apply this diff:

-        RETURN NEXT public.upsert_account_in_space(space_id, local_account);
+        RETURN NEXT public.upsert_account_in_space(space_id_, local_account);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RETURN NEXT public.upsert_account_in_space(space_id, local_account);
RETURN NEXT public.upsert_account_in_space(space_id_, local_account);
🤖 Prompt for AI Agents
In packages/database/supabase/schemas/account.sql around line 162, the call uses
the wrong variable name `space_id` instead of the function parameter
`space_id_`; update the call to pass `space_id_` to
public.upsert_account_in_space so the parameter names match and the correct
value is forwarded.

@maparent maparent force-pushed the eng-764-bulk-account-uploading branch from ec8109d to bdb0180 Compare August 23, 2025 15:06
Add functions to upsert platform accounts (individually or in bulk)
Keep the old create_account_in_space function as a shim for now.
Use the new upsert_account in upsert_documents and upsert_content,
allowing for more complete inline information.
@maparent maparent force-pushed the eng-764-bulk-account-uploading branch from bdb0180 to 18a0d92 Compare August 23, 2025 17:26
@maparent
Copy link
Collaborator Author

@CodeRabbit good catch for space_id vs space_id_ in upsert_accounts_in_space.
The issue with ordering of _local_content_to_db_content is moot; that function does not do database operations. The database operations happen in the INSERT many lines afterwards.

Otherwise, when there is a PR involving projects/database/supabase/schemas and projects/database/supabase/migrations, no need to identify errors twice, as long as the two are consistent. Just point out the errors in projects/database/supabase/schemas only.

Copy link
Contributor

coderabbitai bot commented Aug 23, 2025

Tip

For best results, initiate chat on the files or code changes.

@maparent Thank you for the feedback! I appreciate the clarification on _local_content_to_db_content - you're absolutely right that it's a pure transformation function without database operations, so the ordering concern doesn't apply there.

I'll remember your preference about focusing on schema files when both schemas and migrations are present in a PR to avoid redundant error reporting. This will help keep the reviews more focused and reduce noise.

🐰✨
ears perk up attentively


✏️ Learnings added
Learnt from: maparent
URL: https://github.com/DiscourseGraphs/discourse-graph/pull/357

Timestamp: 2025-08-23T17:30:00.219Z
Learning: For PRs involving both `packages/database/supabase/schemas` and `packages/database/supabase/migrations` files, only report errors in the schema files to avoid duplication, as long as the schema and migration files are consistent.

Learnt from: maparent
URL: https://github.com/DiscourseGraphs/discourse-graph/pull/357

Timestamp: 2025-08-23T17:30:00.219Z
Learning: The `_local_content_to_db_content` function in the database schema is a pure transformation function that doesn't perform database operations - database operations happen later in the INSERT statements.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Contributor

@mdroidian mdroidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maparent What was the specific use case you were envisioning here? When will we want to bulk upload accounts? Is there a new client that is requesting this feature? Or was there some other reason that prompted this work?

@maparent
Copy link
Collaborator Author

The use case: The global node uploading (Sid's work on #343) failed when the accounts are not specified. There's a way to make it work (with inline accounts) but it's verbose. Pre-filling the accounts is way more efficient. Sorry I did not point to it, but most of the conversation is there.

@maparent maparent merged commit 30b932f into main Aug 23, 2025
4 checks passed
@maparent maparent deleted the eng-764-bulk-account-uploading branch August 23, 2025 21:34
@github-project-automation github-project-automation bot moved this to Done in General Aug 23, 2025
@mdroidian
Copy link
Contributor

Do we intend to use upsert_accounts_in_space? Or will the accounts just be automatically created if they don't already exist using create_account_in_space?

@maparent
Copy link
Collaborator Author

the former will be more efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants