Eng 376 Benchmark Concept queries #483

maparent · 2025-10-08T19:42:57Z

Summary by CodeRabbit

New Features
- Added an end-to-end database benchmarking and data seeding tool.
- Supports configurable generation of spaces, accounts, content, nodes, and relations via YAML.
- Introduced a client-side concept query benchmark that reports per-query timings and results.
- Optional verbose logging for detailed run insights.
Chores
- Added a default benchmark configuration file.
- Added a convenient script to run benchmarks from the command line.

Note: The benchmark is not fullly automated, but detailed instructions are provided at the top of the scripts/bench.mts file.
The benchmark as defined here would mostly time out, we need optimizations of ENG-950 to make it work.

linear · 2025-10-08T19:43:00Z

ENG-376 Run Simulation on db schema

supabase · 2025-10-08T19:43:04Z

This pull request has been ignored for the connected project zytfjzqyijgagqxrzbmz because there are no changes detected in packages/database/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.

Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

maparent · 2025-10-14T17:24:06Z

@CodeRabbit review

coderabbitai · 2025-10-14T17:24:17Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-10-14T17:32:11Z

📝 Walkthrough

Walkthrough

Adds a Python PostgreSQL benchmarking/seeding script, a YAML benchmark configuration, a TypeScript bench runner that queries concepts in a Roam-based space, and an npm script to invoke the bench runner.

Changes

Cohort / File(s)	Summary
PostgreSQL benchmark and data seeding `packages/database/benchmark.py`	New end-to-end benchmark/seeding script: DB init (drop/create or truncate), space/account/content generation, schema/node/relation upserts, batch helpers, CLI entrypoint, and debug logging.
Benchmark configuration `packages/database/benchmark.yaml`	New YAML config specifying database_url, account count, node specs, and relation specs for benchmarking.
Bench runner script `packages/database/scripts/bench.mts`	New script to configure environment, create/fetch space, init authenticated client, fetch schema concepts, execute multiple concept queries, and log per-query timings/results.
NPM script `packages/database/package.json`	Adds "bench" script to run the bench.mts via tsx.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant YAML as Config (benchmark.yaml)
  participant Py as benchmark.py
  participant PG as PostgreSQL

  User->>Py: Run main(configPath)
  Py->>YAML: Load database_url, schemas, specs
  alt Schemas provided
    Py->>PG: Drop/Create DB (psql)
  else No schemas
    Py->>PG: Truncate domain tables (psql)
  end
  Py->>PG: Insert Space
  Py->>PG: Upsert Accounts (batch)
  Py->>PG: Upsert Concept Schemata (nodes/relations)
  Py->>PG: Upsert Content (batch)
  Py->>PG: Upsert Concept Nodes (batch, link content)
  Py->>PG: Upsert Relations (batch, link nodes/content)
  PG-->>Py: IDs and upsert results
  Py-->>User: Print created entity IDs
  note over Py,PG: Helpers handle batching, ID propagation, assertions

sequenceDiagram
  autonumber
  actor Dev as Developer
  participant TS as scripts/bench.mts
  participant Env as config()
  participant API as Platform API (Roam)
  participant DB as Supabase Client

  Dev->>TS: npm run bench
  TS->>Env: Load env, set platform "Roam"
  TS->>API: fetchOrCreateSpaceDirect(test creds)
  API-->>TS: spaceId or error
  alt spaceId ok
    TS->>DB: createLoggedInClient(spaceId)
    DB-->>TS: authenticated client
    TS->>API: getSchemaConcepts(spaceId)
    loop For each predefined query
      TS->>API: getConcepts(query + {supabase, spaceId})
      API-->>TS: Results
      TS-->>Dev: Log duration, description, results
    end
  else failure
    TS-->>Dev: Exit with error
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title mentions concept query benchmarking under ENG-376, which corresponds to the bench.mts script, but overlooks the larger addition of end-to-end database seeding and benchmarking utilities provided in Python, the YAML configuration, and the npm script integration. Thus it only partially summarizes the main changes introduced in this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (6)

packages/database/benchmark.yaml (1)

16-31: Make benchmark scale tunable (seed, batch size, space identifiers).

Add knobs so runs are reproducible and adjustable without code edits.

Consider adding:

seed: integer

batch_size: integer (defaults to 500)

space:

url: "test"

name: "test"

platform: "Roam"
Python can then read these with sensible defaults.
packages/database/scripts/bench.mts (1)
24-41: Parameterize space/platform/password via env to avoid hard-coding.

Keep defaults but allow overrides for CI and local runs.

Apply this diff:
-config();
-const platform = "Roam";
-let supabase = createClient();
+config();
+const platform = process.env.BENCH_PLATFORM ?? "Roam";
+const spaceUrl = process.env.BENCH_SPACE_URL ?? "test";
+const spaceName = process.env.BENCH_SPACE_NAME ?? "test";
+const password = process.env.BENCH_PASSWORD ?? "password";
+let supabase = createClient();
 if (!supabase) process.exit(1);
 const { data, error } = await fetchOrCreateSpaceDirect({
-  url: "test",
-  name: "test",
-  platform,
-  password: "password",
+  url: spaceUrl,
+  name: spaceName,
+  platform,
+  password,
 });
 if (error || !data || !data.id) {
   console.error("Could not create space connection", error);
   process.exit(1);
 }
 const spaceId = data.id;
-supabase = await createLoggedInClient(platform, spaceId, "password");
+supabase = await createLoggedInClient(platform, spaceId, password);
 if (!supabase) process.exit(1);
packages/database/benchmark.py (4)
132-133: Reduce log volume for large inserts.

Printing tens of thousands of IDs floods stdout; print counts instead.
-    print("Content:", ", ".join(str(c["id"]) for c in all_content))
+    print(f"Content: inserted {len(all_content)} items")
215-216: Reduce log volume for large node inserts.
-    print("Nodes:", ", ".join(str(n["id"]) for n in all_nodes))
+    print(f"Nodes: inserted {len(all_nodes)} items")
266-267: Reduce log volume for large relation inserts.
-    print("Relations:", ", ".join(str(r["id"]) for r in all_relns))
+    print(f"Relations: inserted {len(all_relns)} items")
1-1: Shebang vs executable bit.

Either make the file executable (chmod +x) or drop the shebang to appease linters.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffc313a and cb6e8de.

📒 Files selected for processing (4)

packages/database/benchmark.py (1 hunks)
packages/database/benchmark.yaml (1 hunks)
packages/database/package.json (1 hunks)
packages/database/scripts/bench.mts (1 hunks)

🧰 Additional context used

🪛 Checkov (3.2.334)

packages/database/benchmark.yaml

[medium] 13-14: Basic Auth Credentials

(CKV_SECRET_4)

🪛 Ruff (0.14.0)

packages/database/benchmark.py

1-1: Shebang is present but file is not executable

(EXE001)

45-45: subprocess call: check for execution of untrusted input

(S603)

91-91: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

105-105: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

227-227: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

236-236: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

238-238: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

🔇 Additional comments (1)

packages/database/package.json (1)

29-29: LGTM: bench entry point added.

Script wiring is correct and tsx is present in devDependencies.

packages/database/benchmark.py

packages/database/benchmark.yaml

mdroidian

The main goal of ENG-376 was to "Run Simulation on db schema". In that I would expect to see

a list of queries that were run
the results of the benchmark for those queries

Could you put those in either the body of this PR or in the comment of said task, please?

Feel free to use this table format

Query Name	Benchmark Result (ms)	Notes
Discourse context: all nodes in any relation with node X	—
All discourse nodes	—
All discourse nodes of type X	—
All discourse relations	—
Relations containing discourse node of type X	—
Relations containing specific discourse node X	—
Nodes that support a claim OR inform a question (authored by user X)	—

maparent · 2025-10-15T19:39:33Z

That table (in text form) is in ENG-950. The last one was not checked, could be added. (Do you mean any question authored by X or some specific question authored by X?) Without the optimizations, anything having to relations times out, but I'll recalculate.

mdroidian · 2025-10-15T19:42:20Z

It was just an example table. That is fine, I'll add it to the main ticket.

mdroidian

From the list of primary queryies we were running the benchmark on, it looks like some of these benchmarks were changed slightly. Below is my understand of the mapping, please correct me if I am wrong:

1:1

query all discourse nodes
query all discourse relations
query all discourse nodes of type x

mapped

discourse context: query all nodes in any relation with node x, and the relation label
- In relation to a specific node
- A specific node's relation
query all discourse relations containing discourse node of type x
- Query all nodes in some relation to another node of a given type
query all discourse relations containing specific discourse node x
- same as discourse context
query all nodes that support a claim OR inform a question and are authored by user x
- not included

Could you please include query all nodes that support a claim OR inform a question and are authored by user x ?

Stated differently: "return all nodes that ... "

Authored by user x
AND 
  Support any claim
  OR
  Inform any question

To clarify "Support any claim": claims could part of two separate relations, in this case defined by target relation destination eg:

evidence supports claim
claim supports question

So the above query is looking for 1), not 2), where claim is the destination not the target of a supports relation

packages/database/scripts/bench.mts

maparent · 2025-10-15T20:37:19Z

Ok, thanks for the clarification, I think I got it. As stated, this is not something that the query builder can handle. I can filter on relation type AND on other node's type independently; but not on a combination yet. Also, I do not take direction into account yet, but I suspect it's not necessary yet. To be clear, I see this as an important use case, but did not see this as part of the initial list.
Do you think it's worth working on now? (No objection either way, but then other plans of the week will suffer.)
What I could do: Instead of giving two lists of relation type and node type, I could give one list of triples. (relation type, destination type, direction) (relation type could be a wildcard or a list, I guess? But I think it's easier to have a list of triples.) Not optimistic about performance tbh.
Concern: We're adding ad hoc complexity for a single use case to what we have determined is likely a dead-end.
I think we should instead look at the more complex queries and think about how we could handle them as a whole, clearly not through that path.

mdroidian · 2025-10-15T21:07:59Z

If you are saying that this query cannot be adequately supported via the current implementation, then that conclusion is all we are looking for in this test. This will help determine what steps we need to take moving forward. I will update the table in linear to include this determination.

github-project-automation bot added this to General Oct 8, 2025

maparent force-pushed the eng-898-create-database-query-functions-for-base-dg-queries branch from ac99941 to 71e4ccf Compare October 12, 2025 17:08

Base automatically changed from eng-898-create-database-query-functions-for-base-dg-queries to main October 12, 2025 23:29

maparent force-pushed the eng-376-run-simulation-on-db-schema branch 2 times, most recently from a6b0d11 to 946b257 Compare October 13, 2025 14:07

coderabbitai bot reviewed Oct 14, 2025

View reviewed changes

maparent marked this pull request as ready for review October 14, 2025 23:42

maparent requested a review from mdroidian October 14, 2025 23:44

mdroidian requested changes Oct 15, 2025

View reviewed changes

mdroidian self-requested a review October 15, 2025 19:49

mdroidian requested changes Oct 15, 2025

View reviewed changes

packages/database/scripts/bench.mts Show resolved Hide resolved

maparent requested a review from mdroidian October 15, 2025 20:49

maparent force-pushed the eng-376-run-simulation-on-db-schema branch 2 times, most recently from 91926ec to 3a6d663 Compare October 17, 2025 13:27

maparent added 7 commits October 18, 2025 08:39

WIP: send files without automation

9572e7d

wip

109c32d

name queries

69742b8

add content for nodes

e6cc447

small bug on generation

82434d9

consequence of correction

e3bc968

fetch schemas first

848928a

maparent added 4 commits October 18, 2025 08:39

use meaningful local_ids for fake data

2bbf23f

another side effect

4a126f5

coderabbit comments

5d657af

linting

e53c99a

maparent force-pushed the eng-376-run-simulation-on-db-schema branch from 3a6d663 to e53c99a Compare October 18, 2025 12:39

Eng 376 Benchmark Concept queries #483

Are you sure you want to change the base?

Eng 376 Benchmark Concept queries #483

Conversation

maparent commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

linear bot commented Oct 8, 2025

Uh oh!

supabase bot commented Oct 8, 2025

Uh oh!

maparent commented Oct 14, 2025

Uh oh!

coderabbitai bot commented Oct 14, 2025

Uh oh!

coderabbitai bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mdroidian left a comment

Choose a reason for hiding this comment

Uh oh!

maparent commented Oct 15, 2025

Uh oh!

mdroidian commented Oct 15, 2025

Uh oh!

mdroidian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maparent commented Oct 15, 2025

Uh oh!

mdroidian commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maparent commented Oct 8, 2025 •

edited

Loading

coderabbitai bot commented Oct 14, 2025 •

edited

Loading