Skip to content

Conversation

@shilman
Copy link
Member

@shilman shilman commented Nov 20, 2025

Latest prompt, stories, components.json

Copilot AI review requested due to automatic review settings November 20, 2025 10:21
@changeset-bot
Copy link

changeset-bot bot commented Nov 20, 2025

⚠️ No Changeset found

Latest commit: 4327f47

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

💥 An error occurred when fetching the changed packages and changesets in this PR
Some errors occurred when validating the changesets config:
The package or glob expression "@storybook/mcp-eval*" is specified in the `ignore` option but it is not found in the project. You may have misspelled the package name or provided an invalid glob expression. Note that glob expressions must be defined according to https://www.npmjs.com/package/micromatch.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Nov 20, 2025

npm i https://pkg.pr.new/@storybook/addon-mcp@87
npm i https://pkg.pr.new/@storybook/mcp@87

commit: 4327f47

@shilman shilman requested review from JReinhold and removed request for Copilot November 20, 2025 10:22
@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.13%. Comparing base (09dad11) to head (4327f47).
⚠️ Report is 1 commits behind head on next.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             next      #87   +/-   ##
=======================================
  Coverage   89.13%   89.13%           
=======================================
  Files          19       19           
  Lines         414      414           
  Branches      116      116           
=======================================
  Hits          369      369           
  Misses          6        6           
  Partials       39       39           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copilot AI review requested due to automatic review settings November 20, 2025 20:27
@JReinhold JReinhold merged commit 188482e into next Nov 20, 2025
11 checks passed
@JReinhold JReinhold deleted the shilman/update-reshape-flight-booking branch November 20, 2025 20:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the flight booking evaluation with improved test infrastructure and data-testid requirements. The changes add specific test IDs to the prompt requirements, refactor the stories file to use a more robust element selection strategy, and update the components.json file with new import patterns and error messages.

Key Changes

  • Added data-testid requirements to all interactive elements in the prompt
  • Introduced looseGetInteractiveElements helper for flexible element selection in tests
  • Updated story type annotations from implicit to explicit Story types
  • Modified components.json with consolidated imports and updated error messages

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.

File Description
prompt.md Added data-testid requirements to each technical requirement
FlightBooking.stories.tsx Added helper function for element selection, restructured stories with explicit types
components.json Updated import statements to use consolidated exports, changed error message text

5. The element for the "Return"-toggle SHOULD have a "Return" as its only content and SHOULD have data-testid="return"
6. The autocomplete to open the From airport picker SHOULD have "From" as its placeholder and SHOULD have data-testid="flight-trigger-from"
7. The autocomplete to open the To airport picker SHOULD have "To" as its placeholder and SHOULD have data-testid="flight-trigger-to"
8. Each element to select an airport in the pickers SHOULD have include both the shortcode and full airport name in its content and SHOULD have data-testid="airport-{SHORTCODE}" (e.g., "airport-MEL", "airport-LAX")
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected grammar error: 'SHOULD have include' should be 'SHOULD include'

Suggested change
8. Each element to select an airport in the pickers SHOULD have include both the shortcode and full airport name in its content and SHOULD have data-testid="airport-{SHORTCODE}" (e.g., "airport-MEL", "airport-LAX")
8. Each element to select an airport in the pickers SHOULD include both the shortcode and full airport name in its content and SHOULD have data-testid="airport-{SHORTCODE}" (e.g., "airport-MEL", "airport-LAX")

Copilot uses AI. Check for mistakes.
8. Each element to select an airport in the pickers SHOULD have include both the shortcode and full airport name in its content and SHOULD have data-testid="airport-{SHORTCODE}" (e.g., "airport-MEL", "airport-LAX")
9. The element to open the Departure Date date select SHOULD have "Departure Date" as its initial content and SHOULD have data-testid="date-trigger-departure"
10. The (optional) element to open the Return Date date select SHOULD have "Return Date" as its initial content and SHOULD have data-testid="date-trigger-return"
11. Each date in the date selects SHOULD the day of month as its only content and SHOULD have data-testid="date-{DAY}" (e.g., "date-27", "date-15")
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing verb: 'SHOULD the day' should be 'SHOULD have the day' or 'SHOULD show the day'

Suggested change
11. Each date in the date selects SHOULD the day of month as its only content and SHOULD have data-testid="date-{DAY}" (e.g., "date-27", "date-15")
11. Each date in the date selects SHOULD have the day of month as its only content and SHOULD have data-testid="date-{DAY}" (e.g., "date-27", "date-15")

Copilot uses AI. Check for mistakes.
}
return false;
});
interactive.push(null as any);
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing null as any to the interactive array appears to be a hack to ensure the array is never empty. This could cause runtime errors when accessing elements. Consider returning an empty array or throwing an error if no elements are found, or document why null is being added.

Suggested change
interactive.push(null as any);

Copilot uses AI. Check for mistakes.
return byTestId;
}
const candidates = [
...screen.queryAllByTestId(testId),
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant query: screen.queryAllByTestId(testId) is called twice - once at line 37 and again at line 42 in the candidates array. The second call is unnecessary since we already checked if byTestId has results.

Suggested change
...screen.queryAllByTestId(testId),

Copilot uses AI. Check for mistakes.
JReinhold added a commit that referenced this pull request Nov 20, 2025
* enter prereelase mode on next branch

* Handle HTML in a separate file (#56)

* extract human-readable /mcp to maintainable html file

* upgrade tsdown, remove json treeshaking workaround

* add changeset

* fix tsdown types

* add changeset release branches to checks

* commit releases with gh api. see https://github.com/changesets/action#inputs

* Version Packages (next) (#57)

Co-authored-by: storybook-app-bot[bot] <175111413+storybook-app-bot[bot]@users.noreply.github.com>

* Replace Storybook canary versions with 10.1.0 prereleases (#59)

* upgrade to storybook 10.1.0-alpha.2

* changesets

* Version Packages (next) (#60)

Co-authored-by: storybook-app-bot[bot] <175111413+storybook-app-bot[bot]@users.noreply.github.com>

* Rename "examples" to "stories" in component manifest format (#61)

* Initial plan

* Rename "examples" to "stories" in component manifest format

- Updated type definitions in types.ts files to rename Example to Story
- Updated format-manifest.ts to use story terminology (story, story_name, story_description, story_code)
- Updated all fixture JSON files to use "stories" instead of "examples"
- Updated test files and descriptions to use "stories" terminology
- Updated test snapshots to reflect the new XML output format
- All tests passing, build and typecheck successful

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: Jeppe Reinhold <[email protected]>

* Support name in manifest errors (#55)

* add name to manifest errors

* add fixtures with errors

* add changeset

* improve test reports in ci

* update fixtures to use stories instead of examples

* more example -> story renaming

* Improve code quality and development setup (#64)

* replace prettier with oxfmt

* add oxlint for linting

* update actions using npx actions-up

* add publint

* fix types

* add check-everything script

* add build-storybook to check all

* split GH Workflows

* explain lint disables

* Version Packages (next) (#63)

Co-authored-by: storybook-app-bot[bot] <175111413+storybook-app-bot[bot]@users.noreply.github.com>

* Replace oxfmt with Prettier (#68)

* replace oxfmt with prettier

* typo

* Update to the latest SB alpha so the internal storybook version works with the server (#71)

Co-authored-by: Jeppe Reinhold <[email protected]>

* Revert Embed demo image from storybook.js.org#21 (#75)

* Evals (#69)

* add initial eval setup

* well, a lot happened here...

* add clack

* Add interactive prompts and styled output to eval CLI (#65)

* Initial plan

* Add interactive prompts and prettier output to eval CLI

Co-authored-by: JReinhold <[email protected]>

* Use tasks API for parallel evaluation steps

Co-authored-by: JReinhold <[email protected]>

* Apply oxfmt formatting to eval.ts

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: JReinhold <[email protected]>

* improve terminal experience

* save environment

* improve terminal experience

* only allow one eval at a time

* add support for custom context

* format

* add support for eval hooks, add log about how to rerun experiments

* prompt to start storybook at the end of the evaluation

* add message about getting into the experiment

* improve experiment dir name

* take screenshots of failed stories too

* cleanup

* improve reshaped stories, improve test+a11y summary, improve mcp server config arg

* support --[no-]storybook flag

* collect experiment description and branch name

* save result summary to google sheets

* improve plain prompt

* prompt for google sheets upload

* fix google sheets upload

* support "Storybook MCP" context, which starts up the docs-only @storybook/mcp server with a given component manifest

* Add basic Radix eval (#66)

* Add Radix eval

* Add Rsuite eval (#67)

---------

Co-authored-by: Jeppe Reinhold <[email protected]>

* format

* fix typechecking

* add reshaped component manifest

* add conversation-viewer.html with approximate token count

* cleanup

* add documentation, fixups

* format

* fix stories not having imports anymore

* fix plain and radix experiments

* experiments will have unique package names

* more eval test fixing

* more story fixes

* fix typecheck and lint summary

* improve conversation viewer

* simplify viewer content

* simplify viewer content

* result visualisations is via storybook

* upload to chromatic

* update google sheet row order

* add Chromatic link to CLI log

* add note about public results

* remove description arg from evals

* Evals: Add Radix UI website prompt (#74)

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: JReinhold <[email protected]>
Co-authored-by: Michael Shilman <[email protected]>

* Review Kasper (#70)

* Start review

* Fix

* More comments

* Fix config files and restructure

* Resolve conflicts

* Fix github actions

* Fix coverage

* Fix type error

* Fix

* Fix

* Dedupe

* Update packages/mcp/src/index.ts

Co-authored-by: Jeppe Reinhold <[email protected]>

* Update .github/workflows/check.yml

Co-authored-by: Jeppe Reinhold <[email protected]>

* Update .github/workflows/check.yml

Co-authored-by: Jeppe Reinhold <[email protected]>

* Improve get/post handling

* Dedupe vite

* lock file

* test perf of check-everything in CI

* rename

* rename

* Add turbo caching

* check cache invalidation

* refactor

* refactor

* refactor

* refactor

* Use node version file

* description

* refactor

* rollback

* use turbo for artifacts

* install node

* optimize

* install offline for faster symlinking

* optimize

* Check ci

* Only upload test results on failure

* Check github reporter

* Fix command

* Fix test

* Remove check everything

* test corepack enable

* test corepack enable

* test corepack enable

* fix

* Check if this is faster

* Check if this is faster

* no cache

* rollback

* Change nothing

* Fix prettier

* Modify changeset for MCP server GET responses

Updated the changeset to handle GET responses in the MCP server.

* Prettier

* use docker

* debug

* use node 24

* Try own caching

* Prune it

* Don't format pnpm lock

* Fix

* again

* use composite

* change

* Revert "change"

This reverts commit 8031a63.

* Revert "use composite"

This reverts commit 7f26a54.

* Revert "again"

This reverts commit 7fdccdf.

* Revert "Fix"

This reverts commit f4dd004.

* Revert "Don't format pnpm lock"

This reverts commit c11c4ec.

* Revert "Prune it"

This reverts commit 1009ad5.

* Revert "Try own caching"

This reverts commit 82eb804.

* Revert "use node 24"

This reverts commit c63f9ee.

* Revert "debug"

This reverts commit d647a91.

* Revert "use docker"

This reverts commit 766462e.

* Address feedback

* Initial plan

* Update README and Copilot instructions for script changes

Co-authored-by: JReinhold <[email protected]>

* Address feedback

* Make it loose

* Watch storybook by default

* Fix command

* Fix

* Add pnpm to ignore

* Fix dev command

* Cleanup

* get CI green

---------

Co-authored-by: Jeppe Reinhold <[email protected]>
Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: JReinhold <[email protected]>

* Make `get-component-documentation` tool only accept a single component ID instead of multiple (#79)

* cleanup

* get-component-documentation only accepts a single component id

* Fix evals (#81)

* cleanup

* get-component-documentation only accepts a single component id

* fix versions

* use vitest cli instead of node for evals

* prefix experiment scripts so they are not picked up by turborepo

* Add toolset property to telemetry payloads in addon-mcp (#78)

* Initial plan

* Add toolset property to all telemetry payloads in addon-mcp

Co-authored-by: JReinhold <[email protected]>

* add changeset

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: JReinhold <[email protected]>
Co-authored-by: Jeppe Reinhold <[email protected]>
Co-authored-by: Jeppe Reinhold <[email protected]>

* remove source API and use the request instead (#54)

* remove source API and use the request instead

* cleanup

* add changesets

* add path argument to manifestProvider

* cleanup

* update changeset

* fix serve.ts

* cleanup

* Fix internal stdio-based MCP server (#85)

* allow undefined requests when using custom manifestProvider

* changeset

* add tests for internal stdio transport

* cleanup

* Add end-to-end tests and improve unit test quality (#84)

* add e2e tests

* improve e2e scripting

* add tests for mcp index

* add preset tests

* add telemetry tests

* simplify tool test mocks

* simplify mcp-handler tests, improve disableTelemetry handling

* add tests for manifest availability

* exclude evals from coverage

* cleanup

* changeset

* fix preset registering handlers instead of middlewares

* update tests to match changes in base branch

* cleanup

* await sb process kill

* globally mock storybook deps

* clean lock file

* Output in markdown instead of XML (#86)

* add e2e tests

* improve e2e scripting

* add tests for mcp index

* add preset tests

* add telemetry tests

* simplify tool test mocks

* simplify mcp-handler tests, improve disableTelemetry handling

* add tests for manifest availability

* exclude evals from coverage

* cleanup

* changeset

* fix preset registering handlers instead of middlewares

* update tests to match changes in base branch

* cleanup

* await sb process kill

* refactor formatter, splitting into markdown and xml, configurable, defaulting to markdown

* globally mock storybook deps

* clean lock file

* fix context arg

* fix tests

* fix types

* "Examples" -> "Stories", simplify tests

* simplify tests and types

* simplify

* use ts-like prop type docs format

* add script to clean experiments

* add changeset

* exit pre mode (#88)

* Update reshaped flight booking eval (#87)

* Update reshaped flight booking eval

* format

---------

Co-authored-by: Jeppe Reinhold <[email protected]>
Co-authored-by: Jeppe Reinhold <[email protected]>

* Version Packages (#80)

Co-authored-by: storybook-app-bot[bot] <175111413+storybook-app-bot[bot]@users.noreply.github.com>

---------

Co-authored-by: storybook-app-bot[bot] <175111413+storybook-app-bot[bot]@users.noreply.github.com>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Tom Coleman <[email protected]>
Co-authored-by: Michael Shilman <[email protected]>
Co-authored-by: JReinhold <[email protected]>
Co-authored-by: Kasper Peulen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants