Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add comment about GHA bug
  • Loading branch information
kevin940726 committed Feb 7, 2023
commit 239050c676e5de4925b7fe00f01219596946f409
2 changes: 2 additions & 0 deletions .github/setup-node/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ runs:
shell: bash

- name: Rebuild dependencies
# Comparing against strings is necessary because of a known bug in composite actions.
# See https://github.com/actions/runner/issues/1483#issuecomment-1279933184
if: ${{ inputs.should-rebuild == 'true' }}
run: npm rebuild
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is normally not needed because most of the packages don't have postinstall scripts. However, some react-native related packages do need them so we make this an opt-in.

This is not future-proof as we might include a package that needs postinstall in the future. We can run npx can-i-ignore-scripts to manually inspect these cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not run npm ci above instead of npm ci --ignore-scripts?

also I wonder how much of the impact on performance is related to --ignore-scripts - I might measure it and find out.

this seems more of a danger on this change: is there a way we can make it more obvious to someone when their PRs are failing multiple test suites that the reason is that they need to supply should-rebuild: true?

I'm not sure how I would expect someone to start from the failure and work back up to this setting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as I said above, it's not entirely "safe", but I think it's good enough for most cases. We will invalidate the cache whenever OS/Node/package-lock.json changes so it's more or less like what we have during local development.

also I wonder how much of the impact on performance is related to --ignore-scripts

Probably not much, because npm ci is only run when there isn't a cache hit. Most runs in our measurements have cache hit.

However, npm rebuild does contribute to some running time. We use it here to trigger any underlying postinstall script in the dependencies that gets ignored by -ignore-scripts. Tested with can-i-ignore-scripts, most dependencies that have the postinstall scripts are react-native-related (and local development related), hence I think it makes sense to make it an opt-in when we're building for react-native.

We can try to just use npm ci without the --ignore-scripts and always run npm rebuild though. I think that's unnecessary in most cases and probably will add up 20 to 30 seconds of runtime but it's technically "safer". Caching node_modules won't behave exactly like running npm ci every time and I think that's okay.

I'm not sure how I would expect someone to start from the failure and work back up to this setting.

I'd expect it to be pretty visible as it would probably fail the CI and we can look into that. I think it'll only happen when there's a new dependency with postinstall get added though, which should be pretty uncommon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most runs in our measurements have cache hit.

where did you measure this? and what was the ratio for pull_request events in PRs?

probably will add up 20 to 30 seconds of runtime but it's technically "safer"

it's usually faster to do the wrong thing 😉
I'm really nervous about introducing a very subtle behavior that could break things if you don't already happen to know it exists and remember that you need to apply a workaround when it's appropriate.

it would probably fail the CI and we can look into that

I agree here, but how is it that you would propose someone moves from "my PR failed and I don't know why" to finding this flag?

Copy link
Member

@dmsnell dmsnell Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's unnecessary in most cases and probably will add up 20 to 30 seconds of runtime

Not that I have investigated why yet, but if I did the test right it doesn't seem to make much difference beyond a couple seconds. There is not enough of an effect with the data collected so far to conclude that there is any measurable impact. I could have done something wrong, or the flag might not be having the impact we expect it to.

Granted, for the unit tests and end-to-end tests there's still that issue of having the multi-modal distribution which could be throwing off the statistics. I haven't investigated the impact that has on the results.

My test branch is #46318

Unit Tests workflow

branch-compare-45932-46318

Performance tests workflow

branch-compare-45932-46318

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did you measure this? and what was the ratio for pull_request events in PRs?

I directly looked into the details of each run and see if there's a cache hit. I only looked into this PR's runs as this is where we're measuring.

it's usually faster to do the wrong thing 😉

It depends on what you think is "wrong". I'm just saying that upon my research this is the trade-off that I'm willing to take. It's obviously subjective and it's totally "right" to think otherwise.

but how is it that you would propose someone moves from "my PR failed and I don't know why" to finding this flag?

I think we should be able to find this out in a code review. Just like how people work in different OSes than in CI, things are gonna be different and weird things happen in CI. We can also write some guide or log about this if necessary. I agree we should minimize the breaking changes whenever possible but I believe this is something that rarely breaks. However, if the added time of removing --ignore-scripts is so little that it doesn't matter then I'll immediately stand on the other side of the fence 😝 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed --ignore-scripts and npm rebuild in 1949d5f. Seems like it's still working nicely!

I updated the postinstall step to also run the postinstall scripts in the packages folder too. This technically doesn't match the behavior of npm ci entirely though as it only runs in the packages folder but not recursively for all dependencies.

If the postinstall scripts only update the node_modules folder (which is what most dependencies are doing) then it should be safely cached. However, in react-native-editor, the i18n-cache files are generated both in node_modules and in the packages/ folder because of linking. During the tests, the i18n-cache files are read from the packages folder for some reason, hence we need to re-run the postinstall script to populate it again. I think this can be seen as an edge case, and possibly there are other ways to solve it too (for example reading from node_modules as a fallback).

shell: bash
Expand Down