Skip to content

Conversation

@samouri
Copy link
Contributor

@samouri samouri commented Aug 29, 2017

This may be me misunderstanding how to use the createSelector helper function, but my current impression is that we are over-invalidating the cache. Ideally we could invalidate the cache for specific posts when the comments for a post are modified, but thats not possible with the current implementation of createSelector.

Something I am seeing regularly now is:

1. getTree( postId1 )
2. getTree( postId2 ) // this request clears the cache
3. getTree( postId1 ) // this ideally would use the cache if the comments for id1 never changed

This PR seeks to invalidate the cache when state.comments.items changes instead of when the post selected for changes

@samouri samouri added [Feature] Comments Comments on posts and the admin screen for managing them. [Feature] Reader The reader site on Calypso. [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. labels Aug 29, 2017
@samouri samouri self-assigned this Aug 29, 2017
@matticbot
Copy link
Contributor

@matticbot matticbot added the [Size] S Small sized issue label Aug 29, 2017
@samouri samouri requested review from a team August 29, 2017 00:51
@samouri samouri changed the title Comments: invalidate createSelector cache based on state.comments.ite… Comments: invalidate createSelector cache based on state.comments.items instead of posts Aug 29, 2017
@dmsnell
Copy link
Member

dmsnell commented Aug 29, 2017

we are over-invalidating the cache

the simple answer is that no we aren't 😉

a better assessment would be that we are poorly documenting the behavior or choosing a poor wording to describe it. it's not really a memoized function; we have a function which caches its last execution and uses a custom cache invalidator based on reference equality. if it truly were memoized it would balloon in memory.

in Simplenote Electron in a now-closed PR I created a wrapper function memoizeSingle because I wasn't aware of the appropriate name for such a function. in that case too I was still basing the cache off of the inputs, which I think is much closer to the concepts "memoization" normally evokes.


my advice here is to think through and list out the tradeoffs in this design (and without it). those tradeoffs and the reason for choosing this option belong in the JSDoc comment above the selector.

true memoization would balloon in memory because it would store the results of every invocation of this function (and it would use the inputs as the cache key for retrieval, not for invalidation)

using createSelector() on the site and post id flushes its cache too often to save much time (do you know this or assume this?)

are we guarding against re-renders? against updates from the server?

would a capped cache better serve than a single-cache? I'm a vocal critic of createSelector() so be aware of my bias. however, I find that nuances in its behavior lead to situations like this; it's worth considering creating a basic and easily-testable reducer then wrapping it/composing it in a separate and easily testable function which adds the appropriate caching behavior once you've identified and documented what that behavior ought to be and why

@samouri
Copy link
Contributor Author

samouri commented Aug 29, 2017

we are over-invalidating the cache

I stand by my assertion that there exists a cache and we are invalidating it too frequently. I have a feeling that the current dependent was actually a mistake and based on a misunderstanding of how createSelector works.

The change I'm proposing here does involve a classic space vs. time tradeoff. Its a necessary trade for conversations tool...but i'll get to that in a bit

createSelector creates memoized functions with custom cache clearing mechanisms. cache keys are by default based on the args provided (memoization) but the cache is cleared when dependents have changed.


getPostCommentsTree Breakdown

current:

  • cacheKey: [ state, siteId, postId, status ]
  • dependents (cache clear): [ state.comments.items[ siteId + '-' + postId] ]
  • performance characteristics:
    • space: this will always hold exactly zero or one cached result and never grow beyond that.
    • time: repeatedly requesting the comments tree for a specific ( siteId, postId ) is optimized and will be cached as long as the site's comments haven't been modified in any way.
    • weakness: nothing will ever be cached if we don't select for the same postId. so when trying to display something like a stream of commentTrees (hint hint conversations), this doesn't help at all

proposed:

  • cacheKey: same
  • dependents: [ state.comments.items ]
  • performance characteristics:
    • space: this will hold up to as many trees as there are posts in the redux comments store. Since it gets cleared each time comments are added, I can't imagine this getting too large.
    • time: any request for the same ( siteId, postId ) will be cached as long as comment.items is unmodified.

Background on why I'm making this:
while developing the conversations tool over here, my cpu would hang at 100% for over 20 seconds until things calmed down. Turned out it was because of many components rendering and clearing the cache in quick succession. breaking the cache meant this expensive selector needlessly repeated the same work in order to return an equivalent object that was referentially different meaning a rerender would be necessary which in turn repeated this loop ad-nasuea (because PostComments nest other PostComments)

I'd also argue that this PR simplifies the dependent rather than making it more complex. I would have wanted a comment for the old version explaining the counterintuitive behavior whereas the version in this PR follows the fairly standard pattern for using createSelector

Copy link
Contributor

@gwwar gwwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think changes here make sense and they test well. 👍

If we still run into perf problems we may want to consider our stance on either normalization of comments (maybe we should add a few other reducers to help us, keeping in mind that we should take care if we ever persist any of this data) or if we need a more custom caching solution.

@samouri
Copy link
Contributor Author

samouri commented Aug 29, 2017

Discussed with @dmsnell on slack, he gave me a verbal OK.

@samouri samouri merged commit d4c4126 into master Aug 29, 2017
@samouri samouri deleted the fix/comments/tree-caching branch August 29, 2017 19:44
@matticbot matticbot removed the [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. label Aug 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Comments Comments on posts and the admin screen for managing them. [Feature] Reader The reader site on Calypso. [Size] S Small sized issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants