Skip to content

Conversation

@mrow4a
Copy link
Contributor

@mrow4a mrow4a commented Jun 30, 2018

Problem Description

The same logic for calling stat in objectstore compared to filesystem implementation is inefficient.

This is due to the fact that objectstore does not implement any filesystem functionality which stores information about file, and thus needs to resort to using filecache.

Solution

statcache

This PR implements stat cache similar to filesystem to temporarly cache results between calls to filesystem. Each request to modify filesystem clears the stat cache. Additionaly, we store only cache for objects, not for file tree (directories).

How did I test

I did:

  • upload new file
  • overwrite new file
  • again overwrite (generate next version)

Master:
diagnostic-master

Obj Stat:
diagnostic-obj

This PR thus cuts 20% of read queries required for objectstore.

Related PRs

Due to the fact there is no decision on introducing "cache of filecache", this PR is just fixing the code logic.

@PVince81 @DeepDiver1975 @butonic

@codecov
Copy link

codecov bot commented Jul 1, 2018

Codecov Report

Merging #31958 into master will increase coverage by 0.01%.
The diff coverage is 98.46%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #31958      +/-   ##
============================================
+ Coverage     64.06%   64.08%   +0.01%     
- Complexity    18306    18313       +7     
============================================
  Files          1192     1192              
  Lines         69135    69163      +28     
  Branches       1277     1277              
============================================
+ Hits          44292    44320      +28     
  Misses        24471    24471              
  Partials        372      372
Flag Coverage Δ Complexity Δ
#javascript 52.91% <ø> (ø) 0 <ø> (ø) ⬇️
#phpunit 65.37% <98.46%> (+0.01%) 18313 <22> (+7) ⬆️
Impacted Files Coverage Δ Complexity Δ
...b/private/Files/ObjectStore/ObjectStoreStorage.php 85.46% <98.46%> (+1.53%) 100 <22> (+7) ⬆️
...eratedfilesharing/lib/Controller/OcmController.php 66.26% <0%> (+0.2%) 30% <0%> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a3bd931...e436abb. Read the comment docs.

@mrow4a mrow4a force-pushed the obj_meta branch 2 times, most recently from f057ac6 to df769f0 Compare July 1, 2018 17:21
@PVince81
Copy link
Contributor

PVince81 commented Jul 2, 2018

this is likely not enough, you also need logic to invalidate the entries whenever other file operations are called.

remember that many high level file operations might involve more than a single low level FS operation

here's an example stat cache that was added for external storage: https://github.com/owncloud/core/blob/v10.0.8/lib/private/Files/Storage/DAV.php#L78 and https://github.com/owncloud/core/blob/v10.0.8/lib/private/Files/Storage/CacheableFlysystem.php#L32

@mrow4a
Copy link
Contributor Author

mrow4a commented Jul 2, 2018

@PVince81 the stat cache gets set and invalidated in the same function - getMetadata. There is no other place in code to set the cache - so cache will leave only for duration of that function

@PVince81
Copy link
Contributor

PVince81 commented Jul 2, 2018

so are we assuming that all file operations like $objectStore->unlink() and $objectStore->rename() will call getMetadata() again to invalidate and refresh the stat cache info ? it might be the case now but there is no guarantee it will be in the future

@mrow4a
Copy link
Contributor Author

mrow4a commented Jul 3, 2018

@PVince81 so your suggestion is to have a look how stat cache is realized in other implementations?

@PVince81
Copy link
Contributor

PVince81 commented Jul 4, 2018

so your suggestion is to have a look how stat cache is realized in other implementations?

that or think about stat cache invalidation for every file operation that go through the storage

@mrow4a
Copy link
Contributor Author

mrow4a commented Aug 7, 2018

@butonic what do you think about approach suggested by @PVince81 ?

@mrow4a
Copy link
Contributor Author

mrow4a commented Aug 12, 2018

@PVince81 @DeepDiver1975 @butonic @tomneedham all tests passing - I also updated the PR description with the context. I made a stat caching similar to the one for dav.

@butonic
Copy link
Member

butonic commented Aug 20, 2018

We are gluing things to different places, where the right place would be the filecache. That #28166 requires touching core all over the place is caused by other places bypassing the filecache. We can patch around that, but the problem does not go away that way. Fix the places the bypass the filecache first.

I don't have the energy to look into filecache related PRs, sorry.

@mrow4a
Copy link
Contributor Author

mrow4a commented Aug 20, 2018

@butonic this does not touch filecache at all. It is a layer below, storage stat cache - please look a description. It is used ONLY when using storage calls, the same as for filesystem, no filecache involved here.

CC @PVince81 @DeepDiver1975

@mrow4a
Copy link
Contributor Author

mrow4a commented Aug 20, 2018

@butonic and to mention further, this PR is only to make objectstore work with similar performance as filesystem, no filecache "caching" is here in place, I just cache results of the filesystem calls wrapped by objectstorage

$stat = $cacheEntry->getData();
if ($cacheEntry->getMimeType() != 'httpd/unix-directory') {
// Only set stat cache for objects
$this->objectStatCache->set($path, $stat);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeepDiver1975 @butonic note that I only cache stat calls for objects metadata, not a file-tree (filecache trees)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

additionally, file need to exists to be cached (I don't cache e.g. false)

return null;
}

private function rmObjects($path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why has this method been moved down ? this makes it difficult to review as now we can't see whether you made change to it

if you want to reformat a file please do so in a separate commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use diff -> split. But I can do it in separate commit, should I?

@PVince81
Copy link
Contributor

I agree with @mrow4a that a stat cache on the storage class is useful to avoid repeated real-storage access as higher level OC code tends to repeat operations (like file_exists, etc). We already use this approach for other storages like DAV, Dropbox, etc.

Copy link
Member

@DeepDiver1975 DeepDiver1975 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine in merging this to master and get an understanding how this works.

We shall rebase and rerun the tests ....

@mrow4a
Copy link
Contributor Author

mrow4a commented Sep 18, 2018

@DeepDiver1975 reviewed again, and rebased

@mrow4a
Copy link
Contributor Author

mrow4a commented Oct 10, 2018

@DeepDiver1975 @PVince81 any decision on this?

Copy link
Member

@butonic butonic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the conservative approach. It at least caches stuff for objectstorage.

@DeepDiver1975 DeepDiver1975 self-assigned this Nov 7, 2018
@DeepDiver1975 DeepDiver1975 dismissed butonic’s stale review November 7, 2018 21:53

CappedMemoryCache is used now as requested .... moving on .....

@DeepDiver1975 DeepDiver1975 merged commit 5fbc0cb into master Nov 7, 2018
@DeepDiver1975 DeepDiver1975 deleted the obj_meta branch November 7, 2018 23:05
@mrow4a
Copy link
Contributor Author

mrow4a commented Nov 8, 2018

@DeepDiver1975 @butonic I will keep an eye on smashbox with objectostrage tests for this change.

@lock lock bot locked as resolved and limited conversation to collaborators Nov 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants