Skip to content

Conversation

@sakertooth
Copy link
Contributor

@sakertooth sakertooth commented Sep 14, 2024

Adds support for caching resources from files. Currently, we use the new file cache for SampleBuffer's and SampleThumbnail's, but it can be extended to other resources in the future. Uses the LRU eviction policy.

Should supersede #7058 I believe.

Copy link
Member

@messmerd messmerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave it a quick look-over.

Also,

  • Sample's move constructor and move assignment operator should be marked noexcept
  • Sample::s_interpolationMargins should be renamed to Sample::InterpolationMargins since it is public
  • sample_rate_t could be used instead of int for the sample rate (See this commit for what needed to be changed: 10162ec)

@sakertooth
Copy link
Contributor Author

sakertooth commented Nov 13, 2024

Sample's move constructor and move assignment operator should be marked noexcept
Sample::s_interpolationMargins should be renamed to Sample::InterpolationMargins since it is public
sample_rate_t could be used instead of int for the sample rate (See this commit for what needed to be changed: 10162ec)

Thanks, I will make another PR to address these issues separately. However, about the int -> sample_rate_t change, I never really understood why we have those types when basic types like int will work just as well. We already have to convert the sample rate to an int when processing the project file anyways. I can't really see the benefit using sample_rate_t brings, and it adds more complexity than int does (not in terms how long the type is.. well maybe, but more so having to know what is the actual type is that we are dealing with here on top of adding more code that we don't really need in the codebase).

@sakertooth sakertooth changed the title Add a database for samples Add a cache for samples Nov 13, 2024
@sakertooth
Copy link
Contributor Author

Any updates @messmerd?

@messmerd
Copy link
Member

This PR looks pretty good to me code-wise. I'll test it next.

@Spacemagehq
Copy link

I'm testing, and I'm finding no issues or bugs so far on Windows 11. I had 20 samples and samples randomly throughout lmms and the memory and the storage seems fine and not affected. The CPU meter inside lmms is not going up by that much and not heavy on the cpu.

@sakertooth
Copy link
Contributor Author

sakertooth commented Feb 8, 2025

@messmerd, I believe I understand the benefit of using UUIDs better now. If the file path changes, then we only would need to update the mapping from UUIDs to file paths at one place, while all the other clients don't have to worry about anything and can continue using the UUID, which hasn't changed.

As of right now my PRs handles it using the "last modified time", which has two problems. One, this can create a lot of dead entries if they are not being actively "garbage collected". Two, since clients still store the file path, if it changes, they would have to still manage the sample by themselves, and that would have to be done possibly everywhere. This is a problem in #7366, where we have check if the Thumbnail needs to be updated or a new one to be created because we store the file paths directly, which can become invalidated at any point on the file system.

It seems like the overall idea here is to map UUIDs to assets/resources, which can be a sample file, sample Bas64 string, project file, preset file, etc. Assets can then specify if they are loaded from disk, or something else like as a string in the case of Base64 samples. Each asset can be updated as necessary to when the asset manager/cache feels like it should (changes on file system or something else possibly). This will not only bolster the sample caching implementation since it truly centralizes everything, but will also help make forward strides in simplifying and improving asset management (which may be needed for features like showing a popup to the user to load missing assets when loading the project, among other things).

I am going to try to implement some of these ideas in a new PR (actually I might just do it here instead).

@messmerd
Copy link
Member

messmerd commented Feb 8, 2025

@sakertooth Yep, that's exactly it.

This PR doesn't make any changes to the project file as far as I'm aware, so it won't introduce any backwards incompatible changes if we merge it now. And since this PR is very useful as is, I think we should merge it then explore the UUID / resource manager idea in a follow-up PR.

@sakertooth
Copy link
Contributor Author

I'll at least move to using QFileSystemWatcher in the current implementation to fix some of the problems mentioned by actively keeping the table in sync with the file system. Other than that I think this is safe to merge. I agree that the UUID asset idea might need to be explored in a different PR since its implementation is far more lengthy than what I have here, and I remember you were planning to do this already to some extent.

@messmerd messmerd mentioned this pull request Feb 9, 2025
@sakertooth
Copy link
Contributor Author

sakertooth commented Feb 9, 2025

I read up on the docs for QFileSystemWatcher

The act of monitoring files and directories for modifications consumes system resources. This implies there is a limit to the number of files and directories your process can monitor simultaneously. On all BSD variants, for example, an open file descriptor is required for each monitored file. Some system limits the number of open file descriptors to 256 by default. This means that addPath() and addPaths() will fail if your process tries to add more than 256 files or directories to the file system monitor. Also note that your process may have other file descriptors open in addition to the ones for files being monitored, and these other open descriptors also count in the total. macOS uses a different backend and does not suffer from this issue.

I was a bit scared off making the switch because of this and had to weigh the pros and cons, so I'll probably leave the timestamp checking to be safe as it works well enough. A counter argument to having dead paths in the caches is that its rare and have static duration, so it really not that big of a problem as I made it out to be. The main issue is exposing the file path information that has to be made in sync everywhere else in the codebase, but we will deal with this later as already discussed.

One change I should make though is that instead of making completely new entries, update preexisting ones if they are fetched a second time and are still valid on the file system. This should keep the number of invalid entries in the cache down by a fair amount.

@sakertooth sakertooth marked this pull request as draft February 9, 2025 05:36
@sakertooth sakertooth marked this pull request as ready for review February 9, 2025 09:27
@sakertooth sakertooth marked this pull request as draft February 12, 2025 13:21
@sakertooth sakertooth marked this pull request as ready for review February 16, 2025 00:29
@sakertooth sakertooth marked this pull request as draft March 17, 2025 11:51
@sakertooth sakertooth force-pushed the add-sample-database branch from 83ee767 to 90c3916 Compare March 19, 2025 13:56
@sakertooth sakertooth marked this pull request as ready for review March 19, 2025 16:25
@sakertooth sakertooth marked this pull request as draft March 22, 2025 21:49
@sakertooth sakertooth changed the title Add a cache for samples Add a cache for samples, transition to use of std::filesystem::path Mar 22, 2025
@sakertooth
Copy link
Contributor Author

Hi @messmerd, I scrapped the PR again and implemented it similar to how I did it the first time but with a different approach: I made a new class (FileCache) that can cache any kind of type originating from a file. Currently, only SampleBuffer and SampleThumbnail store a FileCache<SampleBuffer> and FileCache<SampleThumbnail> respectively in the private section of their declaration.

Since FileCache uses std::filesystem::path and we eventually want to migrate to it, I figured I wouldn't convert to and from a QString within the core, so I have a few changes that span a couple of files to make them use std::filesystem::path instead. I also added new PathUtil::fsConvert overloads to convert to and from a QString/std::filesystem::path.

I did not cache Base64, but it can be added if necessary. Reason being is that Base64 is the data itself (as you know) and isn't really a key in the cache like a file path is. Since there really isn't any good key for that kind of data (except hashing it, which can be slow) as of right now, I have ignored it.

I don't plan on going against the grain from this point, apologies if there was confusion on my end about this. I suppose I made things more complicated than they had to be. This PR is basically done on my end.

@sakertooth sakertooth marked this pull request as ready for review March 23, 2025 00:25
@sakertooth sakertooth changed the title Add a cache for samples, transition to use of std::filesystem::path Add a cache for samples Mar 28, 2025
@sakertooth sakertooth marked this pull request as draft April 16, 2025 11:56
@sakertooth sakertooth force-pushed the add-sample-database branch from 91aa0eb to ef0b9de Compare April 16, 2025 12:28
@sakertooth sakertooth marked this pull request as ready for review April 16, 2025 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants