Skip to content

Conversation

@ChristophWurst
Copy link
Member

On medium and large installations there can be a lot of login data. Loading it can exhaust the available PHP memory.

This limits the impact in two steps

  1. Limit number of DB rows read
  2. Limit number of datasets generated from DB results (DB results are deduplicated and we re-duplicate in memory)

master: I was not able to run occ suspiciouslogin:train --v6 --dry-run -vvv with a memory limit of 512M (Nextcloud default)
here: I can run occ suspiciouslogin:train --v6 --dry-run -vvv successfully with a memory limit of 256M

@ChristophWurst ChristophWurst added bug Something isn't working 2. developing labels Apr 11, 2025
@ChristophWurst ChristophWurst self-assigned this Apr 11, 2025
@ChristophWurst ChristophWurst marked this pull request as ready for review April 16, 2025 06:59
));
))
->orderBy('last_seen', 'DESC') // Use most recent data in case of limiting
->setMaxResults(3_000); // More data will like exhaust memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about this notation :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they took inspiration from Rust :)

if ($validationPositives->count() > self::MAX_SAMPLES_VALIDATE_POSITIVES) {
$threshold = (self::MAX_SAMPLES_VALIDATE_POSITIVES / $validationPositives->count()) * 100;
$validationPositives = $validationPositives->filter(fn () => random_int(0, 100) <= $threshold);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probabilistic limit, nice!

@ChristophWurst
Copy link
Member Author

/backport to stable31

@ChristophWurst
Copy link
Member Author

/backport to stable30

@ChristophWurst ChristophWurst merged commit 22c93f5 into master Apr 16, 2025
29 checks passed
@ChristophWurst ChristophWurst deleted the fix/avoid-memory-exhaustion branch April 16, 2025 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4. to release bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants