Skip to content
This repository was archived by the owner on Jan 10, 2023. It is now read-only.

Conversation

@maglnet
Copy link

@maglnet maglnet commented Apr 13, 2014

Hi,

i had some trouble running phpcpd in an environment where memory_limit was not big enough to run phpcpd over a big codebase. So i stored all hashes within a sqlite file to reduce the memory usage.

With the provided changes it is possible to switch the default storage (internal array) to a temporary sqlite database. Of course this makes the detection slower by using a sqlite database on the filesystem, but you can run phpcpd with a very small memory usage.

Here are some examples / comparisons running phpcpd on zendframework2 code:

Running with memory (default) storage (fast but a lot of memory used):

$ php phpcpd --progress --min-tokens=70 --min-lines=5 vendor/zendframework/
phpcpd 2.0.1-3-ge862208 by Sebastian Bergmann.

[...]

2.49% duplicated lines out of 303394 total lines of code.

Time: 12.92 seconds, Memory: 177.25Mb

Running with sqlite storage (slower but less memory used):

$ php phpcpd --progress --min-tokens=70 --min-lines=5 --hash-storage=SQLite vendor/zendframework/
phpcpd 2.0.1-3-ge862208 by Sebastian Bergmann.

[...]

2.49% duplicated lines out of 303394 total lines of code.

Time: 36.47 seconds, Memory: 16.25Mb

Additionally it is now very easy to add another HashStorageAdapter by providing a class that implements HashStorageInterface.

Best regards,
Matthias

@jk
Copy link

jk commented Apr 15, 2014

👍

@jpetitcolas
Copy link
Contributor

Nice! What is the status of this PR? Does it need anything else to be merged? :)

@jk
Copy link

jk commented Aug 28, 2014

@jpetitcolas I'm a colleague of @maglnet and we use this version internally in our ci environment. I can't speak for @maglnet but from my POV it is complete an ready to be merged. Albeit it is debatable if this should be configurable, so when you got a small project you can gain some performance to do all in memory.

@maglnet
Copy link
Author

maglnet commented Aug 28, 2014

@jpetitcolas Yes, from my POV it's ready to be merged, too.
@jk Perhaps i misunderstood you, but it is configurable. See the --hash-storage=SQLite argument passed to phpcpd in my first comment.

@sebastianbergmann could tell us more about the status, he's propably waiting for the next major/minor release? Or should i adjust the PR in some way?

@jpetitcolas
Copy link
Contributor

Yes, indeed, it is configurable and everything seems fine. Would be nice to see this PR merged to get the extra option in the official release. :)

@jk
Copy link

jk commented Aug 28, 2014

Oops, than I didn't say anything. 👍

@jk
Copy link

jk commented Jan 21, 2015

@sebastianbergmann Anything new in regards of your tweet: https://twitter.com/s_bergmann/status/535472638072995841 ?

@maglnet
Copy link
Author

maglnet commented Nov 4, 2015

Just for me, as a reminder:
The build also fails for master branch, so I need to merge, as soon as the problems in master are fixed.

@sebastianbergmann
Copy link
Owner

I am sorry that I did not get a chance to look at this in over two years. If you are still interested in this then please open a new pull request against the current master.

@jk
Copy link

jk commented Jan 24, 2017

@sebastianbergmann Really disappointed by this behaviour. If we will do the work for a new PR, how high is the chance that you will find the time to look at it this time or should we just refrain from the idea since it does not resonate with you?

@sebastianbergmann
Copy link
Owner

I promise to do my best to review this during the next weekend after you send the pull request.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants