Skip to content

Conversation

@jmurty
Copy link
Contributor

@jmurty jmurty commented Jun 26, 2014

The '--filenames' option makes it easy to view the filename represented
by a git-fat object reference, at the cost of a slight performance
and memory hit compared to the plain git-fat status command.

This option is most helpful when you are thinking about running
git-fat gc to clean up some garbage/unreferenced objects, so you
can check what you are about to delete.

  • Add referenced_objects_with_filenames() method that (optionally)
    stores file name data while looking up git-fat referenced objects.
  • Refactor referenced_objects() method to use the above method while
    providing existing interface.
  • If '--filenames' option is given to the status command, print
    filename(s) next to git-fat object hash values.

The '--filenames' option makes it easy to view the filename represented
by a git-fat object reference, at the cost of a slight performance
and memory hit compared to the plain `git-fat status` command.

This option is most helpful when you are thinking about running
`git-fat gc` to clean up some garbage/unreferenced objects, so you
can check what you are about to delete.

* Add referenced_objects_with_filenames() method that (optionally)
  stores file name data while looking up git-fat referenced objects.
* Refactor referenced_objects() method to use the above method while
  providing existing interface.
* If '--filenames' option is given to the `status` command, print
  filename(s) next to git-fat object hash values.
@abraithwaite
Copy link

Hey @jmurty, I did something similar to this in our fork with git fat list.

Your solution of storing hash->filename strings in a dict is the one I tried first too, but you quickly run out of memory for medium to large sized repositories. The way we implemented it is running through rev-list twice which isn't ideal, but better than nothing at all.

Throw away mappings of git hash value to filename(s) for objects that
are not relevant to git-fat. Since we can do this clean-up during
processing, this change should minimise the memory cost of using the
--filenames option since uninteresting filenames are no longer stored.
@jmurty
Copy link
Contributor Author

jmurty commented Jun 29, 2014

Thanks for the feedback @abraithwaite.

I have just added a small improvement to the proposed feature to clear out uninteresting filenames during processing, which should massively reduce the memory consumption. Or at least use no more memory than is really necessary to store all the filenames of interest.

Are you able to test this improvement against a medium- or large-size repository to see if it survives?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried this yet, but it looks like this part will fail on files with spaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants