gh-121171: Fix zip64 extract version in local headers#121177
gh-121171: Fix zip64 extract version in local headers#121177danifus wants to merge 3 commits intopython:mainfrom
Conversation
If the total archive size or file count in the archive exceeded the zip64 thresholds, the zip64 minimum extract version was not being written to the local header. The central header had the correct version
|
@jaraco are you able to do a review on this bug fix? Thanks! |
| self.fp.seek(self.start_dir) | ||
| zinfo.header_offset = self.fp.tell() | ||
| # exceptions raised in _writecheck if _allowZip64 is False | ||
| zip64 = ( |
There was a problem hiding this comment.
you could |= here and save a line
| # exceptions raised in _writecheck if _allowZip64 is False | ||
| zip64 = ( | ||
| zip64 | ||
| or zinfo.header_offset > ZIP64_LIMIT |
There was a problem hiding this comment.
I explicitly approve of the following line being >= because it makes having the extra unambiguous, you might make this line also >= for the same reason. I think the functionality lost by being able to produce this precise file with _allowZip64=False is minimal.
There was a problem hiding this comment.
I don't know if this was copypasted from the central directory logic; it would be nice if they match as this is fixing the inconsistency.
There was a problem hiding this comment.
Yeah, I copied the conditionals from:
cpython/Lib/zipfile/__init__.py
Lines 1860 to 1868 in 182e035
(filesize is handled a few lines above the added code)
This is the line with the different operator that existed before this PR:
cpython/Lib/zipfile/__init__.py
Line 2065 in 182e035
Good catch. I'd be happy either way as long as we use the same one everywhere.
Given ZIP_FILECOUNT_LIMIT = (1 << 16) - 1 (1 less than the actual limit of 0xFFFF) I would be inclined to change if to > to be consistent with how all the other checks against ZIP64_LIMIT and ZIP_FILECOUNT_LIMIT are written (we also get to cram one more file in :p) but I don't feel that strongly about it :)
There was a problem hiding this comment.
I don't follow the rest of the logic,
>>> hex((1<<16) - 1)
'0xffff'
There was a problem hiding this comment.
ah whoops. (1<<16) - 1 is not one less than the limit - it is the limit :p
I still have a small preference that it should be changed to > so that we can store the maximum possible files allowed without the zip64 extra field and for consistency with all the other checks against ZIP64_LIMIT and ZIP_FILECOUNT_LIMITZIP_MAX_COMMENT that use >.
|
This PR is stale because it has been open for 30 days with no activity. |
Fix: Wrong extract version set in local headers for files located beyond zip64 limit #121171