Skip to content

gh-121171: Fix zip64 extract version in local headers#121177

Open
danifus wants to merge 3 commits intopython:mainfrom
danifus:repro_121171
Open

gh-121171: Fix zip64 extract version in local headers#121177
danifus wants to merge 3 commits intopython:mainfrom
danifus:repro_121171

Conversation

@danifus
Copy link
Copy Markdown
Contributor

@danifus danifus commented Jun 30, 2024

Fix: Wrong extract version set in local headers for files located beyond zip64 limit #121171

danifus added 2 commits June 30, 2024 16:48
If the total archive size or file count in the archive exceeded the
zip64 thresholds, the zip64 minimum extract version was not being
written to the local header. The central header had the correct version
@danifus danifus changed the title gh-121171: Add failing test for reproducing gh-121171: Fix zip64 extract version in local headers Jun 30, 2024
@danifus danifus marked this pull request as ready for review June 30, 2024 11:34
@danifus danifus closed this Jul 1, 2024
@danifus danifus reopened this Jul 1, 2024
@danifus
Copy link
Copy Markdown
Contributor Author

danifus commented Mar 31, 2025

@jaraco are you able to do a review on this bug fix? Thanks!

Comment thread Lib/zipfile/__init__.py
self.fp.seek(self.start_dir)
zinfo.header_offset = self.fp.tell()
# exceptions raised in _writecheck if _allowZip64 is False
zip64 = (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could |= here and save a line

Comment thread Lib/zipfile/__init__.py
# exceptions raised in _writecheck if _allowZip64 is False
zip64 = (
zip64
or zinfo.header_offset > ZIP64_LIMIT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly approve of the following line being >= because it makes having the extra unambiguous, you might make this line also >= for the same reason. I think the functionality lost by being able to produce this precise file with _allowZip64=False is minimal.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this was copypasted from the central directory logic; it would be nice if they match as this is fixing the inconsistency.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I copied the conditionals from:

if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:
requires_zip64 = "Files count"
elif zinfo.file_size > ZIP64_LIMIT:
requires_zip64 = "Filesize"
elif zinfo.header_offset > ZIP64_LIMIT:
requires_zip64 = "Zipfile size"
if requires_zip64:
raise LargeZipFile(requires_zip64 +
" would require ZIP64 extensions")

(filesize is handled a few lines above the added code)

This is the line with the different operator that existed before this PR:

if centDirCount > ZIP_FILECOUNT_LIMIT:

Good catch. I'd be happy either way as long as we use the same one everywhere.

Given ZIP_FILECOUNT_LIMIT = (1 << 16) - 1 (1 less than the actual limit of 0xFFFF) I would be inclined to change if to > to be consistent with how all the other checks against ZIP64_LIMIT and ZIP_FILECOUNT_LIMIT are written (we also get to cram one more file in :p) but I don't feel that strongly about it :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the rest of the logic,

>>> hex((1<<16) - 1)
'0xffff'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah whoops. (1<<16) - 1 is not one less than the limit - it is the limit :p

I still have a small preference that it should be changed to > so that we can store the maximum possible files allowed without the zip64 extra field and for consistency with all the other checks against ZIP64_LIMIT and ZIP_FILECOUNT_LIMIT ZIP_MAX_COMMENT that use >.

@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale Stale PR or inactive for long period of time. label Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting core review stale Stale PR or inactive for long period of time. tests Tests in the Lib/test dir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants