Skip to content

ZipArchive.GetEntry() doesn't work properly after adding 2 entries with the same path, and removing one of them #51051

@dovisutu

Description

@dovisutu

Description

After creating 2 entries with the same path (which is allowed by the standard, while not recommended) in a ZipArchive and deleting one of the entries, ZipArchive.GetEntry() w/ the path cannot retrieve the remaining entry, despite being already written into the archive, being able to be retrieved if the archive was readed afterwards.

Minimal reproduce:

using System;
using System.IO;
using System.IO.Compression;
public class Program
{
    static void Main(string[] args)
    {
        {
            using var stream = File.Create("..\\..\\..\\testZip.zip") ;
            using var archive = new ZipArchive(stream, ZipArchiveMode.Update);
            var text1 = "1234567890987654321";
            var text2 = "0987654321234567890";
            var previousEntry = archive.CreateEntry("test.txt");
            using (var writer = new StreamWriter(previousEntry.Open()))
            {
                writer.Write(text1);
                writer.Flush();
            }
            using (var writer = new StreamWriter(archive.CreateEntry("test.txt").Open()))
            {
                writer.Write(text2);
                writer.Flush();
            }
            previousEntry.Delete();
            var firstResult = archive.GetEntry("test.txt");
            Console.WriteLine("Looking for test.txt the first time...");
            ValidateResult(firstResult);
        }
        {
            using var stream = File.OpenRead("..\\..\\..\\testZip.zip");
            using var archive = new ZipArchive(stream);
            var secondResult = archive.GetEntry("test.txt");
            Console.WriteLine("Looking for test.txt the second time...");
            ValidateResult(secondResult);
        }
    }
    public static void ValidateResult(ZipArchiveEntry searchResult)
    {
        if (searchResult == null)
        {
            Console.WriteLine("Did not find test.txt in the archive.");
        }
        else
        {
            using var reader = new StreamReader(searchResult.Open());
            Console.WriteLine("Found test.txt in the archive. Content: {0}", reader.ReadToEnd());
        }
    }
}

Expected output:

Looking for test.txt the first time...
Found test.txt in the archive. Content: 0987654321234567890
Looking for test.txt the second time...
Found test.txt in the archive. Content: 0987654321234567890

Actual output:

Looking for test.txt the first time...
Did not find test.txt in the archive.
Looking for test.txt the second time...
Found test.txt in the archive. Content: 0987654321234567890

Configuration

dotnet version: 5.0.2 (runtime)
OS & version: Windows 20H2 19042.867
architecture: x64

Other information

Possible where the issue occured:
When adding an entry, ZipArchive.CreateEntry() calls ZipArchive.DoCreateEntry(), which then calls ZipArchive.AddEntry(), which looks like this:

private void AddEntry(ZipArchiveEntry entry)
{
_entries.Add(entry);
string entryName = entry.FullName;
if (!_entriesDictionary.ContainsKey(entryName))
{
_entriesDictionary.Add(entryName, entry);
}
}

Here, the attempted addition to _entriesDictionary is discarded when multiple entries with the same name occur. This, while prevented an exception from being thrown, caused a loss of information of the second entry, although it's still recorded in _entries.
When deleting an entry, ZipArchiveEntry.Delete() calls ZipArchive.RemoveEntry(), which looks like this:
public void Delete()
{
if (_archive == null)
return;
if (_currentlyOpenForWrite)
throw new IOException(SR.DeleteOpenEntry);
if (_archive.Mode != ZipArchiveMode.Update)
throw new NotSupportedException(SR.DeleteOnlyInUpdate);
_archive.ThrowIfDisposed();
_archive.RemoveEntry(this);
_archive = null!;
UnloadStreams();
}

which removes the record of an entry with such name, regardless of it being present multiple times.
When searching for an entry:
public ZipArchiveEntry? GetEntry(string entryName)
{
if (entryName == null)
throw new ArgumentNullException(nameof(entryName));
if (_mode == ZipArchiveMode.Create)
throw new NotSupportedException(SR.EntriesInCreateMode);
EnsureCentralDirectoryRead();
_entriesDictionary.TryGetValue(entryName, out ZipArchiveEntry? result);
return result;
}

Here, only the attempt to search for the entry in _entriesDictionary happens, therefore unable to retrieve the entry while being present in the archive and _entries.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions