Skip to content

Conversation

@jgonz120
Copy link
Contributor

Bug

Fixes: NuGet/Home#12715

Regression? Last working version:

Description

Second PR in this series, this one focuses on the PackageSpec, which is required to complete the LockFile conversion.

FileFormatException

Updated this file to allow us to create this exception without having to provide the line or column where the error occurred. Unfortunately, we do not have access to this information, it's an internal member of the JsonReaderState

JsonPackageSpecReader

Has been split into two different files, one for NewtonsoftJson and another for SystemTextJson

The logic for both of these version's is similar since we were iterating through the JSON using the JsonTextReader.

PR Checklist

  • PR has a meaningful title

  • PR has a linked issue.

  • Described changes

  • Tests

    • Automated tests added
    • OR
    • Test exception
    • OR
    • N/A
  • Documentation

    • Documentation PR or issue filled
    • OR
    • N/A

jgonz120 and others added 5 commits December 12, 2023 10:30
* Moved files over and addressed some PR comments

* added comment

* switched to true and false strings

* Added ctr to specify buffer for testing purposes.

* remove commented code

* switch to use Utf8 preamble for BOM

* Create method for checking complete

* combined code for ReadStringArray

* Updated buffer size to match STJ's default buffer size

* Switch Utf8JsonStreamReader to be disposable.

* Switch to read the value for numbers into a string directly

* revert back to using private var for utf8Bom

* Remove ReadStringArrayAsList

* Avoid referencing buffer after  returning

* Actually avoid referencing _buffer after returning

* Update how buffers are fed into Utf8JsonReader to avoid feeding extra empty data.

* remove extra line

* Reverted back to using try get int for ReadTokenAsString

* Update src/NuGet.Core/NuGet.ProjectModel/Utf8JsonStreamReader.cs

Co-authored-by: Andy Zivkovic <[email protected]>

* Remove ValueTextEquals taking in string

* Switched to Skip instead of TrySkip

* Update src/NuGet.Core/NuGet.ProjectModel/Utf8JsonStreamReader.cs

Co-authored-by: Andy Zivkovic <[email protected]>

* Added some unit tests

* fix Bom

* Switched to using Moq

* Update src/NuGet.Core/NuGet.ProjectModel/Utf8JsonStreamReader.cs

Co-authored-by: Andy Zivkovic <[email protected]>

* loop through stream when reading to ensure reading full bytes or to the end

* update signature comment

* Switched stream back to field and supress warning

---------

Co-authored-by: Andy Zivkovic <[email protected]>
@jgonz120 jgonz120 requested a review from a team as a code owner December 13, 2023 18:21
Copy link
Member

@nkolev92 nkolev92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly cleanup comments, nothing major.

I do have a suggestion for reducing the number of changes, thus making it easier to review.

Copy link
Member

@zivkan zivkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good for the "happy case". But looking at this from a library point of view, I have concerns about the error handling. However, since NuGet should be the only component that writes this file (we don't expect 3rd party tools to write the file, then NuGet to read it), maybe error handling is less important, since it should be so unlikely.

There's always a question of "doing it right" vs "good enough", but also the principal of least surprise. I'm worried about how difficult it might be to diagnose some problems if customers report bugs, and we don't realize that the json parsing code is silently ignoring JSON errors, rather than throwing exceptions and making it abundantly clear that something unexpected happened.

packageSpecPath);
}

if (jsonReader.Read())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no else for this statement. My understanding is that Read only returns false at the end of the file, in which case it means this JSON file is incomplete. Traditionally, parsers would throw an exception at the point of syntax error (like missing value following a property name), but this code appears to return an incomplete value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're right that read should only return false at the end of the file. That said Read() will throw an error if we're at the end of the file at this point. The file needs to be structured to be able to call read on the reader and be at a StartObject. I added a test to confirm this.

Utf8JsonReader throws an exception if we try to read and hit the end of file when it's several layers deep into an object already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So from a practical point of view, Read() is never expected to return false, except for potentially the last read of the root object's deserialization (but from another PR, you found that Newtosoft.Json doesn't even validate that end of an object is also the end of the file).

This makes all the if (jsonReader.Read()) conditions unnecessary nesting.

I'm wondering from an API design point of view, if Utf8JsonStreamReader's Read() method should be changed to return void. We can add a EnsureEndOfFile() if we want to check that there is no contents at the end of the root object.

On the other hand, I feel guilty when my comments feel like scope creep, so this probably really doesn't matter, as Utf8JsonStreamReader is internal, and nuget should write the files that are read, so we have very high confidence there will not be any errors in the json.

{
if (jsonReader.Read() && jsonReader.TokenType == JsonTokenType.StartObject)
{
while (jsonReader.Read() && jsonReader.TokenType == JsonTokenType.PropertyName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like other comments, I'm not sure the error handling here provides a good API experience. If jsonReader.Read() returns false, doesn't that mean the file has an error? If read succeeds, but the token type is wrong, that's deinately an error, but no error is being reported here.

@zivkan
Copy link
Member

zivkan commented Dec 18, 2023

Also, can the title of this PR please be updated to something meaningful? It confuses me every time I see it in my notifications list 😄

@jgonz120 jgonz120 changed the title Dev featurebranch jgonz120 package spec reader migration System.Text.Json Migration - Adding code to parse the PackageSpec using STJ Dec 18, 2023
Copy link
Member

@zivkan zivkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears all my feedback is relevant to the Newtonsoft.Json code, so probably out of scope for this PR.

@nkolev92 nkolev92 self-requested a review December 19, 2023 18:20
@jgonz120 jgonz120 force-pushed the dev-jgonz120-FeatureBranch-Stj-Migration branch from 22f3566 to 1731952 Compare December 20, 2023 18:22
…urebranch-jgonz120-package-spec-reader-migration
@jgonz120 jgonz120 merged commit d99864a into dev-jgonz120-FeatureBranch-Stj-Migration Dec 20, 2023
@jgonz120 jgonz120 deleted the dev-featurebranch-jgonz120-package-spec-reader-migration branch December 20, 2023 18:51
Copy link
Contributor

@kartheekp-ms kartheekp-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I apologize for providing feedback after the pull request has been merged. Given that it was merged into a feature branch, I would like you to consider these comments in your future PRs during the migration process.

});
}

[Obsolete]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An obsolete attribute allows to add a short explanation suggesting alternative workarounds. It could be great if we can add that information here. Are these methods marked as obsolete due to their usage in the Newtonsoft.Json API? If that's the case, perhaps we could add a message to inform the caller. In certain instances, the caller might be a customer of the NuGet SDK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is private so we shouldn't have any customers calling it right? The public functions have a message in them already but I can update it if you think a specific call out to STJ would be better.


if (jsonReader.TokenType == JsonTokenType.StartObject)
{
while (jsonReader.Read() && jsonReader.TokenType == JsonTokenType.PropertyName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed that the GetPackageSpec method contains several if-else blocks within this while loop, which might affect the readability and maintainability of the code. To improve this, I'm considering a refactor that involves using smaller, focused classes, each class is responsible for handling a specific property in the package spec. This way, the GetPackageSpec method can simply delegate the handling of each property to these specialized classes. One approach I'm thinking now is to combine the Strategy Pattern with the Abstract Factory Pattern. Implementing these patterns would significantly streamline the GetPackageSpec method, making it more concise and simpler to understand. Each property handler could be encapsulated in its own class, and an abstract factory could be used to instantiate the appropriate handler based on the property being processed. This would not only enhance readability but also make the codebase more modular and easier to extend or modify in the future.

interface IJsonPropertyHandler
{
    void HandleProperty(ref Utf8JsonStreamReader jsonReader, PackageSpec packageSpec);
}
class AuthorsPropertyHandler : IJsonPropertyHandler
{
    public void HandleProperty(ref Utf8JsonStreamReader jsonReader, PackageSpec packageSpec)
    {
        // Implement logic for handling Authors property
    }
}

class DescriptionPropertyHandler : IJsonPropertyHandler
{
    public void HandleProperty(ref Utf8JsonStreamReader jsonReader, PackageSpec packageSpec)
    {
        // Implement logic for handling Description property
    }

    // ... other implementations for different properties
}
interface IJsonPropertyHandlerFactory
{
    IJsonPropertyHandler GetHandler(string propertyName);
}
class JsonPropertyHandlerFactory : IJsonPropertyHandlerFactory
{
    private readonly Dictionary<string, IJsonPropertyHandler> _handlers;

    public JsonPropertyHandlerFactory()
    {
        _handlers = new Dictionary<string, IJsonPropertyHandler>
        {
            { "Authors", new AuthorsPropertyHandler() },
            { "Description", new DescriptionPropertyHandler() },
            // ... other mappings
        };
    }

    public IJsonPropertyHandler GetHandler(string propertyName)
    {
        if (_handlers.TryGetValue(propertyName, out var handler))
        {
            return handler;
        }

        return null; // Or some default handler
    }
}
internal static PackageSpec GetPackageSpec(ref Utf8JsonStreamReader jsonReader, string name, string packageSpecPath, string snapshotValue)
{
    var packageSpec = new PackageSpec();
    var handlerFactory = new JsonPropertyHandlerFactory();

    if (jsonReader.TokenType == JsonTokenType.StartObject)
    {
        while (jsonReader.Read() && jsonReader.TokenType == JsonTokenType.PropertyName)
        {
            string propertyName = jsonReader.GetString();
            var handler = handlerFactory.GetHandler(propertyName);

            if (handler != null)
            {
                handler.HandleProperty(ref jsonReader, packageSpec);
            }
            else
            {
                jsonReader.Skip(); // Skip unknown properties
            }
        }
    }

    // ... finalizing packageSpec
    return packageSpec;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this type of pattern is necessary when many of these properties are just reading a string from the stream?

jgonz120 added a commit that referenced this pull request Jan 27, 2024
…ng STJ (#5541)

Updated the logic for parsing the package spec to use STJ. Using a registry setting to determine if the code should use NJ or STJ for parsing.
jgonz120 added a commit that referenced this pull request Feb 5, 2024
…ng STJ (#5541)

Updated the logic for parsing the package spec to use STJ. Using a registry setting to determine if the code should use NJ or STJ for parsing.
jgonz120 added a commit that referenced this pull request Feb 5, 2024
…ng STJ (#5541)

Updated the logic for parsing the package spec to use STJ. Using a registry setting to determine if the code should use NJ or STJ for parsing.
jgonz120 added a commit that referenced this pull request Feb 12, 2024
Crated a new struct to parse the assets file using System.Text.Json instead of Newtonsoft. It will read the file without loading it completely in memory, reducing memory allocations. 

PRs associated with this:
* Create class for reading Json files in chunks (#5530)
* System.Text.Json Migration - Adding code to parse the PackageSpec using STJ (#5541)
* System.Text.Json Migration - Adding code to parse the Project.Assets.Json file using STJ.  (#5558)
* Create class for reading Json files in chunks (#5530)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants