Skip to content

Conversation

@anujmodi2021
Copy link
Contributor

PR in trunk: #7614
Commit CP'd: 810c42f
JIRA: https://issues.apache.org/jira/browse/HADOOP-19543

Description of PR

On FNS-Blob, the List Blobs API is known to return duplicate entries for non-empty explicit directories. One entry corresponds to the directory itself, and another corresponds to the marker blob that the driver internally creates and maintains to mark that path as a directory. We already know about this behaviour, and it was handled to remove such duplicate entries from the set of entries that were returned as part of current list iterations.

Due to a possible partition split, if such duplicate entries happen to be returned in separate iterations, there is no handling on this, and the caller might get back the result with duplicate entries, as happened in this case. The logic to remove duplicates was designed before the realization of the partition split.

This PR fixes this bug

How was this patch tested?

A new test for the failing scenario was added and existing test suite was ran to validate changes across all combinations.

…t Listing Across Iterations (apache#7614)

Contributed by Anuj Modi
Reviewed by Anmol Asrani, Manish Bhatt, Manika Joshi

Signed off by Anuj Modi<[email protected]>
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 6m 48s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-3.4 Compile Tests _
+1 💚 mvninstall 22m 25s branch-3.4 passed
+1 💚 compile 0m 25s branch-3.4 passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 20s branch-3.4 passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 23s branch-3.4 passed
+1 💚 mvnsite 0m 27s branch-3.4 passed
+1 💚 javadoc 0m 27s branch-3.4 passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 22s branch-3.4 passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 46s branch-3.4 passed
+1 💚 shadedclient 19m 38s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 20s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 20s the patch passed
+1 💚 compile 0m 17s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 17s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 14s the patch passed
+1 💚 mvnsite 0m 21s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 18s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 40s the patch passed
+1 💚 shadedclient 19m 6s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 1s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 24s The patch does not generate ASF License warnings.
77m 18s
Subsystem Report/Notes
Docker ClientAPI=1.49 ServerAPI=1.49 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7632/1/artifact/out/Dockerfile
GITHUB PR #7632
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 6ddd2eb46e65 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.4 / 2f3cf45
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7632/1/testReport/
Max. process+thread count 558 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7632/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@anujmodi2021 anujmodi2021 changed the title HADOOP-19543. [ABFS][FnsOverBlob] Remove Duplicates from Blob Endpoin… HADOOP-19543. [ABFS][FnsOverBlob] Remove Duplicates from Blob Endpoint Listing Across Iterations Apr 18, 2025
@anujmodi2021
Copy link
Contributor Author

============================================================
HNS-OAuth-DFS

[ERROR] testRenamePathRetryIdempotency(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemRename) Time elapsed: 19.281 s <<< ERROR!

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 3
[ERROR] Tests run: 810, Failures: 0, Errors: 1, Skipped: 166
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 31
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 23

============================================================
HNS-SharedKey-DFS

[ERROR] testRenamePathRetryIdempotency(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemRename) Time elapsed: 15.421 s <<< ERROR!

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 4
[ERROR] Tests run: 810, Failures: 0, Errors: 1, Skipped: 116
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 31
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 10

============================================================
NonHNS-SharedKey-DFS

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 794, Failures: 0, Errors: 0, Skipped: 366
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 32
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 11

============================================================
AppendBlob-HNS-OAuth-DFS

[ERROR] testRenamePathRetryIdempotency(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemRename) Time elapsed: 16.647 s <<< ERROR!

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 3
[ERROR] Tests run: 810, Failures: 0, Errors: 1, Skipped: 177
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 55
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 23

============================================================
NonHNS-SharedKey-Blob

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 794, Failures: 0, Errors: 0, Skipped: 292
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 27
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 11

============================================================
NonHNS-OAuth-DFS

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 794, Failures: 0, Errors: 0, Skipped: 371
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 32
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 24

============================================================
NonHNS-OAuth-Blob

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 794, Failures: 0, Errors: 0, Skipped: 297
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 27
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 24

============================================================
AppendBlob-NonHNS-OAuth-Blob

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 794, Failures: 0, Errors: 0, Skipped: 317
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 51
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 24

============================================================
HNS-Oauth-DFS-IngressBlob

[ERROR] testRenamePathRetryIdempotency(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemRename) Time elapsed: 18.999 s <<< ERROR!

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 3
[ERROR] Tests run: 810, Failures: 0, Errors: 1, Skipped: 299
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 31
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 23

============================================================
NonHNS-OAuth-DFS-IngressBlob

[WARNING] Tests run: 175, Failures: 0, Errors: 0, Skipped: 10
[WARNING] Tests run: 794, Failures: 0, Errors: 0, Skipped: 369
[WARNING] Tests run: 181, Failures: 0, Errors: 0, Skipped: 32
[WARNING] Tests run: 272, Failures: 0, Errors: 0, Skipped: 24

@anujmodi2021 anujmodi2021 merged commit a13f530 into apache:branch-3.4 Apr 18, 2025
3 checks passed
@anujmodi2021 anujmodi2021 deleted the branch-3.4_listDup branch April 29, 2025 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants