Skip to content

Conversation

@anuvedverma
Copy link

@anuvedverma anuvedverma commented Dec 5, 2025

(cherry picked from commit 984bf78)

Brings Spark 3.3 patch #40 into our Spark 3.5 fork, with a couple small modifications:

  • to get build working, we set spark.sql.sources.readPartitionWithSubdirectory.enabled to false by default (then set to true in spark wrapper https://github.com/lyft/spark-private/pull/1575)
  • OSS Spark had removed some listLeafDirStatuses and listLeafDirStatuses from SparkHadoopUtil.scala -- this PR brings them back for our fork

This patch is needed because there are a large number of tables with nested subdirectories (eg. most tables under feature schema, generated by featureservice) -- see thread for more context.

Eventually, we want to fix underlying tables that rely on nested subdirectories, and get rid of this patch & config entirely. But fixing these tables will take a longer dedicated effort. Created backlog ticket BATCHCMP-459 for this effort in the future.

@github-actions github-actions bot added the SQL label Dec 5, 2025
@semgrep-code-lyft
Copy link

Legal Risk

The following dependencies were released under a license that
has been flagged by your organization for consideration.

Recommendation

While merging is not directly blocked, it's best to pause and consider what it means to use this license before continuing. If you are unsure, reach out to your security team or Semgrep admin to address this issue.

EPL-1.0

EPL-2.0

GPL-2.0

LGPL-2.1

@anuvedverma anuvedverma force-pushed the BATCHCMP-379-support-subdirectory-reads branch from c60b6f9 to dcd8c6d Compare December 5, 2025 08:09
@github-actions github-actions bot added the CORE label Dec 5, 2025
@anuvedverma anuvedverma force-pushed the BATCHCMP-379-support-subdirectory-reads branch from dcd8c6d to 4d917dc Compare December 5, 2025 08:13
@semgrep-code-lyft
Copy link

Legal Risk

The following dependencies were released under a license that
has been flagged by your organization for consideration.

Recommendation

While merging is not directly blocked, it's best to pause and consider what it means to use this license before continuing. If you are unsure, reach out to your security team or Semgrep admin to address this issue.

EPL-1.0

EPL-2.0

@anuvedverma anuvedverma marked this pull request as ready for review December 6, 2025 00:13
@anuvedverma anuvedverma merged commit 51989e6 into v3.5.6-lyft Dec 6, 2025
79 of 86 checks passed
@anuvedverma anuvedverma deleted the BATCHCMP-379-support-subdirectory-reads branch December 6, 2025 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants