-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-53077][CORE][TESTS] Add an env to control the execution of the SparkBloomFilterSuite
#51806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| import java.util.stream.Stream; | ||
|
|
||
| @Disabled("TODO(SPARK-53077): Re-enable with a resonable test time.") | ||
| @EnabledIfEnvironmentVariable(named = "SPARK_TEST_SPARK_BF_SUITE_ENABLED", matches = "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this can be a role model for this kind of environment variable group, can we have a rule?
For example, if we have a rule SPARK_TEST_${CLASSNAME}_ENABLED, this should be SPARK_TEST_SPARK_BLOOM_FILTER_SUITE_ENABLED.
Although BF is reasonable, it looks like not a rule-based name from the suite name.
.github/workflows/build_java21.yml
Outdated
| "SKIP_UNIDOC": "true", | ||
| "DEDICATED_JVM_SBT_TESTS": "org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormatV1Suite,org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormatV2Suite,org.apache.spark.sql.execution.datasources.orc.OrcSourceV1Suite,org.apache.spark.sql.execution.datasources.orc.OrcSourceV2Suite" | ||
| "DEDICATED_JVM_SBT_TESTS": "org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormatV1Suite,org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormatV2Suite,org.apache.spark.sql.execution.datasources.orc.OrcSourceV1Suite,org.apache.spark.sql.execution.datasources.orc.OrcSourceV2Suite", | ||
| "SPARK_TEST_SPARK_BF_SUITE_ENABLED": "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we remove this? Since this is a sketch module, I guess build_non_ansi.yml is enough because it's irrelevant to ANSI/non-ANSI.
|
Thank you for making an improvement, @LuciferYang . |
|
@dongjoon-hyun Thank you for your review. I'll make the revisions later. |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
SparkBloomFilterSuite
SparkBloomFilterSuiteSparkBloomFilterSuite
SparkBloomFilterSuiteSparkBloomFilterSuite
|
Merged into master. Thanks @dongjoon-hyun and @peter-toth |
What changes were proposed in this pull request?
This pr adds an environment variable named
SPARK_TEST_SPARK_BLOOM_FILTER_SUITE_ENABLEDto control whether the test caseSparkBloomFilterSuiteis executed. It also ensures that this test case is only run for validation in the daily tests specified inbuild_non_ansi.yml.Why are the changes needed?
The
SparkBloomFilterSuiterequires periodic validation, but due to its excessively long execution time (over 10 minutes), it is not suitable for execution in the Change Pipeline.Does this PR introduce any user-facing change?
No
How was this patch tested?
Manual verification:
Was this patch authored or co-authored using generative AI tooling?
No