Skip to content

Conversation

@JacekLach
Copy link

Incase of DS v2 InputMetrics are not updated

Before Fix
inputMetrics

After Fix we can see that Input Size / Records is updated in the UI
image

InputMetrics like bytesread and recordread should be updated

Authored-by: sandeep katta [email protected]
Signed-off-by: Wenchen Fan [email protected]


Cherry-pick conflicts:

  • sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala
    only import statements, as upstream reorganized packages
  • sql/core/src/test/scala/org/apache/spark/sql/execution/DataSourceScanExecRedactionSuite.scala
    The test in original PR sets a SQLConf to empty string that doesnt exist yet.
    Will need to be reintroduced after pulling from master.

Upstream SPARK-XXXXX ticket and PR link (if not applicable, explain)

https://issues.apache.org/jira/browse/SPARK-30362
apache#27021

What changes were proposed in this pull request?

DataSource RDDs now report records read (and bytes read based on filesystem stats). Previously those metrics were not present.

How was this patch tested?

unit tests from upstream PR

sandeep-katta and others added 2 commits May 13, 2020 14:06
Incase of DS v2 InputMetrics are not updated

**Before Fix**
![inputMetrics](https://user-images.githubusercontent.com/35216143/71501010-c216df00-288d-11ea-8522-fdd50b13eae1.png)

**After Fix** we can see that `Input Size / Records` is updated in the UI
![image](https://user-images.githubusercontent.com/35216143/71501000-b88d7700-288d-11ea-92fe-a727b2b79908.png)

InputMetrics like bytesread and recordread should be updated

No

Added UT and also verified manually

Closes apache#27021 from sandeep-katta/dsv2inputmetrics.

Authored-by: sandeep katta <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>

----

Cherry-pick conflicts:
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala
  only import statements, as upstream reorganized packages
- sql/core/src/test/scala/org/apache/spark/sql/execution/DataSourceScanExecRedactionSuite.scala
  The test in original PR sets a SQLConf to empty string that doesnt exist yet.
  Will need to be reintroduced after pulling from master.
@JacekLach
Copy link
Author


══ Results ═════════════════════════════════════════════════════════════════════
Duration: 129.6 s

OK:       0
Failed:   0
Warnings: 0
Skipped:  2
Had test warnings or failures; see logs.
[error] running /home/circleci/project/R/run-tests.sh ; received return code 255

I don't know how to parse this - it says 0 warnings 0 failed and then fails the test?

@JacekLach JacekLach force-pushed the jl/SPARK-30362 branch 4 times, most recently from a057338 to d1b1ac6 Compare May 14, 2020 12:36
@rshkv rshkv merged this pull request into palantir:master May 21, 2020
rshkv pushed a commit that referenced this pull request May 21, 2020
* [SPARK-30362][CORE] Update InputMetrics in DataSourceRDD

Incase of DS v2 InputMetrics are not updated

**Before Fix**
![inputMetrics](https://user-images.githubusercontent.com/35216143/71501010-c216df00-288d-11ea-8522-fdd50b13eae1.png)

**After Fix** we can see that `Input Size / Records` is updated in the UI
![image](https://user-images.githubusercontent.com/35216143/71501000-b88d7700-288d-11ea-92fe-a727b2b79908.png)

InputMetrics like bytesread and recordread should be updated

No

Added UT and also verified manually

Closes apache#27021 from sandeep-katta/dsv2inputmetrics.

Authored-by: sandeep katta <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>

----

Cherry-pick conflicts:
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala
  only import statements, as upstream reorganized packages
- sql/core/src/test/scala/org/apache/spark/sql/execution/DataSourceScanExecRedactionSuite.scala
  The test in original PR sets a SQLConf to empty string that doesnt exist yet.
  Will need to be reintroduced after pulling from master.

* waitUntilEmpty requires a timeout until 5e92301

Co-authored-by: sandeep katta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants