Skip to content

Conversation

@mkesselaers
Copy link

What changes were proposed in this pull request?

Changed the Strings from MegaBytes to MebiBytes (and likewise for other sizes)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Changed the Strings from MegaBytes to MebiBytes (and likewise for other sizes)
@srowen
Copy link
Member

srowen commented Jun 7, 2017

No, this is wrong. The function is still computing megabytes, etc.
Did you see my comment in JIRA? there is more to this change.

Changed the calculation of bytesToString towards MebiBytes instead of Megabytes
Changed the result string to indicate MebiBytes
@mkesselaers
Copy link
Author

@srowen , my apologies, I must have misunderstood you.
I changed the calculation to be MebiBytes, but afterwards saw in the tests that the calculation was already correct
test("bytesToString") { assert(Utils.bytesToString(10) === "10.0 B") assert(Utils.bytesToString(1500) === "1500.0 B") assert(Utils.bytesToString(2000000) === "1953.1 KB") assert(Utils.bytesToString(2097152) === "2.0 MB") assert(Utils.bytesToString(2306867) === "2.2 MB") assert(Utils.bytesToString(5368709120L) === "5.0 GB") assert(Utils.bytesToString(5L * (1L << 40)) === "5.0 TB") assert(Utils.bytesToString(5L * (1L << 50)) === "5.0 PB") assert(Utils.bytesToString(5L * (1L << 60)) === "5.0 EB") assert(Utils.bytesToString(BigInt(1L << 11) * (1L << 60)) === "2.36E+21 B") }

Can you please help me with the one?

Update the test to take into account the newly returned String
@srowen
Copy link
Member

srowen commented Jun 7, 2017

Those tests are wrong, in that 2000000 bytes is not "1953.1 KB". It is "1953.1 KiB". And so on. The numbers should stay the same to make sure behavior doesn't actually change, but the labels should be fixed.

The utils.js correctly computed MB, etc, but, it should compute MiB and label it as such, for consistency. This was the original issue.

formatBytes should calculate MiB instead of MB.
@mkesselaers
Copy link
Author

OK, the tests are fixed to take this into account.
BytesToString, now correctly output "KiB" instead of "KB" and utils.js now also calculates the correct value.

There is still a small difference between BytesToString and FormatBytes, because the first only goes towards a higher size when it is at least 2 MiB, fe.
Should we adapt this as well?

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's looking good. It may be worth searching the code for any other instances of "KB" to see if anything else is formatting numbers inconsistently.

I think this is a safe fix because it is cosmetic.

@mkesselaers
Copy link
Author

@srowen ,
I propose to rewrite the variable names of bytesToString as well, to be consistent.
Other occurences of KB can be found in the ivy-report.xsl and in the ByteSuffixes of JavaUtils.
Since there seems to be a lot of logic behind the JavaUtils, I would not change this.

Changed the variable names to reflect the difference between MegaBytes en MebiBytes
@mkesselaers
Copy link
Author

@srowen , I changed the variable names for bytesToString.
As far as I see, this should be everything.

@SparkQA
Copy link

SparkQA commented Jun 7, 2017

Test build #3783 has finished for PR 18229 at commit 71c2bbe.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 7, 2017

@mkesselaers looks like there is at least one more test testing the formatted string that will need to be updated

Fixed filling test
@mkesselaers
Copy link
Author

@srowen, you're right. I changed the strings there as well.

@SparkQA
Copy link

SparkQA commented Jun 8, 2017

Test build #3784 has finished for PR 18229 at commit dd9cc4d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 9, 2017

@kishorvpatil @wzhfy do you have any thoughts on this? it touches some code you created

@wzhfy
Copy link
Contributor

wzhfy commented Jun 9, 2017

I agree with changing the base 1000 to 1024. About MiB and MB, strictly speaking, they are different. But I think in the computing technology world, most people say Megabyte (MB) instead of Mebibytes even though they know the difference. MiB may seem rare for many users.
That's only my personal thoughts, I don't have strong disagreement on this.

@jiangxb1987
Copy link
Contributor

@mkesselaers are you still working on this?

@kiszk
Copy link
Member

kiszk commented Oct 6, 2017

gentle ping @mkesselaers

@vanzin vanzin mentioned this pull request May 11, 2018
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon
Copy link
Member

gentle ping @mkesselaers

zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
Closes apache#17422
Closes apache#17619
Closes apache#18034
Closes apache#18229
Closes apache#18268
Closes apache#17973
Closes apache#18125
Closes apache#18918
Closes apache#19274
Closes apache#19456
Closes apache#19510
Closes apache#19420
Closes apache#20090
Closes apache#20177
Closes apache#20304
Closes apache#20319
Closes apache#20543
Closes apache#20437
Closes apache#21261
Closes apache#21726
Closes apache#14653
Closes apache#13143
Closes apache#17894
Closes apache#19758
Closes apache#12951
Closes apache#17092
Closes apache#21240
Closes apache#16910
Closes apache#12904
Closes apache#21731
Closes apache#21095

Added:
Closes apache#19233
Closes apache#20100
Closes apache#21453
Closes apache#21455
Closes apache#18477

Added:
Closes apache#21812
Closes apache#21787

Author: hyukjinkwon <[email protected]>

Closes apache#21781 from HyukjinKwon/closing-prs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants