Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
d7a06b8
Updated SparkConf class to add getOrCreate method. Started test suite…
Apr 13, 2015
a99032f
Spacing fix
Apr 14, 2015
e92caf7
[SPARK-6703] Added test to ensure that getOrCreate both allows creati…
Apr 14, 2015
8be2f83
Replaced match with if
Apr 14, 2015
733ec9f
Fixed some bugs in test code
Apr 14, 2015
dfec4da
Changed activeContext to AtomicReference
Apr 14, 2015
0e1567c
Got rid of unecessary option for AtomicReference
Apr 14, 2015
15e8dea
Updated comments and added MiMa Exclude
Apr 14, 2015
270cfe3
[SPARK-6703] Documentation fixes
Apr 14, 2015
cb0c6b7
Doc updates and code cleanup
Apr 14, 2015
8c884fa
Made getOrCreate synchronized
Apr 14, 2015
1dc0444
Added ref equality check
Apr 14, 2015
db9a963
Closing second spark context
Apr 17, 2015
5390fd9
Merge remote-tracking branch 'upstream/master' into SPARK-5932
Apr 18, 2015
09ea450
[SPARK-5932] Added byte string conversion to Jav utils
Apr 18, 2015
747393a
[SPARK-5932] Added unit tests for ByteString conversion
Apr 18, 2015
a9f4fcf
[SPARK-5932] Added unit tests for unit conversion
Apr 18, 2015
851d691
[SPARK-5932] Updated memoryStringToMb to use new interfaces
Apr 18, 2015
475370a
[SPARK-5932] Simplified ByteUnit code, switched to using longs. Updat…
Apr 18, 2015
0cdff35
[SPARK-5932] Updated to use bibibytes in method names. Updated spark.…
Apr 18, 2015
b809a78
[SPARK-5932] Updated spark.kryoserializer.buffer.max
Apr 18, 2015
eba4de6
[SPARK-5932] Updated spark.shuffle.file.buffer.kb
Apr 18, 2015
1fbd435
[SPARK-5932] Updated spark.broadcast.blockSize
Apr 18, 2015
2d15681
[SPARK-5932] Updated spark.executor.logs.rolling.size.maxBytes
Apr 18, 2015
ae7e9f6
[SPARK-5932] Updated spark.io.compression.snappy.block.size
Apr 18, 2015
afc9a38
[SPARK-5932] Updated spark.broadcast.blockSize and spark.storage.memo…
Apr 18, 2015
7a6c847
[SPARK-5932] Updated spark.shuffle.file.buffer
Apr 18, 2015
5d29f90
[SPARK-5932] Finished documentation updates
Apr 18, 2015
928469e
[SPARK-5932] Converted some longs to ints
Apr 18, 2015
35a7fa7
Minor formatting
Apr 18, 2015
0f4443e
Merge remote-tracking branch 'upstream/master' into SPARK-5932
Apr 18, 2015
f15f209
Fixed conversion of kryo buffer size
Apr 19, 2015
f32bc01
[SPARK-5932] Fixed error in API in SparkConf.scala where Kb conversio…
Apr 19, 2015
69e2f20
Updates to code
Apr 21, 2015
54b78b4
Simplified byteUnit class
Apr 21, 2015
c7803cd
Empty lines
Apr 21, 2015
fe286b4
Resolved merge conflict
Apr 21, 2015
d3d09b6
[SPARK-5932] Fixing error in KryoSerializer
Apr 21, 2015
84a2581
Added smoother handling of fractional values for size parameters. Thi…
Apr 21, 2015
8b43748
Fixed error in pattern matching for doubles
Apr 21, 2015
e428049
resolving merge conflict
Apr 22, 2015
3dfae96
Fixed some nits. Added automatic conversion of old paramter for kryos…
Apr 22, 2015
22413b1
Made MAX private
Apr 22, 2015
9ee779c
Simplified fraction matches
Apr 22, 2015
852a407
[SPARK-5932] Added much improved overflow handling. Can now handle si…
Apr 23, 2015
fc85733
Got rid of floating point math
Apr 24, 2015
2ab886b
Scala style
Apr 24, 2015
49a8720
Whitespace fix
Apr 24, 2015
11f6999
Nit fixes
Apr 24, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fixed some nits. Added automatic conversion of old paramter for kryos…
…erializer.mb to new values.
  • Loading branch information
Ilya Ganelin committed Apr 22, 2015
commit 3dfae96ee8f594e4136c6916326ef7e0fe70b4be
5 changes: 3 additions & 2 deletions core/src/main/scala/org/apache/spark/SparkConf.scala
Original file line number Diff line number Diff line change
Expand Up @@ -507,8 +507,9 @@ private[spark] object SparkConf extends Logging {
translation = s => s"${s.toLong * 10}s")),
"spark.reducer.maxSizeInFlight" -> Seq(
AlternateConfig("spark.reducer.maxMbInFlight", "1.4")),
"spark.kryoserializer.buffer" -> Seq(
AlternateConfig("spark.kryoserializer.buffer.mb", "1.4")),
"spark.kryoserializer.buffer" ->
Seq(AlternateConfig("spark.kryoserializer.buffer.mb", "1.4",
translation = s => s"${s.toDouble * 1000}k")),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This automatic translation may throw a NumberFormatException if someone tries to use the .mb parameter as "64k" (e.g. the correct new format). Is that a case we should be concerned with? There will be enough warnings and errors thrown for them to readily track down the problem and fix the erroneous config so this should be ok but want to confirm that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since you're adding the new config, it's fine to not allow the old config take the new style. If you really want to support that, you could try to parse the config using the new API in an exception handler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very small nit, but I would put the Seq( on L510 to be consistent with the rest.

"spark.kryoserializer.buffer.max" -> Seq(
AlternateConfig("spark.kryoserializer.buffer.max.mb", "1.4")),
"spark.shuffle.file.buffer" -> Seq(
Expand Down
2 changes: 1 addition & 1 deletion core/src/main/scala/org/apache/spark/util/Utils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1082,7 +1082,7 @@ private[spark] object Utils extends Logging {
def memoryStringToMb(str: String): Int = {
// Convert to bytes, rather than directly to MB, because when no units are specified the unit
// is assumed to be bytes
(JavaUtils.byteStringAsBytes(str) / 1024.0d / 1024.0d).toInt
(JavaUtils.byteStringAsBytes(str) / 1024 / 1024).toInt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: you could drop the JavaUtils. prefix in the call.

}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@

public enum ByteUnit {
BYTE (1),
KiB (1024l),
MiB ((long) Math.pow(1024l, 2l)),
GiB ((long) Math.pow(1024l, 3l)),
TiB ((long) Math.pow(1024l, 4l)),
PiB ((long) Math.pow(1024l, 5l));
KiB (1024L),
MiB ((long) Math.pow(1024L, 2L)),
GiB ((long) Math.pow(1024L, 3L)),
TiB ((long) Math.pow(1024L, 4L)),
PiB ((long) Math.pow(1024L, 5L));

private ByteUnit(long multiplier) {
this.multiplier = multiplier;
Expand All @@ -39,25 +39,19 @@ public long convert(long d, ByteUnit u) {
return toBytes(d) / u.multiplier;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to be really correct here, you could avoid overflows by playing with the multipliers instead of converting things to bytes first.

I think what's bugging me is that the semantics of all these methods are a little weird. It seems like you're trying to cap the maximum amount to be represented to Long.MAX_VALUE bytes (so that having Long.MAX_VALUE PB, for example, would be wrong since you can't convert that to bytes). I'm not sure that's needed, but if you want that, it should be enforced differently (and not here). Otherwise, I'd rework these methods to avoid overflows where possible, and throw exceptions when they would happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw your comment about using double - I don't think that's a great idea because doubles lose precision as you try to work with values at different orders of magniture.

Regarding the last paragraph of my comment above, I don't think it's going to be an issue in practice; but the code here can be changed to at least avoid overflows where possible. I checked j.u.c.TimeUnit, used in the time functions in this class, and it seems to follow the approach you took, than when an overflow is inevitable it caps the value at Long.MAX_VALUE. So that part is fine.

}

public long toBytes(long d) { return x(d, multiplier); }
public long toBytes(long d) {
if (d == 0) { return 0; }
long over = MAX / d;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem worth it to have a private MAX just for this one use. Use Long.MAX_VALUE.

if (d > over) return Long.MAX_VALUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I feel like it would be better to throw an exception instead of truncating.

if (d < -over) return Long.MIN_VALUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative byte counts sound a little weird, but well. Same comment as above, though.

return d * multiplier;
}
public long toKiB(long d) { return convert(d, KiB); }
public long toMiB(long d) { return convert(d, MiB); }
public long toGiB(long d) { return convert(d, GiB); }
public long toTiB(long d) { return convert(d, TiB); }
public long toPiB(long d) { return convert(d, PiB); }

long multiplier = 0;
private long multiplier = 0;
static final long MAX = Long.MAX_VALUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used anywhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to still keep these in case they were used down the line. Would you recommend getting rid of them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want this to be available, then it should be public. But I'd avoid adding it until it's actually needed.


/**
* Scale d by m, checking for overflow.
* This has a short name to make above code more readable.
*/
static long x(long d, long m) {
if (d == 0) { return 0; }
long over = MAX / d;
if (d > over) return Long.MAX_VALUE;
if (d < -over) return Long.MIN_VALUE;
return d * m;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ private static long parseByteString(String str, ByteUnit unit) {
Matcher m = Pattern.compile("([0-9]+)([a-z]+)?").matcher(lower);
Matcher fractionMatcher = Pattern.compile("([0-9]*\\.[0-9]*)([a-z]+)?").matcher(lower);

if(m.matches()) {
if (m.matches()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

if (!m.matches()) {
  throw...
}

String suffix = ...;
long val = ...;

long val = Long.parseLong(m.group(1));
String suffix = m.group(2);

Expand Down