Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b693209
Ready for Pull request
Nov 11, 2014
f378e16
[SPARK-3974] Block Matrix Abstractions ready
Nov 11, 2014
aa8f086
[SPARK-3974] Additional comments added
Nov 11, 2014
589fbb6
[SPARK-3974] Code review feedback addressed
Nov 14, 2014
19c17e8
[SPARK-3974] Changed blockIdRow and blockIdCol
Nov 14, 2014
b05aabb
[SPARK-3974] Updated tests to reflect changes
brkyvz Nov 14, 2014
645afbe
[SPARK-3974] Pull latest master
brkyvz Nov 14, 2014
49b9586
[SPARK-3974] Updated testing utils from master
brkyvz Nov 14, 2014
d033861
[SPARK-3974] Removed SubMatrixInfo and added constructor without part…
brkyvz Nov 15, 2014
9ae85aa
[SPARK-3974] Made partitioner a variable inside BlockMatrix instead o…
brkyvz Nov 20, 2014
ab6cde0
[SPARK-3974] Modifications cleaning code up, making size calculation …
brkyvz Jan 14, 2015
ba414d2
[SPARK-3974] fixed frobenius norm
brkyvz Jan 14, 2015
239ab4b
[SPARK-3974] Addressed @jkbradley's comments
brkyvz Jan 19, 2015
1e8bb2a
[SPARK-3974] Change return type of cache and persist
brkyvz Jan 20, 2015
1a63b20
[SPARK-3974] Remove setPartition method. Isn't required
brkyvz Jan 20, 2015
eebbdf7
preliminary changes addressing code review
brkyvz Jan 21, 2015
f9d664b
updated API and modified partitioning scheme
brkyvz Jan 21, 2015
1694c9e
almost finished addressing comments
brkyvz Jan 27, 2015
140f20e
Merge branch 'master' of github.com:apache/spark into SPARK-3974
brkyvz Jan 27, 2015
5eecd48
fixed gridPartitioner and added tests
brkyvz Jan 27, 2015
24ec7b8
update grid partitioner
mengxr Jan 28, 2015
e1d3ee8
minor updates
mengxr Jan 28, 2015
feb32a7
update tests
mengxr Jan 28, 2015
a8eace2
Merge pull request #2 from mengxr/brkyvz-SPARK-3974
brkyvz Jan 28, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-3974] Changed blockIdRow and blockIdCol
  • Loading branch information
Burak Yavuz committed Nov 14, 2014
commit 19c17e8d1594a3f7bd5a973a09b341de3a1c857a
Original file line number Diff line number Diff line change
Expand Up @@ -28,27 +28,27 @@ import org.apache.spark.util.Utils
/**
* Represents a local matrix that makes up one block of a distributed BlockMatrix
*
* @param blockIdRow The row index of this block
* @param blockIdCol The column index of this block
* @param blockRowIndex The row index of this block
* @param blockColIndex The column index of this block
* @param mat The underlying local matrix
*/
case class SubMatrix(blockIdRow: Int, blockIdCol: Int, mat: DenseMatrix) extends Serializable
case class SubMatrix(blockRowIndex: Int, blockColIndex: Int, mat: DenseMatrix) extends Serializable

/**
* Information of the submatrices of the BlockMatrix maintained on the driver
*
* @param partitionId The id of the partition the block is found in
* @param blockIdRow The row index of this block
* @param blockIdCol The column index of this block
* @param blockRowIndex The row index of this block
* @param blockColIndex The column index of this block
* @param startRow The starting row index with respect to the distributed BlockMatrix
* @param numRows The number of rows in this block
* @param startCol The starting column index with respect to the distributed BlockMatrix
* @param numCols The number of columns in this block
*/
case class SubMatrixInfo(
partitionId: Int,
blockIdRow: Int,
blockIdCol: Int,
blockRowIndex: Int,
blockColIndex: Int,
startRow: Long,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. BlockPartitionInfo -> SubmatrixInfo?

Can startRow be determined from block row index plus the partitioner? Are we going to support irregular grids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question... I left it in there for the following reason, not sure if it's going to be ever required, but in the end, I thought it would be good to cover corner cases such as irregular grids:
Assume you have a matrix A with dimensions 280 x d. Assume each SubMatrix has a dimension 30 x d/3. The last row will consist of SubMatrices 10 x d/3.
Then you vertically append a Matrix B, with dimensions n x d. Then you're left with an irregular grid.
Maybe vertical concatenation is not as common as horizontal concatenation, but being ready to support such operations seems beneficial for users.

numRows: Int,
startCol: Long,
Expand Down Expand Up @@ -228,7 +228,7 @@ class BlockMatrix(
// collect may cause akka frameSize errors
val blockStartRowColsParts = matrixRDD.mapPartitionsWithIndex { case (partId, iter) =>
iter.map { case (id, block) =>
((block.blockIdRow, block.blockIdCol), (partId, block.mat.numRows, block.mat.numCols))
((block.blockRowIndex, block.blockColIndex), (partId, block.mat.numRows, block.mat.numCols))
}
}.collect()
val blockStartRowCols = blockStartRowColsParts.sortBy(_._1)
Expand Down Expand Up @@ -283,9 +283,9 @@ class BlockMatrix(
private def keyBy(part: BlockMatrixPartitioner = partitioner): RDD[(Int, SubMatrix)] = {
rdd.map { block =>
part match {
case r: RowBasedPartitioner => (block.blockIdRow, block)
case c: ColumnBasedPartitioner => (block.blockIdCol, block)
case g: GridPartitioner => (block.blockIdRow + numRowBlocks * block.blockIdCol, block)
case r: RowBasedPartitioner => (block.blockRowIndex, block)
case c: ColumnBasedPartitioner => (block.blockColIndex, block)
case g: GridPartitioner => (block.blockRowIndex + numRowBlocks * block.blockColIndex, block)
case _ => throw new IllegalArgumentException("Unrecognized partitioner")
}
}
Expand All @@ -304,7 +304,7 @@ class BlockMatrix(

/** Collect the distributed matrix on the driver. */
def collect(): DenseMatrix = {
val parts = rdd.map(x => ((x.blockIdRow, x.blockIdCol), x.mat)).
val parts = rdd.map(x => ((x.blockRowIndex, x.blockColIndex), x.mat)).
collect().sortBy(x => (x._1._2, x._1._1))
val nRows = numRows().toInt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd check numRows and numCols here before converting to Int and throw an error if the matrix is too large.

val nCols = numCols().toInt
Expand Down