Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
82fd38d
[SPARK-5200] Disable web UI in Hive ThriftServer tests
JoshRosen Jan 12, 2015
ef9224e
[SPARK-5102][Core]subclass of MapStatus needs to be registered with Kryo
lianhuiwang Jan 12, 2015
13e610b
SPARK-4159 [BUILD] Addendum: improve running of single test after ena…
srowen Jan 12, 2015
a3978f3
[SPARK-5078] Optionally read from SPARK_LOCAL_HOSTNAME
marmbrus Jan 12, 2015
aff49a3
SPARK-5172 [BUILD] spark-examples-***.jar shades a wrong Hadoop distr…
srowen Jan 12, 2015
3aed305
[SPARK-4999][Streaming] Change storeInBlockManager to false by default
jerryshao Jan 12, 2015
5d9fa55
[SPARK-5049][SQL] Fix ordering of partition columns in ParquetTableScan
marmbrus Jan 12, 2015
1e42e96
[SPARK-5138][SQL] Ensure schema can be inferred from a namedtuple
mulby Jan 13, 2015
f7741a9
[SPARK-5006][Deploy]spark.port.maxRetries doesn't work
WangTaoTheTonic Jan 13, 2015
9dea64e
[SPARK-4697][YARN]System properties should override environment varia…
WangTaoTheTonic Jan 13, 2015
39e333e
[SPARK-5131][Streaming][DOC]: There is a discrepancy in WAL implement…
uncleGen Jan 13, 2015
8ead999
[SPARK-5223] [MLlib] [PySpark] fix MapConverter and ListConverter in …
Jan 13, 2015
6463e0b
[SPARK-4912][SQL] Persistent tables for the Spark SQL data sources api
yhuai Jan 13, 2015
14e3f11
[SPARK-5168] Make SQLConf a field rather than mixin in SQLContext
rxin Jan 13, 2015
f996909
[SPARK-5123][SQL] Reconcile Java/Scala API for data types.
rxin Jan 14, 2015
d5eeb35
[SPARK-5167][SQL] Move Row into sql package and make it usable for Java.
rxin Jan 14, 2015
a3f7421
[SPARK-5248] [SQL] move sql.types.decimal.Decimal to sql.types.Decimal
adrian-wang Jan 14, 2015
81f72a0
[SPARK-5211][SQL]Restore HiveMetastoreTypes.toDataType
yhuai Jan 14, 2015
38bdc99
[SQL] some comments fix for GROUPING SETS
adrian-wang Jan 14, 2015
5840f54
[SPARK-2909] [MLlib] [PySpark] SparseVector in pyspark now supports i…
MechCoder Jan 14, 2015
9d4449c
[SPARK-5228][WebUI] Hide tables for "Active Jobs/Completed Jobs/Faile…
sarutak Jan 14, 2015
259936b
[SPARK-4014] Add TaskContext.attemptNumber and deprecate TaskContext.…
JoshRosen Jan 14, 2015
2fd7f72
[SPARK-5235] Make SQLConf Serializable
alexbaretta Jan 14, 2015
76389c5
[SPARK-5234][ml]examples for ml don't have sparkContext.stop
Jan 14, 2015
13d2406
[SPARK-5254][MLLIB] Update the user guide to position spark.ml better
mengxr Jan 15, 2015
cfa397c
[SPARK-5193][SQL] Tighten up SQLContext API
rxin Jan 15, 2015
6abc45e
[SPARK-5254][MLLIB] remove developers section from spark.ml guide
mengxr Jan 15, 2015
4b325c7
[SPARK-5193][SQL] Tighten up HiveContext API
rxin Jan 15, 2015
3c8650c
[SPARK-5224] [PySpark] improve performance of parallelize list/ndarray
Jan 15, 2015
1881431
[SPARK-5274][SQL] Reconcile Java and Scala UDFRegistration.
rxin Jan 16, 2015
65858ba
[Minor] Fix tiny typo in BlockManager
sarutak Jan 16, 2015
96c2c71
[SPARK-4857] [CORE] Adds Executor membership events to SparkListener
Jan 16, 2015
a79a9f9
[SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
Jan 16, 2015
2be82b1
[SPARK-1507][YARN]specify # cores for ApplicationMaster
WangTaoTheTonic Jan 16, 2015
e200ac8
[SPARK-5201][CORE] deal with int overflow in the ParallelCollectionRD…
advancedxy Jan 16, 2015
f6b852a
[DOCS] Fix typo in return type of cogroup
srowen Jan 16, 2015
e8422c5
[SPARK-5231][WebUI] History Server shows wrong job submission time.
sarutak Jan 16, 2015
ecf943d
[WebUI] Fix collapse of WebUI layout
sarutak Jan 16, 2015
d05c9ee
[SPARK-4923][REPL] Add Developer API to REPL to allow re-publishing t…
Jan 16, 2015
fd3a8a1
[SPARK-733] Add documentation on use of accumulators in lazy transfor…
Jan 16, 2015
ee1c1f3
[SPARK-4937][SQL] Adding optimization to simplify the And, Or condit…
scwf Jan 16, 2015
61b427d
[SPARK-5193][SQL] Remove Spark SQL Java-specific API.
rxin Jan 17, 2015
f3bfc76
[SQL][minor] Improved Row documentation.
rxin Jan 17, 2015
c1f3c27
[SPARK-4937][SQL] Comment for the newly optimization rules in `Boolea…
scwf Jan 17, 2015
6999910
[SPARK-5096] Use sbt tasks instead of vals to get hadoop version
marmbrus Jan 18, 2015
e7884bc
[SQL][Minor] Added comments and examples to explain BooleanSimplifica…
rxin Jan 18, 2015
e12b5b6
MAINTENANCE: Automated closing of pull requests.
pwendell Jan 18, 2015
ad16da1
[HOTFIX]: Minor clean up regarding skipped artifacts in build files.
pwendell Jan 18, 2015
1727e08
[SPARK-5279][SQL] Use java.math.BigDecimal as the exposed Decimal type.
rxin Jan 18, 2015
1a200a3
[SQL][Minor] Update sql doc according to data type APIs changes
scwf Jan 18, 2015
1955645
[SQL][minor] Put DataTypes.java in java dir.
rxin Jan 19, 2015
7dbf1fd
[SQL] fix typo in class description
Jan 19, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-5167][SQL] Move Row into sql package and make it usable for Java.
Mostly just moving stuff around. This should still be source compatible since we type aliased Row previously in org.apache.spark.sql.Row.

Added the following APIs to Row:
```scala
def getMap[K, V](i: Int): scala.collection.Map[K, V]
def getJavaMap[K, V](i: Int): java.util.Map[K, V]
def getSeq[T](i: Int): Seq[T]
def getList[T](i: Int): java.util.List[T]
def getStruct(i: Int): StructType
```

Author: Reynold Xin <[email protected]>

Closes apache#4030 from rxin/sql-row and squashes the following commits:

6c85c29 [Reynold Xin] Fixed style violation by adding a new line to Row.scala.
82b064a [Reynold Xin] [SPARK-5167][SQL] Move Row into sql package and make it usable for Java.
  • Loading branch information
rxin committed Jan 14, 2015
commit d5eeb35167e1ab72fab7778757163ff0aacaef2c
34 changes: 34 additions & 0 deletions sql/catalyst/src/main/java/org/apache/spark/sql/RowFactory.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql;

import org.apache.spark.sql.catalyst.expressions.GenericRow;

/**
* A factory class used to construct {@link Row} objects.
*/
public class RowFactory {

/**
* Create a {@link Row} from an array of values. Position i in the array becomes position i
* in the created {@link Row} object.
*/
public static Row create(Object[] values) {
return new GenericRow(values);
}
}
240 changes: 240 additions & 0 deletions sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql

import org.apache.spark.sql.catalyst.expressions.GenericRow


object Row {
/**
* This method can be used to extract fields from a [[Row]] object in a pattern match. Example:
* {{{
* import org.apache.spark.sql._
*
* val pairs = sql("SELECT key, value FROM src").rdd.map {
* case Row(key: Int, value: String) =>
* key -> value
* }
* }}}
*/
def unapplySeq(row: Row): Some[Seq[Any]] = Some(row)

/**
* This method can be used to construct a [[Row]] with the given values.
*/
def apply(values: Any*): Row = new GenericRow(values.toArray)

/**
* This method can be used to construct a [[Row]] from a [[Seq]] of values.
*/
def fromSeq(values: Seq[Any]): Row = new GenericRow(values.toArray)
}


/**
* Represents one row of output from a relational operator. Allows both generic access by ordinal,
* which will incur boxing overhead for primitives, as well as native primitive access.
*
* It is invalid to use the native primitive interface to retrieve a value that is null, instead a
* user must check `isNullAt` before attempting to retrieve a value that might be null.
*
* To create a new Row, use [[RowFactory.create()]] in Java or [[Row.apply()]] in Scala.
*
* A [[Row]] object can be constructed by providing field values. Example:
* {{{
* import org.apache.spark.sql._
*
* // Create a Row from values.
* Row(value1, value2, value3, ...)
* // Create a Row from a Seq of values.
* Row.fromSeq(Seq(value1, value2, ...))
* }}}
*
* A value of a row can be accessed through both generic access by ordinal,
* which will incur boxing overhead for primitives, as well as native primitive access.
* An example of generic access by ordinal:
* {{{
* import org.apache.spark.sql._
*
* val row = Row(1, true, "a string", null)
* // row: Row = [1,true,a string,null]
* val firstValue = row(0)
* // firstValue: Any = 1
* val fourthValue = row(3)
* // fourthValue: Any = null
* }}}
*
* For native primitive access, it is invalid to use the native primitive interface to retrieve
* a value that is null, instead a user must check `isNullAt` before attempting to retrieve a
* value that might be null.
* An example of native primitive access:
* {{{
* // using the row from the previous example.
* val firstValue = row.getInt(0)
* // firstValue: Int = 1
* val isNull = row.isNullAt(3)
* // isNull: Boolean = true
* }}}
*
* Interfaces related to native primitive access are:
*
* `isNullAt(i: Int): Boolean`
*
* `getInt(i: Int): Int`
*
* `getLong(i: Int): Long`
*
* `getDouble(i: Int): Double`
*
* `getFloat(i: Int): Float`
*
* `getBoolean(i: Int): Boolean`
*
* `getShort(i: Int): Short`
*
* `getByte(i: Int): Byte`
*
* `getString(i: Int): String`
*
* In Scala, fields in a [[Row]] object can be extracted in a pattern match. Example:
* {{{
* import org.apache.spark.sql._
*
* val pairs = sql("SELECT key, value FROM src").rdd.map {
* case Row(key: Int, value: String) =>
* key -> value
* }
* }}}
*
* @group row
*/
trait Row extends Seq[Any] with Serializable {
def apply(i: Int): Any

/** Returns the value at position i. If the value is null, null is returned. */
def get(i: Int): Any = apply(i)

/** Checks whether the value at position i is null. */
def isNullAt(i: Int): Boolean

/**
* Returns the value at position i as a primitive int.
* Throws an exception if the type mismatches or if the value is null.
*/
def getInt(i: Int): Int

/**
* Returns the value at position i as a primitive long.
* Throws an exception if the type mismatches or if the value is null.
*/
def getLong(i: Int): Long

/**
* Returns the value at position i as a primitive double.
* Throws an exception if the type mismatches or if the value is null.
*/
def getDouble(i: Int): Double

/**
* Returns the value at position i as a primitive float.
* Throws an exception if the type mismatches or if the value is null.
*/
def getFloat(i: Int): Float

/**
* Returns the value at position i as a primitive boolean.
* Throws an exception if the type mismatches or if the value is null.
*/
def getBoolean(i: Int): Boolean

/**
* Returns the value at position i as a primitive short.
* Throws an exception if the type mismatches or if the value is null.
*/
def getShort(i: Int): Short

/**
* Returns the value at position i as a primitive byte.
* Throws an exception if the type mismatches or if the value is null.
*/
def getByte(i: Int): Byte

/**
* Returns the value at position i as a String object.
* Throws an exception if the type mismatches or if the value is null.
*/
def getString(i: Int): String

/**
* Return the value at position i of array type as a Scala Seq.
* Throws an exception if the type mismatches.
*/
def getSeq[T](i: Int): Seq[T] = apply(i).asInstanceOf[Seq[T]]

/**
* Return the value at position i of array type as [[java.util.List]].
* Throws an exception if the type mismatches.
*/
def getList[T](i: Int): java.util.List[T] = {
scala.collection.JavaConversions.seqAsJavaList(getSeq[T](i))
}

/**
* Return the value at position i of map type as a Scala Map.
* Throws an exception if the type mismatches.
*/
def getMap[K, V](i: Int): scala.collection.Map[K, V] = apply(i).asInstanceOf[Map[K, V]]

/**
* Return the value at position i of array type as a [[java.util.Map]].
* Throws an exception if the type mismatches.
*/
def getJavaMap[K, V](i: Int): java.util.Map[K, V] = {
scala.collection.JavaConversions.mapAsJavaMap(getMap[K, V](i))
}

/**
* Return the value at position i of struct type as an [[Row]] object.
* Throws an exception if the type mismatches.
*/
def getStruct(i: Int): Row = getAs[Row](i)

/**
* Returns the value at position i.
* Throws an exception if the type mismatches.
*/
def getAs[T](i: Int): T = apply(i).asInstanceOf[T]

override def toString(): String = s"[${this.mkString(",")}]"

/**
* Make a copy of the current [[Row]] object.
*/
def copy(): Row

/** Returns true if there are any NULL values in this row. */
def anyNull: Boolean = {
val l = length
var i = 0
while (i < l) {
if (isNullAt(i)) { return true }
i += 1
}
false
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,10 @@ package org.apache.spark.sql.catalyst
*/
package object expressions {

type Row = org.apache.spark.sql.Row

val Row = org.apache.spark.sql.Row

/**
* Converts a [[Row]] to another Row given a sequence of expression that define each column of the
* new row. If the schema of the input row is specified, then the given expression will be bound
Expand Down
Loading