[SPARK-7943][SQL][WIP]support DataFrame created by hiveContext can create table to specific database by saveAstable #6494

baishuo · 2015-05-29T09:47:04Z

No description provided.

SparkQA · 2015-05-29T11:50:27Z

Test build #33731 has finished for PR 6494 at commit 8939610.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

@deprecated

Author: Cheng Lian <[email protected]> Closes apache#6529 from liancheng/schemardd-deprecation-fix and squashes the following commits: 49765c2 [Cheng Lian] Adds @deprecated Scaladoc entry for SchemaRDD

…tly. Author: Reynold Xin <[email protected]> This patch had conflicts when merged, resolved by Committer: Reynold Xin <[email protected]> Closes apache#6527 from rxin/covariant-equals and squashes the following commits: e7d7784 [Reynold Xin] [SPARK-7975] Enforce CovariantEqualsChecker

Author: Reynold Xin <[email protected]> Closes apache#6533 from rxin/whitespace-2 and squashes the following commits: 038314c [Reynold Xin] [SPARK-3850] Trim trailing spaces for core.

Author: Reynold Xin <[email protected]> Closes apache#6530 from rxin/trim-whitespace-1 and squashes the following commits: 7b7b3a0 [Reynold Xin] Reset again. dc14597 [Reynold Xin] Reset scalastyle. cd556c4 [Reynold Xin] YARN, Kinesis, Flume. 4223fe1 [Reynold Xin] [SPARK-3850] Trim trailing spaces for examples/streaming.

Author: Reynold Xin <[email protected]> Closes apache#6535 from rxin/whitespace-sql and squashes the following commits: de50316 [Reynold Xin] [SPARK-3850] Trim trailing spaces for SQL.

Author: Reynold Xin <[email protected]> Closes apache#6536 from rxin/structural-type-checker and squashes the following commits: f833151 [Reynold Xin] Fixed compilation. 633f9a1 [Reynold Xin] Fixed typo. d1fa804 [Reynold Xin] [SPARK-7979] Enforce structural type checker.

Add license for dagre-d3 and graphlib-dot Author: zsxwing <[email protected]> Closes apache#6539 from zsxwing/LICENSE and squashes the following commits: 82b0475 [zsxwing] Add license for dagre-d3 and graphlib-dot

Author: Reynold Xin <[email protected]> Closes apache#6534 from rxin/whitespace-mllib and squashes the following commits: 38926e3 [Reynold Xin] [SPARK-3850] Trim trailing spaces for MLlib.

add save load for examples: KMeansModel PowerIterationClusteringModel Word2VecModel IsotonicRegressionModel Author: Yuhao Yang <[email protected]> Closes apache#6498 from hhbyyh/docSaveLoad and squashes the following commits: 7f9f06d [Yuhao Yang] add missing imports c604cad [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into docSaveLoad 1dd77cc [Yuhao Yang] update document with some missing save/load

Author: Reynold Xin <[email protected]> Closes apache#6541 from rxin/trailing-whitespace-on and squashes the following commits: f72ebe4 [Reynold Xin] [SPARK-3850] Turn style checker on for trailing whitespaces.

Author: Sun Rui <[email protected]> Closes apache#6183 from sun-rui/SPARK-7227 and squashes the following commits: dd6f5b3 [Sun Rui] Rename readEnv() back to readMap(). Add alias na.omit() for dropna(). 41cf725 [Sun Rui] [SPARK-7227][SPARKR] Support fillna / dropna in R DataFrame.

PySpark SQL's `readerwriter` and `window` doctests weren't being run by our test runner script; this patch re-enables them. Author: Josh Rosen <[email protected]> Closes apache#6542 from JoshRosen/enable-more-pyspark-sql-tests and squashes the following commits: 9f46ce4 [Josh Rosen] Enable PySpark SQL readerwriter and window tests.

(1) rules that we enforce. (2) rules that we would like to enforce, but haven't cleaned up the codebase to turn on yet (or we need to make the scalastyle rule more configurable). (3) rules that we don't want to enforce. Author: Reynold Xin <[email protected]> Closes apache#6543 from rxin/scalastyle and squashes the following commits: beefaab [Reynold Xin] [SPARK-7986] Split scalastyle config into 3 sections.

Author: Davies Liu <[email protected]> Closes apache#6532 from davies/decimal and squashes the following commits: c7fcbce [Davies Liu] Update tests.py 1425359 [Davies Liu] DecimalType should not be singleton

… numeric type is broken. The origin code has several problems: * `true <=> 1` will return false as we didn't set a rule to handle it. * `true = a` where `a` is not `Literal` and its value is 1, will return false as we only handle literal values. Author: Wenchen Fan <[email protected]> Closes apache#6505 from cloud-fan/tmp1 and squashes the following commits: 77f0f39 [Wenchen Fan] minor fix b6401ba [Wenchen Fan] add type coercion for CaseKeyWhen and address comments ebc8c61 [Wenchen Fan] use SQLTestUtils and If 625973c [Wenchen Fan] improve 9ba2130 [Wenchen Fan] address comments fc0d741 [Wenchen Fan] fix style 2846a04 [Wenchen Fan] fix 7952

Also cut trailing whitespaces. Author: Reynold Xin <[email protected]> Closes apache#6548 from rxin/readme and squashes the following commits: 630efc3 [Reynold Xin] Update README to include DataFrames and zinc.

…ata receiving pwendell tdas Author: Nishkam Ravi <[email protected]> Author: nishkamravi2 <[email protected]> Author: nravi <[email protected]> Closes apache#6544 from nishkamravi2/master_nravi and squashes the following commits: 46e8c03 [Nishkam Ravi] Slight modification to streaming docs

Increase the duration and timeout in streaming python tests. Author: Davies Liu <[email protected]> Closes apache#6239 from davies/flaky_tests and squashes the following commits: d6aee8f [Davies Liu] fix window tests 26317f7 [Davies Liu] Merge branch 'master' of github.com:apache/spark into flaky_tests 7947db6 [Davies Liu] fix streaming flaky tests

This PR adds a section in the user guide for `VectorAssembler` with code examples in Python/Java/Scala. It also adds a unit test in Java. jkbradley Author: Xiangrui Meng <[email protected]> Closes apache#6556 from mengxr/SPARK-7584 and squashes the following commits: 11313f6 [Xiangrui Meng] simplify Java example 0cd47f3 [Xiangrui Meng] update user guide fd36292 [Xiangrui Meng] update Java unit test ce61ca0 [Xiangrui Meng] add Java unit test for VectorAssembler e399942 [Xiangrui Meng] scala/python example code

…) to prevent leaking of actors StreamingContext.start() can throw exception because DStream.validateAtStart() fails (say, checkpoint directory not set for StateDStream). But by then JobScheduler, JobGenerator, and ReceiverTracker has already started, along with their actors. But those cannot be shutdown because the only way to do that is call StreamingContext.stop() which cannot be called as the context has not been marked as ACTIVE. The solution in this PR is to stop the internal scheduler if start throw exception, and mark the context as STOPPED. Author: Tathagata Das <[email protected]> Closes apache#6559 from tdas/SPARK-7958 and squashes the following commits: 20b2ec1 [Tathagata Das] Added synchronized 790b617 [Tathagata Das] Handled exception in StreamingContext.start()

Currently if a bad log type if specified, then we get blank. We should provide a more informative error message.

This prevents the spark.jars from being cleared while using `--packages` or `--jars` cc pwendell davies brkyvz Author: Shivaram Venkataraman <[email protected]> Closes apache#6568 from shivaram/SPARK-8028 and squashes the following commits: 3a9cf1f [Shivaram Venkataraman] Use addJar instead of setJars in SparkR This prevents the spark.jars from being cleared

…l for pairs that don't appear Author: Reynold Xin <[email protected]> Closes apache#6566 from rxin/crosstab and squashes the following commits: e0ace1c [Reynold Xin] [SPARK-7982][SQL] DataFrame.stat.crosstab should use 0 instead of null for pairs that don't appear

Author: Reynold Xin <[email protected]> Closes apache#6565 from rxin/alias and squashes the following commits: 286d880 [Reynold Xin] [SPARK-8026][SQL] Add Column.alias to Scala/Java DataFrame API

Also use that profile in create-release.sh cc pwendell -- Note that this means that we need `knitr` and `roxygen` installed on the machines used for building the release. Let me know if you need help with that. Author: Shivaram Venkataraman <[email protected]> Closes apache#6567 from shivaram/SPARK-8027 and squashes the following commits: 8dc8ecf [Shivaram Venkataraman] Add maven profile to build R package docs Also use that profile in create-release.sh

…freqItem API Author: Reynold Xin <[email protected]> Closes apache#6569 from rxin/freqItemsWarning and squashes the following commits: 7eec145 [Reynold Xin] [minor doc] Add exploratory data analysis warning for DataFrame.stat.freqItem API.

…onstructed too early https://issues.apache.org/jira/browse/SPARK-8020 Author: Yin Huai <[email protected]> Closes apache#6563 from yhuai/SPARK-8020 and squashes the following commits: 4e5addc [Yin Huai] style bf766c6 [Yin Huai] Failed test. 0398f5b [Yin Huai] First populate the SQLConf and then construct executionHive and metadataHive.

…ve get constructed too early" This reverts commit 91f6be8.

…treaming methods Scala `deprecated` annotation actually doesn't show up in JavaDoc. Author: zsxwing <[email protected]> Closes apache#6564 from zsxwing/SPARK-8025 and squashes the following commits: 2faa2bb [zsxwing] Add JavaDoc style deprecation for deprecated Streaming methods

when i test the following code: hiveContext.sql("""use testdb""") val df = (1 to 3).map(i => (i, s"val_$i", i * 2)).toDF("a", "b", "c") df.write .format("parquet") .mode(SaveMode.Overwrite) .saveAsTable("ttt3") hiveContext.sql("show TABLES in default") found that the table ttt3 will be created under the database "default" Author: baishuo <[email protected]> Closes apache#6695 from baishuo/SPARK-8516-use-database and squashes the following commits: 9e155f9 [baishuo] remove no use comment cb9f027 [baishuo] modify testcase 00a7a2d [baishuo] modify testcase 4df48c7 [baishuo] modify testcase b742e69 [baishuo] modify testcase 3d19ad9 [baishuo] create table to specific database

chenghao-intel adrian-wang Author: dragonli <[email protected]> Author: zhichao.li <[email protected]> Closes apache#6838 from zhichao-li/positive and squashes the following commits: e1032a0 [dragonli] remove useless import and refactor code 624d438 [zhichao.li] add positive identify function

The problem occurs because the position mask `0xEFFFFFF` is incorrect. It has zero 25th bit, so when capacity grows beyond 2^24, `OpenHashMap` calculates incorrect index of value in `_values` array. I've also added a size check in `rehash()`, so that it fails instead of reporting invalid item indices. Author: Vyacheslav Baranov <[email protected]> Closes apache#6763 from SlavikBaranov/SPARK-8309 and squashes the following commits: 8557445 [Vyacheslav Baranov] Resolved review comments 4d5b954 [Vyacheslav Baranov] Resolved review comments eaf1e68 [Vyacheslav Baranov] Fixed failing test f9284fd [Vyacheslav Baranov] Resolved review comments 3920656 [Vyacheslav Baranov] SPARK-8309: Support for more than 12M items in OpenHashMap

JIRA: https://issues.apache.org/jira/browse/SPARK-7199 Author: Liang-Chi Hsieh <[email protected]> Closes apache#5984 from viirya/add_date_timestamp and squashes the following commits: 7f21ce9 [Liang-Chi Hsieh] For comment. 0b89698 [Liang-Chi Hsieh] Add timestamp to settableFieldTypes. c30d490 [Liang-Chi Hsieh] Use default IntUnsafeColumnWriter and LongUnsafeColumnWriter. 672ef17 [Liang-Chi Hsieh] Remove getter/setter for Date and Timestamp and use Int and Long for them. 9f3e577 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 281e844 [Liang-Chi Hsieh] Fix scala style. fb532b5 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 80af342 [Liang-Chi Hsieh] Fix compiling error. f4f5de6 [Liang-Chi Hsieh] Fix scala style. a463e83 [Liang-Chi Hsieh] Use Long to store timestamp for rows. 635388a [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 46946c6 [Liang-Chi Hsieh] Adapt for moved DateUtils. b16994e [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 752251f [Liang-Chi Hsieh] Support setDate. Fix failed test. fcf8db9 [Liang-Chi Hsieh] Add functions for Date and Timestamp to SpecificRow. e42a809 [Liang-Chi Hsieh] Fix style. 4c07b57 [Liang-Chi Hsieh] Add date and timestamp support to UnsafeRow.

MatrixUDT was recently coded in scala. This has been ported to PySpark Author: MechCoder <[email protected]> Closes apache#6354 from MechCoder/spark-6390 and squashes the following commits: fc4dc1e [MechCoder] Better error message c940a44 [MechCoder] Added test aa9c391 [MechCoder] Add pyUDT to MatrixUDT 62a2a7d [MechCoder] [SPARK-6390] Port MatrixUDT to PySpark

All, this is a first attempt at refactoring `dev/run-tests` into Python. Initially I merely converted all Bash calls over to Python, then moved to a much more modular approach (more functions, moved the calls around, etc.). What is here is the initial culmination and should provide a great base to various downstream issues (e.g. SPARK-7016, modularize / parallelize testing, etc.). Would love comments / suggestions for this initial first step! /cc srowen pwendell nchammas Author: Brennon York <[email protected]> Closes apache#5694 from brennonyork/SPARK-7017 and squashes the following commits: 154ed73 [Brennon York] updated finding java binary if JAVA_HOME not set 3922a85 [Brennon York] removed necessary passed in variable f9fbe54 [Brennon York] reverted doc test change 8135518 [Brennon York] removed the test check for documentation changes until jenkins can get updated 05d435b [Brennon York] added check for jekyll install 22edb78 [Brennon York] add check if jekyll isn't installed on the path 2dff136 [Brennon York] fixed pep8 whitespace errors 767a668 [Brennon York] fixed path joining issues, ensured docs actually build on doc changes c42cf9a [Brennon York] unpack set operations with splat (*) fb85a41 [Brennon York] fixed minor set bug 0379833 [Brennon York] minor doc addition to print the changed modules aa03d9e [Brennon York] added documentation builds as a top level test component, altered high level project changes to properly execute core tests only when necessary, changed variable names for simplicity ec1ae78 [Brennon York] minor name changes, bug fixes b7c72b9 [Brennon York] reverting streaming context 03fdd7b [Brennon York] fixed the tuple () wraps around example lambda 705d12e [Brennon York] changed example to comply with pep3113 supporting python3 60b3d51 [Brennon York] prepend rather than append onto PATH 7d2f5e2 [Brennon York] updated python tests to remove unused variable 2898717 [Brennon York] added a change to streaming test to check if it only runs streaming tests eb684b6 [Brennon York] fixed sbt_test_goals reference error db7ae6f [Brennon York] reverted SPARK_HOME from start of command 1ecca26 [Brennon York] fixed merge conflicts 2fcdfc0 [Brennon York] testing targte branch dump on jenkins 1f607b1 [Brennon York] finalizing revisions to modular tests 8afbe93 [Brennon York] made error codes a global 0629de8 [Brennon York] updated to refactor and remove various small bugs, removed pep8 complaints d90ab2d [Brennon York] fixed merge conflicts, ensured that for regular builds both core and sql tests always run b1248dc [Brennon York] exec python rather than running python and exiting with return code f9deba1 [Brennon York] python to python2 and removed newline 6d0a052 [Brennon York] incorporated merge conflicts with SPARK-7249 f950010 [Brennon York] removed building hive-0.12.0 per SPARK-6908 703f095 [Brennon York] fixed merge conflicts b1ca593 [Brennon York] reverted the sparkR test afeb093 [Brennon York] updated to make sparkR test fail 1dada6b [Brennon York] reverted pyspark test failure 9a592ec [Brennon York] reverted mima exclude issue, added pyspark test failure d825aa4 [Brennon York] revert build break, add mima break f041d8a [Brennon York] added space from commented import to now test build breaking 983f2a2 [Brennon York] comment out import to fail build test 2386785 [Brennon York] Merge remote-tracking branch 'upstream/master' into SPARK-7017 76335fb [Brennon York] reverted rat license issue for sparkconf e4a96cc [Brennon York] removed the import error and added license error, fixed the way run-tests and run-tests.py report their error codes 56d3cb9 [Brennon York] changed test back and commented out import to break compile b37328c [Brennon York] fixed typo and added default return is no error block was found in the environment 7613558 [Brennon York] updated to return the proper env variable for return codes a5bd445 [Brennon York] reverted license, changed test in shuffle to fail 803143a [Brennon York] removed license file for SparkContext b0b2604 [Brennon York] comment out import to see if build fails and returns properly 83e80ef [Brennon York] attempt at better python output when called from bash c095fa6 [Brennon York] removed another wait() call 26e18e8 [Brennon York] removed unnecessary wait() 07210a9 [Brennon York] minor doc string change for java version with namedtuple update ec03bf3 [Brennon York] added namedtuple for java version to add readability 2cb413b [Brennon York] upcased global variables, changes various calling methods from check_output to check_call 639f1e9 [Brennon York] updated with pep8 rules, fixed minor bugs, added run-tests file in bash to call the run-tests.py script 3c53a1a [Brennon York] uncomment the scala tests :) 6126c4f [Brennon York] refactored run-tests into python

…hildren For example large IN clauses Large IN clauses are parsed very slowly. For example SQL below (10K items in IN) takes 45-50s. s"""SELECT * FROM Person WHERE ForeName IN ('${(1 to 10000).map("n" + _).mkString("','")}')""" This is principally due to TreeNode which repeatedly call contains on children, where children in this case is a List that is 10K long. In effect parsing for large IN clauses is O(N squared). A lazily initialised Set based on children for contains reduces parse time to around 2.5s Author: Michael Davies <[email protected]> Closes apache#6673 from MickDavies/SPARK-8077 and squashes the following commits: 38cd425 [Michael Davies] SPARK-8077: Optimization for TreeNodes with large numbers of children d80103b [Michael Davies] SPARK-8077: Optimization for TreeNodes with large numbers of children e6be8be [Michael Davies] SPARK-8077: Optimization for TreeNodes with large numbers of children

start-slave.sh no longer takes a worker # param in 1.4+ Author: Sean Owen <[email protected]> Closes apache#6855 from srowen/SPARK-8395 and squashes the following commits: 300278e [Sean Owen] start-slave.sh no longer takes a worker # param in 1.4+

to make it easier to start & stop http servers in sbt https://issues.apache.org/jira/browse/SPARK-6782 Author: Imran Rashid <[email protected]> Closes apache#5426 from squito/SPARK-6782 and squashes the following commits: dc4fb19 [Imran Rashid] add sbt-revolved plugin, to make it easier to start & stop http servers in sbt

… in non-binary expression of HiveTypeCoercion 1. Given a query `select coalesce(null, 1, '1') from dual` will cause exception: java.lang.RuntimeException: Could not determine return type of Coalesce for IntegerType,StringType 2. Given a query: `select case when true then 1 else '1' end from dual` will cause exception: java.lang.RuntimeException: Types in CASE WHEN must be the same or coercible to a common type: StringType != IntegerType I checked the code, the main cause is the HiveTypeCoercion doesn't do implicit convert when there is a IntegerType and StringType. Numeric types can be promoted to string type Hive will always do this implicit conversion. Author: OopsOutOfMemory <[email protected]> Closes apache#6551 from OopsOutOfMemory/pnts and squashes the following commits: 7a209d7 [OopsOutOfMemory] rebase master 6018613 [OopsOutOfMemory] convert function to method 4cd5618 [OopsOutOfMemory] limit the data type to primitive type df365d2 [OopsOutOfMemory] refine 95cbd58 [OopsOutOfMemory] fix style 403809c [OopsOutOfMemory] promote non-string to string when can not found tighestCommonTypeOfTwo

…rnalBlockStore is initialized externalBlockStoreInitialized is never set to be true, which causes the blocks stored in ExternalBlockStore can not be removed. Author: Mingfei <[email protected]> Closes apache#6702 from shimingfei/SetTrue and squashes the following commits: add61d8 [Mingfei] Set externalBlockStoreInitialized to be true, after ExternalBlockStore is initialized

…on not started The history server may show an incorrect App ID for an incomplete application like <App ID>.inprogress. This app info will never disappear even after the app is completed. ![incorrectappinfo](https://cloud.githubusercontent.com/assets/9278199/8156147/2a10fdbe-137d-11e5-9620-c5b61d93e3c1.png) The cause of the issue is that a log path name is used as the app id when app id cannot be got during replay. Author: Carson Wang <[email protected]> Closes apache#6827 from carsonwang/SPARK-8372 and squashes the following commits: cdbb089 [Carson Wang] Fix code style 3e46b35 [Carson Wang] Update code style 90f5dde [Carson Wang] Add a unit test d8c9cd0 [Carson Wang] Replaying events only return information when app is started

… calling sum on an empty RDD This PR fixes the sum issue and also adds `emptyRDD` so that it's easy to create a test case. Author: zsxwing <[email protected]> Closes apache#6826 from zsxwing/python-emptyRDD and squashes the following commits: b36993f [zsxwing] Update the return type to JavaRDD[T] 71df047 [zsxwing] Add emptyRDD to pyspark and fix the issue when calling sum on an empty RDD

…uffe, PartitionedSerializedPairBuffer and AppendOnlyMap The previous growing strategy is alway doubling the capacity. This PR adjusts the growing strategy: doubling the capacity but if overflow, use the maximum capacity as the new capacity. It increases the maximum capacity of PartitionedPairBuffer from `2 ^ 29` to `2 ^ 30 - 1`, the maximum capacity of PartitionedSerializedPairBuffer from `2 ^ 28` to `(2 ^ 29) - 1`, and the maximum capacity of AppendOnlyMap from `0.7 * (2 ^ 29)` to `(2 ^ 29)`. Author: zsxwing <[email protected]> Closes apache#6456 from zsxwing/SPARK-7913 and squashes the following commits: abcb932 [zsxwing] Address comments e30b61b [zsxwing] Increase the maximum capacity of AppendOnlyMap 05b6420 [zsxwing] Update the exception message 64fe227 [zsxwing] Increase the maximum capacity of PartitionedPairBuffer and PartitionedSerializedPairBuffer

This PR is a improvement for apache#5189. The resolution rule for ORDER BY is: first resolve based on what comes from the select clause and then fall back on its child only when this fails. There are 2 steps. First, try to resolve `Sort` in `ResolveReferences` based on select clause, and ignore exceptions. Second, try to resolve `Sort` in `ResolveSortReferences` and add missing projection. However, the way we resolve `SortOrder` is wrong. We just resolve `UnresolvedAttribute` and use the result to indicate if we can resolve `SortOrder`. But `UnresolvedAttribute` is only part of `GetField` chain(broken by `GetItem`), so we need to go through the whole chain to indicate if we can resolve `SortOrder`. With this change, we can also avoid re-throw GetField exception in `CheckAnalysis` which is little ugly. Author: Wenchen Fan <[email protected]> Closes apache#5659 from cloud-fan/order-by and squashes the following commits: cfa79f8 [Wenchen Fan] update test 3245d28 [Wenchen Fan] minor improve 465ee07 [Wenchen Fan] address comment 1fc41a2 [Wenchen Fan] fix SPARK-7067

…o the HiveConf inside executionHive.state. https://issues.apache.org/jira/browse/SPARK-8306 I will try to add a test later. marmbrus aarondav Author: Yin Huai <[email protected]> Closes apache#6758 from yhuai/SPARK-8306 and squashes the following commits: 1292346 [Yin Huai] [SPARK-8306] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state.

…the tests more reliable KafkaStreamSuite, DirectKafkaStreamSuite, JavaKafkaStreamSuite and JavaDirectKafkaStreamSuite use non-thread-safe collections to collect data in one thread and check it in another thread. It may fail the tests. This PR changes them to thread-safe collections. Note: I cannot reproduce the test failures in my environment. But at least, this PR should make the tests more reliable. Author: zsxwing <[email protected]> Closes apache#6852 from zsxwing/fix-KafkaStreamSuite and squashes the following commits: d464211 [zsxwing] Use thread-safe collections to make the tests more reliable

We encourage people to use TestHive in unit tests, because it's impossible to create more than one HiveContext within one process. The current implementation locks people into using a local[2] SparkContext underlying their HiveContext. We should make it possible to override this using a system property so that people can test against local-cluster or remote spark clusters to make their tests more realistic. Author: Punya Biswal <[email protected]> Closes apache#6844 from punya/feature/SPARK-8397 and squashes the following commits: 97ef394 [Punya Biswal] [SPARK-8397][SQL] Allow custom configuration for TestHive

baishuo · 2015-06-18T04:34:49Z

get error after rebase, close this and create a same one as this : #6868

SparkQA · 2015-06-18T06:04:20Z

Test build #35088 has finished for PR 6494 at commit 8833772.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng and others added 29 commits May 30, 2015 23:49

[SQL] [MINOR] Adds @deprecated Scaladoc entry for SchemaRDD

8764dcc

Author: Cheng Lian <[email protected]> Closes apache#6529 from liancheng/schemardd-deprecation-fix and squashes the following commits: 49765c2 [Cheng Lian] Adds @deprecated Scaladoc entry for SchemaRDD

[SPARK-3850] Trim trailing spaces for core.

74fdc97

Author: Reynold Xin <[email protected]> Closes apache#6533 from rxin/whitespace-2 and squashes the following commits: 038314c [Reynold Xin] [SPARK-3850] Trim trailing spaces for core.

[SPARK-3850] Trim trailing spaces for SQL.

63a50be

Author: Reynold Xin <[email protected]> Closes apache#6535 from rxin/whitespace-sql and squashes the following commits: de50316 [Reynold Xin] [SPARK-3850] Trim trailing spaces for SQL.

[MINOR] Add license for dagre-d3 and graphlib-dot

d1d2def

Add license for dagre-d3 and graphlib-dot Author: zsxwing <[email protected]> Closes apache#6539 from zsxwing/LICENSE and squashes the following commits: 82b0475 [zsxwing] Add license for dagre-d3 and graphlib-dot

[SPARK-3850] Trim trailing spaces for MLlib.

e1067d0

Author: Reynold Xin <[email protected]> Closes apache#6534 from rxin/whitespace-mllib and squashes the following commits: 38926e3 [Reynold Xin] [SPARK-3850] Trim trailing spaces for MLlib.

[SPARK-3850] Turn style checker on for trailing whitespaces.

866652c

Author: Reynold Xin <[email protected]> Closes apache#6541 from rxin/trailing-whitespace-on and squashes the following commits: f72ebe4 [Reynold Xin] [SPARK-3850] Turn style checker on for trailing whitespaces.

[SPARK-7978] [SQL] [PYSPARK] DecimalType should not be singleton

91777a1

Author: Davies Liu <[email protected]> Closes apache#6532 from davies/decimal and squashes the following commits: c7fcbce [Davies Liu] Update tests.py 1425359 [Davies Liu] DecimalType should not be singleton

Update README to include DataFrames and zinc.

3c01568

Also cut trailing whitespaces. Author: Reynold Xin <[email protected]> Closes apache#6548 from rxin/readme and squashes the following commits: 630efc3 [Reynold Xin] Update README to include DataFrames and zinc.

[MINOR] [UI] Improve error message on log page

15d7c90

Currently if a bad log type if specified, then we get blank. We should provide a more informative error message.

[SPARK-8026][SQL] Add Column.alias to Scala/Java DataFrame API

89f642a

Author: Reynold Xin <[email protected]> Closes apache#6565 from rxin/alias and squashes the following commits: 286d880 [Reynold Xin] [SPARK-8026][SQL] Add Column.alias to Scala/Java DataFrame API

Revert "[SPARK-8020] Spark SQL in spark-defaults.conf make metadataHi…

75dda33

…ve get constructed too early" This reverts commit 91f6be8.

baishuo and others added 27 commits June 16, 2015 16:40

Closes apache#6850.

e3de14d

[HOTFIX] [PROJECT-INFRA] Fix bug in dev/run-tests for MLlib-only PRs

165f52f

modify for rebase

e3a51bb

modify for rebase2

ef0eb17

add test case

6397c0a

correct style

34f010d

modify testcase

58ce973

modfiy for rebase 3

8cb69f3

modify for rebase 4

8833772

baishuo closed this Jun 18, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-7943][SQL][WIP]support DataFrame created by hiveContext can create table to specific database by saveAstable #6494

[SPARK-7943][SQL][WIP]support DataFrame created by hiveContext can create table to specific database by saveAstable #6494

Uh oh!

baishuo commented May 29, 2015

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

baishuo commented Jun 18, 2015

Uh oh!

SparkQA commented Jun 18, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[SPARK-7943][SQL][WIP]support DataFrame created by hiveContext can create table to specific database by saveAstable #6494

[SPARK-7943][SQL][WIP]support DataFrame created by hiveContext can create table to specific database by saveAstable #6494

Uh oh!

Conversation

baishuo commented May 29, 2015

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

baishuo commented Jun 18, 2015

Uh oh!

SparkQA commented Jun 18, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants