Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 3 additions & 9 deletions external/docker-integration-tests/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -130,15 +130,9 @@
<artifactId>postgresql</artifactId>
<scope>test</scope>
</dependency>
<!-- Oracle ojdbc jar, used for oracle integration suite for docker testing.
See https://github.com/apache/spark/pull/11306 for background on why we need
to use a an ojdbc jar for the testcase. The maven dependency here is commented
because currently the maven repository does not contain the ojdbc jar mentioned.
Once the jar is available in maven, this could be uncommented. -->
<dependency>
<groupId>com.oracle</groupId>
<artifactId>ojdbc6</artifactId>
<version>11.2.0.1.0</version>
<dependency>
<groupId>com.oracle.database.jdbc</groupId>
<artifactId>ojdbc8</artifactId>
<scope>test</scope>
</dependency>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu

protected val dockerIp = DockerUtils.getDockerIp()
val db: DatabaseOnDocker
val connectionTimeout = timeout(2.minutes)

private var docker: DockerClient = _
protected var externalPort: Int = _
Expand Down Expand Up @@ -155,7 +156,7 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
docker.startContainer(containerId)
jdbcUrl = db.getJdbcUrl(dockerIp, externalPort)
var conn: Connection = null
eventually(timeout(2.minutes), interval(1.second)) {
eventually(connectionTimeout, interval(1.second)) {
conn = getConnection()
}
// Run any setup queries:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ import java.math.BigDecimal
import java.sql.{Connection, Date, Timestamp}
import java.util.{Properties, TimeZone}

import org.scalatest.time.SpanSugar._

import org.apache.spark.sql.{Row, SaveMode}
import org.apache.spark.sql.execution.{RowDataSourceScanExec, WholeStageCodegenExec}
import org.apache.spark.sql.execution.datasources.LogicalRelation
Expand All @@ -31,27 +33,27 @@ import org.apache.spark.sql.types._
import org.apache.spark.tags.DockerTest

/**
* This patch was tested using the Oracle docker. Created this integration suite for the same.
* The ojdbc6-11.2.0.2.0.jar was to be downloaded from the maven repository. Since there was
* no jdbc jar available in the maven repository, the jar was downloaded from oracle site
* manually and installed in the local; thus tested. So, for SparkQA test case run, the
* ojdbc jar might be manually placed in the local maven repository(com/oracle/ojdbc6/11.2.0.2.0)
* while Spark QA test run.
*
* The following would be the steps to test this
* 1. Build Oracle database in Docker, please refer below link about how to.
* https://github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md
* 2. export ORACLE_DOCKER_IMAGE_NAME=$ORACLE_DOCKER_IMAGE_NAME
* Pull oracle $ORACLE_DOCKER_IMAGE_NAME image - docker pull $ORACLE_DOCKER_IMAGE_NAME
* 3. Start docker - sudo service docker start
* 4. Download oracle 11g driver jar and put it in maven local repo:
* (com/oracle/ojdbc6/11.2.0.2.0/ojdbc6-11.2.0.2.0.jar)
* 5. The timeout and interval parameter to be increased from 60,1 to a high value for oracle test
* in DockerJDBCIntegrationSuite.scala (Locally tested with 200,200 and executed successfully).
* 6. Run spark test - ./build/sbt "test-only org.apache.spark.sql.jdbc.OracleIntegrationSuite"
* 4. Run spark test - ./build/sbt -Pdocker-integration-tests
* "test-only org.apache.spark.sql.jdbc.OracleIntegrationSuite"
*
* An actual sequence of commands to run the test is as follows
*
* All tests in this suite are ignored because of the dependency with the oracle jar from maven
* repository.
* $ git clone https://github.com/oracle/docker-images.git
* // Head SHA: 3e352a22618070595f823977a0fd1a3a8071a83c
* $ cd docker-images/OracleDatabase/SingleInstance/dockerfiles
* $ ./buildDockerImage.sh -v 18.4.0 -x
* $ export ORACLE_DOCKER_IMAGE_NAME=oracle/database:18.4.0-xe
* $ cd $SPARK_HOME
* $ ./build/sbt -Pdocker-integration-tests
* "test-only org.apache.spark.sql.jdbc.OracleIntegrationSuite"
*
* It has been validated with 18.4.0 Express Edition.
*/
@DockerTest
class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSparkSession {
Expand All @@ -60,15 +62,19 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSpark
override val db = new DatabaseOnDocker {
override val imageName = sys.env("ORACLE_DOCKER_IMAGE_NAME")
override val env = Map(
"ORACLE_ROOT_PASSWORD" -> "oracle"
"ORACLE_PWD" -> "oracle"
)
override val usesIpc = false
override val jdbcPort: Int = 1521
override def getJdbcUrl(ip: String, port: Int): String =
s"jdbc:oracle:thin:system/oracle@//$ip:$port/xe"
}

override val connectionTimeout = timeout(7.minutes)

override def dataPreparation(conn: Connection): Unit = {
// In 18.4.0 Express Edition auto commit is enabled by default.
conn.setAutoCommit(false)
conn.prepareStatement("CREATE TABLE datetime (id NUMBER(10), d DATE, t TIMESTAMP)")
.executeUpdate()
conn.prepareStatement(
Expand Down
6 changes: 6 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -984,6 +984,12 @@
<version>8.2.2.jre8</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.oracle.database.jdbc</groupId>
<artifactId>ojdbc8</artifactId>
<version>19.6.0.0</version>
<scope>test</scope>
</dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update this root pom in this PR? It seems this change is related to the ohter PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is questionable but I think rolling this back and adding it in the next commit ends-up more work nominally. Apart from that purely personal taste but I like the dependency management way of dep handling (even if only one reference exists).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, but, if this PR is reverted for some reason after the two PRs have been merged, will the other PR break? I think each PR should be as independent as possible for backports/reverts.

Copy link
Contributor Author

@gaborgsomogyi gaborgsomogyi Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly not considered independency but I agree with the direction. I would like to understand your point but not yet able. Here we're considering on which side the dep management will be. We can consider dep management change as an additional complexity at the top of the actual code base. If we move that from this PR into the other one will make this more independent but on the other side will make the other one less independent. I see this as a tradeoff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but on the other side will make the other one less independent.

You meant #28863 can avoid the dependency for this oracle driver in the root pom? I thought you added the dependency in #28863 for adding some secure connector's tests in the core package:
https://github.com/apache/spark/pull/28863/files#diff-0b713a218f4e99665e339404f7b6c6c3R153-R157

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, I believe it's okay because now we have all SQL drivers in one place. In this root pom, we have MySQL/Maria/PostgreSQL/DB2/MsSqlServer already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think about that, @maropu ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Looks okay now. Thanks, @dongjoon-hyun and @gaborgsomogyi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you guys for the clarification.

<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
Expand Down