Skip to content
Prev Previous commit
Next Next commit
Clarify the message further with a different exception for file which…
… is ignored
  • Loading branch information
Hirobe Keiichi committed Dec 14, 2018
commit 08850ae2f64449bae5c449e53c00fa5051479380
Original file line number Diff line number Diff line change
Expand Up @@ -554,9 +554,13 @@ case class DataSource(

// Sufficient to check head of the globPath seq for non-glob scenario
// Don't need to check once again if files exist in streaming mode
if (checkFilesExist &&
(!fs.exists(globPath.head) || InMemoryFileIndex.shouldFilterOut(globPath.head.getName))) {
throw new AnalysisException(s"Path does not exist: ${globPath.head}")
if (checkFilesExist) {
val firstPath = globPath.head
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does it make sense to check only the first file? Looks multiple files could be detected.

if (!fs.exists(firstPath)) {
throw new AnalysisException(s"Path does not exist: ${firstPath}")
} else if (InMemoryFileIndex.shouldFilterOut(firstPath.getName)) {
throw new AnalysisException(s"Path exists but is ignored: ${firstPath}")
Copy link
Member

@HyukjinKwon HyukjinKwon Dec 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing i'm not sure tho, it's going to throw an exception for, for instance,

spark.read.text("_text.txt").show()

instead of returning an empty dataframe - which is kind of a behaviour change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, looks it's going to not check children.

}
}
globPath
}.toSeq
Expand Down