Skip to content

Conversation

@sumwale
Copy link
Contributor

@sumwale sumwale commented Dec 1, 2016

Changes proposed in this pull request

Single column dictionary optimization to make use of dictionary indexes for a column batch instead of strings. This one is the simpler variant where an array of the dictionary size is created. This array is populated with the corresponding MapEntry object of the main HashMap on demand (i.e. lookup and fill in if found else put an EMPTY marker if not found) so other columns can be updated/fetched directly from array skipping the map completely after first miss (update for GROUP BY and fetch for JOIN). Some more details can be found in class comments of DictionaryOptimizedMapAccessor.

  • new DictionaryOptimizedMapAccessor to check for single column dictionary key case and generated code for the same (array creation, fetch from map on miss and return the other columns)

  • refactored the map lookup code both in the code generator ObjectMapAccessor as well as in actual generated code to enable invocation for both dictionary case or other cases; this introduces no overhead rather it is slightly more efficient in some cases where JVM can dynamically decide whether or not to inline the method call as per CPU instruction cache size

  • Changed the pattern in join consume. Earlier it used to invoke a "moveNext" at the start assuming iterator is placed before first row. Now generated code does not make this assumption rather than iterator is placed at first row (with above changes it difficult to fit in the "before first row" pattern). To circumvent the problem of consume code calling "continue" and expecting to move to next row (now it will go into infinite loop), it is surrounded with an otherwise useless "do { consume } while(false);" so that a continue will break out and then go on to moveNext -- looks like "while (true) { do { consume } while(false); moveNext }"

  • Use a generic map in SnappySession to keep track of any addition "context" objects during code generation. Used to pass around dictionary variable names and a new "finallyCode" block which is used to combine multiple "try{} finally {}" in generated code into a single block.

  • Added a HashedObjectCache for LocalJoin map that is shared by multiple partitions on the same node. This helps both in reduction of effort to create the map as well as lesser memory overhead hence better CPU cache behaviour. It is created on first get and removed when the last reference is removed (so could be created multiple times in single query for each set of scheduled partitions on a node). This behaviour helps avoid the invalidation complexity while adding minimal overhead.

  • Handle StartsWith predicate for MAX/MIN by treating it like a range. 'ABC%' is treated as ">= 'ABC' and < 'ABD'"

  • Skip creation of SnappyHashAggregateExec completely if code generation is not possible (due to an ImperativeAggregate). This allows the doExecute of SnappyHashAggregateExec to simply fallback to code-generation assuming it will never fail.

  • Added Utils.metricsMethods and call it from all Snappy optimized plans to allow invoking optimized primitive methods avoiding boxing/unboxing overhead for SQLMetrics (see snappy-spark PR linked below)

  • Remove the opt=F case that skipped optimized implementation for LocalJoin and HashAggregate. It is no longer useful for comparison and does not work for LocalJoin with the changes in this PR. Removed from both TPCETrade as well as disabled in LocalJoin.

Patch testing

precheckin

ReleaseNotes.txt changes

NA

Other PRs

TIBCOSoftware/snappy-spark#33

Sumedh Wale added 10 commits December 1, 2016 19:39
The cache is created when the first partition asks for it and maintained
only till the last partition references it. This means that the map
can potentially get re-created multiple times if all partitions did
not get scheduled to an executor in one shot.

This is acceptable given that this avoids the complications of
invalidating the cache.
- groupBy/join operations much faster for on single column dictionary strings --
  base groupBy/join or combination is 2-3X faster
- many other optimizations in groupBy/join generated code; overall single column
  integer groupBy/join is also 1.5X faster
…essions in group by expressions else aggregate expressions can consume and empty the ExprCode.code
- normal createMap() function as before; Callable class is separate since
  code may have "shouldStop()" which cannot be invoked even from sub-class
  so instead the Callable class now calls createMap()
No longer useful in comparison and does not work anymore for LocalJoin, so disabled it
and removed from TPCETrade test (will be completely removed once Hemant's changes are merged)
generated code to skip shouldStop() under certain conditions (aggregations) was causing trouble
in some queries so removed for now

increased the default buckets in local mode now that partition overhead is smaller
(and will become more so with SNAP-1190)
Sumedh Wale added 5 commits December 2, 2016 02:58
Check for column batch skipping for LIKE 'XYZ%' kind of queries.

Update UnifiedPartitionerTest as per the new default buckets.

Corrected compiler warnings in CatalogConsistencyDUnitTest and adding an assertion
after drop in proper place where drop table is expected to fail.
Cleaned up the new code added in 1d144f9 to ColumnTableScan and ExistingPlans.scala
@sumwale sumwale merged commit 78bbf14 into master Dec 3, 2016
@sumwale sumwale deleted the SNAP-1194 branch December 5, 2016 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants