Skip to content

Conversation

@eguiraud
Copy link
Contributor

In particular, all creations and deletions of shared_ptrs have been removed from the event loop.

This is a long due optimization that required several changes in the internal behaviour of TDataFrame{Impl,Action,Branch,Filter}. Unfortunately all changes are entangled, so the third commit is quite fat.

The main change to the internal logic is that TDataFrame{Action,Branch,Filter} now store a tuple of TDataFrameValues rather than (possibly null) shared pointers to TTreeReaderValueBase.
TDataFrameValue offers a transparent, unified interface to the different kinds of values that the nodes must handle: temporary columns, to be evaluated on-the-fly, TTreeReaderArrays that must be converted to array_views and TTreeReaderValues.
TDataFrameValue also incorporates validity checks on the value types, e.g. that arrays read via TTreeReaderArray are actually contiguous in memory and that the type of a temporary column is the same as the type expected by the node that makes use of it.

Values of temporary columns are now stored in unique_ptrs instead
of shared_ptrs, as we do not need to share ownership.
The update of these values are now handled by a separate Update
function. The content of the unique_ptr is updated, instead of
allocating a new object for each new evaluation as it was before.

This last change forces the types of the values of temporary columns
to be default-constructible and assignable, rather than
copy-constructible as before. The change is necessary to limit usage
of shared_ptrs and heap allocation inside the event-loop.
This is a long due optimization that required several changes in the internal
behaviour of TDataFrame{Impl,Action,Branch,Filter}. Unfortunately all changes
are entangled, and I don't see a way to split them in multiple commits that compile.

- Introduced class TDataFrameValue, that abstracts the notion of access to column values.
  At compile-time, TDataFrameValue chooses between TTreeReaderArray and TTreeReaderValue.
  At runtime, it resolves the column value to either a real TTree branch or a temporary column.
  During the event-loop, TDataFrameValue makes sure that values are updated at each entry,
  offering a single interface to handle TTree branches and temporary columns.
- TDataFrame{Action,Branch,Filter} now store tuples of TDataFrameValues instead of shared_ptrs
  to TTreeReaderValueBase.
- TDataFrameValue also checks that the return type of temporary column expressions
  corresponds to the inferred type of the branch that makes use of it.
- TDataFrameImpl was moved at the beginning of the file because the other classes
  now need to call methods on it, e.g. to invoke InitTDFValues.
- the GetBranchValue methods have been eliminated in favour of TDataFrameValue
- the BuildReaderValue methods now call InitTDFValue to initialize their TDataFrameValues;
  the free function BuildReaderValues has been eliminated
@dpiparo
Copy link
Member

dpiparo commented Mar 19, 2017

Merged after some manual testing. Thanks a lot for this great development!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants