Skip to content

Conversation

@guitargeek
Copy link
Contributor

@guitargeek guitargeek commented Aug 20, 2021

This PR implements the possibility to store global observables in RooFit datasets, and makes the necessary changes in the RooAbsPdf::fitTo code path to consider the global observables in the data if available. If one wants to restore the old behavior of taking the global observable values from the model even if they are stored in the data, one can use the new GlobalObservablesSource command argument.

Unit tests for all new functionality is also implemented.

Please find the more information in the commit messages.

Ideas for future developments in future PRs:

  1. make it possible to specify the global observables when generating a dataset:
    model.generate(x, 1000, GlobalObservables(g))
  2. make it possible to also sample global observable values when generating a toy dataset:
    model.generate({x, g}, 1000, GlobalObservables(g))
  3. Add a tutorial to show all the new functionality related to global observables

Closes #8123.

@guitargeek guitargeek self-assigned this Aug 20, 2021
@guitargeek guitargeek force-pushed the GlobalObservables_2 branch from 59d393e to af5ccf7 Compare August 21, 2021 01:30
@root-project root-project deleted a comment from phsft-bot Aug 21, 2021
@root-project root-project deleted a comment from phsft-bot Aug 21, 2021
@root-project root-project deleted a comment from phsft-bot Aug 21, 2021
@root-project root-project deleted a comment from phsft-bot Aug 21, 2021
@root-project root-project deleted a comment from phsft-bot Aug 21, 2021
@root-project root-project deleted a comment from phsft-bot Aug 21, 2021
@root-project root-project deleted a comment from phsft-bot Aug 21, 2021
@guitargeek guitargeek force-pushed the GlobalObservables_2 branch 2 times, most recently from a4a0fad to 4c1433a Compare August 25, 2021 14:37
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@root-project root-project deleted a comment from phsft-bot Aug 25, 2021
@guitargeek guitargeek force-pushed the GlobalObservables_2 branch 2 times, most recently from 53e2469 to 27b8283 Compare August 26, 2021 09:27
@root-project root-project deleted a comment from phsft-bot Sep 12, 2021
@root-project root-project deleted a comment from phsft-bot Sep 12, 2021
@phsft-bot
Copy link

Build failed on mac11.0/cxx17.
Running on macphsft20.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

Copy link
Member

@hageboeck hageboeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two more things. 🙂



////////////////////////////////////////////////////////////////////////////////
/// Replace the varaiables in this RooConstraintSum with the global observables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Replace the varaiables in this RooConstraintSum with the global observables
/// Replace the variables in this RooConstraintSum with the global observables

Copy link
Contributor Author

@guitargeek guitargeek Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

ROOT_ADD_GTEST(testNaNPacker testNaNPacker.cxx LIBRARIES RooFitCore)
ROOT_ADD_GTEST(testRooSimultaneous testRooSimultaneous.cxx LIBRARIES RooFitCore RooFit)
ROOT_ADD_GTEST(testRooGradMinimizerFcn testRooGradMinimizerFcn.cxx LIBRARIES RooFitCore)
ROOT_ADD_GTEST(testGlobalObservables testGlobalObservables.cxx LIBRARIES RooFitCore RooFit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ROOT_ADD_GTEST(testGlobalObservables testGlobalObservables.cxx LIBRARIES RooFitCore RooFit)
ROOT_ADD_GTEST(testGlobalObservables testGlobalObservables.cxx LIBRARIES RooFit)

I thought for a long time that you have to mention both, but CMake's transitive dependencies should allow for only mentioning RooFit. If it doesn't work, the CMake config needs to be fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, but on second look:
This introduces a dependency to RooFit, i.e., the entire set of libRooFit.so files needs to be compiled before you can run this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I tried to avoid this now by using RooGenericPdf instead of RooPoisson and RooGaussian. However, the test gets slowed down by the numeric integration: from 185 ms to about 1.5 seconds. So while I agree that it would be nice to get rid of this dependency, I don't think we should do something about it in this PR. There is also testRooAbsPdf and testRooSimultaneous that still depend on RooFit too, and I'll try to solve this dependency problem for all of the tests at a later point.

@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@guitargeek
Copy link
Contributor Author

guitargeek commented Sep 13, 2021

Hi @hageboeck, thanks for the review! I have addressed most of your comments and tried to justify whenever I didn't do something, which was mostly because I think it should be done in separate PRs (new tutorial, removing dependecy of RooFitCore tests on RooFit).

I think this PR implements now all the functionality requested by @will-cern in #8123, so it would be good if you Will could confirm this.

In a later PR, I'm then going to implement what was not required to solve the issue but what is still useful for ROOT nonetheless (also mentioned in the PR description):

  1. make it possible to specify the global observables when generating a dataset:
    model.generate(x, 1000, GlobalObservables(g))
  2. make it possible to also sample global observable values when generating a toy dataset:
    model.generate({x, g}, 1000, GlobalObservables(g))
  3. Add a tutorial to show all the new functionality related to global observables

@phsft-bot
Copy link

Build failed on mac11.0/cxx17.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

The `RooAbsPdf::fitTo()` function creates the likelihood function
itself, using the `RooAbsPdf::createNLL()` function. But if one has
already created the NLL manually, it is not possible to easily reproduce
the minimizer configuration that `RooAbsPdf::fitTo()` uses.

This commit refactors `RooAbsPdf::fitTo()` such that the minimizer part
is put into a new function called `RooAbsPdf::minimizeNLL()`. This is
useful for unit tests and debuggings where one wants to do something
with the likelihood function before proceeding with the minimization.
This commit doesn't change any fitting behavior. It only adds the
possibility to attach global observable snapshots to the RooAbsData.
This is to conveniently define the global observables stored in the
`RooDataSet` or `RooDataHist` in the constructor.
Adapt `RooConstraintSum::createConstraintTerm` to consider when the
dataset stores global observables.

The RooFit behavior changes in two ways:

  * If no global observables are specified with `GlobalObservables` in
    the `RooAbsPdf::fitTo()` command arguments, the set of global
    observables is automatically detected from the dataset and the
    values from the dataset are used.
  * If the global observables are specified with `GlobalObservables`,
    their values are taken from the dataset if they are stored in it,
    otherwise they are taken from the model as before. It's also
    supported that only a subset of the specified global observables is
    stored in the data.
Since the parameters of the RooConstraintSum need to be syncronized with
the dataset now when global observables are used from the dataset, the
`RooAbsReal::setData()` function needs to be overridden.
The `GlobalObservablesSource` command argument can take one of two
values:

  * `data`: to take the values from the dataset, falling back to the pdf
    value if a given global observable is not available.  If no
    `GlobalObservables` or `GlobalObservablesTag` command argument is
    given, the set of global observables will be automatically defined
    to be the set stored in the data.

  * `model`: to take all values from the pdf and completely ignore the
    set of global observables stored in the data (not even using it to
    automatically define the set of global observables if the
    `GlobalObservables` or `GlobalObservablesTag` command arguments are
    not given).

The default option is `data`. The `model` option basically means to
ignore the new feature of global observables stored in the dataset.

A unit test to implement that the `model` source works is also
implemented in this commit.
The global observables need to be treated consistently in the constraint
sum and the actual likelihood term, even though it would be unusual that
global observables appear in the likelihood outside of the constraints.
@phsft-bot
Copy link

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

Copy link
Member

@lmoneta lmoneta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
The only comment I have is that it would be nice to show the new functionality in a RooFit tutorial where you show the old way of using global observables and the one having it in the data set.

@guitargeek
Copy link
Contributor Author

Thanks @lmoneta! In #8878 (comment) I have mentioned the developments that will be done related to these global observables after this PR, and creating a tutorial is part of them.

@phsft-bot
Copy link

Build failed on ROOT-ubuntu16/nortcxxmod.
Running on sft-ubuntu-1604-1.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link

Build failed on mac11.0/cxx17.
Running on macphsft20.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RF] Add RooArgSet storage to RooAbsData for GlobalObservables

4 participants