RooFit with Chebyshev and Bernstein polynomials

This is a little demonstration of how to fit a peak plus smooth background with either an empirical Chebyshev polynomial for the background or a Bernstein polynomial. The Bernstein polynomial is technically superior, because it allows us to ensure that the background density is always positive with a simple box constrain on its parameters. That is not possible with Chebyshev.

This example is based on the RooFit Tutorial. I generate a toy model which is an additive mixture of a normal distribution for a peak and a polynomial background. I draw a sample, and fit two variations of the toy model, one with either polynomial. The fitted lines look identical after the fit, as expected. The result of the maximum likelihood method is invariant to reparametrisations.

import ROOT as R

Welcome to JupyROOT 6.30/02

x = R.RooRealVar("x", "x", -10, 10)

mu1 = R.RooRealVar("mu1", "mu", 0, -10, 10)
sigma1 = R.RooRealVar("sigma1", "sigma", 1, 0.01, 1e2)
gauss1 = R.RooGaussian("gauss1", "Gauss", x, mu1, sigma1)

p0 = R.RooRealVar("p0", "p_0", 1, 0, 1e2)
p1 = R.RooRealVar("p1", "p_1", 0.2, 0, 1e2)
cheb = R.RooChebychev("cheb", "Chebyshev", x, R.RooArgList(p0, p1))

mu2 = R.RooRealVar("mu2", "mu", 0, -10, 10)
sigma2 = R.RooRealVar("sigma2", "sigma", 1, 0.01, 1e2)
gauss2 = R.RooGaussian("gauss2", "Gauss", x, mu2, sigma2)

fix = R.RooRealVar("one", "one", 0.1)
q0 = R.RooRealVar("q0", "q_0", 1, 0, 1e2)
q1 = R.RooRealVar("q1", "q_1", 0.2, 0, 1e2)
bern = R.RooBernstein("bern", "Bernstein", x, R.RooArgList(fix, q0, q1))

f1 = R.RooRealVar("f1", "signal fraction", 0.3, 0, 1)
model1 = R.RooAddPdf("sum1", "S+B model", R.RooArgList(gauss1, cheb), f1)

data = model1.generate(R.RooArgSet(x), 1000)

model1.fitTo(data)

f2 = R.RooRealVar("f2", "signal fraction", 0.3, 0, 1)
model2 = R.RooAddPdf("sum2", "S+B model", R.RooArgList(gauss2, bern), f2)
model2.fitTo(data, R.RooFit.PrintLevel(-1))

c = R.TCanvas()
frame = x.frame()
data.plotOn(frame)
model1.plotOn(frame)
model2.plotOn(frame, R.RooFit.LineStyle(R.kDashed), R.RooFit.LineColor(R.kRed))
model2.paramOn(frame)
frame.Draw()

[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#1] INFO:Minimization --  The following expressions will be evaluated in cache-and-track mode: (gauss1,cheb)
Minuit2Minimizer: Minimize with max-calls 2500 convergence for edm < 1 strategy 1
Minuit2Minimizer : Valid minimum - status = 0
FVAL  = 2737.75116242620834
Edm   = 0.000269023969144123421
Nfcn  = 100
f1    = 0.275394     +/-  0.0242084 (limited)
mu1   = -0.0175247   +/-  0.091251  (limited)
p0    = 0.990493     +/-  0.0455302 (limited)
p1    = 0.103726     +/-  0.0587728 (limited)
sigma1    = 1.01262  +/-  0.0914058 (limited)
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: deactivating const optimization
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#1] INFO:Minimization --  The following expressions will be evaluated in cache-and-track mode: (gauss2,bern)
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: deactivating const optimization

Info in <Minuit2>: MnSeedGenerator Computing seed using NumericalGradient calculator
Info in <Minuit2>: MnSeedGenerator Initial state: FCN =       2739.667437 Edm =       1.767793129 NCalls =     21
Info in <Minuit2>: MnSeedGenerator Initial state  
  Minimum value : 2739.667437
  Edm           : 1.767793129
  Internal parameters:  [    -0.4115168461                0     -1.370461484     -1.481323767     -1.371459022] 
  Internal gradient  :  [      15.50719084      34.67327852      -68.9043889      127.4588967     -139.7445175] 
  Internal covariance matrix:
[[   0.0035221149              0              0              0              0]
 [              0   0.0001480147              0              0              0]
 [              0              0  4.3539642e-05              0              0]
 [              0              0              0  0.00023511592              0]
 [              0              0              0              0  0.00010343293]]]
Info in <Minuit2>: VariableMetricBuilder Start iterating until Edm is < 0.001 with call limit = 2500
Info in <Minuit2>: VariableMetricBuilder    0 - FCN =       2739.667437 Edm =       1.767793129 NCalls =     21
Info in <Minuit2>: VariableMetricBuilder    1 - FCN =       2738.073461 Edm =      0.1061417478 NCalls =     33
Info in <Minuit2>: VariableMetricBuilder    2 - FCN =       2737.800519 Edm =     0.03895269349 NCalls =     46
Info in <Minuit2>: VariableMetricBuilder    3 - FCN =       2737.753549 Edm =    0.002339429787 NCalls =     58
Info in <Minuit2>: VariableMetricBuilder    4 - FCN =       2737.751162 Edm =   0.0002368589638 NCalls =     69
Info in <Minuit2>: VariableMetricBuilder After Hessian
Info in <Minuit2>: VariableMetricBuilder    5 - FCN =       2737.751162 Edm =   0.0002690239691 NCalls =    100
Info in <Minuit2>: Minuit2Minimizer::Hesse Using max-calls 2500
Info in <Minuit2>: Minuit2Minimizer::Hesse Hesse is valid - matrix is accurate

model1.getParameters(data).Print("v")

  1) 0x7f7e8f6c9310 RooRealVar::     f1 = 0.275394 +/- 0.0241924  L(0 - 1)  "signal fraction"
  2) 0x7f7e8eff2700 RooRealVar::    mu1 = -0.0175247 +/- 0.0912425  L(-10 - 10)  "mu"
  3) 0x7f7e8f2b7960 RooRealVar::     p0 = 0.990493 +/- 0.0454892  L(0 - 100)  "p_0"
  4) 0x7f7e8f692e30 RooRealVar::     p1 = 0.103726 +/- 0.058683  L(0 - 100)  "p_1"
  5) 0x7f7e8eff2a90 RooRealVar:: sigma1 = 1.01262 +/- 0.0913868  L(0.01 - 100)  "sigma"

model2.getParameters(data).Print("v")

  1) 0x7f7e8fda7940 RooRealVar::     f2 = 0.275478 +/- 0.0241593  L(0 - 1)  "signal fraction"
  2) 0x7f7e8fd050e0 RooRealVar::    mu2 = -0.0183764 +/- 0.0911997  L(-10 - 10)  "mu"
  3) 0x7f7e8fd08f60 RooRealVar::    one = 0.1 C  L(-INF - +INF)  "one"
  4) 0x7f7e8fd092f0 RooRealVar::     q0 = 0.604314 +/- 0.403307  L(0 - 100)  "q_0"
  5) 0x7f7e8fd15aa0 RooRealVar::     q1 = 1.84239 +/- 0.845585  L(0 - 100)  "q_1"
  6) 0x7f7e8fd08bd0 RooRealVar:: sigma2 = 1.01227 +/- 0.0912609  L(0.01 - 100)  "sigma"