Hans Dembinski’s blog
Lectures
Talks
About
LinkedIn
Github
Order By
Default
Title
Date - Oldest
Date - Newest
Author
From unstructured to structured: Parsing webpages with a Large Language Model (LLM)
In a recent article, I showed how to set up a simple RAG system based on a locally run Large Language Model. I already praised the
ollama
library there, which makes it very…
Jan 7, 2025
Running Large Language Models (LLMs) locally for Retrieval-Augmented-Generation (RAG) Systems with full privacy
tl;dr:
You can run small LLMs locally on your consumer PC and with
ollama
that’s very easy to set up. It is fun to chat with an LLM locally, but it gets really interesting…
Dec 6, 2024
Comparison of gof tests in counting experiments
The chi-squared test is the most common way to check goodness-of-fit (gof) when of models that are compared to data. Here, one splits the data into
\(m\)
categories and then…
Aug 12, 2024
Factorization test
The classic sWeights method is a powerful way to project out a component from a mixture of signal and background, but it is often not obvious whether it can be applied…
Jul 8, 2024
Inelastic cross-sections of proton-proton and pion-proton collisions
According to a simple parton model, the cross-section for an inelastic proton-proton collision should be 3/2 of a pion-proton collision. In this picture, only the valence…
Jun 4, 2024
pp cross-section extrapolation uncertainty
Here we estimate the extrapolated uncertainty of the pp inelastic cross-section at 100 TeV before and after the LHC era.
Jun 4, 2024
Combining results with asymmetric errors
When sample sizes are small and models very non-linear, the likelihood around the minimum is not well approximated by a parabola. In that case, the errors computed by the…
May 18, 2024
Template fits with distortion correction
I recently gave a review talk on template fits in CERN’s PHYSTAT seminar, where I presented results from our recent paper on this subject. In the summary, I discussed on of…
Apr 19, 2024
Fast deep sets with FLAX
In Zaheer et al., NIPS 2017 Deep Sets are introduced, a new network architecture with interesting applications to particle physics. One of the first applications in particle…
Jan 11, 2024
All decays into two Kaons
In a recent paper review, the question came up how many resonances there are which decay into two kaons. It is not easy to look this up somewhere. My solution was to extract…
Oct 15, 2023
Look-elsewhere effect
When searching for an excess (peak) in some phase-space which has an unknown location, one has to correct the local significance for the fact that one would have reported…
Oct 15, 2023
Non-prompt fraction of KS and Lambda particles from hyperon decays
A particle is prompt, if it does not have particles in its decay history with life-times larger than 30 ps. According to this definition, some Lambda particles are not…
Aug 31, 2023
Regression example with a neural network
It turns out that the simple MLPRegressor in Scikit-Learn works very well on small datasets.
Jul 19, 2023
Statistical issues in naive calibration analyses
The statistical problem of calibrating a sample with observable pairs
\(\hat a\)
and
\(\hat b\)
is challenging if random stochastic fluctuations affect both numbers of the…
Jun 16, 2023
Transform LaTeX to Unicode
It is possible to transform a subset of LaTeX to Unicode, as demonstrated by unicodeit website. Unfortunately, unicodeit only works on short LaTeX strings.
Apr 10, 2023
Render LaTeX to SVG with matplotlib backend
We use matplotlib’s mathtext system and SVG backend to generate an SVG rendering of LaTeX code.
Apr 4, 2023
Visual cross-section for particle production
When you look for a somewhat rare decay, you want to know how much luminosity is required to see N decays in your detector. For this one needs to compute the visual…
Mar 31, 2023
How to make a plot with legend entries that are hyperlinks
This is a demo on what the title says, also see this matplotlib issue. The trick works with SVG and PDF. Try to click on “BBC” in the legend.
Mar 29, 2023
Tracking efficiency for two-body decay
\[ \epsilon'_A = \int \text{d}x_B\! \int \text{d}x_C\, \epsilon'_B(x_B) \, \epsilon'_C(x_C) \; f(x_B, x_C) \]
Nov 15, 2022
Particle correlations in proton-proton collisions
Here, I run simulations of proton-proton collisions with Chromo to investigate two-particle correlations in pseudorapidity space. With Chromo, it is very easy to study the…
Oct 7, 2022
How to use the logaddexp special function in fits
The logaddexp special function was made to perform this calculation more accurately:
\[ y = \ln(\exp(a) + \exp(b)). \]
This calculation occurs when we fit a mixed PDF with…
Sep 10, 2022
Fitting mixtures with COWs
This is a demo of how to use custom orthogonal weight functions (COWs) to generate weights which extract one component of a mixture. Find out more in our paper.
Sep 10, 2022
Which data structure is faster: array of structs or struct of arrays?
In the high-performance computing community, it is well known that certain algorithms run faster if the data are structured in such a way that the CPU can read them…
Jul 9, 2022
Unbiasedness of EML fit for a mixture model with fixed component pdfs
The log-likelihood for one observed Poisson-distributed count is (without constants)
\[ \ell_i(\lambda) := \ln \mathcal{L}_i(\lambda) = -\lambda + k_i \ln \lambda. \]
Apr 25, 2022
Numerically stable calculation of invariant mass
The art of writing numerical stable formulas is to find a mathematically equivalent computation for a formula that produces accurate results even though computation with…
Jan 20, 2022
Chance of seeing a one-sigma deviation in random splits
We randomly split a sample into two parts and check whether their respective arithmetic means deviate by more than one standard deviation from each other. The chance for…
Dec 5, 2021
Simple parallelization in Jupyter Notebooks
The usual modules
concurrent.futures
and
multiprocessing
do not work correctly in notebooks on all platforms (notably on OSX there are issues). What does work is joblib…
Aug 13, 2021
Uncertainty of efficiency from fitted decay yields
We derive an approximate formula to compute the uncertainty of the efficiency computed from fitted yields of some decays. The formula is suitable to draw error bars in plots.
Aug 8, 2021
Ratio bias
When you compute a ratio of two random numbers, the ratio is biased in general, unless the relative error on the denominator is very small. This is a general consequence of…
May 11, 2021
The Wilson score interval and weighted histograms
The Wilson Score interval (WSI) has won the LHCb internal challenge for the most suitable interval for efficiencies. The question is what to do when events are weighted was…
May 7, 2021
p-value computation and conversion: one-sided, two-sided?
This is an attempt to clear up the confusion around p-value computation and based on the authorative source on that matter:
May 4, 2021
Benchmark of building an array with numba
Here I try out different ways of making arrays of tuples with Numba where the number of tuples that need to be produced is not known in advance. The fastest way is to make a…
Apr 2, 2021
MCMC demo
This is a tiny demo of the MCMC algorithm from the emcee library, which computes the posterior for some parameters based on the likelihood function and priors on the…
Mar 29, 2021
Coverage of HESSE and MINOS intervals
The MINUIT package provides two ways to compute uncertainty intervals for a fitted parameter, the HESSE method and the MINOS method. Ideally, these intervals should have 68…
Mar 29, 2021
From a fixed-target to center-of-mass frame and back
I compute with Sympy how to transform a 4-vector from a fixed-target frame (aka laboratory frame) to the center-of-mass frame, where the colliding particles have equal but…
Feb 27, 2021
RooFit with Chebyshev and Bernstein polynomials
This is a little demonstration of how to fit a peak plus smooth background with either an empirical Chebyshev polynomial for the background or a Bernstein polynomial. The…
Jan 25, 2021
Comparison of asymptotically chisquare-distributed test statistics
For GoF tests, we often use a test statistic that is asymptotically
\(\chi^2\)
distributed.
Jan 9, 2021
Approximate PDF for the invariant mass distribution of combinatorial background
We derive an approximate pdf for the invariant mass distribution from combinatorial background. Combinatorial background refers to random pairs of unrelated particles that…
Jan 7, 2021
Solving the duffing oscillator
Here I solve the differential equations for the duffing oscillator with scipy and animate the solution with matplotlib. The oscillator exhibits chaotic behavior.
Jan 5, 2021
Error propagation for a ratio
Error propagation also works for systematic uncertainties. I discuss along a simple example.
Dec 12, 2020
Error propagation with Sympy
Sympy is a Python module for symbolic computation (like Mathematica and Matlab) with an elegant Python design.
Sep 17, 2020
New Fit Result Display
The next release of iminuit will feature a new Fit Result display.
Jul 29, 2020
Numba-accelerated Jackknife
My resample library implements the jackknife and bootstrap resampling techniques. The implementations use pure-Python. Since resampling techniques are computationally…
May 21, 2020
Applying the SPD approximation to negative weights
In Bohm and Zech, NIMA 748 (2014) 1-6, the scaled Poisson distribution is discussed, which is an approximate distribution for sums of weights. This distribution is used in…
Apr 22, 2020
Power consumption in sleep mode
Some people make a big fuss about machines consuming power in standby. While it is very important to conserve energy, it makes no sense to turn off all devices but then run…
Apr 16, 2020
Interactive plotting in Jupyter with matplotlib
This is a little demo on how to make interactive plots with matplotlib and ipympl.
Apr 16, 2020
Fitting weighted histograms with SPD method
We test different fit methods of weighted binned data. The tested weight distributions are normal, exponential, and uniform. The toy case is a common HEP fit of a gaussian…
Apr 15, 2020
Leave-one-out cross-validation
Leave-one-out cross-validation is a simple generic tool for selecting the best empirical model. When we model data empirically, for example, with a polynomial, we want to…
Apr 7, 2020
C++ exceptions in high-performance code
Here is a collection of advice on using exceptions in high-performance libraries. There has been a lot of discussion in the Boost community about exceptions lately, since som…
Apr 5, 2020
No matching items