Should an alignment method require one iteration only to converge to the minimum?

I almost always use AlignmentMillepede to align my telescope (3 iterations for the module).
After one alignment, if I look at the track chi2ndof, I can clearly see the improvement from a rather flat distribution. I would expect that after a first iteration, I would converge to the minimum.
However, if I parse this new geometry again to Millepede (same config) to align, I get an improved chi2ndof distribution (see below). At some point diminishing returns. I mention that this happens irregardless of whether I align the telescope only or telescope+DUTs.
Is this intended? Is my sample too small (300k tracks) to find the true minimum the first try?

I would expect that the algorithm should converge the first try. Probably missing something.
How is everybody else aligning :slight_smile:? Thank you!!
chi2
Mention: in the figure, the alignment “iteration” should not be understood as the parameter that is provided to AlignmentMillepede. This is always =3. Instead, each curve represents the chi2ndof after running the alignment config once (green: before any alignment, with just the prealignment; blue: the geometry of the first alignment is fed; red: the geometry file of the second alignment iteration is fed; further iterations do not show any improvement over the red curve and are not shown here).

Hi @mblidaru

if your initial alignment was off (as it looks like, and of course is expected), re-running the trackign after a first iteration will give you different (and better, since already better aligned) tracks. So it is expected that you still get a change from a second iteration since you re-do tracking.

/Simon

Hi @simonspa !

Thanks!
So, the fact that after a first alignment the position and orientation of the sensor slightly change (in the geometry file) is not a problem?
I mean, of course it’s not, since I’ve been using this all along and results are all correct. But mathematically, I was under the impression that the algorithm should directly go to the minimum (if sufficient tracks are provided - which is also a tricky thing to say)…

Cheers,
Bogdan

Hi @mblidaru

no, this should not be a problem - the minimizer will likely just find a slightly different minimum and therefore still change the positions.

Cheers,
Simon

Thank you Simon! I did a bit of digging and I found some interesting things here, that might be useful for somebody at some point, because it was not immediately obvious why the algorithm would not directly converge to the true minimum. So I will just leave it here. Please correct me if I say something wrong.

I report below results from the alignment of a 7 plane ALPIDE telescope where the sensors are inserted in place between metallic drilled inserts (so they are rather well fixed and the position/rotation should be quite well determined, given that it’s CNC machined). See the following picture:

alpide_Casing

I mention here that in the end for the efficiency analysis we used the association module, as not to bias the efficiency calculation (by requiring hits in every layer). We just used the alignment for all sensors at once since it should work very well, given no other material is in between.

For this setup, we had a colleague who has done 6 alignment steps (+ 1 pre alignment). By this, I mean get a new geometry file after each alignment which is then used as input for yet another corry run where tracks are re-fitted and a new alignment is found … and so on). Our sensors were operated at their nominal parameters (highly efficient, no noise). They were all 50um thin, no extra material in the telescope.

In a first step, the prealignment was done, using a Gaussian fit to find the mean and shift it to 0. The module AlignmentMillepede was then used with the following parameters:

[AlignmentMillepede]
exclude_dut = false
iterations = 2
dofs = true, true, false, true, true, true
residual_cut = .05mm
residual_cut_init = 1mm
number_of_stddev = 0
sigmas = 50um, 50um, 50um, .005rad, .005rad, .005rad
convergence = 1e-4

The prealignment does indeed find values ranging up to ~1 mm, and this understood by the placement of the sensors on the PCB and their insertion in the DAQ boards (below in the picture).
In the following picture I report the figure of merit, the track chi2/ndof for the runs:

One can see that after prealignment, a rather flat distribution is found, but still quite a few tracks have a chi2ndof <10. After a first alignment step, the track chi2ndof is already very good, with a mean of 1.22 A second step with track refitting improves this by a tiny bit (1.22 → 1.14) and is visible in the plot. Next iterations do not bring any improvement and are indistinguishable. (I mention at this point that by iterations I mean the number of alignment procedures, not the iteration parameter that the AlignmentMillepede module has, that was always kept at the value from above =2).

Below I report the values from the geometry files before/after such an iteration. There should be a total of 6 plots, for each of the degrees of freedom (translation X, Y, Z and rotation X, Y, Z). As can be seen from the AlignmentMillepede parameters, the z translation is not aligned, because this we serve as input in the initial geometry file and it should be well fixed. Therefore, this plot is not shown.

Also, since the tracking algorithm with translation X,Y and rotation X,Y,Z was stuck in a false minimum in the beginning, the second iteration step has the rotations in X and Y removed, while the Z is kept. As you will see later, the rot X and Y are minimal (< milli degree) (since the sensors are fixed in the metallic slots) and only the rotation around the Z axis has a higher impact.

This can be seen in the following table showing at which step what dofs were considered.

Next, I will plot for each of these degrees of freedom the difference between an alignment step and the next. I chose this instead of plotting the values directly, as some sensors had initial misalignment of eg: 1mm, while others had ~30um. Plotting those on the same scale would result in problems seeing the difference between iterations.


Here are the translations in X and Y. On the x axis are the 7 sensors, on the y axis I plotted this difference between alignments (check the legend to see what difference; I only plotted the first 3 alignments, since as I explained above the differences afterwards are minimal). This difference between alignments is expressed in pixel pitches (in the case of the ALPIDE 29.24 um in x and 26.88 in y).

What one can see is that the prealignment already does a good job in terms of translation. The changes between the first alignment and the prealignment is smaller than 20% of a pixel pitch (~6um) for x and smaller than 40% (~12um) for y. A second alignment will also add a correction, but afterwards diminishing returns. The positional difference after 6 iterations, I considered as sort of a true minimum is plotted as a cross and you can see that a third alignment does not improve anymore with respect to this position.

I did not plot the other alignments (4-6), but I observed something. Given the telescope resolution ~5um (and the inherent tracking error ~2-3um at the position of each sensor with GBL), the alignment algorithm will try to align again and again, always finding another minimum (close to the true one), but the changes are of the order of the resolution, as one would expect. So, one should be careful not to align too many times.

Now, for the rotations, here is the situation:



I will start by drawing your attention that for rotations around the x and y axes the units are in milli degrees, while for the rotations around the z axis this is expressed in degrees. So, the rotations around the z axis have a much higher impact.

The rotations around y are a bit special and currently not understood. Here, all values are 0 (the two crosses are due to rounding errors basically). In the config files, this dof is specified to be used, but the algorithm does not do it. We don’t understand why this is the case. For other testbeams, this works well.

For rotations around x, one can see the same behaviour as for translations, namely some improvement is possible (<2 milli degrees). If one then does more and more alignments on top of this already good one, the algorithm starts to have increased values (overfitting? migrating local minimum?)

A similar thing happens for rot z. Here, you see a large improvement from the prealignment (since in the prealignment only the translations are accounted for by the shift of the peak; nothing is done for the rotational dofs, so the first alignment is the first time these are accounted for). After this initial large improvement, the next two iterations are smaller, but again towards smaller values (showing that indeed one improves the alignment). After this second alignment, as explained above, the values start increasing.

Hi @mblidaru,

sorry for hopping late into that discussion. I do have a remark and a question:

After rerunning the alignment, you might have slightly different tracks (a cluster that was not assigned is now assigned for example). So I would expect small changes.

I am typically using the AlignmentTrackChi2 module and do not observe the above effects when running with approximately 100k tracks. Would you mind comparing the two?

Cheers,
Lennart

Hi Lennart! Saw this message late :slight_smile:
I confirm there are basically no changes when running with AlignmentTrackChi2
The changes that I see are minimal.
The prealignment is rather good, so a first alignment improves the positions in x,y,z with values up to max 8um. Rather big changes happen in the rotation where changes of -2 -- 4.3 deg in x -0.2 -- 0.5 deg in y and -0.15 -- 0.13 deg in z happen. As expected…
A second alignment results in minimal changes: <1um in positions, <1 deg in x , <0.1 deg in both y and z.
There basically no changes in efficiency, position resolution, etc. The chi2ndof changes from 1.883 to 1.847 after the second alignment.

Maybe I can write about one curiosity we observed. Looking at the first results I showed and these ones from the AlignmentTrackChi2 I was expecting to indeed not see differences in resolution/efficiency, but see something at least for the in-pixel plots.

To our surprise this does not seem to be the case. I think our expectation of just how much changes can occur from a chi2 difference of ~0.1 should be lowered by a lot, considering just how much this actually means (taking into account the rather large number of degrees of freedom for the whole telescope).

Any ideas on why a change is not observed given changes of a fraction of the pixel pitch?

This is a whole new level of delay @mblidaru… Sorry.

I cannot really answer your question completely. There is also no initial slope defined afaik. This can at the end mean, that you change the residuals less than the shifts if there is a systematic slope in the track incident :man_shrugging:

Hi @lhuth … well, after May 19th when I last wrote, I had some vacation before I saw your message on the 31st :slight_smile:
For the initial slope, I am afraid we also do not have an answer as we did not look this up. We assume there must be some systematic deviation there, but we did not quantify it.

Hi,

The delay was referring to my response time :innocent:

You can use a straight line fit to check for the initial slope. But maybe you’ve already stopped working on this. Let me know if we should continue this.

Ah, my apologies :slight_smile:
We closed the issue on this with the conclusion for the other users that:

  • Usage of AlignmentMillepede is advised
  • No crucial benefit is expected from rerunning the alignment multiple times
  • If one studies highly position sensitive properties, such as in-pixel efficiency, running alignment multiple times might prove beneficial but there is no recipe for how many iterations / algorithm. That is, it should be addressed at case to case basis

I am still curious to find more about this, so as soon as I finish with what I have here I will look into the initial slope.
My naive assumption is that this is represented by the track angle (x/y) histograms, right? (I worked with straight tracks). If that is the case, they are gaussian, centered on 0, with some 1.5mrad std dev. If there was a systematic shift it would have shown as a shift from 0… if my understanding is correct…

Correct - let me know if you have any updates