Duplicate trees in root files written by FileWriter

Hello,

I am currently writing a module to merge together the data of two pixel telescopes only using the readout timestamp added in the computer while receiving the data from the detectors.

For this I am using the FileWriter and and a custom FileReader implementation to merge two events with matching timestamps. But when I save events and pixels onto disk using the FileWriter module I see duplicate trees in the root files produced, see an example file here.
The duplicates only arise for higher event numbers, when i stop the corry run intermediate only one tree per type exists.

Is this behavior already known and intended?

Thank you for your help,
Bent

Hi @bbuttwil

this is curious, would you have a sample config and the inputs for me to try to reproduce this?

Cheers,
Simon

Hi @simonspa

thank you for investigating my problem.

Here I uploaded a compressed tarball [link] with a sample data file as well as config and geometry file.
In our setup we took data with 6 Alpides, of which I only included 3 in the geo.

When i run this config file as it is it produces the file “contains_duped_tree.root” with the duplicated trees that contain Pixel and Event data. I did some modifications to the corry code to add my module but it shouldn’t have changed anything in how the FileWriter behaves. But just as a sanity check i have also executed the config with the latest version on the master branch [commit: 87c7980349e4ce1de972750e3ca833911cb60923] still the same behavior of seeing duplicate trees.

Cheers,
Bent

Hi Bent,

just in case this is still relevant to you: Could it be that what you are seeing are just the backup copies created by the root TTree Autosave?
If I open your file with root, and do “.ls”, this is the output I see:

 TFile*         first_half_tele_events.root
  KEY: TTree    Pixel;3 Tree of Pixel [current cycle]
  KEY: TTree    Pixel;2 Tree of Pixel [backup cycle]
  KEY: TTree    Event;2 Tree of Events [current cycle]
  KEY: TTree    Event;1 Tree of Events [backup cycle]
  KEY: TDirectoryFile   config;1        config

This is not corryvreckan-specific, but generally, usually when you have “duplicate trees seen in TBrowser” , that’s a save cycle thing; here’s my understanding on what could be causing this.
→ the keyword to search for in the root documentation to understand this is “cycle” or “namecycle” (for saved objects name;cycle)

Root begins filling the tree (e.g. “mytree”), and when the number of entries reaches the one given by ‘AutoSave’, the meta-data is flushed to the disk. mytree;1 is a snapshot of the metadata at that point. The next time a write operation is performed, the cycle number is increased, and you get mytree;2. The different cycles are kind of a crash-recovery mechanism - if your code crashes, and there was an autosave in between, you can usually recover the entries up to that point from the file.

TBrowser shows all versions of an object (a histogramm, a tree, …). That’s what you are seeing.
If you just do …->Get(“mytree”) in a macro/in cling, it automatically retrieves the one with the highest cycle number, i.e. the one that was most recently created, so in that case you don’t even notice.

If it really bothers you to see multiple versions in TBrowser, I think you could prevent that by adding the option kWriteDelete when storing the file (from TObject documentation: “write object, then delete previous key with same name”).
Although I wouldn’t really consider it a problem: mytree;2 and mytree;1 are not two completely different trees, just different namecycles for the same tree, and only the TTree metadata is duplicated between namecycles. The real data is only present once in the file, so it doesn’t use a lot of extra filespace.