ConditionsDbSchema

From Mu2eWiki
Jump to navigation Jump to search

Introduction

A database schema is the design of the tables and functions, etc, and their interactions. Here we describe the conditions database in some detail.

We are using postgres databases. In postgres, the word "schema" is also used to refer to a what is essentially a folder, or a way to group database tables and assign permssions as a group. We have several of this type of schema, one for each detector subsystem (trk,cal,crv) plus one for the interval of validity structure (val) and one for test tables (tst). A table can be referred to as schema.name, for example, we have an example table in the tst schema, tst.calib1.

In this page we will use the word 'calibration to refer to one logical group of data, entered into one table at the same time. For example, if we had a set of gains for all crystals and committed those, with one value (equals one row) for each channel, that would comprise a "calibration". If the next week I committed another set of gains, that would be another "calibration".

Calibration Tables

Starting with tst.calib1, we have a conceptual content of a detector with 3 channels and calibration which contains a integer flag and a float, "dtoe", calibration for each channel.

TstCalib1 concept
channel flag DtoE
0 12 1.11
1 13 2.11
2 11 3.11

This set of 3 rows are a logical group and were committed together, and so it is an example of a "calibration".

In the database, this table needs to hold many calibrations and we need a way to separate them, so we add a column called cid, which stands for calibration ID. One cid number is unique across the whole database, so it labels exactly three rows in exactly this table. The actual database table looks like the following, with three calibrations entered, with cids of 1, 2 and 3.

TstCalib1
cid channel flag DtoE
1 0 12 1.11
1 1 13 2.11
1 2 11 3.11
2 0 22 1.21
2 1 23 2.21
2 2 21 3.21
3 0 32 1.3177
3 1 33 2.3166
3 2 31 3.3134

Calibration Support

There are two tables required to support the calibration entries. The first is a table which defines what calibration tables exist and gives them a number.


ValTables
tid (seq) name dbname create_date create_user
1 TstCalib1 tst.calib1 2018-10-12 08:58:26 rlc
2 TstCalib2 tst.calib2 2018-10-12 08:58:26 rlc

Note that this table is in the "val" schema, as its name indicates. The name is the c++ class name and the dbname is the postgres table name. The number assigned to identify the calibration table is the Table ID or TID. This is unique across the whole database. The number is a assigned as a sequence. In postgres, if a column is defined as a sequence, then you do not insert a value for that column when you insert a new row, that column's value for the new row is assigned automatically as the next higher integer in a sequence. The sequence is stored in another auto-generated auxiliary table. It is not hard to increment this sequence incidentally while not actually creating a new row, such as in the dbTool dry-run methods, so the sequence numbers are not going to be densely packed, there will be gaps - the tables might be numbered 1, 2, 5, 6, 8, etc.

This table, as many tables do, as a record of who created the row and when.

The second support table, called ValCalibrations, lists the calibration entries and assigns the CID.

ValCalibrations
cid (seq) tid create_date create_user
1 1 2018-10-12 08:58:26 rlc
2 1 2018-10-12 08:58:26 rlc
3 1 2018-10-12 08:58:26 rlc

The Calibration or CID is assigned automatically as a sequence. A row assigns a CID and defines which table contains this calibration.

When a calibrator inserts a new calibration, the dbTool code does:

  1. the table name is identified from the input file
  2. the new calibration rows are read from the input file
  3. the TID is looked up in ValTables based on the table name
  4. The TID is inserted in the ValCalibrations tables and a new CID is returned
  5. the rows are added to the calibration table (TstCalib1 in this example) with the new rows labeled by the new CID

Intervals of Validity and Groups

Logically, an interval of validity (IOV) says that a specific cid (one calibration in a specific table) is appropriate for a specific interval of subruns.

Many of these IOVs can point to the same calibration data. For example, if a calibrator uploads a calibration for run 1 and makes an IOV tying those together, then the next day confirms that the same calibration data is also good for run 2, the procedure would be to simply add a new IOV that connects the data to run 2.

iid (seq) Ccid start_run start_subrun end_run end_subrun create_time create_user
1 1 1001 1 1001 999999 2018-10-12 08:58:26 rlc
2 2 1002 1 1004 1 2018-10-12 08:58:26 rlc
3 3 1004 2 999999 999999 2018-10-12 08:58:26 rlc

These entries are identified by a sequence number, the IOV ID or IID.

The run and subrun ranges are closed and fixed in order to enforce that once IOV's are collected in a calibration set, the user's result will never change. If intervals were open-ended, then results for later runs might change. If you want this IOV to apply to all runs permanently, then use 0 for the start_run and 999999 for the end_run.

There is another conceptual level which is not address yet. This level connects this IOV to a particular set of calibrations. For example, this IOV might belong in the first pass of production, but not in the second pass, which has a whole different set of calibrations. This logical connection is made in other tables explained below. In the IOV table, we only attach a subrun range to calibration. This can be re-used for many calibration sets.

Upper Level