BESIII Data Management System

The BESIII experiment has got about 1PB raw data and 1PB DST data from 2009 to date, so how to manage these data and condition data well is very important. The BESIII data managment system has run successfully for ten years, which has offered a full-featured, time-tested approach to BESIII offline and physics users. We designed an almost perfect structure for the system and had good backup and maintance strategy.


Introduction
Beijing Electron Positron Collider Ⅱ (BEPC Ⅱ) is located at Beijing China [1] . The third generation of the Beijing Spectrometer (BES III) is a multi-purpose detector operated at the Beijing electron-positron collider (BEPC) II for physics at tau-charm energy region. The BES III detector consists of four sub-detectors: the Drift Chamber (DC), Time-Of-Flight counters (TOF), the Electromagnetic Calorimeter (EMC), and Muon Counters (MuC). There are about one PB Raw data and one PB DST Data from 2009, so how to manage these data and condition data well is very important.

Fig. 1. The view of BESIII Detector
For BESIII data management system, the original raw data from DAQ, Monte Carlo (MC) simulation data and reconstruction data are stored in disk/tape file system; all the simulation and reconstruction data are produced by BESIII offline software. There are three types data stored in MySQL database: data from DAQ and Slow Controls database, calibration constants, user customized information. The original raw data from DAQ are written into files on a hard disk of the DAQ server. Next these files are copied to tapes as soon as possible, then these data in tapes will be written to disk file system of offline servers, then these data will be available to offline and physics users.

Data in MySQL Database
This section will focus on the data which stored in MySQL database.

Data from DAQ and Slow Controls databases
To date, data from DAQ and Slow Controls database have about one million records, which include the status of data taking, the information of raw data file, etc. This picture shows the structure of BESIII central database system. For safety's sake, Slow controls database and online database servers with internal network located in the control room, they cannot be accessed directly from outside. The two servers give read privilege to the central database server which can get their replications. So users can access the data from DAQ and Slow Controls databases by browser/client/API. One shell API extracts the data which users are interested from DAQ and Slow Controls database replications to the central database. There are two ways the users can access the data of central database, one is access the central database at IHEP directly, and another way is that they can access their local replica.

Calibration constants
Calibration Constant data have about ten thousand records now. Every BESIII detector has its own calibration constant data. Calibrators create calibration constants by running calibration algorithm, then these data will be stored in hard disk. Calibrators verified them, if these data are OK, then inserted them into central database by the application BEMP. At the same time, the new constants will be replicated to all slaves by MySQL replication thread which is a real time one. So physics, reconstruction and simulation users can access these new constants. For security reasons database records will never be deleted. New record can have a new calibration version, and if this record is never used, its status will be set 'ABORT'.

User customized data
User customized data are stored in MySQL database, about two hundred thousand records now. Most of these data are written in MySQL database by application.

BEMP
BESIII experiment management platform (BEMP) keeps track of data handling steps and provides facilities to access metadata for physics analysis [2] . BEMP has about four million records now. BEMP's typical use cases: physicist users. They can submit jobs with input of datasets or retrieve information about a dataset; or define user's own data production. The second is data production manager (collaboration wide data reprocessing); users can create datasets in the database too; or data transfer by dataset and retrieve datasets as input for reconstruction. The third user case is for calibrators, they use BEMP to manage the calibration constants. This is a picture shows the component of BEMP.

Backups
For data security, backup is the most important thing. We have two ways to back up all the data in MySQL database. One way is MySQL replication thread, replication enables data from one MySQL database server (the master) to be copied to one or more MySQL database servers (the slaves) [3] . Because data is replicated to the slave, and the slave can pause the replication process, it is possible to run backup services on the slave without corrupting the corresponding master data. And replication can realize data distribution quickly and safely; load balancing, high availability; and it is simple to use; replication is a perfect database backup design. Another way of backup is to copy all the database files to a tape every day. So far, the two backup ways run successfully.

Maintenance of MySQL Database Data
To keep consistency and integrity of MySQL database's data, monitoring the difference between data stored in master and slaves is necessary. For the normal operation of the database, a monitor application always keep an eye on the database servers. Once errors happened, such as, the server is gone, at the same time the errors will be emailed to the administrator and sent SMS information to the administrator's phone. The slave replication status is monitored by application too. So the database administrator can deal with the problems as soon as possible. For data's security, the following ways are used: limiting specific IP which can access database server; different users have different privileges; log enable; backup and so on. MySQL database recovery is simple for BESIII offline management system, because we use MySQL replication and data backup every day.

Conclusions and Outlook
BESIII data managment system has run successfully for ten years, which has offered a fullfeatured, time-tested approach to BESIII offline and physics users. In the future, we will learn and use more new technologies so the BESIII data management system will evolve with experience and innovation.