Shared nomenclature and identifiers for telescopes and instru- ments

In the context of sharing public data, science results are expected to be reproducible and therefore we need full traceability of the origin of the data. On the documentalist side, there is a need to relate instrumental origins to the published data. We propose to define a shared nomenclature to index each publication with unique designations for facilities, telescopes and instruments which could benefit from the Virtual Observatory work on semantics. This would help the documentalists to check the consistency of the instrument description in publications or make it more explicit. Observation period, data quality and spectral coverage, for instance, may be checked by referencing a global instrumentation service which gathers the nominal observation parameters for the telescope/facility/instrument involved. Based on this indexation mechanism, then the bibliographic metrics for telescope/instrument usage would be easy to compute, and tracking services like the ESO telescope bibliography database (TelBib) or others would be easier to feed. This paper traces the existing initiatives and gives the example of a facility description framework reusing Virtual Observatory metadata which could be fed by the community.


Introduction
Sharing public data is now well adopted in the sciences, especially in astronomy, and supported by public policies -e.g. the G8 Science Ministers Statement in 2013 [1].Reusability and reproducibility of the data that researchers have to share are key features.Particularly in astronomy, where science questions are driven by the need for multi-wavelength, multimessenger observations, it is crucial to attach metadata to observations in order to describe facilities, instruments, wavelengths and other particular information related to the data.By doing so, those observations become discoverable and understandable and so can be reused by the astronomer community.Another interesting point related to data sharing is the interoperability provided by metadata standards which allow querying multiple archives at the same time -e.g. as provided in the Virtual Observatory framework.To this day, there are individual usages of the descriptions of facilities but no common effort from the e-mail: emmanuelle.perret@astro.unistra.frORCID: 0000-0002-4068-8175 community to coordinate them.The aim of this paper is to review the existing initiatives in terms of facility designations and to initiate a discussion on a possible common strategy.

Definition points
The terms facility, observatory or even instrument can have different meanings according to the usages -e.g.Voyager 2 or HST are both observatories when querying the SPASE registry [2] whereas HST is a telescope in the WISeREP database [3].In this paper, we propose to define the following terms: -Facility: a global term to designate any installation containing a telescope, a satellite or an observatory.
-Observatory: the ground-based buildings equipped with a telescope.
-Telescope: any platform ground-based or spacecraft.
-Instrument: a physical device that collects data.

Existing initiatives
We conducted a survey of various repositories describing facilities on the Web.All these sites deal with astronomical data and belong to the following categories: • Agencies webpages which describe telescopes, space missions and instruments (how data are produced) • Publishing services which distribute scientific papers (how data is analyzed) • Astronomical services providing data (how data is distributed: by domain of interests or data collections)

Agencies
Agencies describe their missions and facilities online with more or less details.Although most agencies provide the names and descriptions of their facilities, they do so in an uncoordinated manner.
Besides, agencies have more specific services which use facility designations.For instance, the NASA Space Science Data Coordinated Archive (NSSDCA) provides a master catalog to search for spacecraft.It also provides a search through the SPASE Registry [2] for NASA observatories and instruments.Furthermore, the Jet Propulsion Laboratory (JPL) provides the Navigation and Ancillary Information Facility (NAIF) integer ID codes [4] which are part of an observation geometry information system to help scientists to interpret scientific observations from space-based instruments aboard robotic spacecraft.
In both cases, alternative names of instruments are managed by a unique identifier designating all the synonyms.But, if you search Voyager 2 on SPASE, its code is "1977-076A" whereas the NAIF identifier is "-32".So no uniform index currently exists.

Publishing services
As a matter of fact, the American Astronomical Society (AAS) which publishes three of the main journals in astronomy, initiated the use of a list of controlled facility keywords 12 years ago [5] in order to encourage authors to tag their publications with these facility tag labels.However, authors adoption did not spread enough and facility names are still searched in free text.
In order to index articles with names of the facilities which have provided observational data, a common reference list among journals and the astronomical community at large would be very helpful.

Astronomical services
An interesting example of an online repository fed by the supernovae community worldwide is the Weizmann Interactive Supernova data REPository (WISeREP) [6] which gathers observational data on supernovae.This data repository provides a list of telescopes, instruments and gives the wavelength domains.Its purpose is to select appropriate datasets.It also provides filter names and operational status information, for example updates and decommissioning dates.Besides, comparing the spectral coverage of various filters helps to tag associated data correctly, select datasets and combine measurements together.For instance, this is done when a spectral energy distribution profile is built from catalog data.The SVO filter profile service [7] is a good example of an initiative to gather all the information concerning filters coming from diverse facilities.A constrained but rich list of instruments and facilities is proposed for the search of the filters.Furthermore, the service provides the transmission plots and reference parameters for the filters like the effective wavelengths and widths.
In order to synthesize the observed behaviors of the various initiatives in our survey, we suggest registering the following features as columns in Table 1: • "Goal" describes the main objective of the site • "Facility type" indicates the subclass of facility as: instruments 'Inst", telescopes (ground or space) as "Tel" or observatories "Obs", if provided • "Facility description" indicates the presence of a free text description • "Instrument type" indicates "F" for full, if a precise list of instrument is given, or "P" for partial if an indication is given (e.g.spectroscopic or photometric in WISeREP [3]) • "Time range" indicates if the period when the facility is operational is given -usually it is not given but some sites indirectly provides some time information (TelBib [8] via program ID and VESPA [9] via time intervals).
• "Spectral coverage" shows "F" as full, "P" as Partial or "N" as no, depending on whether the site gives the range of wavelength/frequency/energy covered by the instrument • "Alternative names" indicates "Y" (yes) if alternative names can be resolved on the web site API • "Reference features" mentions if additional parameters of the instrument are given like the spectral or spatial resolution for instance • "Spatial location" can be "Y" for yes if the coordinates location is given or "P" for partial if the observatory site or the country only is mentioned • "Number of entries" gives the number of facility rows for each site

Classifying the requirements
As we can see in Table 1, various level of details are provided for facility descriptions in order to serve diverse needs.

Simple list
For some usages, a list of all the unique, common and long-lasting identifiers of existing facilities would be enough.For example, in the case of the naming convention proposed by the SAO/NASA Astrophysics Data System (ADS) [10] to build dataset identifiers from facility names, the users only have the list of the unique identifiers -based on human-readable acronyms -of 258 facilities.However, even for a simple list, the problem of alternative names needs to be solved.Among the examples given in Table 1, we can cite the NAIF codes [4] which give for one integer code, diverse synonyms of the same spacecraft.In the case of SPASE [2], you also have unique identifiers as we have already seen in section 2.1.but here, the alternative names are given in a description sheet and they can be queried as well as their identifiers.
Finally, in the case of Telbib [8] or the SVO filter profile service [7], the lists of queriable instruments and facilities are constrained, so alternative names are not searchable.But, in those cases, the alternative names have to be managed internally, e.g. the librarian at ESO keeps an updated list of the ESO facilities with alternate designations.

Detailed facility profile
Going a little further, according to the usages, diverse degrees of details related to the facility are required.In Table 1, indeed, all examples other than ADS give more information than a simple list of identifiers.
For some efforts, information about the instrument, where it is installed (telescope, observatory name and location) or to which space it belongs, is provided (SPASE [2]).For others, the dependency is not tracked, or it is described in a non standard way (AAS [5]).The type is also important information which is provided at diverse levels.From the simple distinction between photometric or spectroscopic (e.g.WISeREP [3]) to a detailed list of instrument types with the ability to search among them (e.g.Europlanet [13], SPASE).Another aspect, which is particularly important in planetary science, concerns the observational periods of the instruments.For this parameter, we can have the global commissioning dates, the global observational dates corresponding to a specific version of the instrument or we can have the specific dates corresponding to a given observation (e.g.indicated in the program identifier of an observation).For instance, we already note in section 2.3 that WISeREP provides comments on instrument upgrades or decommissioning dates.This leads to the versioning notion, which is indeed something to keep in mind when we want to describe observational data.Finally, in the case of planetary science, the location of observatories is really important.For example the Minor Planet Center (MPC [11]) provides unique codes along with the precise coordinates for more than 2,000 observatories.This kind of information is also given in the case of VESPA [9].To a less extent, the country of observatories hosting telescopes is also often provided (e.g.AAS, WIS-eREP, Europlanet).From those specific usages of facility designations, we can define three levels of facility profiles: • precise : these are the exact parameters for a specific observation • global : generic values for the facility -e.g. when no other information is provided for the spectral range of a data set, the minimum and maximum spectral range of the instrument can be used if there is a link toward the agency's instrument web-page description.

Specific usages
As we have seen in section 2.2, a list of unique identifiers would allow tagging of the scientific publications in a homogeneous way.
Considering the new technological tools, tagging afterwards by automated extraction of facility names thanks to text mining and machine-learning tools, could help both ways by: • Adding synonymous identifiers to a built list of standardized facility identifiers • Assisting authors to rightfully tag the publication with a standardized facility It is worth noting that those kind of tags would be useful for librarians and would allow the tracking of papers more easily according to the facilities used to obtain the scientific results.For instance, a consistent tag in publications could be used by ADS to allow a search among papers using specific facilities.For the VizieR associated data service [20] which archives images and spectra associated to publications, a simple list of facilities would be a great help because this list, once integrated in the indexation tool, could help the documentalists and the authors to attach the correct telescope/instrument to the corresponding observations.Another level of usage is reached with the ESO Telbib service [8] which allows the retrieval of all the scientific papers in which the ESO facilities have been used for observations.This service is really beneficial for the agency, because all the metrics of this archive can help the decision makers.For instance, when an instrument is less cited in papers, the agency can think of improvements to make to this instrument [21].However, to provide such a service, the librarians need to parse the publications in order to find the citations of the ESO facilities and program identifiers, and this is a real time-consuming effort.Even if curation will always be necessary, it would be easier to find the relevant papers if the publications have homogeneous facility tags.Another use-case example can be the VizieR Photometry Viewer tool [22] which offers to plot the fluxes for one source from the photometry gathered among all the VizieR catalogs (currently 16,200).In order to do so, precise data on filters must be collected and attached to the VizieR datasets.Those metadata are carefully enriched by the whole team of astronomers, engineers and documentalists.Having unique identifiers for all facilities along with the links to instruments and filters descriptions provided by the agencies could be a good basis to enrich the metadata.
Furthermore, by tagging publications, other usages could be implemented.For instance, in SIM-BAD, each data measurement is linked to its bibcode which displays the title of the corresponding scientific paper.And so, with facility-tagged publications, each parameter could eventually show the facilities used to make the observations for that paper.

Feedback
From all the existing initiatives and use cases, we can conclude that there is a need for a common nomenclature, with unique and long-lasting identifiers for facilities.Should this telescope/instruments list be as complete as possible, this would emphasize interoperability and would be very beneficial for the community.This conclusion has been enforced by the LISA VIII meeting at the end of which a working group started to discuss all these aspects, towards a definition of a nomenclature.This interest working group currently includes more than twenty people from more than ten institutes (including NED, ADS, ESO, Keck, NOAO, CfA, etc.) The interest comes from librarians and data center service developers who need to tag the facilities in their descriptions or publications.Curation of facilities is especially hard because, on one hand all the alternative names, and on a second hand all the facilities which share the same names, but are different.This is true for the organizations, i.e.SAO (Cambridge) versus SAO (Moscow) but also for the instruments, i.e.MegaCam can be the instrument of diverse telescopes (CFHT or MMT and Magellan).Also the difficulties in defining all the terms we are talking about (see section 1.1) and the level of facility profiles (see section 3.2) are some illustrations of the complexity of sharing a common description.All the diversity of usages suggest that we need a common base on which more layers can be linked from diverse origins.

On the way to having a general facility representation
In the framework of the IVOA Semantics group, we initiated an effort to homogenize the notation and description of facilities properties and link it to existing web pages maintained by agencies in charge of building them.This representation is spread among 6 basic concepts: Facility, Telescope, Observatory, Space Mission, Spacecraft, and Instrument.
The Facility Class is derived from a VOResource [23] metadata structure which gathers basic curation parameters.It handles a list of alternate names for the Facility, and a start time and stop time interval related to the operational period of the Facility.The other classes are derived from the Facility class.Two subclasses are defined from the Observatory Class: a 'ground-based' Observatory has location coordinates (longitude, latitude, altitude), a Space-Craft is an Observatory subclass related to a Space Mission instance.All facilities usually have an Agency name attribute and are described by a webpage or document by each agency.A more detailed description is available in [24].As we examined other projects, namely in Space Science, the PDS4 [25] data model representation offers a richer description of these items, with an efficient binding of the various instrumental classes together with the data collections.This perspective is pursued currently in our group to produce a clear relational diagram of the various concepts and derive an Instrument Class with precise instrument types.This is an ongoing effort.

Conclusion
We have presented a survey of various repositories describing facilities available on the Web.We have categorized those examples by the origin of data (how data are produced, analyzed and distributed) and by the type of information requested according to diverse needs.From the reviewed simple list of facility identifiers to detailed facility profiles, we can conclude that there is a need for a common nomenclature, with unique and standardized identifiers for facilities.A common and basic description would help to maintain and build upon diverse astronomical services.More detailed facility descriptions as proposed in the PDS4 data model suggest a way to go for developing a standardized facility description repository with shared contributions from the whole astronomical community.
of all filters of an instrument like the effective widths TelBib (ESO) [8to datasets which give the properties of observations VESPA (Paris observatory) [9] data like targets, exposure time, phases... ADS facility identifiers (SAO/NASA) Analysis and Review Tool (OSCAR) from the World Meteorological Organization