The Scikit-HEP Project

. The Scikit-HEP project is a community-driven and community-oriented e ﬀ ort with the aim of providing Particle Physics at large with a Python scientiﬁc toolset containing core and common tools. The project builds on ﬁve pillars that embrace the major topics involved in a physicist’s analysis work: datasets, data aggregations, modelling, simulation and visualisation. The vision is to build a user and developer community engaging collaboration across experiments, to emulate scikit-learn’s uniﬁed interface with Astropy’s embrace of third-party packages, and to improve discoverability of relevant tools.


Introduction
It is acknowledged that Python is an extremely popular programming language across a broad range of communities. Outside High Energy Physics (HEP), the Python scientific ecosystem is built atop the "building blocks" of the SciPy ecosystem of open-source software for mathematics, science, and engineering [1]. A self-explanatory visualisation of the ecosystem is given in figure 1. The ecosystem grows to incorporate data manipulation and visualisation tools, packages for statistics and machine learning, etc. At the top of the "pyramid" lie domain-specific projects -for example, astropy [2] -which build on and exploit the building blocks. Traditionally, HEP has been evolving in a rather disjoint ecosystem based on the C++ ROOT data analysis framework [4]. Same as for the Python scientific ecosystem, it provides tools for data manipulation and modeling, for fitting, for statistics and machine learning applications. But it is a toolkit rather than a toolset, with bindings to Python.
Various initiatives exist or have existed, which try and link both HEP and non-HEP worlds. But they mainly tackle(d) specific topics. We believe there is need for a more generalised effort, domain-specific oriented.

Scikit-HEP project overview
The Scikit-HEP project [5] is a community-driven and community-oriented effort with the aim of providing Particle Physics at large with a Python scientific toolset containing core and common tools. The project builds on five pillars that embrace the major topics involved in a physicist's analysis work: datasets, data aggregations, modelling, simulation and visualisation.
The project should neither be seen as a replacement for ROOT nor a replacement for the Python ecosystem based on the SciPy suite. It is rather the following: • An initiative to improve the interoperability between HEP tools and the Python ecosystem, expanding the typical set of tools for HEP physicists with common APIs and definitions to ease "cross-talk".
• An initiative to build a community of developers and users.
• An effort to improve discoverability of relevant tools.
The Scikit-HEP toolset is depicted in figure 2. For completeness, it should be mentioned that the well-known packages root_numpy [6] and root_pandas [7], pre-dating the project, are not described in this report. They are nevertheless part of the project, but somewhat deprecated by the new and more versatile package uproot [8], see below.
The remainder of this report briefly presents each package with simple examples of main functionality. Figure 2. Overview of the packages making the Scikit-HEP toolset. All GitHub repositories can be found at the location https://github.com/scikit-hep.

Outlook
The Scikit-HEP project is gaining interest and momentum as a Python library for HEP analysis. Some of the packages are in fact being used by other communities, in particular the astroparticle physics community. Much can already be done with the packages described here. The various packages presented are being further developed and improved as users get to strain test them, and provide feedback. It is foreseen that other packages join the project in order to complement the offered toolset.
Anyone is welcome to get in touch as user or developer via the public scikit-hep-forum mailing list. All project administrators and package maintainers can be reached with the scikit-hep-admins mailing list.