WLCG Token Usage and Discovery FERMILAB-CONF-21-078-SCD

. Since 2017, the Worldwide LHC Computing Grid (WLCG) has been working towards enabling token based authentication and authorisation throughout its entire middleware stack. Following the publication of the WLCG v1.0 Token Schema [1] in 2019, middleware developers have been able to enhance their services to consume and validate OAuth2.0 [2] tokens and process the authorization information they convey. Complex scenarios, involving multiple delegation steps and command line ﬂows, are a key challenge to be addressed in order for the system to be fully operational. This paper expands on the anticipated token based workﬂows, with a particular focus on local storage of tokens and their discovery by services. The authors include a walk-through of this token ﬂow in the RUCIO managed data-transfer scenario, including delegation to FTS and authorised access to storage elements. Next steps are presented, including the current target of submitting production jobs authorised by Tokens within 2021.


Introduction
Over the past few years there has been significant progress made towards making token based authentication and authorisation a realistic goal for WLCG. OAuth2.0 workflows for physics analysis have been prototyped thanks to technical developments, made by both industry and the wider academic community, and to time dedicated by many members of WLCG to address WLCG-specific challenges. The current objective is to be able to submit production jobs within 2021.

Contributing Groups
The WLCG Authorisation Working Group was formed in 2017, at a time when multiple activities were independently beginning to seriously consider token based authorisation. Experts from multiple domains and projects -including SciTokens [3], the INDIGO DataCloud project [4] and EGI [5] -came together to chart a path towards token based authorisation for WLCG [6]. Work to enhance software was supported by several European Commission Projects: EOSC-Hub [7], EOSC Pilot [8] and AARC2 [9]. The group focuses on the technical and policy challenges affecting WLCG's transition to OAuth2.0 authorisation.
The Data Organization, Management and Access (DOMA) Working Group [10] has been instrumental in getting WLCG token support tested in data handling workflows that will be vital for LHC Run 3 and beyond.

WLCG Token Schema
The WLCG Token Schema v1.0 was published in September 2019 [1] and defined the semantics for Token use within the WLCG. It was largely inspired by the SciTokens schema but, importantly, addressed some WLCG-specific requirements such as the need for including group information in tokens. The schema document defines recommended lifetimes for different tokens in the ecosystem and a mechanism for requesting tokens conforming to a given version of the specification. Many of the areas that required extended discussion have since been addressed by an Internet Draft RFC that defines the content of Access Tokens [11]. A future addition of the WLCG specification will take advantage of this new document and focus on defining only those aspects which fall outside the scope of the RFC.
The publication of the WLCG Schema allowed middleware developers to implement support for OAuth2.0 Tokens, which they could test using a WLCG Token Issuer deployed at INFN. This Token Issuer is a deployment of INDIGO IAM [12], the software chosen by the WLCG for this purpose, following a thorough analysis of several very viable options.

Command Line Tools
Whilst other scientific domains are seeing their researchers moving to web based analysis, a large proportion of physicists' work is still performed on the command line. Tools to provision OAuth2.0 tokens into a user's local environment must be both convenient and secure, minimising any requirements for the user e.g. to perform operations on a web portal. Tokens must be discoverable by command line clients, which led to the definition of the Bearer Token Discoverability specification described in Section 3. Further details on Command Line tools for WLCG Token Based workflows can be found in the paper by Dr Dave Dykstra for vCHEP 2021.

Token authorization flows
To make token based authentication and authorization a reality in WLCG we need to define how tokens are obtained from the Virtual Organisation (VO) token issuer (e.g., IAM) and sent across services to drive authentication and authorization in the infrastructure. We will rely only on standard OAuth/OpenID connect authorization flows for this purpose.
It is expected that most services will act as OAuth resource servers (which do not need registration at the token issuer), while services acting as entry points to the infrastructure (e.g., experiment frameworks, UIs, etc.) or that need to exchange tokens will have to be registered in IAM as clients.
Most registered services will obtain tokens using the OpenID Connect authorization flow [13], the recommended flow for server-side applications. The OAuth token exchange flow [14] will be used for delegation across services, to implement audience and scope restrictions and the ability to delegate offline access privileges across the chain of services. The device code flow [15] will be mainly used in support of CLI applications (e.g., UIs), to implement authentication flows on the terminal that can support federated identity providers like the CERN SSO [16]. When services need to act on their own identity, i.e. not on behalf of a user, the OAuth client credentials flow [2] will be used.
User authentication is requested by including the openid scope in the authorization requests. Audience restrictions are requested using the audience parameter, as standardized in the OAuth token exchange standard. IAM honours what is a requested by a client, and includes the requested audience in issued access tokens. When no audience is explicitly requested, the generic audience string is included in access tokens, as required by the WLCG profile.

The RUCIO-FTS-SEs flow
To support DOMA activities, the first scenario we focused on is the RUCIO managed data transfer, which is common to many LHC experiments (as depicted in Figure 1). In this scenario, RUCIO [17] delegates its identity to FTS [18] to manage a third-party transfer between two storage elements. RUCIO requests a token from IAM using the client credentials flow, as it is acting on its own service identity. In this request (step 1 in Figure 1), RUCIO requests that the token audience is restricted to the FTS service. RUCIO then submits a transfer job to FTS including the token in the request. FTS creates the transfer job, but cannot use the received token to manage the transfer as the token audience is specific to FTS and may not provide the privileges needed for reading and writing data at storage elements. So FTS starts a token exchange flow with IAM to exchange the received token (step 3.) with a couple of tokens, an access token and a refresh token, that will be used to manage the transfer. In this flow, FTS requests the scopes needed to access the data and restricts the audience of the issued tokens to the target storage elements. The access token obtained in the flow is then used to submit the third-party transfer to one of the SE (step 4.) and for the actual data transfer among the SEs (step 5.). The refresh token can be used by FTS to get fresh access token from IAM (using the standard OAuth refresh token flow) if needed.

WLCG IAM Operational Readiness
In order to transition to production use of the WLCG IAM Token Issuers, the new Token Based infrastructure must be operated with a level of service at least equivalent to that of the current infrastructure. This implies running highly available WLCG IAM instances, offering user support, and ensuring that service incidents can be addressed within an acceptable timeframe. In addition, the Token Issuers must demonstrate their trustworthiness by conforming with the EUGridPMA's Guidelines for Attribute Authority Service Provider Operations [19].
WLCG IAM Instances for CMS and ATLAS have been successfully deployed on CERN's Openshift Infrastructure for several months, allowing them to be highly scalable and leveraging central CERN IT services wherever possible. IAM instances for the remaining VOs will be set up similarly. Each instance will be integrated behind CERN's Single-Sign-On [16], improving user experience for the researchers and facilitating the inclusion of such services in the investigation of security incidents. Production level support is expected for the second half of 2021.

Token Discovery
Client tools that rely on a bearer token for authenticating themselves need a mechanism for receiving the tokens from their environment. While the browser is a monolithic user agent (and can internally manage tokens), the terminal environment involves a number of independently developed tools; the environment needs a way to communicate the token to be used to Unix processes. As we did not find any existing standard for token discovery, we have defined one to be used by the tools built in our community. The rest of this section is a description of that standard.
If a tool needs to authenticate with a token and does not have out-of-band WLCG Bearer Token Discovery knowledge on which token to use, the following steps to discover a token MUST be taken in sequence, where $ID below denotes the process's effective user ID: 1. If the BEARER_TOKEN environment variable is set, then its value is taken to be the token contents.
2. If the BEARER_TOKEN_FILE environment variable is set, then its value is interpreted as a filename. The contents of the specified file are taken to be the token contents.
3. If the XDG_RUNTIME_DIR environment variable is set 1 , then take the token from the contents of $XDG_RUNTIME_DIR/bt_u$ID 2 .
If a potential token is found at a step, then the discovery implementation MUST strip all whitespace on the left and right sides of the string. We define whitespace the same way as the C99 isspace() function: space, form-feed (\f), newline (\n), carriage return (\r), horizontal tab (\t), and vertical tab (\v). Upon finding a valid token according to section 2.1 of RFC6750 [20], the discovery procedure MUST terminate and return this token. Upon finding an empty token, the discovery implementation should continue with the next step. Upon finding an invalid token, the implementation SHOULD stop and return an error.
Upon discovery of a valid token, referred to as $TOKEN, if the tool is to use it to authenticate an HTTP request the tool MUST use it in accordance with RFC6750. For example, in the Authorization header as follows: