Data Access Control Setup ========================= This short tutorial will explain how to setup the ESGF Access Control infrastructure for publishing and downloading a specific dataset. As an example, we will consider the case of NASA “obs4MIPs” data. Step 1: Configure the ESGF Postgres database -------------------------------------------- Use the command line client to interact with the Postgres database: .. code:: console psql -U dbsuper -d esgcet (type in your Postgres super-user password). Create a new group that will control all operations on the dataset (all SQL commands below must be typed in only one line): .. code:: console esgcet=# insert into esgf_security.group (id, name, description, visible, automatic_approval) values (2, ‘NASA OBS’, ‘NASA observations’, true, true); Note that: - visible=true will cause the group to be exposed to the public for registration - automatic_approval=true will enroll users into the group (with READ-ONLY priileges) upon request, without the need for the group administrators to approve the request Then, assign read/write privileges to one or more users who will be publishing the data. In this case, we look up the “rootAdmin” user and assign him/her priileges on the group just created: .. code:: console esgcet=# select id from esgf_security.user where openid like ‘%rootAdmin%’; esgcet=# insert into esgf_security.permission (user_id, group_id, role_id, approved) values (1, 2, 4, true); # ‘publisher’ role, aka ‘write’ privileges esgcet=# insert into esgf_security.permission (user_id, group_id, role_id, approved) values (1, 2, 6, true); # ‘user’ role, aka ‘read’ privileges Step 2: Edit the ESGF XML configuration files --------------------------------------------- As root, edit the file /esg/config/esgf_policy_local.xml to specify one or more policies for reading/writing the files of your dataset. For example: .. code:: console vi /esg/config/esgf_policies_local.xml Also, you must edit the file /esg/config/esgf_ats_static.xml to specify the URLs of the Attribute and Registration services that manage membership in the access control group. For example: .. code:: console vi /esg/config/esgf_ats_static.xml When done, you may restart the node, but there is really no need to as the above files should be automatically reloaded. Step 3 (optional): Publish the Group Registration URL ----------------------------------------------------- CoG provides functionality for streamlining the user registration into the ESGF access control groups. Whenever CoG is connected to an ESFG “security” database back-end, it will automatically create an appropriate registration page for each of the ESGF access control groups read from the local database. These pages all have URLs of the form: .. code:: console https:///ac/subscribe// (for example: https://esgf-node.llnl.gov/ac/subscribe/NASA%20OBS/), so as an node administrator you can embed this URL anywhere on your node where content is allowed: for example, on the node home page, or on the home page for the specific “NASA OBS” project. Users can visit the registration page directly to request READ permission, without having to go through the old ESGF data download workflow. Additionally, the registration page can be “embedded” with a license for the users to read before they request membership. To do so, place a file called .html (in HTML format) or .txt (in plain text format) under your local templates directory, specifically: .. code:: console /usr/local/cog/cog_config/mytemplates/cog/access_control/licenses/.html or: /usr/local/cog/cog_config/mytemplates/cog/access_control/licenses/.txt The figure below shows an example registration page with embedded HTML license. .. figure:: /images/ESGF-CoG_group_registration_page.png :scale: 45% :alt: Figure1. Example ESGF-CoG registration page with optional license agreement display. Special Case: Unrestricted Data ------------------------------- In some cases, the data might be available for download without any restrictions at all, i.e. simply to guest users. In this case, the Node administrator only needs to insert a policy statement in the file /esg/config/esgf_policies_local.xml that matches the data URLs, and uses the special attribute_type=“ANY”. Note that your will still want to have a restricted access control group to enable publishing of the data. For example: .. code:: console vi /esg/config/esgf_policies_local.xml Special Case: Authentication Only Data -------------------------------------- In other cases, the data providers might want to require users to authenticate before downloading the data, so they can capture their openid for metrics reporting, but they don’t need users to enroll in any group. In this case, they can use a policy statement with the special attribute_type=“AUTH_ONLY”. For example: .. code:: console vi /esg/config/esgf_policies_local.xml