From 13ca989448a7532bc25f88a883be67845b795ddc Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Tue, 1 Oct 2024 15:37:57 +0200 Subject: [PATCH 01/20] update Part I / Introduction --- guide/sections/introduction.adoc | 2 +- guide/sections/part1/introduction.adoc | 64 +++++++++++++------------- 2 files changed, 33 insertions(+), 33 deletions(-) diff --git a/guide/sections/introduction.adoc b/guide/sections/introduction.adoc index caeb742..df9f107 100644 --- a/guide/sections/introduction.adoc +++ b/guide/sections/introduction.adoc @@ -1,4 +1,4 @@ == Introduction === Purpose -In conjunction with the https://library.wmo.int/idurl/4/68731[_Manual on the WMO Information System_] (WMO-No. 1060), Volume II – WMO Information System 2.0 (_Manual on WIS_, Volume II), the present Guide to the WMO Information System, Volume II – WMO Information System 2.0 _(Guide to WIS_, Volume II) is designed to ensure adequate uniformity and standardization in the data, information and communication practices, procedures and specifications employed by Members of the World Meteorological Organization (WMO) in the operation of the WMO Information System WIS 2.0 as it supports the mission of the Organization. The Manual on WIS, Volume II contains standard and recommended practices, procedures and specifications. The Guide to WIS contains additional information concerning practices, procedures and specifications that Members are invited to follow or implement in establishing and conducting their arrangements in compliance with the WMO technical regulations and in developing meteorological and hydrological services. \ No newline at end of file +In conjunction with the https://library.wmo.int/idurl/4/68731[_Manual on the WMO Information System_] (WMO-No. 1060), Volume II – WMO Information System 2.0 (_Manual on WIS_, Volume II), the present _Guide to the WMO Information System_ (WMO-No. 1061), Volume II – WMO Information System 2.0 _(Guide to WIS_, Volume II) is designed to ensure adequate uniformity and standardization in the data, information and communication practices, procedures and specifications employed by WMO Members in the operation of the WMO Information System WIS 2.0 as it supports the mission of the Organization. The _Manual on WIS_, Volume II contains standard and recommended practices, procedures and specifications. The _Guide to WIS_, Volume II contains additional information concerning practices, procedures and specifications that Members are invited to follow or implement in establishing and conducting their arrangements in compliance with the WMO technical regulations and in developing meteorological and hydrological services. \ No newline at end of file diff --git a/guide/sections/part1/introduction.adoc b/guide/sections/part1/introduction.adoc index d35e878..30a2794 100644 --- a/guide/sections/part1/introduction.adoc +++ b/guide/sections/part1/introduction.adoc @@ -2,7 +2,7 @@ Since the Global Telecommunication System (GTS) entered operational life in 1971, it has been a reliable real-time exchange mechanism of essential data for WMO Members. -In 2007, the WMO Information System (WIS) entered operations to complement the GTS, providing a searchable catalogue and a Global Cache to enable additional discovery, access and retrieval. The success of WIS was limited as the system only partially met the requirement of providing simple access to WMO data. Today’s technology developed for the Internet of Things (IoT) opens the possibility of creating a WIS2 that is able to stand to its expectations of delivering an increasing number and volume of real-time data to WMO centres in a reliable and cost -effective way. +In 2007, the WMO Information System (WIS) entered operations to complement the GTS, providing a searchable catalogue and a Global Cache to enable additional discovery, access and retrieval of data. The success of WIS was limited as the system only partially met the requirement of providing simple access to WMO data. Today’s technology developed for the Internet of Things (IoT) opens the possibility of creating a WIS2 that is able to deliver an increasing number and volume of real-time data to WMO centres in a reliable and cost -effective way. WIS2 has been designed to meet the shortfalls of the current WIS and GTS, support Resolution 1 (Cg-Ext(2021)) – WMO Unified Policy for the International Exchange of Earth System Data (https://library.wmo.int/idurl/4/57850[_World Meteorological Congress: Abridged Final Report of the Extraordinary Session_] (WMO-No. 1281)), support the Global Basic Observing Network (GBON) and meet the demand for high data volume, variety, velocity and veracity. @@ -10,23 +10,23 @@ WIS2 technical framework is based around three foundational pillars: leveraging ==== 1.1.1 Leveraging open standards -WIS2 leverages open standards to take advantage of the ecosystem of technologies available on the market and avoid building bespoke solutions that can force National Meteorological and Hydrological Services (NMHS) to procure costly systems and equipment. In today’s standards development ecosystem, standards bodies work closely together to minimize overlap and build on one another’s areas of expertise. For example, the World Wide Web Consortium provides the framework of web standards, which the Open Geospatial Consortium and other standards bodies leverage. WIS2 leverages open standards with industry adoption and wider, stable and robust implementations, thus extending the reach of WMO data sharing and lowering the barrier to access by Members. +WIS2 leverages open standards to take advantage of the ecosystem of technologies available on the market,thereby avoiding the need to build bespoke solutions that can force National Meteorological and Hydrological Services (NMHSs) to procure costly systems and equipment. In today’s standards development ecosystem, standards bodies work together closely to minimize overlap and build on one another’s areas of expertise. For example, the World Wide Web Consortium provides the framework of web standards, which the Open Geospatial Consortium (OGC) and other standards bodies leverage. WIS2 leverages open standards with industry adoption and wider, stable and robust implementations, thus extending the reach of WMO data sharing and lowering the barrier to access by Members. ==== 1.1.2 Simpler data exchange -WIS2 prioritizes public telecommunication networks, unlike private networks for GTS links. As a result, using the Internet will enable the best choice for a local connection, using commonly available and well-understood technology. +WIS2 prioritizes public telecommunication networks, rather than private networks for GTS links. As a result, using the Internet will enable the best choice for a local connection, using commonly available and well-understood technology. -WIS2 aims to improve the discovery, access and utilization of weather, climate and water data by adopting web technologies proven to provide a truly collaborative platform for a more participatory approach. Data exchange using the Web also facilitates easy access mechanisms. Browsers and search engines allow web users to discover data without specialized software. The Web also enables additional data access platforms, such as desktop Geographical Information Systems (GIS), mobile applications and forecaster workstations. The Web provides access control and security mechanisms that can be utilized to freely share core data as defined by Resolution 1 (Cg-Ext(2021)) and protect the data with more restrictive licensing constraints. Web technologies also allow for authentication and authorization for the provider to retain control of who can access published resources and to request users to accept a license specifying the terms and conditions for using the data as a condition for providing access to them. +WIS2 aims to improve the discovery, access and utilization of weather, climate and water data by adopting web technologies proven to provide a truly collaborative platform for a more participatory approach. Data exchange using the Web also facilitates easy access mechanisms. Browsers and search engines allow web users to discover data without the need for specialized software. The Web also enables additional data access platforms, such as desktop geographical information systems (GIS), mobile applications and forecaster workstations. The Web provides access control and security mechanisms that can be utilized to freely share core data as defined by Resolution 1 (Cg-Ext(2021)) and to protect the data with more restrictive licensing constraints. Web technologies also allow for authentication and authorization to enable the provider to retain control of who can access published resources and to request users to accept a license specifying the terms and conditions for using the data as a condition of being granted access. -WIS2 uses a "publish-subscribe" pattern where users subscribe to a topic to receive new data in real -time. The mechanism is like WhatsApp and other messaging applications. It is a reliable and straightforward way to allow the user to choose her data of interest and to receive them reliably. +WIS2 uses a "publish-subscribe" pattern by which users subscribe to a topic to receive new data in real time. The mechanism is similar to WhatsApp and other messaging applications. It is a reliable and straightforward way to allow the users to choose their data of interest and to receive them reliably. ==== 1.1.3 Cloud-ready solutions -The cloud provides reliable platforms for data sharing and processing. It reduces the need for expensive local IT infrastructure, which constitutes a barrier to developing effective and reliable data processing workflows for some WMO Members. WIS2 encourages WMO centres to adopt cloud technologies where appropriate to meet their users' needs. While WMO technical regulations will not mandate cloud services, WIS2 will promote a gradual adoption of cloud technologies that provide the most effective solution. +The cloud provides reliable platforms for data sharing and processing. It reduces the need for expensive local IT infrastructure, which constitutes a barrier to developing effective and reliable data processing workflows for some WMO Members. WIS2 encourages WMO centres to adopt cloud technologies where appropriate to meet users' needs. While WMO technical regulations will not mandate cloud services, WIS2 will promote the gradual adoption of cloud technologies that provide the most effective solution. -The cloud-based infrastructure allows easy portability of technical solutions, ensuring that a system implemented by a specific country can be packaged and deployed easily in other countries with similar needs. In addition, using cloud technologies allows WIS2 to deploy infrastructure and systems efficiently with minimum effort for the NMHSs by shipping ready-made services and implementing consistent data processing and exchange techniques. +The cloud-based infrastructure allows for the easy portability of technical solutions, ensuring that a system implemented by a specific country or territory can be packaged and deployed easily in other countries/territories with similar needs. In addition, using cloud technologies allows WIS2 to deploy infrastructure and systems efficiently, while requiring minimal effort from the NMHSs by shipping ready-made services and implementing consistent data processing and exchange techniques. -It should be clear that hosting data and services on the cloud does not affect data ownership. Even in a cloud environment, organizations retain ownership of their data, software, configuration and change management as if they were hosting their infrastructure. As a result, data authority and provenance stay with the organization, and the cloud is simply a technical means to publish the data. +It is importantn to note that hosting data and services on the cloud does not affect data ownership. Even in a cloud environment, organizations retain ownership of their data, software, configuration and change management as if they were hosting their infrastructure. As a result, data authority and provenance stay with the organization, and the cloud is simply a technical means to publish the data. ==== 1.1.4 Why are datasets so important? @@ -34,49 +34,49 @@ WMO enables the international exchange of observations and model data for all Ea Resolution 1 (Cg-Ext(2021)) describes the Earth system data that are necessary for efforts to monitor, understand and predict the weather and climate – including the hydrological cycle, the atmospheric environment and space weather. -WIS is the mechanism by which this Earth system data is exchanged. +WIS is the mechanism by which these Earth system data are exchanged. -A common practice when working with data is to group them into "datasets". All the data in a dataset share some common characteristics. The Data Catalog Vocabulary (DCAT) defines a dataset as a "collection of data, published or curated by a single agent, and available for access or download in one or more representations" footnote:[Data Catalog Vocabulary (DCAT) - Version 2, W3C Recommendation 04 February 2020 https://www.w3.org/TR/vocab-dcat-2/#Class:Dataset]. +A common practice when working with data is to group them into "datasets". All the data in a dataset share some common characteristics. The Data Catalog Vocabulary (DCAT) defines a dataset as a "collection of data, published or curated by a single agent, and available for access or download in one or more representations" footnote:[See Data Catalog Vocabulary (DCAT) - Version 2, W3C Recommendation 04 February 2020 https://www.w3.org/TR/vocab-dcat-2/#Class:Dataset]. -Why is this important? The "single agent" (such as a single organization) responsible for managing the collection ensures consistency among the data. For example, in a dataset: +Why is this important? The "single agent" (such as an organization) responsible for managing the collection ensures consistency among the data. For example, in a dataset: * All the data should be of the same type (for example, observations from weather stations). * All the data should have the same license and/or usage conditions. -* All the data should be subject to the same quality management regime - which may mean that all the data is collected or created using the same processes. -* All the data should be encoded in the same way (such as, using the same data formats and vocabularies). +* All the data should be subject to the same quality management regime - which may mean that all the data are collected or created using the same processes. +* All the data should be encoded in the same way (for example, using the same data formats and vocabularies). * All the data should be accessible using the same protocols - ideally from a single location. -This consistency means that one can predict what data is in a dataset, at least as far as the common characteristics, making it easier to write applications to process the data. +This consistency means that it is possible to predict the contents of a dataset, at least regarding the common characteristics, making it easier to write applications to process the data. -A dataset might be published as an immutable resource (such as, data collected from a research programme), or it might be routinely updated (for example, every minute as new observations are collected from weather stations). +A dataset may be published as an immutable resource (such as, data collected from a research programme), or it may be routinely updated (for example, every minute, as new observations are collected from weather stations). -A dataset may be represented as a single, structured file or object (for example, a CSV file where each row represents a data record) or as thousands of consistent files (for example, output from a reanalysis model encoded as many thousands as possible of General Regularly-distributed Information in Binary form (GRIB) files). Determining the best way to represent a dataset is beyond the scope of this Guide – there are many factors to consider. The key point here is that we consider the dataset to be a single, identifiable resource irrespective of how it is represented. +A dataset may be represented as a single, structured file or object (for example, a CSV file in which each row represents a data record) or as thousands of consistent files (for example, output from a reanalysis model encoded as many thousands of General Regularly-distributed Information in Binary form (GRIB) files). Determining the best way to represent a dataset is beyond the scope of this Guide – there are many factors to consider. The key point here is that the dataset is considered to be a single, identifiable resource, irrespective of how it is represented. -Because we group data into a single, conceptual resource (that is, the dataset) we can: -* Give this resource an identifier and use this identifier to unambiguously refer to collections of data; +Because data are grouped into a single, conceptual resource (that is, the dataset) it is possible to: +* Assign this resource an identifier and use this identifier to unambiguously refer to collections of data; * Make statements about the dataset (that is, metadata) and infer that these statements apply to the entire collection. The dataset concept is central to WIS: -* We publish discovery metadata about datasets, as specified in the _Manual on WIS_, Volume II – Appendix F: WMO Core Metadata Profile; -* We can search for datasets that contain relevant data using the Global Discovery Catalogue (see <<_2_4_4_global_discovery_catalogue>>); -* We can subscribe to notifications about updates about a dataset via a Global Broker (see <<_2_4_2_global_broker>>); -* We can access the data that comprises a dataset from a single location using a well -described mechanism. +* Discovery metadata about datasets are published, as specified in the _Manual on WIS_, Volume II – Appendix F. WMO Core Metadata Profile (Version 2); +* Data consumers can search for datasets that contain relevant data using the Global Discovery Catalogue (see <<_2_4_4_global_discovery_catalogue>>); +* Data consumers can subscribe to notifications about updates about a dataset via a Global Broker (see <<_2_4_2_global_broker>>); +* Data consumers can access the data that comprise a dataset from a single location using a well -described mechanism. -It is up to the data publisher to decide how their data is grouped into datasets – effectively, to decide what datasets they publish to WIS. That said, we recommend that, subject to the consistency rules above, data publishers should organize their data into as few datasets as possible. +It is up to data publishers to decide how their data are grouped into datasets – effectively, to decide what datasets they publish to WIS. That said, it is recommended that, subject to the consistency rules above, data publishers should organize their data into as few datasets as possible. -For a data publisher, this means fewer discover metadata records to maintain. For a data consumer this means fewer topics to subscribe to and fewer places to access the data. +For a data publisher, this means fewer discover metadata records to maintain. For a data consumer, this means fewer topics to subscribe to and fewer places to access the data. There are some things that are fixed requirements for datasets: -1. All data in the dataset must be accessible from a single location. +1. All data in the dataset must be accessible from a single location; 2. All data in the dataset must be subject to the same license or usage conditions. -Here are some examples of datasets: +Some examples of datasets include: -* The most recent 5-days of synoptic observations for an entire country or territory footnote:[Why 5-days in this example? Because the system used to publish the data in this example only retains data for 5-days]. -* Longterm record of observed water quality for a managed set of hydrological stations. -* Output from the most recent 24-hours of operational numerical weather prediction model runs. -* Output from 6-months of experimental model runs. It is important to note that output from the operational and experimental should not be merged into the same dataset because they use different algorithms - it is very useful to be able to distinguish the provenance (or lineage) of data. -* A multipetabyte global reanalysis spanning 1950 to present day. +* The most recent five days of synoptic observations for an entire country or territory footnote:[Why 5-days in this example? Because the system used to publish the data in this example only retains data for 5-days]; +* A long-term record of observed water quality for a managed set of hydrological stations; +* The output from the most recent 24 hours of operational numerical weather prediction model runs; +* The output from six months of experimental model runs. It is important to note that output from the operational and experimental model runs should not be merged into the same dataset because they use different algorithms - it is very useful to be able to distinguish the provenance (or lineage) of data; +* A multi-petabyte global reanalysis spanning 1950 to the present. -In summary, datasets are important because they are how data is managed in WIS. +In summary, datasets are important because they are how data are managed in WIS. From 2a7886bd2cf6351ac4b735d0c6a573321e249aab Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Fri, 4 Oct 2024 11:20:21 +0200 Subject: [PATCH 02/20] update part 1 for data consumers and data publishers --- guide/sections/part1/data-consumer.adoc | 66 +++++----- guide/sections/part1/data-publisher.adoc | 159 +++++++++++------------ 2 files changed, 112 insertions(+), 113 deletions(-) diff --git a/guide/sections/part1/data-consumer.adoc b/guide/sections/part1/data-consumer.adoc index 870ed9c..10dbd5d 100644 --- a/guide/sections/part1/data-consumer.adoc +++ b/guide/sections/part1/data-consumer.adoc @@ -1,90 +1,90 @@ === 1.2 Data consumer -As a data consumer wanting to use data published via WIS2 you should read the guidance presented here. In addition, a list of references to informative material in this Guide and elsewhere is provided at the end of this section. +Data consumers wanting to use data published via WIS2 should read the guidance presented here. In addition, a list of references to informative material in this Guide and elsewhere is provided at the end of this section. ==== 1.2.1 How to search the Global Discovery Catalogue to find datasets -The first step to using data published via WIS2 is to determine which dataset or datasets contains the data that is needed. To do this, a data consumer may browse discovery metadata provided by the Global Discovery Catalogue. Discovery metadata follows a standard scheme (see _Manual on WIS_, Volume II – Appendix F: WMO Core Metadata Profile). A data consumer may discover a dataset using keywords, geographic area of interest, temporal information, or free text. Matching search results from the Global Discovery Catalogue provide high-level information (title, description, keywords, spatiotemporal extents, data policy, licensing, contact information), from which a data consumer can assess and evaluate their interest in accessing/downloading data associated with the dataset record. +The first step to using data published via WIS2 is to determine which dataset or datasets contain the data that are needed. To do this, a data consumer may browse discovery metadata provided by the Global Discovery Catalogue. Discovery metadata follow a standard scheme (see _Manual on WIS_, Volume II – Appendix F. WMO Core Metadata Profile (Version 2)). A data consumer may discover a dataset using keywords, a geographic area of interest, temporal information, or free text. Matching search results from the Global Discovery Catalogue provide high-level information (title, description, keywords, spatiotemporal extents, data policy, licensing, contact information), from which data consumers can assess and evaluate their interest in accessing/downloading data associated with the dataset record. -A key component of dataset records in the Global Discovery Catalogue is that of "actionable" links. A dataset record provides one to many links that clearly identify the nature and purpose of the link (informational, direct download, API, subscription) so that the data consumer can interact with the data accordingly. For example, a dataset record may include a link to subscribe to notifications (see <<_1_2_2_how_to_subscribe_to_notifications_about_the_availability_of_new_data>>) about the data, or an API, or an offline archive retrieval service. +A key component of dataset records in the Global Discovery Catalogue is "actionable" links. A dataset record provides one or more links, each clearly identifying its nature and purpose (informational, direct download, application programming interface (API), subscription) so that the data consumer can interact with the data accordingly. For example, a dataset record may include a link to subscribe to notifications about the data(see <<_1_2_2_how_to_subscribe_to_notifications_about_the_availability_of_new_data>>), or an API, or an offline archive retrieval service. The Global Discovery Catalogue is accessible via an API and provides a low barrier mechanism (see <<_2_2_4_global_discovery_catalogue>>). Internet search engines are able to index the discovery metadata in the Global Discovery Catalogue, thereby providing data consumers with an alternative means to search for WIS2 data. ==== 1.2.2 How to subscribe to notifications about the availability of new data -WIS2 provides notifications about updates to datasets; for example, when a new observation record from an automatic weather station is added to a dataset of surface observations. Notifications are published on message brokers. Where data consumers need to use data rapidly once it has been published (such as input to a weather prediction model), they should subscribe to one or more Global Broker to get notifications messages using Message Queuing Telemetry Transport (MQTT) protocolfootnote:[Subscribing to notifications about newly available data means that you don't need to continually to poll the data server to check for updates.]. +WIS2 provides notifications about updates to datasets; for example, a notification may indicate that a new observation record from an automatic weather station has been added to a dataset of surface observations. These notifications are published on Message Brokers. If data consumers need to use data rapidly once they have been published (for example, as inputs to a weather prediction model), they should subscribe to one or more Global Brokers to get notification messages using Message Queuing Telemetry Transport (MQTT) protocolfootnote:[Subscribing to notifications about newly available data means that you don't need to continually to poll the data server to check for updates.]. In WIS2, notifications are republished by Global Brokers to ensure resilient distribution. Consequently, there will be multiple places where one can subscribe. Data consumers requiring real-time notifications must subscribe to Global Brokers. A data consumer should subscribe to more than one Global Broker, thereby ensuring that notifications continue to be received if a Global Broker instance fails. -A dataset in WIS2 is associated with a unique _topic_. Notifications about updates to a dataset are published to the associated topic. Topics are organized according to a standard scheme (see the _Manual on WIS_, Volume II - Appendix D: WIS2 Topic Hierarchy). +A dataset in WIS2 is associated with a unique _topic_. Notifications about updates to a dataset are published to the associated topic. Topics are organized according to a standard scheme (see the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). -A data consumer can find the appropriate topic to subscribe to either by searching the Global Discovery Catalogue, using an Internet search enginefootnote:[Internet search engines allow data consumers to discover WIS2 datasets by indexing the content in the Global Discovery Catalogues.], or by browsing the topic hierarchy on a Message Broker. +A data consumer can find the appropriate topic to subscribe to either by searching the Global Discovery Catalogue, by using an Internet search engine,footnote:[Internet search engines allow data consumers to discover WIS2 datasets by indexing the content in the Global Discovery Catalogues.], or by browsing the topic hierarchy on a Message Broker. -WIS2 uses Global Caches to distribute core data, as defined in the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). Each Global Cache republishes core data on its own highly available data server and publishes a new notification message advertising the availability of that data from the Global Cache location. +WIS2 uses Global Caches to distribute core data, as defined in the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). Each Global Cache republishes core data on its own highly available data server and publishes a new notification message advertising the availability of those data from the Global Cache location. Notifications from WIS2 Nodes and Global Caches are published on different topics: The root topic used by WIS2 Nodes is ``origin``, while the root topic used by Global Caches is ``cache``. Other than the root, the topic hierarchy is identical. For example, for synoptic weather observations published by Environment Canada: -* Environment and Climate Change Canada, Meteorological Service of Canada's WIS2 Node publishes to: ``origin/a/wis2/ca-eccc-msc/data/core/weather/surface-based-observations/synop`` -* Global Caches publish to: ``cache/a/wis2/ca-eccc-msc/data/core/weather/surface-based-observations/synop`` +* Environment and Climate Change Canada, Meteorological Service of Canada's WIS2 Node publishes to: ``origin/a/wis2/ca-eccc-msc/data/core/weather/surface-based-observations/synop``; +* Global Caches publish to: ``cache/a/wis2/ca-eccc-msc/data/core/weather/surface-based-observations/synop``. -As per clause 3.2.13 of the _Manual on WIS_, Volume II, data consumers should access core data from the Global Caches. Consequently, they need to subscribe to the ``cache`` topic hierarchy to receive the notifications from Global Caches, each of which provides a link (that is, URL) to download from the respective Global Cache's data server. +As per clause 3.2.13 of the _Manual on WIS_, Volume II, data consumers should access core data from the Global Caches. In order to access these data, they need to subscribe to the ``cache`` topic hierarchy. They will then receive the relevant notifications from the Global Caches, each of which will contain a link (URL) enabling them to download the relevant data from the data server of the corresponding Global Cache. ==== 1.2.3 How to use a notification message to decide whether to download data -On receipt of a notification message, a data consumer needs to decide whether to download the newly available data. The content of the notification message provides the information needed to make this decision. For details of the specification, see the _Manual on WIS_, Volume II - Appendix E: WIS2 Notification Message. +On receipt of a notification message, a data consumer needs to decide whether to download the newly available data. The content of the notification message provides the information needed to make this decision (see the _Manual on WIS_, Volume II - Appendix E. WIS2 Notification Message). -In many cases, data consumers will use a software application to determine whether or not to download the data. This section provides insight about what happens. +In many cases, data consumers will use a software application to determine whether or not to download the data. The present section explains this process. -When subscribing to multiple Global Brokers the data consumer will receive multiple copies of a notification message. Each notification message has a unique identifier, defined using the ``id`` property. Duplicate messages should be discarded. +When subscribing to multiple Global Brokers, the data consumers will receive multiple copies of a notification message. Each notification message has a unique identifier, defined using the ``id`` property. Duplicate messages should be discarded. -The core data will be available from both a WIS2 Node and Global Caches, each of which publishes a different notification message advertising an alternative location from where the data can be downloaded. Because these are different messages, they will have different identifiers. However, each of these messages refers to the same data object, which is uniquely identified in the notification message using the data_id property. Notification messages from different sources can easily be compared to determine if they refer to the same data. By subscribing to the cache root topic, data consumers will only receive notifications about data available from the Global Caches. The origin root topic should be used when subscribing to notifications about recommended data. Data consumers should not subscribe to the origin root topic for notifications about core data because notification messages provided on these topics will refer to data published directly on the WIS2 Nodes (referred to as, the "origin"). +Core data are available from both a WIS2 Node and the Global Caches, each of which will publish a different notification message advertising an alternative location from which the data may be downloaded. Because these are different messages, they will have different identifiers. However, each of these messages refers to the same data object, which is uniquely identified in the notification message using the data_id property. Notification messages from different sources can easily be compared to determine whether they refer to the same data. By subscribing to the cache root topic, data consumers will only receive notifications about data available from the Global Caches. The origin root topic should be used when subscribing to notifications about recommended data. Data consumers should not subscribe to the origin root topic for notifications about core data because the notification messages provided on these topics will refer to data published directly on the WIS2 Nodes (referred to as the "origin"). -Data consumers need to consider their strategy for managing these duplicate messages. From a data perspective, it does not matter which Global Cache instance is used – they will all provide an identical copy of the data object published by the originating WIS2 Node. The simplest strategy is to accept the first notification message and download it from the Global Cache instance that the message refers to by using a URL for the data object at that Global Cache instance. Alternatively, a data consumer may have a preferred Global Cache instance, for example, that is located in their region. Whichever Global Cache instance is chosen, data consumers will need to implement logic to discard duplicate notification messages based on ``id`` and duplicate data objects based on ``data_id``. +Data consumers need to consider their strategy for managing these duplicate messages. From a data perspective, it does not matter which Global Cache instance is used – they will all provide an identical copy of the data object published by the originating WIS2 Node. The simplest strategy is to accept the first notification message and download the data from the Global Cache instance that the message refers to by using a URL for the data object at that Global Cache instance. Alternatively, data consumers may have a preferred Global Cache instance, for example, one that is located in their region. Whichever Global Cache instance is chosen, data consumers will need to implement logic to discard duplicate notification messages based on ``id`` and duplicate data objects based on ``data_id``. -A notification message also provides a small amount of metadata about the data object it references such as location and time. Data consumers can use this metadata to decide if the data object referenced in the message should be downloaded. This is known as client-side filtering. +A notification message also provides a small amount of metadata about the data object it references, such as location and time. Data consumers can use these metadata to decide whether the data object referenced in the message should be downloaded. This is known as client-side filtering. -The notification message should also include the metadata identifier for the dataset to which the data object belongs. A data consumer can use the metadata identifier to search the Global Discovery Catalogue and discover more about the data - in particular, whether there are any conditions on the use of this data. +The notification message should also include the metadata identifier for the dataset to which the data object belongs. A data consumer can use the metadata identifier to search the Global Discovery Catalogue and discover more about the data - in particular, whether there are any conditions on the use of those data. ==== 1.2.4 How to download data -Links to where data can be accessed are made available through dataset discovery metadata (via the Global Discovery Catalogue) and/or data notification messages (via Global Brokers). Links can be used to directly download the data (according to the network protocol and content description provided in the link) using a mechanism appropriate to the workflow of the data consumer. This could include web and/or desktop applications, custom tooling, or other approaches. +Links to where data can be accessed are made available through dataset discovery metadata (via the Global Discovery Catalogue) and/or data notification messages (via Global Brokers). Links can be used to directly download the data (according to the network protocol and content description provided in the link) using a mechanism appropriate to the workflow of the data consumer. Such mechanisms could include web and/or desktop applications, custom tools and so forth. -A discovery metadata record or notification message may provide more than one download link. The preferred link will be identified as "canonical" (link relation: "rel": "canonical" footnote:[IANA Link Relations https://www.iana.org/assignments/link-relations/link-relations.xhtml]). +A discovery metadata record or notification message may provide more than one download link. The preferred link will be identified as "canonical" (link relation: "rel": "canonical" footnote:[See Internet Assigned Numbers Authority (IANA) Link Relations: https://www.iana.org/assignments/link-relations/link-relations.xhtml]). -Where data is provided through an interactive web service, a canonical link that provides a URL where one can directly download a data object may be complemented with an additional link providing the URL for the root of the web service from where one can interact with or query the entire dataset. +Where data are provided through an interactive web service, a canonical link containing a URL from which data consumers can directly download a data object may be complemented with an additional link providing the URL for the root of the web service from which data consumers can interact with or query the entire dataset. -If a download link implements access control (for example, the data consumer needs to take some additional action(s) to download the data object), the download link will contain a security object that provides the pertinent information (such as the access control mechanism used and where/how a data consumer would need to register to request access). +If a download link implements access control (for example, the data consumer needs to take some additional action(s) to download the data object), it will contain a security object that provides the pertinent information (such as the access control mechanism used and where/how a data consumer would need to register to request access). ==== 1.2.5 How to use data -Data is shared on WIS2 in accordance with the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). This data policy describes two categories of data: core and recommended. +Data are shared on WIS2 in accordance with the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). This data policy describes two categories of data: core and recommended. -* Core data is considered essential for the provision of services for the protection of life and property and the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. -* Recommended data is exchanged on WIS2 in support of Earth system monitoring and prediction efforts. Recommended data _may_ be provided with conditions on use and/or subject to a license. +* Core data are considered essential for the provision of services for the protection of life and property and the well-being of all nations. Core data are provided on a free and unrestricted basis, without charge and with no conditions on use. +* Recommended data are exchanged on WIS2 in support of Earth system monitoring and prediction efforts. Recommended data _may_ be provided with conditions on use and/or subject to a license. -Furthermore, the Unified Data Policy (Resolution 1 (Cg-Ext (2021))) encourages attribution of the source of the data in all cases. In this way, credit is given to those who have expended effort and resources in collecting, curating, generating, or processing the data. Attribution provides visibility of who is using data which, for many organizations, provides necessary evidence to justify continued provision of and updates to the data. +The Unified Data Policy (Resolution 1 (Cg-Ext (2021))) encourages attribution of the source of the data in all cases. This ensures that, credit is given to those who have expended effort and resources in collecting, curating, generating, or processing the data. Attribution provides visibility into who is using the data, which, for many organizations, serves as crucial evidence to justify the continued provision and updating of the data. -Details of the applicable WMO data policy and any rights or licenses associated with data are provided in the discovery metadata that accompanies the data. Discovery metadata records are available from the Global Discovery Catalogue. +Details of the applicable WMO data policy and any rights or licenses associated with the data are provided in the discovery metadata accompanying the data. Discovery metadata records are available from the Global Discovery Catalogue. -The _Manual on WIS_, Volume II – Appendix F: WMO Core Metadata Profile, section 1.18 Properties / WMO data policy provides details on how data policy, rights and/or licenses are described in the discovery metadata. +The _Manual on WIS_, Volume II – Appendix F. WMO Core Metadata Profile (Version 2), 1.18 Properties / WMO Data Policy provides details on how the WMO Data Policy, rights and/or licenses are described in the discovery metadata. When using data from WIS2, data consumers: -* Shall respect the conditions of use applicable to the data as expressed in the WMO data policy, rights statements, or licenses. +* Shall respect the conditions of use applicable to the data as expressed in the WMO Data Policy, rights statements, or licenses; * Should attribute the source of the data. ==== 1.2.6 Further reading for data consumers -As a data consumer wanting to use data published via WIS2, as a minimum you should read the following sections: +Data consumers wanting to use data published via WIS2 should, at a minimum, read the following sections: * <<_1_1_introduction_to_wis2>> * <<_2_1_wis2_architecture>> * <<_2_2_roles_in_wis2>> * <<_2_4_components_of_wis2>> -The following specifications in the _Manual on WIS_, Volume II are useful for further reading: +The following specifications in the _Manual on WIS_, Volume II also provide useful information: -* Appendix D: WIS2 Topic Hierarchy -* Appendix E: WIS2 Notification Message -* Appendix F: WMO Core Metadata Profile +* Appendix D. WIS2 Topic Hierarchy +* Appendix E. WIS2 Notification Message +* Appendix F. WMO Core Metadata Profile (Version 2) diff --git a/guide/sections/part1/data-publisher.adoc b/guide/sections/part1/data-publisher.adoc index 94c5aa1..4d72441 100644 --- a/guide/sections/part1/data-publisher.adoc +++ b/guide/sections/part1/data-publisher.adoc @@ -1,177 +1,177 @@ === 1.3 Data publisher -As a data publisher with authoritative Earth system data that you want to share with the WMO community you should read the guidance presented here. In addition, a list of references to informative material in this Guide and elsewhere is provided at the end of this section. +Data publishers wanting to share authoritative Earth system data that with the WMO community should read the guidance presented here. A list of references to informative material in this Guide and elsewhere is provided at the end of this section. ==== 1.3.1 How to get started -The first thing you need to do is consider your data, how it can be conceptually grouped into one or more datasets (see <<_1_1_4_why_are_datasets_so_important?>>), and whether it is core or recommended data, as per the Unified Data Policy (Resolution 1 (Cg-Ext(2021))) . +The first thing step is to consider the data, how they can be conceptually grouped into one or more datasets (see <<_1_1_4_why_are_datasets_so_important?>>), and whether they are core or recommended data, as per the Unified Data Policy (Resolution 1 (Cg-Ext(2021))) . -Next, you need to consider where it is published. If your data relates to your country or territory, you need to publish it through a National Centre (NC). If your data relates to a region, programme, or other specialized function within WMO, you need to publish it through a Data Collection or Production Centre (DCPC). The functional requirements for NC and DCPC are described in the _Manual on WIS_, Volume II - Part III Functions of WIS. +Next, it is important to consider where the data are published. If the data relate to a specific country or territory, they should be published through a National Centre (NC). If they relate to a region, programme, or other specialized function within WMO, they should be published through a Data Collection or Production Centre (DCPC). The functional requirements for NCs and DCPCs are described in the _Manual on WIS_, Volume II - Part III Functions of WIS. -All NCs and DCPCs are affiliated with a Global Information System Centre (GISC) that has a responsibility to help establish efficient and effective data sharing on the WIS. Your GISC will be able to help you in getting your data onto WIS2. +All NCs and DCPCs are affiliated with a Global Information System Centre (GISC), which is responsible for helping to establish efficient and effective data sharing on WIS. The affiliated GISC can assist in getting the data onto WIS2. -You may be able to identify an existing NC or DCPC that can publish your data. Alternatively, you may need to establish a new NC or DCPC. The main difference is that an NC is designated by a Member, whereas a DCPC is designated by a WMO or related international programme and/or a regional association. +It may be possible to identify an existing NC or DCPC that can publish the data. Alternatively, it may be necessary to establish a new NC or DCPC. The main distinction between these two centres is that an NC is designated by a Member, whereas a DCPC is designated by a WMO or related international programme and/or a regional association. -Both NC and DCPC require the operation of a WIS2 Node (see <<_2_4_2_wis2_node>>). The procedure for registering a new WIS2 Node is provided in <<_2_6_1_1_registration_and_decommissioning_of_a_wis2_node>>. +Both NCs and DCPCs require the operation of a WIS2 Node (see <<_2_4_2_wis2_node>>). The procedure for registering a new WIS2 Node is provided in <<_2_6_1_1_registration_and_decommissioning_of_a_wis2_node>>. -Once you have determined the scope of your datasets, the data policy that applies, and have a WIS2 Node ready for data publication, you are ready to progress to the next step: providing discovery metadata. +Once the scope of the datasets has been determined, the applicable data policy has been identified, and a WIS2 Node is ready for data publication, the process can proceed to the next step: providing discovery metadata. ==== 1.3.2 How to provide discovery metadata to WIS2 -Discovery metadata is the mechanism by which you tell potential consumers about your data, how it may be accessed, and any conditions you may place on the use of the data. +Discovery metadata is the mechanism by which data publishers tell potential consumers about their data, how it may be accessed, and any conditions they may place on the use of those data. -Each dataset you want to publish must have an associated discovery metadata record. This record is encoded as GeoJSON (RFC 7946footnote:[RFC 7946 - The GeoJSON Format: https://datatracker.ietf.org/doc/html/rfc7946]) must conform to the specification given in the _Manual on WIS_, Volume II - Appendix F: WMO Core Metadata Profile. +Each dataset that is published must have an associated discovery metadata record. This record is encoded as GeoJSON (RFC 7946footnote:[See RFC 7946 - The GeoJSON Format: https://datatracker.ietf.org/doc/html/rfc7946.]) and must conform to the specification given in the _Manual on WIS_, Volume II - Appendix F. WMO Core Metadata Profile (Version 2). -Copies of all discovery metadata records from WIS2 are held at the Global Discovery Catalogues, where data consumers can search and browse to find data that is of interest to them. +Copies of all discovery metadata records from WIS2 are held in the Global Discovery Catalogues, where data consumers can search and browse to find data that is of interest to them. -Depending on local arrangements, your GISC may be able to help you transfer your discovery metadata record(s) to the Global Discovery Catalogues. If this is not the case, you will need to publish the discovery metadata record(s) yourselffootnote:[In the future, WIS2 may provide metadata publication services (such as, through a WIS2 metadata management portal) to assist with this task. However, such a service is not currently available.] using one of two ways: +Depending on local arrangements, your GISC may be able to assist in transferring discovery metadata record(s) to the Global Discovery Catalogues. If this is not the case, data publishers will need to publish the discovery metadata record(s) themselvesfootnote:[In the future, WIS2 may provide metadata publication services (for example, through a WIS2 metadata management portal) to assist with this task. However, such services are not currently available.] using one of two methods: -* The simplest method is to encode the discovery metadata record as a file and publish it to an HTTP server where it can be accessed with a URL. -* Alternatively, you may operate a local metadata catalogue through which discovery metadata records can be shared using an API (for example, Open Geospatial Consortium (OGC) API – Recordsfootnote:[OGC API - Records - Part 1: Core https://docs.ogc.org/DRAFTS/20-004.html]). Each discovery metadata record can be accessed with a unique URL via the API (for instance, an item that is part of the discovery metadata catalogue). +* The simplest method is to encode the discovery metadata record as a file and publish it to an HTTP server, where it can be accessed with a URL. +* Alternatively, a data publisher may operate a local metadata catalogue through which discovery metadata records can be shared using an API (for example, OGC API – Recordsfootnote:[See OGC API - Records - Part 1: Core https://docs.ogc.org/DRAFTS/20-004.html.]). Each discovery metadata record (for instance, an item that is part of the discovery metadata catalogue) can be accessed with a unique URL via the API . -In both cases, a notification message needs to be published on a Message Broker that tells WIS2 there is a new discovery metadata to upload and that it is accessed at the specified URLfootnote:[Both data and metadata publication use the same notification message mechanism to advertise the availability of a new resource.]. The notification messages shall conform to the specification given in the _Manual on WIS_, Volume II - Appendix E: WIS2 Notification Message. Furthermore, the notification message must be published on a topic that conforms to the specification given in _Manual on WIS_, Volume II - Appendix D: WIS2 Topic Hierarchy. For example, metadata published by Deutscher Wetterdienst would use the following topic: ``origin/a/wis2/de-dwd/metadata/core``. +In both cases, a notification message needs to be published on a Message Broker that tells WIS2 that there is a new discovery metadata record to upload and that it can be accessed at the specified URL.footnote:[Both data and metadata are published using the same notification message mechanism to announce the availability of new resources.] Notification messages shall conform to the specification given in the _Manual on WIS_, Volume II - Appendix E. WIS2 Notification Message. They must also be published on a topic that conforms to the specification given in the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy. For example, metadata published by Deutscher Wetterdienst would use the following topic: ``origin/a/wis2/de-dwd/metadata/core``. -These discovery metadata records are then propagated through the Global Service components into to the Global Discovery Catalogue where data consumers can search and browse for datasets of interest. +These discovery metadata records are then propagated through the Global Service components into the Global Discovery Catalogue, where data consumers can search and browse for datasets of interest. -Upon receipt of a new discovery metadata record, a Global Discovery Catalogue (see <<_2_4_4_global_discovery_catalogue>>) will validate, assess, ingest, and publish the record. Validation ensures that your discovery metadata record complies with the specification. The assessment examines your discovery metadata record against good practice. The Global Discovery Catalogue will notify you if your discovery metadata record fails validation and provide recommendations for improvements for you to consider. +Upon receipt of a new discovery metadata record, a Global Discovery Catalogue (see <<_2_4_4_global_discovery_catalogue>>) will validate, assess, ingest, and publish the record. Validation ensures compliance with the specification, while the assessment evaluates the discovery record against good practices. The Global Discovery Catalogue will notify the data publisher if the discovery metadata record fails validation and provide recommendations for improvements. -Discovery metadata must be published in the Global Discovery Catalogues before you begin publishing data. +Discovery metadata must be published in the Global Discovery Catalogues before the data is published. ==== 1.3.3 How to provide data to WIS2 -WIS2 is based on the web architecturefootnote:[Architecture of the World Wide Web https://www.w3.org/TR/webarch/]. As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986]), and the unique identifier can be used to resolve the resource identified and interact with it (for example, to download a representation of the resource over an open standard protocol such as HTTP). +WIS2 is based on the web architecture.footnote:[See Architecture of the World Wide Web, Volume One: https://www.w3.org/TR/webarch/.] As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; and the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[See RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986]), which can be used to resolve the identifieed resource and interact with it (for example, to download a representation of the resource over an open-standard protocol such as HTTP). -Simply, you provide data (and metadata) to WIS2 by assigning it a unique identifier, in this case a URLfootnote:[The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (such as its network "location"). RFC 3986], and make it available via a data server - most typically a web server using the HTTP protocolfootnote:[WIS2 strongly prefers secure versions of protocols (such as HTTPS) wherein the communication protocol is encrypted using Transport Layer Security (TLS)]. It is up to the data server to decide what to provide when resolving the identifier: the URL of a data granule may resolve as a representation encoded in a given data format, whereas the URL of a dataset may resolve as a description of the dataset (that is, metadata) that includes links to access the data from which the dataset is comprised - either individual files (that is, the data granules) or an interactive API that enables a user to request just the parts of the dataset they need by specifying query parameters. +In simple terms, data (and metadata) are provided to WIS2 by assigning them a unique identifier, in this case a URLfootnote:[The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (such as its network "location"). RFC 3986], and making them available via a data server - most typically a web server using HTTP protocol.footnote:[WIS2 strongly prefers secure versions of protocols (such as HTTPS) wherein the communication protocol is encrypted using Transport Layer Security (TLS)] It is up to the data server to decide what to provide when resolving the identifier. For example, the URL of a data granule may resolve as a representation encoded in a given data format, whereas the URL of a dataset may resolve as a description of the dataset (that is, metadata) that includes links to access the data from which the dataset is comprised - either individual files (that is, the data granules) or an interactive API that enables users to request only the parts of the dataset they need by specifying query parameters. The following sections cover specific considerations relating to publishing data to WIS2. ===== 1.3.3.1 Data formats and encodings -Whether providing data as files or through interactive APIs you need to decide which encodings (in other words, _data formats_) to use. WMO technical regulations may require that data be encoded in specific formats. For example, synoptic observations must be encoded in Binary universal form for the representation of meteorological data (BUFR). The https://library.wmo.int/idurl/4/35625[_Manual on Codes_] (WMO-No. 306) provides details of data formats formally approved for use in WMO. However, technical regulations do not cover all data sharing requirements. In such cases, you should select data formats that are open, non-proprietary, widely adopted and understood in their target user community. In this context, “open” means that anyone can use the format without needing a license to do so – either to encode data in that format or write software that understands the format. +Whether providing data as files or through interactive APIs, data publishers need to decide which encodings (_data formats_) to use. WMO technical regulations may require that data be encoded in specific formats. For example, synoptic observations must be encoded in Binary Universal Form for the Representation of meteorological data (BUFR). The https://library.wmo.int/idurl/4/35625[_Manual on Codes_] (WMO-No. 306) provides details of data formats formally approved for use in WMO. However, the technical regulations do not cover all data sharing requirements. In such cases, data publishers should select data formats that are open, non-proprietary, widely adopted, and understood in the target user community. In this context, “open” means that anyone can use the format without needing a license – either to encode data in that format or to write software that understands it. ===== 1.3.3.2 Providing data as files -The simplest way to publish data through WIS2 is to persist your data as files and publish those files on a web server. All these files need to be organized somehow – perhaps in a flat structure or grouped into collections that resemble folders or directory structures. +The simplest way to publish data through WIS2 is to persist your data as files and publish those files on a web server. All these files need to be organized in some manner, for example, in a flat structure or grouped into collections that resemble folders or directory structures. -To make your data usable, your users need to be able to find the specific file (or files) they need. +To ensure that the data are usable, users need to be able to find the specific file (or files) they need. -Naming conventions for files and/or directories are useful – but only when the scheme is understood. If users do not understand the naming convention, it will be a barrier to widespread reuse; many users will simply treat the filename as an opaque string. Where communities commonly use file-naming conventions (such as names with embedded metadata), these should only be used when adequate documentation is provided to users. +Naming conventions for files and/or directories are useful – but only if they are understood. If users do not understand the naming convention, it will be a barrier to widespread reuse, as many users will simply treat the filename as an opaque string. Where file naming conventions (such as names with embedded metadata) are commonly used by communities, they should only be used when adequate documentation is provided to users. WIS2 does not require the use of specific naming conventions. -Another mechanism to consider is complementing the collections (such as, directories or folders in which files are grouped) with information that describes their content. Then users,both humans and software agents, can browse the structure and find what they need. Examples of this approach include: +Another approach to enhance the usability of the data is to complement the collections (such as directories or folders in which files are grouped) with information that describes their content. Then users, both humans and software agents, can browse the structure and find what they need. Examples of this approach include: -* Web Accessible Folders (WAF) and "README" files: a web-based folder structure listing the data object files by name, where each folder contains a formatted "README" file describing the folder contents. -* SpatioTemporal Asset Catalog (STAC)footnote:[Spatio Temporal Asset Catalogue (STAC) https://stacspec.org/en]: a community standard based on GeoJSON to describe geospatial data files that can be easily indexed, browsed and accessed. Free and open source tools present STAC records (one for each data object file) through a web-based, browseable user interface. +* Web Accessible Folders (WAF) and README files: A web-based folder structure listing the data object files by name, where each folder contains a formatted README file describing the folder contents; +* SpatioTemporal Asset Catalog (STAC):footnote[See STAC: SpatioTemporal Asset Catalogs: https://stacspec.org/en.] A community standard based on GeoJSON to describe geospatial data files that can be easily indexed, browsed and accessed. Free and open source tools present STAC records (one for each data object file) through a web-based, browsable user interface. -When publishing collections of data it is tempting to package content into zip or submission information package (SIP)footnote:[See https://www.iasa-web.org/tc04/submission-information-package-sip or end of https://www.eumetsat.int/formats] resources - perhaps even packaging the entire collection, complete with folders, into a single resource. Similarly, WMO formats such as GRIB and BUFR allow multiple data objects (such as, fields or observations) to be packed into a single file. Only having to download a single resource is convenient for many users, but the downside is that the user must download the entire resource and then unpack/decompress it. The convenience of downloading fewer resources needs to be balanced against the cost of forcing users to download data they may not need. Whatever your choice, you should be guided by common practice in your domain - for example, only zip, SIP, or pack if your users expect it. +When publishing collections of data, it is tempting to package content into zip or submission information package (SIP)footnote:[See https://www.iasa-web.org/tc04/submission-information-package-sip or https://user.eumetsat.int/resources/user-guides/formats.] resources - perhaps even to package the entire collection, including folders, into a single resource. Similarly, WMO formats such as GRIB and BUFR allow multiple data objects (such as fields or observations) to be packed into a single file. Downloading a single resource is convenient for many users, but the downside is that the user must download the entire resource and then unpack/decompress it. The convenience of downloading fewer resources must be balanced against the cost of forcing users to download data they may not need. The decision should be guided by common practice in the specific domain - for example, only using zip files, SIP resources, or packing files if this is what the users expect. ===== 1.3.3.3 Providing interactive access to data with APIs -Interactive data access aims to support efficient data workflows by enabling client applications to request only the data that they need. The advantage of interactive data access is that it provides more flexibility. Data publishers can offer an API structured around how users want to work with the data rather than forcing them to work with the structure that is convenient for you as a data publisher. +Interactive data access aims to support efficient data workflows by enabling client applications to request only the data they need. The advantage of interactive data access is that it provides greater flexibility. Data publishers can offer an API structured around how users want to work with the data rather than forcing them to work with the structure that is convenient for the data publisher. -But it is more complex to implement. You need a server running software that can: +However, interactive data access is complex to implement. It requires a server running software that can: 1. Interpret a user's request; -2. Extract the data from wherever it is stored; -3. Package that data up and send it back to the user. +2. Extract the data from wherever they are stored; +3. Package those data and send them back to the user. -Importantly, when considering the use of interactive APIs to serve your data you need to plan for costs: every request to an interactive API requires computational resources to process. +Importantly, when considering the use of interactive APIs to serve data, it is necessary to plan for costs: every request to an interactive API requires computational resources to process. -Based on the experience of data publishers who have been using web APIs to serve their communities, this Guide makes the following recommendations about interactive APIs: +Based on the experience of data publishers that have been using web APIs to serve their communities, this Guide makes the following recommendations regarding interactive APIs: -* First, interactive APIs should be self-describing. A data consumer should not need to know, a priori, how to make requests from an API. They should be able to discover this information from the API endpoint itself – even if this is just a link to a documentation page they need to read. -* Second, APIs should comply with OpenAPIfootnote:[OpenAPI Specification https://spec.openapis.org/oas/v3.1.0] version 3 or later. OpenAPI provides a standardized mechanism to describe the API. Tooling (free, commercial, etc.) is widely available that can read this metadata and automatically generate client applications to query the API. -* Third, the OGC has developed a suite of APIsfootnote:[Open Geospatial Consortium OGC API https://ogcapi.ogc.org/] (called "OGC APIs") that are designed specifically to provide APIs for geospatial data workflows (discovery, visualization, access, processing/exploitation) – all of which build on OpenAPI. Among these, OGC API – Environmental Data Retrieval (EDR)footnote:[OGC API - Environmental Data Retrieval (EDR) https://ogcapi.ogc.org/edr], OGC API – Featuresfootnote:[OGC API - Features https://ogcapi.ogc.org/features], and OGC API - Coveragesfootnote:[OGC API - Coverages https://ogcapi.ogc.org/coverages] are considered particularly useful. Because these are open standards, there is an ever-growing suite of software implementations (both free and proprietary) that support them. We recommend that data publishers assess these open-standard API specifications to determine their suitability for publishing their datasets using APIs. +* First, interactive APIs should be self-describing. Data consumers should not need to know, a priori, how to make requests from an API. They should be able to discover this information from the API endpoint itself – even if this simply entails a link to a documentation page they need to read. +* Second, interactive APIs should comply with OpenAPIfootnote:[See OpenAPI Specification v3.1.0: https://spec.openapis.org/oas/v3.1.0.] version 3 or later. OpenAPI provides a standardized mechanism to describe the API. Tooling (free, commercial, etc.) that can read this metadata and automatically generate client applications to query the API is widely available. +* Third, the OGC has developed a suite of APIsfootnote:[Open Geospatial Consortium OGC API https://ogcapi.ogc.org/] (called "OGC APIs") that are specifically designed to provide APIs for geospatial data workflows (discovery, visualization, access, processing/exploitation) – all of which build on OpenAPI. Among these, OGC API – Environmental Data Retrieval (EDR),footnote:[OGC API - Environmental Data Retrieval (EDR) https://ogcapi.ogc.org/edr] OGC API – Features,footnote:[OGC API - Features https://ogcapi.ogc.org/features] and OGC API - Coveragesfootnote:[OGC API - Coverages https://ogcapi.ogc.org/coverages] are considered particularly useful. Because these are open standards, there is an ever-growing suite of software implementations (both free and proprietary) that support them. It is recommended that data publishers assess these open-standard API specifications to determine their suitability for publishing their datasets using APIs. -Finally, you should consider versioning your API to avoid breaking changes when adding new features. A common approach is adding a _version number_ prefix into the API path; for example, ``/v1/service/{rest-of-path}`` or ``/service/v1/{rest-of-path}``. +Finally, it is advisable to consider versioning the API to avoid breaking changes when adding new features. A common approach is to add a _version number_ prefix into the API path, for example, ``/v1/service/{rest-of-path}`` or ``/service/v1/{rest-of-path}``. More guidance on the use of interactive APIs in WIS2 is anticipated in future versions of this Guide. -===== 1.3.3.4 Providing data in (near) real-time +===== 1.3.3.4 Providing data in (near) real time WIS2 is designed to support the data sharing needs of all WMO disciplines and domains. Among these, the World Weather Watch footnote:[WMO World Weather Watch https://wmo.int/world-weather-watch] drives specific needs for the rapid exchange of data to support weather forecasting. -To enable real-time data sharingfootnote:[In the context of WIS2, real-time implies anything from a few seconds to a few minutes - not the milliseconds required by some applications.], WIS2 uses notification messages to advertise the availability of a new resource, either data or discovery metadata, and how to access that resource. Notification messages are published to a queue on a Message Broker in your WIS2 Nodefootnote:[WIS2 ensures rapid global distribution of notification messages using a network of Global Brokers which subscribe to message brokers of WIS2 Nodes and republish notification messages (see <<_2_4_2_Global_Broker>>).] using the MQTT protocol and immediately delivered to everyone subscribing to that queue. A queue is associated with a specific _topic_, such as a dataset. +To enable real-time data sharing,footnote:[In the context of WIS2, real time implies anything from a few seconds to a few minutes - not the milliseconds required by some applications.] WIS2 uses notification messages to inform users of the availability of a new resource, either data or discovery metadata, and how they can access that resource. Notification messages are published to a queue on a Message Broker in a data publisher's WIS2 Nodefootnote:[WIS2 ensures the rapid global distribution of notification messages using a network of Global Brokers which subscribe to the Message Brokers of WIS2 Nodes and republish notification messages (see <<_2_4_2_Global_Broker>>).] using the MQTT protocol and immediately delivered to all users subscribing to that queue. A queue is associated with a specific _topic_, such as a dataset. -For example, when a new temperature profile from a radio sonde deployment is added to a dataset of upper-air data measurements, a notification message will be published that includes the URL used to access the new temperature profile data. Everyone subscribing to notification messages about the upper-air measurement dataset would receive the notification message, identify the URL and download the new temperature profile data. +For example, when a new temperature profile from a radiosonde deployment is added to a dataset of upper-air data measurements, a notification message will be published that includes the URL used to access the new temperature profile data. All subscribers to notification messages about the upper-air measurement dataset will receive the notification message and be able to identify the URL and download the new temperature profile data. -Optionally, data may be embedded in a notification message using a ``content`` object _in addition_ to publishing via the data server. Inline data must be encoded as ``UTF-8``, ``Base64``, or ``gzip``, and must not exceed 4096 bytes in length once encoded. +Optionally, data may be embedded in a notification message using a content object in addition to being published via the data server. Inline data must be encoded as ``UTF-8``, ``Base64``, or ``gzip``, and must not exceed 4096 bytes in length once encoded. -Notification messages are encoded as GeoJSON (RFC 7946) and must conform to the _Manual on WIS_, Volume II, Appendix E: WIS2 Notification Message. +Notification messages are encoded as GeoJSON (RFC 7946) and must conform to the _Manual on WIS_, Volume II, Appendix E. WIS2 Notification Message. -The URL used in the notification message should refer only to the newly added data object (for example, the new temperature profile) rather than the entire dataset. However, the WIS2 Notification Message specification allows for multiple URLs to be provided. If you are providing your data through an interactive API, you might provide a "canonical" link (designated with link relation: ``"rel": "canonical"``footnote:[IANA Link Relations https://www.iana.org/assignments/link-relations/link-relations.xhtml]), and an additional link providing the URL for the root of the web service from where one can interact with or query the entire dataset. +The URL used in the notification message should refer only to the newly added data object (for example, the new temperature profile), rather than the entire dataset. However, the WIS2 Notification Message specification allows for multiple URLs to be provided. When providing data through an interactive API, it may be useful to provide a "canonical" link (designated with link relation: ``"rel": "canonical"``footnote:[IANA Link Relations https://www.iana.org/assignments/link-relations/link-relations.xhtml]) and an additional link with the URL for the root of the web service from which the entire dataset can be accessed or queried. -You should include the dataset identifier in the notification message (``metadata_id`` property). This allows data consumers receiving the notification to cross reference with information provided in the discovery metadata for the dataset, such as the conditions of use specified in the data policy, rights, or license. +The dataset identifier should be included in the notification message (``metadata_id`` property). This allows data consumers receiving the notification to cross reference it with information provided in the discovery metadata for the dataset, for example the conditions of use specified in the data policy, rights, or license. -Furthermore, if you have implemented controlled access to your data (such as, the use of an API key), you should include a security object in the download link that provides the pertinent information (for example, the access control mechanism used, and where or how a data consumer would need to register to request access). +If controlled access to the data has been implemented (for example, the use of an API key), the download link should include a security object which provides the pertinent information (the access control mechanism used, where or how a data consumer needs to register to request access, and so forth). -To ensure that data consumers can easily find the topics they want to subscribe to, data publishers must publish to an authorized topic, as specified in the _Manual on WIS_, Volume II, Appendix D: WIS2 Topic Hierarchy. +To ensure that data consumers can easily find the topics they want to subscribe to, data publishers must publish to an authorized topic, as specified in the _Manual on WIS_, Volume II, Appendix D. WIS2 Topic Hierarchy. -If your data seems to relate to more than one topic, select the most appropriate one. The topic hierarchy is not a knowledge organization system - it is only used to ensure the uniqueness of topics for publishing notification messages. Discovery metadata is used to describe a dataset and its relevance to additional disciplines; each dataset is mapped to one, and only one, topic. +If the data seem to relate to more than one topic, the most appropriate one should be selected. The topic hierarchy is not a knowledge organization system – it is used solely to ensure the uniqueness of topics for publishing notification messages. Discovery metadata is used to describe a dataset and its relevance to additional disciplines; each dataset is mapped to one, and only one, topic. -If the WIS2 Topic Hierarchy does not include a topic appropriate for your data, you should publish on an _experimental_ topic. This allows for data exchange to be established while the formalities are consideredfootnote:[The "experimental" topic is necessary for the WIS2 pre-operational phase and future pre-operational data exchange in test mode.]. Experimental topics are provided for each Earth-system discipline at level eight in the topic hierarchy (for example, ``origin/a/wis2/{centre-id}/data/{earth-system-discipline}/experimental/``). Data publishers can extend the experimental branch with subtopics as they deem appropriate. Experimental topics are subject to change and will be removed once they are no longer needed. For more information, see _Manual on WIS_, Volume II, Appendix D: WIS2 Topic Hierarchy, section 1.2 Publishing guidelines. +If the WIS2 Topic Hierarchy does not include a topic appropriate for the data, the data should be published on an experimental topic. This will allow data exchange to be established while the formalities are being considered.footnote:[The "experimental" topic is necessary for the WIS2 pre-operational phase and future pre-operational data exchange in test mode.] Experimental topics are provided for each Earth system discipline at level eight in the topic hierarchy (for example, ``origin/a/wis2/{centre-id}/data/{earth-system-discipline}/experimental/``). Data publishers can extend the experimental branch with subtopics they deem appropriate. Experimental topics are subject to change and will be removed once they are no longer needed. For more information, see _Manual on WIS_, Volume II, Appendix D. WIS2 Topic Hierarchy, section 1.2 Publishing guidelines. -Whatever topic is used, the discovery metadata provided to the Global Discovery Catalogue must include subscription links using that topicfootnote:[The Global Discovery Catalogue will reject discovery metadata records containing links to topics outside the official topic-hierarchy.]. The Global Broker will only republish notification messages on topics specified in the discovery metadata records. +Whatever topic is used, the discovery metadata provided to the Global Discovery Catalogue must include subscription links using that topic.footnote:[The Global Discovery Catalogue will reject discovery metadata records containing links to topics outside the official topic-hierarchy.] The Global Broker will only republish notification messages on topics specified in the discovery metadata records. ===== 1.3.3.5 Considerations when providing core data in WIS2 -Core data, as specified in the Unified Data Policy (Resolution 1 (Cg-Ext 2021)) is considered essential for the provision of services for the protection of life and property and for the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. +Core data, as specified in the Unified Data Policy (Resolution 1 (Cg-Ext(2021))) are considered essential for the provision of services for the protection of life and property and for the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. -WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published at WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24-hoursfootnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]). Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. +WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published at WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24-hours.footnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]) Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. -The URL included in a notification message that is used to access core data from a WIS2 Node, or the "canonical" URL if multiple URLs are provided, must: +The URL included in a notification message that is used to access core data from a WIS2 Node, or the "canonical" URL, if multiple URLs are provided, must: 1. Refer to an individual data object; and 2. Be directly resolvable, such that the data object can be downloaded simply by resolving the given URL without further action. A Global Cache will download and cache the data object accessed via this URL. -The Global Caches are designed to support Members efficiently share real-time and near real-time data; they take on the task of making sure that core data is available to all on a free and unrestricted basis, as required by the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))). +The Global Caches are designed to help Members efficiently share real-time and near-real-time data. They ensure that core data are available to all on a free and unrestricted basis, as required by the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))). -Unfortunately, Global Caches cannot republish _all_ core data: there is a limit to how much data they can afford to serve. Currently, a Global Cache is expected to cache about 100 GB of core data each day. +Unfortunately, Global Caches cannot republish _all_ core data; there is a limit to how much data they can afford to serve. Currently, a Global Cache is expected to cache about 100 GB of core data each day. -If frequent updates to your dataset are very large (for example, weather prediction models or remote sensing observations) you will need to share the burden of distributing your data with the Global Cache operators. You should work with your GISC to determine the highest priority elements of your core datasets that will be republished by the Global Caches. +If frequent updates to a dataset are very large (for example, in the case of weather prediction models or remote sensing observations) data publishers will need to share the burden of distributing their data with Global Cache operators. They should work with their GISC to determine the highest priority elements of their datasets that will be republished by the Global Caches. -For core data that is not to be cached, you must set the ``cache`` property in the notification message to ``false``footnote:[Default value for the ``cache`` property is ``true``; omission of the property will result in the data object being cached.]. +Core data that are not to be cached must have the cache property in the notification message set to false.footnote:[Default value for the ``cache`` property is ``true``; omission of the property will result in the data object being cached.] -You must ensure that core data that is not cached is publicly accessible from your WIS2 Node, that is, with no access control mechanisms in place. +Data publishers must ensure that core data that are not cached are publicly accessible from their WIS2 Node, that is, with no access control mechanisms in place. -A Global Cache operator may choose to disregard your cache preference - for example, if they feel that the content you are providing is large enough to impede the provision of caching services for other Membersfootnote:[Excessive data volume is not the only reason they may refuse to cache content. Other reasons include: too many small files, unreliable download from a WIS2 Node, etc.]. In such cases, the Global Cache operator will log this behaviour. In collaboration with the Global Cache operators, your GISC will work with you to resolve concerns. +Global Cache operators may choose to disregard a cache preference, for example, if they feel that the content being providing is large enough to impede the provision of caching services for other Members.footnote:[Excessive data volume is not the only reason they may refuse to cache content. Other reasons include: too many small files, unreliable download from a WIS2 Node, etc.] In such cases, the Global Cache operator will log this behaviour. In collaboration with the Global Cache operators, your GISC will work with you to resolve concerns. -Finally, please note that Global Caches are under no obligation to cache data published on _experimental_ topics. For such data, the ``cache`` property should be set to ``false``. +Finally, note that Global Caches are under no obligation to cache data published on _experimental_ topics. For such data, the ``cache`` property should be set to ``false``. ===== 1.3.3.6 Implementing access control -Recommended data, as defined in the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021)), is exchanged on WIS2 in support of Earth system monitoring and prediction efforts and _may_ be provided with conditions on use. This means that you may control access to recommended data. +Recommended data, as defined in the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))), are exchanged on WIS2 in support of Earth system monitoring and prediction efforts and may be provided with conditions on use. This means that the data publisher may control access to recommended data. -Access control should use only the "security schemes" for authentication and authorization specified in OpenAPIfootnote:[OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object]. +Access control should only use the "security schemes" for authentication and authorization specified in OpenAPI.footnote:[OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object] -Where access control is implemented, you should include a ``security`` object in download links provided in discovery metadata and notification messages that provide the user with pertinent information about the access control mechanism used and where/how they might register to request access. +Where access control is implemented, a ``security`` object should be included in the download links in discovery metadata and notification messages to provide the user with pertinent information about the access control mechanism used and where/how they might register to request access. -Recommended data is never cached by the Global Caches. +Recommended data are never cached by the Global Caches. -Use of core data must always be free and unrestricted. However, you may need to leverage existing systems with built-in access control when implementing the download service for your WIS2 Node. +The use of core data must always be free and unrestricted. However, it may be necessary to leverage existing systems with built-in access control when implementing the download service for the WIS2 Node. -Example 1: API key. Your data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for data publication via WIS2 so that usage can be tracked.]footnote:[Given that users are encouraged to download Core data from the Global Cache, there will likely be only a few accesses using the WIS2 account's API key. If the usage quota for the WIS2 account is exceeded (for instance, further data access is blocked) then this should encourage users to download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] +Example 1: API key. The data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for data publication via WIS2 so that usage can be tracked.], footnote:[Given that users are encouraged to download Core data from the Global Cache, there will likely be only a few accesses using the WIS2 account's API key. If the usage quota for the WIS2 account is exceeded (for instance, further data access is blocked) then this should encourage users to download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] -Example 2: Presigned URLs. Your data server uses a cloud-based object store that requires credentials to be provided when downloading data. The URLs used in notification messages should be _presigned_ with the data publisher's credentials and valid for the cache retention period (for example, 24-hours).footnote:[Working with presigned URLs on Amazon S3 https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html] +Example 2: Presigned URLs. The data server uses a cloud-based object store that requires credentials to be provided when downloading data. The URLs used in notification messages should be _presigned_ with the data publisher's credentials and valid for the cache retention period (for example, 24 hours).footnote:[Working with presigned URLs on Amazon S3 https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html] -In both cases, the URL provided in a notification message can be directly resolved without a user, or Global Cache, needing to take additional action such as providing credentials or authenticating. +In both cases, the URL provided in a notification message can be directly resolved without requiring a user or a Global Cache to take additional action, such as providing credentials or authenticating. -Finally, note that if you are only publishing core data, you may be able to entirely rely on the Global Caches to distribute your data. In such cases, your WIS2 Node may use Internet Protocol (IP)filtering to allow access only from Global Services. For more details, see section 2.6 Implementation and operation of a WIS2 Node. +Finally, note that if only core data are being published, it may be possible to rely entirely on the Global Caches to distribute the data. In such cases, the WIS2 Node may use Internet Protocol (IP) filtering to allow access only from Global Services. For more details, see 2.6 Implementation and operation of a WIS2 Node. ===== 1.3.3.7 Providing access to data archives -There is no requirement for a WIS2 Node to publish notification messages about newly available data, however, the mechanism is available if needed (for instance, for real-time data exchange). Data archives published through WIS2 do not need to provide notification messages for data unless the user community has expressed a need to be rapidly notified about changes (for example, the addition of new records into a climate observation archive). +There is no requirement for a WIS2 Node to publish notification messages about newly available data; however, the mechanism is available if needed (for instance, for real-time data exchange). Data archives published through WIS2 do not need to provide notification messages for data unless the user community has expressed a need to be rapidly notified about changes (for example, the addition of new records into a climate observation archive). -However, notification messages must still be used to share discovery metadata with WIS2. Given that the provision of metadata and subsequent updates is likely to be infrequent, it may be sufficient to "handcraft" a notification message and publish it locally on an MQTT brokerfootnote:[MQTT broker managed services are available online, often with a free (no cost) starter plan sufficient for infrequent publications of notifications about metadata. These provide a viable alternative to implementing an MQTT broker instance yourself.] or with help from a GISC. See above for more details on publishing discovery metadata to WIS2. +However, notification messages must still be used to share discovery metadata with WIS2. Given that the provision of metadata and subsequent updates are likely to be infrequent, it may be sufficient to "handcraft" notification messages as needed and publish them locally on an MQTT brokerfootnote:[MQTT broker managed services are available online, often with a free (no cost) starter plan sufficient for infrequent publications of notifications about metadata. These provide a viable alternative to implementing an MQTT broker instance yourself.] or with the help of a GISC. See above for more details on publishing discovery metadata to WIS2. -Note that some data archives are categorized as core data; for example, Essential Climate Variables. Core data may be distributed via the Global Caches. However, given that these provide only short-term hosting of data (for instance, 24-hours), Global Caches are not an appropriate mechanism to provide access to archives of core data. The archive must be accessed directly via the WIS2 Node. +Note that some data archives, for example, Essential Climate Variables, are categorized as core data. Core data may be distributed via the Global Caches; however, given that they provide only short-term data hosting (for instance, for 24 hours), Global Caches are not an appropriate mechanism to provide access to core data archives. These archives must be accessed directly via the WIS2 Node. ==== 1.3.4 Further reading for data publishers -As a data publisher planning to operate a WIS2 Node, as a minimum you should read the following sections: +Data publishers planning to operate WIS2 Nodes, at a minimum, should read the following sections: * <<_1_1_introduction_to_wis2>> * <<_2_1_wis2_architecture>> @@ -179,21 +179,20 @@ As a data publisher planning to operate a WIS2 Node, as a minimum you should rea * <<_2_4_components_of_wis2>> * <<_2_6_implementation_and_operation_of_a_wis2_node>> -The following sections are useful for further reading: +The following sections are recommended for further reading: * <<_3_1_information_management>> * <<_4_1_security>> * <<_5_1_competencies>> -Note that sections _4.1. Security_ and _5.1. Competencies_ reference content originally published for WIS1. These remain largely applicable and will be updated in subsequent releases of this Guide. +Note that _4.1. Security_ and _5.1. Competencies_ reference content originally published for WIS1. These sections remain largely applicable and will be updated in subsequent releases of this Guide. -If you are publishing aviation weather data via WIS2 for onward transmission through the International Civil Aviation Organization (ICAO) System Wide Information Management (SWIM), you should also read: -* <<_2_8_8_1_publishing_aviation_weather_data_through_wis2_into_icao_swim>>. +Data publishers publishing aviation weather data via WIS2 for onward transmission through the International Civil Aviation Organization (ICAO) System Wide Information Management (SWIM), should also read <<_2_8_8_1_publishing_aviation_weather_data_through_wis2_into_icao_swim>>. -Finally, you should also review the specifications in the _Manual on WIS_, Volume II: +Finally, data publishers should also review the specifications in the _Manual on WIS_, Volume II: -* Appendix D: WIS2 Topic Hierarchy -* Appendix E: WIS2 Notification Message -* Appendix F: WMO Core Metadata Profile 2 +* Appendix D. WIS2 Topic Hierarchy +* Appendix E. WIS2 Notification Message +* Appendix F. WMO Core Metadata Profile (Version 2) // include::sections/wis2node.adoc[] From 42a126f0208fae2a06f5c4737ddfa1e439e40ae4 Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Fri, 4 Oct 2024 15:21:53 +0200 Subject: [PATCH 03/20] update part 2 architecture --- guide/sections/part2/wis2-architecture.adoc | 104 ++++++++++---------- 1 file changed, 52 insertions(+), 52 deletions(-) diff --git a/guide/sections/part2/wis2-architecture.adoc b/guide/sections/part2/wis2-architecture.adoc index 124153f..078c230 100644 --- a/guide/sections/part2/wis2-architecture.adoc +++ b/guide/sections/part2/wis2-architecture.adoc @@ -1,12 +1,12 @@ === 2.1 WIS2 architecture -WIS2 is a federated system of systems based on web-architecture and open standards, comprising of many WIS2 Nodes for publishing data and Global Services that enable fault tolerant, highly available, low latency data distribution. +WIS2 is a federated system of systems based on web architecture and open standards, comprising many WIS2 Nodes, for publishing data and Global Services that enable fault tolerant, highly available, low-latency data distribution. -National Centres (NC), Data Collection or Production Centres (DCPC), and Global Information System Centres (GISC) are all types of WIS centre. +NCs, DCPCs, and GISCs are all types of WIS centres. NCs and DCPCs operate WIS2 Nodes. -GISCs coordinate the operation of WIS within their Area of Responsibility (AoR) and ensure the smooth operation of the WIS2 system. +GISCs coordinate the operation of WIS within their area of responsibility (AoR) and ensure the smooth operation of the WIS2 system. A WIS centre may also operate one or more Global Services. @@ -16,51 +16,51 @@ WIS centres shall comply with the technical regulations defined in the _Manual o When describing the functions of WIS2 there are four roles to consider: -. Data publisher -. Global coordinator -. Global Service operator -. Data consumer +. Data publisher; +. Global coordinator; +. Global Service operator; +. Data consumer. These roles are outlined below. ==== 2.2.1 Data publisher -* This role is fulfilled by NC and DCPC. +* This role is fulfilled by NCs and DCPCs. * Data publishers operate a WIS2 Node to share their data within the WIS2 ecosystem. * Data publishers manage, curate, and provide access to one or more datasets. * For each dataset, a data publisher provides: - i) "Discovery metadata" to describe the dataset, provide details on how it can be accessed, and under what conditions. - ii) An API or web-service to access (or interact with) the dataset. + i) Discovery metadata to describe the dataset and provide details on how it can be accessed and under what conditions; + ii) An API or web service to access or interact with the dataset; iii) Notification messages advertising the availability of new data and metadata. ==== 2.2.2 Global coordinator * This role is exclusive to GISCs. -* All GISCs supporting WMO Members in their AoR fulfill their data sharing obligations via WIS2. +* All GISCs supporting WMO Members in their AoR fulfil their data sharing obligations via WIS2. ==== 2.2.3 Global Service operator * To ensure highly available global data exchange, a WIS centre may operate one or more Global Services: - i) Global Discovery Catalogue: enables users to search all datasets provided by data publishers and discover where and how to interact with those datasets (for example, subscribe to updates, access/download/visualize data, or access more detailed information about the dataset). - ii) Global Broker: provides highly available messaging services where users may subscribe to notifications about all datasets provided by data publishers. - iii) Global Cache: provides highly available download service for cached copies of core data downloaded from data publishers’ web-services. - iv) Global Monitor: gathers and displays system performance, data availability, and other metrics from all WIS2 Nodes and Global Services. + i) The Global Discovery Catalogue enables users to search all datasets provided by data publishers and discover where and how to interact with those datasets (for example, to subscribe to updates, to access/download/visualize data, to access more detailed information about the dataset); + ii) A Global Broker provides highly available messaging services through which users may subscribe to notifications about all datasets provided by data publishers; + iii) A Global Cache provides a highly available download service for cached copies of core data downloaded from data publishers’ web services; + iv) A Global Monitor gathers and displays system performance, data availability, and other metrics from all WIS2 Nodes and Global Services. ==== 2.2.4 Data consumer -* This role represents anyone wanting to find, access, and use data from WIS2 – examples include (but are not limited to): NMHS, Government agencies, research institutions, private sector organizations, and so on. -* Searches or browses the Global Discovery Catalogue (or other search engine) to discover the dataset(s) that meet their needs (namely, "datasets of interest"). -* Subscribes via the Global Broker to receive notification messages about the availability of data or metadata associated with datasets of interest. -* Determines whether the data or metadata referenced in notification messages is required. -* Downloads data from Global Cache or WIS2 Node. +* This role represents anyone wanting to find, access, and use data from WIS2. Examples include NMHSs, government agencies, research institutions, private sector organizations, and so forth. +* Data consumers search or browse a Global Discovery Catalogue (or another search engine) to discover the datasets that meet their needs ("datasets of interest"). +* Data consumers subscribe via a Global Broker to receive notification messages about the availability of data or metadata associated with their datasets of interest. +* Data consumers determine whether the data or metadata referenced in the notification messages are required. +* Data consumers download data from a Global Cache or WIS2 Node. === 2.3 Specifications of WIS2 Leveraging existing open standards, WIS2 defines the following specifications in support of publication, subscription, notification and discovery: -.WIS2 Specifications +.WIS2 specifications |=== |Specification|Granularity|Primary WIS2 Component(s) |WMO Core Metadata Profile 2 (WCMP2) |Datasets -|Global Discovery Catalogue (GDC) +|Global Discovery Catalogue |WIS2 Topic Hierarchy (WTH) |Dataset granules @@ -72,44 +72,44 @@ Leveraging existing open standards, WIS2 defines the following specifications in |=== -Please refer to the _Manual on WIS_, Volume II for details. +Refer to the _Manual on WIS_, Volume II for details. === 2.4 Components of WIS2 // TODO: add refs to other parts of the Guide describing these components ==== 2.4.1 WIS2 Node -* WIS2 Nodes are central to WIS2. These are operated by National Centres (NC) and Data Collection or Production Centres (DCPC) to publish their core and recommended data. -* WIS2 adopts web technologies and open standards enabling WIS2 Nodes to be implemented using freely-available software components and common industry practices. -* WIS2 Nodes publish data as files of a web server or using an interactive web service. -* WIS2 Nodes describe the data they publish using discovery metadata. See the _Manual on WIS_, Volume II, Appendix F: WMO Core Metadata Profile. -* WIS2 Nodes generate notification messages advertising the availability of new data. These notification messages are published to a Message Broker. The WIS2 Topic Hierarchy is used to ensure that all WIS2 Nodes publish to consistent topics. The information in the notification message tells the data consumer where to download data from. Notification messages are also used to advertise the availability of discovery metadata. See the _Manual on WIS_, Volume II - Appendix D: WIS2 Topic Hierarchy and Appendix E: WIS2 Notification Message. -* WIS2 Nodes may implement controlled access for the data they publish. Global Services will operate with fixed IP addresses, enabling WIS2 Nodes to easily distinguish their requests. +* WIS2 Nodes are central to WIS2. They are operated by NCs and DCPCs to publish their core and recommended data. +* WIS2 adopts web technologies and open standards, enabling WIS2 Nodes to be implemented using freely available software components and common industry practices. +* WIS2 Nodes publish data as files on a web server or using an interactive web service. +* WIS2 Nodes describe the data they publish using discovery metadata. See the _Manual on WIS_, Volume II, Appendix F. WMO Core Metadata Profile (Version 2). +* WIS2 Nodes generate notification messages advertising the availability of new data. These notification messages are published to a Message Broker. The WIS2 Topic Hierarchy is used to ensure that all WIS2 Nodes publish to consistent topics. The information in the notification message tells the data consumer the location from which to download the data. Notification messages are also used to inform data consumers of the availability of discovery metadata. See the Manual on WIS, Volume II, Appendix D. WIS2 Topic Hierarchy and Appendix E. WIS2 Notification Message. +* WIS2 Nodes may control access to the data they publish. Global Services operate with fixed IP addresses, enabling WIS2 Nodes to easily distinguish their requests. ==== 2.4.2 Global Broker -* WIS2 incorporates several Global Brokers, ensuring highly resilient distribution of notification messages across the globe. +* WIS2 incorporates several Global Brokers, ensuring the highly resilient distribution of notification messages across the globe. * A Global Broker subscribes to the Message Broker operated by each WIS2 Node and republishes notification messages. -* A Global Broker subscribes to notifications from other Global Brokers to ensure it receives a copy of all notification messages. +* A Global Broker subscribes to notifications from other Global Brokers to ensure that it receives a copy of all notification messages. * A Global Broker republishes notification messages from every WIS2 Node and Global Service. * A Global Broker operates a highly available, high-performance Message Broker. -* A Global Broker uses the WIS2 Topic Hierarchy enabling a data consumer to easily find topics relevant to their needs. -* Data consumers should subscribe to notifications from a Global Broker not directly to the Message Brokers operated by WIS2 Nodes. +* A Global Broker uses the WIS2 Topic Hierarchy, enabling data consumers to easily find topics relevant to their needs. +* Data consumers should subscribe to notifications from a Global Broker, not directly from the Message Brokers operated by WIS2 Nodes. ==== 2.4.3 Global Cache -* WIS2 incorporates several Global Caches, ensuring highly resilient distribution of data across the globe. -* A Global Cache provides a highly available data server from which a data consumer can download core data, as specified in Resolution 1 (Cg-Ext(2021)). +* WIS2 incorporates several Global Caches, ensuring the highly resilient distribution of data across the globe. +* A Global Cache provides a highly available data server, from which a data consumer can download core data, as specified in Resolution 1 (Cg-Ext(2021)). * A Global Cache subscribes to notification messages via a Global Broker. -* On receipt of a notification message, the Global Cache downloads from the WIS2 Node a copy of data referenced in the notification message, makes this copy available on its data server, and publishes a new notification message advertising the availability of this data at the Global Cache. -* A Global Cache will subscribe to notification messages from other Global Caches enabling it to download and republish data it has not acquired directly from WIS2 Nodes. This ensures that each Global Cache holds data from every WIS2 Node. -* A Global Cache shall retain a copy of core data for a duration compatible with the real-time or near real-time schedule of the data and not less than 24-hours. +* Upon receiving a notification message, the Global Cache downloads a copy of the data referenced in the message from the WIS2 Node, makes these data available on its server, and publishes a new notification message informing data consumers that they can now access these data on its server. +* A Global Cache will subscribe to notification messages from other Global Caches, enabling it to download and republish data that it has not acquired directly from WIS2 Nodes. This ensures that each Global Cache holds data from every WIS2 Node. +* A Global Cache shall retain a copy of the core data for a duration compatible with the real-time or near-real-time schedule of the data and not less than 24 hours. * A Global Cache will delete data from the cache once the retention period has expired. -* Data consumers should download data from a Global Cache when available. +* Data consumers should download data from a Global Cache when those data are available. ==== 2.4.4 Global Discovery Catalogue * WIS2 includes several Global Discovery Catalogues. -* A Global Discovery Catalogue enables a data consumer to search and browse descriptions of data published by each WIS2 Node. The data description (that is, discovery metadata) provides sufficient information to determine the usefulness of data and how one may access it. -* A Global Discovery Catalogue subscribes to notification messages via a Global Broker about the availability of new (or updated) discovery metadata. It downloads a copy of the discovery metadata and updates the catalogue. -* A Global Discovery Catalogue will amend discovery metadata records to add details of where one can subscribe to updates about the dataset at a Global Broker. +* A Global Discovery Catalogue enables a data consumer to search and browse descriptions of data published by each WIS2 Node. The data description (discovery metadata) provides sufficient information to determine the usefulness of the data and how it may be accessed. +* A Global Discovery Catalogue subscribes to notification messages about the availability of new (or updated) discovery metadata via a Global Broker. It downloads a copy of the discovery metadata and updates the catalogue. +* A Global Discovery Catalogue amends discovery metadata records to add details of where one can subscribe to updates about the dataset at a Global Broker. * A Global Discovery Catalogue makes its content available for indexing by search engines. ==== 2.4.5 Global Monitor @@ -117,21 +117,21 @@ Please refer to the _Manual on WIS_, Volume II for details. * The Global Monitor collects metrics from WIS2 components. * The Global Monitor provides a dashboard that supports the operational management of the WIS2 system. * The Global Monitor tracks: - i) What data is published by WIS2 Nodes. - ii) Whether data can be effectively accessed by data consumers. + i) What data is published by WIS2 Nodes; + ii) Whether the data can be effectively accessed by data consumers; iii) The performance of components in the WIS2 system. -=== 2.5 Protocols configuration +=== 2.5 Protocol configuration ==== 2.5.1 Publish-subscribe protocol (MQTT) * The MQTT protocolfootnote:[MQTT Specifications: https://mqtt.org/mqtt-specification/] is to be used for all WIS2 publish-subscribe workflows (publication and subscription). -* MQTT v3.1.1 and v5.0 are the chosen protocols for the WIS2 Notification Messages publication and subscription. -** To connect to Global Brokers, MQTT v5.0 is preferred as it provides additional features such as the ability to use shared subscriptions. -* The following parameters are to be used for all MQTT client/server connectivity and subscription: -** Message retention: false -** Quality of Service (QoS) of 1 -** A maximum of 2000 messages to be held in a queue per client +* MQTT v3.1.1 and v5.0 are the chosen protocols for the publication of and subscription to WIS2 notification messages. +** MQTT v5.0 is preferred for connecting to Global Brokers as it provides additional features such as the ability to use shared subscriptions. +* The following parameters are to be used for all MQTT client/server connections and subscriptions: +** Message retention: false; +** Quality of Service (QoS) of 1; +** A maximum of 2000 messages to be held in a queue per client. * In order to permit authentication and authorization for users, WIS2 Node, Global Cache, Global Discovery Catalogue and Global Brokers shall use a user and password based mechanism. * To improve the overall level of security of WIS2, the secure version of the MQTT protocol is preferred. If used, the certificate must be valid. * The standard Transmission Control Protocol (TCP) ports to be used are 8883 for Secure MQTT (MQTTS) and 443 for Secure Web Socket (WSS). @@ -140,4 +140,4 @@ Please refer to the _Manual on WIS_, Volume II for details. * The HTTP protocol (RFC 7231footnote:[RFC 7231 - Hypertext Transfer Protocol (HTTP/1.1): https://datatracker.ietf.org/doc/html/rfc7231]) is to be used for all WIS2 download workflows. * To improve the overall level of security of WIS2, the secure version of the HTTP protocol is preferred. If used, the certificate must be valid. -* The standard Transmission Control Protocol (TCP) port to be used is 443 for Secure HTTP (HTTPS). +* The standard TCP port to be used is 443 for Secure HTTP (HTTPS). From 03ecd74fe0c8fb3ba96bab60783964627040db75 Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Mon, 7 Oct 2024 14:08:00 +0200 Subject: [PATCH 04/20] update wis2Node with LSP edits --- guide/sections/part2/wis2node.adoc | 120 ++++++++++++++--------------- 1 file changed, 59 insertions(+), 61 deletions(-) diff --git a/guide/sections/part2/wis2node.adoc b/guide/sections/part2/wis2node.adoc index 1749b47..c6c3466 100644 --- a/guide/sections/part2/wis2node.adoc +++ b/guide/sections/part2/wis2node.adoc @@ -4,117 +4,115 @@ ===== 2.6.1.1 Registration and decommissioning of a WIS2 Node -Registration and decommissioning of WIS2 Nodes must be approved by the Permanent Representative (PR) with WMO for the country or territory in which the WIS centre resides. The WIS National Focal Point can register a WIS2 Node on behalf of the PR for an official NC or DCPC listed in the _Manual on WIS_, Volume I. Where the WIS2 Node is part of a DCPC, the sponsoring WMO Programme or Regional Association shall be consulted. +The registration and decommissioning of WIS2 Nodes must be approved by the Permanent Representative (PR) with WMO of the country or territory where the WIS centre is located. The WIS National Focal Point (NFP) can register a WIS2 Node on behalf of the PR for an official NC or DCPC listed in the Manual on WIS, Volume I. Where the WIS2 Node is part of a DCPC, the sponsoring WMO programme or regional association shall be consulted. -A WIS2 Node can be registered to exchange data concerning a WMO project or campaign, for a limited time. The National Focal Point can register such a project-related WIS2 Node in coordination with the Secretariat. +A WIS2 Node can be registered to exchange data concerning a WMO project or campaign for a limited time. The WIS NFP can register such a project-related WIS2 Node in coordination with the WMO Secretariat. A WIS2 Node can act as a publication facility on behalf of other centres. This is a Data Collection or Production Centre (DCPC) role, as defined in the Manual on WIS. Data or metadata publication by a DCPC will use the centre identifiers of the data producers. -WMO Secretariat will operate a WIS2 register as an authoritative list of WIS2 Nodes and Global Services. +The WMO Secretariat will maintain a WIS2 register with an authoritative list of WIS2 Nodes and Global Services. The registration of a WIS2 Node involves the following steps: -* Request hosting a WIS2 Node: A request for hosting a WIS2 Node shall be put forward by the WIS National Focal Point (NFP) of the country of the WIS2 Node host centre, or, in the case of international organizations, by either the Permanent Representative (PR) of the country or territory where the WIS2 Node host centre is located or the president of the relevant organization in case of WMO partner or programme designated as DCPC. +* Request to host a WIS2 Node: A request to host a WIS2 Node shall be put forward by the WIS NFP of the country of the WIS2 Node host centre, or, in the case of international organizations, by either the PR of the country or territory where the WIS2 Node host centre is located or the president of the relevant organization, if the WMO partner or programme is designated as a DCPC. -* Assign a centre-id: The centre identifier (centre-id) is an acronym as proposed by the Member and endorsed by the WMO Secretariat. It is a single identifier comprised of a top -level domain (TLD) and centre -name, and represents the data publisher, distributor or issuing centre of a given dataset or data product/granule (see the Manual on WIS, Volume II – Appendix D: WIS2 Topic Hierarchy). See below for guidance on assigning a centre identifier +* Assign a centre identifier (“centre-id”): The centre-id is an acronym proposed by the Member and endorsed by the WMO Secretariat. It is a single identifier consisting of a top-level domain (TLD) and a centre name and represents the data publisher, distributor or issuing centre of a given dataset or data product/granule (see the Manual on WIS, Volume II – Appendix D. WIS2 Topic Hierarchy). See guidance on assigning a centre identifier (<<_2_6_1_2_guidance_on_assigning_a_centre_identifier_for_a_wis2_node>>). -* Complete the WIS2 register: The WIS NFP shall complete the WIS2 register operated by the WMO Secretariat. -* Provide Global Service details: The WMO Secretariat provides connection details for the Global Services (such as, IP addresses) so that the WIS2 Node can be configured to provide the access. -* WIS2 Node assessment: The principal GISC verifies that the WIS2 Node is compliant with WIS2 requirements. The assessment includes: - - Verification of the compliance of the topics used by the centre with the WIS2 Topic Hierarchy (WTH) specification. - - Verification of compliance of notification messages with the WIS2 Notification Message (WNM) specification. - - Verification that the data server is correctly configured and properly functioning. +* Complete the WIS2 register: The WIS NFP shall complete the WIS2 register maintained by the WMO Secretariat. +* Provide details of the Global Service: The WMO Secretariat provides connection details (such as IP addresses) for the Global Services so that the WIS2 Node can be configured to provide access. +* WIS2 Node assessment: The principal GISC verifies that the WIS2 Node is compliant with WIS2 requirements. This assessment includes: + - Verification of compliance of the topics used by the centre with the WTH specification; + - Verification of compliance of notification messages with the WIS2 Notification Message (WNM) specification; + - Verification that the data server is correctly configured and properly functioning; - Verification that the Message Broker is correctly configured and properly functioning. -* Add new centre to WIS2: Upon completion of this verification, and confirmation that it satisfies all conditions for operating a WIS2 Node, GISC notifies WMO Secretariat and confirms that this WIS2 Node can be added to WIS2. -* Communicate details to the Global Services: WMO Secretariat provides the WIS2 Node details to the Global Brokers to subscribe to the WIS2 Node. +* Add a new centre to WIS2: Upon completion of the verification and confirmation that the centre satisfies all the conditions for operating a WIS2 Node, the GISC notifies the WMO Secretariat and confirms that the WIS2 Node can be added to WIS2. +* Communicate the details to the Global Services: The WMO Secretariat provides the details of the WIS2 Node to the Global Brokers so that they can subscribe to the WIS2 Node. -A diagram of the process of registering a WIS2 Node is presented below. +A diagram of the process for registering a WIS2 Node is presented below (see Figure 1). image::images/add-wis2node.png[Adding a WIS2 Node,link=images/add-wis2node.png] -Once a WIS2 Node has been registered and connected with Global Services, it can proceed to register the datasets it will publish via WIS2. To register a dataset, the authorized WIS2 Node publishes discovery metadata about the new dataset. Validation of the discovery metadata is completed by the Global Discovery Catalogues and Global Brokers automatically subscribes to the topics provided in the discovery metadata record. For more information, see <<_1_3_2_how_to_provide_discovery_metadata_to_wis2>>. +Once a WIS2 Node has been registered and connected to the Global Services, it can proceed to register the datasets it will publish via WIS2. To register a dataset, the authorized WIS2 Node publishes discovery metadata about the new dataset. Validation of the discovery metadata is completed by the Global Discovery Catalogues, and the Global Brokers automatically subscribe to the topics provided in the discovery metadata record. For more information, see <<_1_3_2_how_to_provide_discovery_metadata_to_wis2>>. -Once the dataset has successfully been registered, the WIS2 Node can proceed to exchange data - see +Once the dataset has been successfully registered, the WIS2 Node can proceed to exchange data - see <<_1_3_3_how_to_provide_data_in_wis2>>. -When decommissioning a WIS2 Node operators must ensure that obligations relating to data sharing within WIS continue to be met after the WIS2 Node is decommissioned, for example, by migrating these data sharing obligations to another WIS2 Node. In the case of a DCPC, this may mean the responsibilities are transferred to another Member. +When decommissioning a WIS2 Node, operators must ensure that obligations relating to data sharing within WIS continue to be met after the WIS2 Node is decommissioned, for example, by migrating the data sharing obligations to another WIS2 Node. In the case of a DCPC, this may mean transferring the responsibilities to another Member. ===== 2.6.1.2 Guidance on assigning a centre identifier for a WIS2 Node -The centre identifier (``centre-id``) is used in WIS2 to uniquely identify a participating WIS2 Node. The ``centre-id`` must conform to the specification given in the _Manual on WIS_, Volume II - Appendix D: WIS2 Topic Hierarchy, section 7.1.6 Centre identification. +The centre identifier (``centre-id``) is used in WIS2 to uniquely identify a participating WIS2 Node. The ``centre-id`` must conform to the specifications given in the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy, section 7.1.6 Centre identification. The ``centre-id`` comprises two dash-separated tokens. -*Token 1* is a _Top Level Domain_ (TLD) defined by the Internet Assigned Numbers Authority (IANA)footnote:[IANA Top Level Domains https://data.iana.org/TLD]. +*Token 1* is a _Top Level Domain_ (TLD) defined by the Internet Assigned Numbers Authority (IANA).footnote:[IANA Top Level Domains https://data.iana.org/TLD] -This is usually a simple choice for a Member. However, overseas territories require some thought. The recommended approach depends on the governance of the overseas territory. Take some French examples. Réunion is a French Department – it is considered part of France, it uses the Euro. Here, we would use the “fr” TLD. New Caledonia is a French overseas territory with top-level-domain of “nc”. It has separate, devolved governance. The recommendation is to use “nc”. All that said, it is a national decision which TLD to use. +It is usually fairly easy for a Member to choose a TLD. However, for Members’ overseas territories, this may require some thought. The recommended approach depends on the governance structure of the overseas territory. For example, Réunion is a French Department; it is considered part of France, and it uses the Euro. Réunion would use the “fr” TLD. New Caledonia is a French overseas territory with a TLD of “nc” because it has a separate, devolved governance structure. The recommendation is to use “nc”. However, the decision of which TLD to use is made at the national level. -*Token 2* is a descriptive name for the centre and this may contain dashes (but not other special characters). +*Token 2* is a descriptive name for the centre. It may contain dashes, but it may not contain other special characters. -The descriptive name should be something recognizable - not only by our community, but by other users too. Basing things on the web domain name is likely to ensure that centre identifiers remain unique within a particular country/territory. A UK example this time: the UK's National Meteorological Service is the Met Office (http://www.metoffice.gov.uk), so “metoffice” is better than “ukmo”footnote:[The “.gov” part of the domain name is superfluous for the purposes of WIS2. There is nothing preventing its use, but it does not add any value.]. Using the 4-letter GTS centre identifiers (CCCC) is not recommended because people unfamiliar with the GTS do not understand them. +The descriptive name should be something recognizable – not only by the WIS2 community, but also by other users. Basing the name on the web domain name is likely to ensure that centre identifiers remain unique within a particular country or territory. For example, the National Meteorological Service of the United Kingdom of Great Britian and Northern Ireland is the Met Office,footnote:[see http://www.metoffice.gov.uk] so “metoffice” is better than “ukmo”.footnote:[The “.gov” part of the domain name is superfluous for the purposes of WIS2 There is nothing preventing its use, but it does not add any value.]. Using a four-letter GTS centre identifier (for example, CCCC) is not recommended because those who are unfamiliar with GTS will not understand these identifiers. -The centre identifier specification says that larger organizations operating multiple centres may wish to register separate centre-ids for each centre. This is good practice. Keeping with the UK example, Met Office operates a National Meteorological Centre (NMC), 9 DCPCs (for example, a Volcanic Ash Advisory Centre) and a WIS2 Global Service, so it is important to split them out. For example: +The centre identifier specification says that larger organizations operating multiple centres may wish to register separate centre-ids for each centre. This is a good practice. Keeping with the UK example, the Met Office operates a National Meteorological Centre (NMC), 9 DCPCs (for example, a Volcanic Ash Advisory Centre) and a WIS2 Global Service, so it is important to separate them. For example: -* ``uk-metoffice-nmc`` -* ``uk-metoffice-vaac`` -* ``uk-metoffice-global-cache`` +* ``uk-metoffice-nmc``; +* ``uk-metoffice-vaac``; +* ``uk-metoffice-global-cache``. -Using a system name in the centre-id is not a good idea because these may change over time. Functional designations are longterm durable. Appending ``-test`` may be used to designate test WIS2 Nodes. +It is not advisable to use a system name in the centre-id because system names may change over time. Functional designations are durable over the long term. Test WIS2 Nodes may be designated by adding “-test” to the descriptive name. ===== 2.6.1.3 Authentication, authorization, and access control for a WIS2 Node -When configuring your WIS2 Node you need to consider how it will be accessed by Global Services and data consumers. +When configuring a WIS2 Node, it is necessary to consider how it will be accessed by Global Services and data consumers. -Global Brokers must authenticate when they connect to the MQTT Message Broker in your WIS2 Node. Username and password credentials are usedfootnote:[The default connection credentials for a WIS2 Node Message Broker are username ``everyone`` and password ``everyone``. WIS2 Node operators should choose credentials that meet their local policies (for example, password complexity).]. When registering your WIS2 Node with the WMO Secretariat, you will need to provide these credentials. The WMO Secretariat will share these credentials with the Global Service operators and store them in the WIS register. You should not consider these credentials as confidential or secret. +Global Brokers must authenticate when they connect to the MQTT Message Broker in the WIS2 Node. Username and password credentials are used.footnote:[The default connection credentials for a WIS2 Node Message Broker are username ``everyone`` and password ``everyone`` WIS2 Node operators should choose credentials that meet their local policies (for example, password complexity).]. When registering the WIS2 Node with the WMO Secretariat, these credentials must be provided. The WMO Secretariat will share the credentials with the Global Service operators and store them in the WIS register. These credentials should not be considered confidential or secret. -Given that Global Brokers will republish notification messages provided by your WIS2 Node, you may decide to restrict access to the MQTT Message Broker. Global Brokers operate using a fixed IP address which allows you to permit them access using IP filteringfootnote:[In WIS2 the IP addresses are used to determine the origin of connections and therefore confer trust to remote systems. It is well documented that IP addresses can be hi-jacked and that there are alternative, more sophisticated, mechanisms available for reliably determining the origin of connections requests, such as Public Key Infrastructure (PKI). However, the complexities of such implementation would introduce a barrier to Member's participation in WIS2. IP addresses are considered to provide an adequate level of trust for the purposes of WIS2: distributing publicly accessible data and messages.]. You must ensure that your MQTT Message Broker is accessible for more than one Global Broker to provide resilient transmission of notification messages to WIS2. +Given that Global Brokers republish notification messages provided by the WIS2 Node, you may decide to restrict access to the MQTT Message Broker. Global Brokers operate using a fixed IP address which allows you to permit them access using IP filtering.footnote:[In WIS2 the IP addresses are used to determine the origin of connections and therefore confer trust to remote systems. It is well documented that IP addresses can be hi-jacked and that there are alternative, more sophisticated, mechanisms available for reliably determining the origin of connections requests, such as Public Key Infrastructure (PKI). However, the complexities of such implementation would introduce a barrier to Member's participation in WIS2. IP addresses are considered to provide an adequate level of trust for the purposes of WIS2: distributing publicly accessible data and messages.] MQTT Message Brokers must be accessible by more than one Global Broker to ensure resilient transmission of notification messages to WIS2. -If your WIS2 Node is only publishing core datafootnote:[In some cases, WIS2 Nodes will need to serve core data directly (see <<_1_3_3_5_considerations_when_providing_core_data_in_wis2>>). In these situations, the WIS2 Node data server must remain publicly accessible.], you may also restrict access to your data server - instead, relying on the Global Caches to distribute your data. Similarly, Global Caches also operate on fixed IP addresses allowing connections from them to be easily identified. Again, you must ensure that access is given to more than one Global Broker to ensure resilience. +If your WIS2 Node only publishes core data,footnote:[In some cases, WIS2 Nodes will need to serve core data directly (see <<_1_3_3_5_considerations_when_providing_core_data_in_wis2>>). In these situations, the WIS2 Node data server must remain publicly accessible.] access to the data server may also be restricted, with the distribution of data handled by Global Caches. Global Caches also operate on fixed IP addresses, allowing their connections to be easily identified. Again, access must be granted to more than one Global Broker to ensure resilience. -During registration, the WMO Secretariat will provide host names and IP addresses of the Global Services to enable configuration of access control. +During registration, the WMO Secretariat will provide host names and IP addresses of the Global Services to enable access controls to be configured. -Access controls may be implemented for recommended data. You should use only the "security schemes" for authentication and authorization specified in OpenAPIfootnote:[OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object]. +Access controls may be implemented for recommended data. Only the security schemes for authentication and authorization specified in OpenAPI should be used.footnote:[OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object] ==== 2.6.2 Performance management ===== 2.6.2.1 Service levels and performance indicators -A WIS2 Node must be able to: - -- Publish datasets and compliant metadata and discovery metadata - * Publish metadata to the Global Data Catalogue - * Publish core data to the Global Cache - * Publish data for consumer access - * Publish data embedded in a message (such as, Common Alerting Protocol (CAP) warnings) - * Receive metadata publication errors from the Global Data Catalogue - * Provide metadata with topics to Global Brokers +A WIS2 Node must be able to publish datasets, compliant metadata and discovery metadata. This entails: + * Publishing metadata to the Global Data Catalogue; + * Publishing core data to the Global Cache; + * Publishing data for consumer access; + * Publishing data embedded in a message (for example, Common Alerting Protocol (CAP) warnings); + * Receiving metadata publication errors from the Global Data Catalogue; + * Providing metadata with topics to Global Brokers. ===== 2.6.2.2 System performance metrics -If contacted by the Global Monitor via GISC for a performance issue, the WIS2 Node should provide metrics to the GISC and Global Monitor when service is restored to indicate the resolution of the issue. +If contacted by a Global Monitor for a performance issue via a GISC, the WIS2 Node should provide metrics to the GISC and the Global Monitor when service is restored to inform them of the resolution of the issue. ==== 2.6.3 WIS2 Node reference implementation: WIS2 in a box -To provide a WIS2 Node, Members may use whichever software components they consider most appropriate to comply with WIS2 technical regulations. +When providing a WIS2 Node, Members may use whichever software components they consider most appropriate to comply with the WIS2 technical regulations. -To assist Members participating in WIS2, a free and open-source reference implementation is available for use. WIS2 in a box (wis2box) implements the requirements of a WIS2 Node in as well as additional enhancements. The wis2box builds on mature and robust free and open-source software components that are widely adopted for operational use. +To assist Members, a free and open-source reference implementation called “WIS2 in a box” (wis2box) is available. wis2box implements the requirements for a WIS2 Node and contains additional enhancements. wis2box is built on mature and robust free open-source software components that are widely adopted for operational use. -The wis2box provides functionality required for both data publisher and data consumer roles. It provides the following technical functions: +wis2box provides the functionality required for both data publisher and data consumer roles, as well as the following technical functions: * Configuration, generation and publication of data (real-time or archive) and metadata to WIS2, compliant to WIS2 Node requirements -* MQTT Message Broker and notification message publication (subscribe) -* HTTP object storage and raw data access (download) -* Station metadata curation/editing tools (user interface) -* Discovery metadata curation/editing tools (user interface) -* Data entry tools (user interface) -* OGC API server, providing dynamic APIs for discovery, access, visualization and processing functionality (APIs) -* Extensible data "pipelines", allowing for transformation, processing and publishing of additional data types -* Provision of system performance and data availability metrics -* Access control for recommended data publication, as required -* Subscription to notifications and and download of WIS data from Global Services -* Modular design, allowing for extending to meet additional requirements or integration with existing data management systems - -Project documentation can be found at https://docs.wis2box.wis.wmo.int. - -The wis2box is managed as a free and open source project. Source code, issue tracking and discussions are hosted openly on GitHub: https://docs.wis2box.wis.wmo.int. +* MQTT Message Broker and notification message publication (subscribe); +* HTTP object storage and raw data access (download); +* Station metadata curation/editing tools (user interface); +* Discovery metadata curation/editing tools (user interface); +* Data entry tools (user interface); +* OGC API server, providing dynamic APIs for discovery, access, visualization and processing functionality (APIs); +* Extensible data "pipelines", allowing for the transformation, processing and publishing of additional data types; +* Provision of system performance and data availability metrics; +* Access control for publication of recommended data, as required; +* Subscription to notifications and download of WIS data from Global Services; +* Modular design, allowing for extending to meet additional requirements or integration with existing data management systems. + +The project documentation can be found at https://docs.wis2box.wis.wmo.int. + +The wis2box is managed as a free and open source project. The source code, issue tracking and discussions are hosted openly on GitHub: https://docs.wis2box.wis.wmo.int. From f911e0b8739ca1f0902c3636689c1faf1010a8ae Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Tue, 8 Oct 2024 10:38:25 +0200 Subject: [PATCH 05/20] update global-services --- guide/sections/part2/global-services.adoc | 245 +++++++++++----------- 1 file changed, 120 insertions(+), 125 deletions(-) diff --git a/guide/sections/part2/global-services.adoc b/guide/sections/part2/global-services.adoc index 9590db8..5c5ad50 100644 --- a/guide/sections/part2/global-services.adoc +++ b/guide/sections/part2/global-services.adoc @@ -1,24 +1,24 @@ === 2.7 Implementation and operation of a Global Service ==== 2.7.1 Procedure for registration of a new Global Service -Successful operations of WIS will depend on having a set of Global Services running well-managed IT environments with a very high level of reliability so that all WIS Users and WIS2 Nodes will be able to access and provide the data they need for their duties. +The successful operation of WIS2 depends on a set of Global Services running well-managed IT environments with a very high level of reliability so that all WIS2 users and WIS2 Nodes are able to access and provide the data they need for their duties. -Depending on the nature of the Global Service, the following is the minimum capability of Global Service operation, so that collectively, the level of service is 100% (or very close): +Depending on the nature of the Global Service, the following are the minimum capabilities to ensure that, collectively, the level of service is 100% (or very close): -* Three (3) Global Brokers: Each Global Broker is connected to at least two (2) other Global Brokers -* Three (3) Global Caches: Each Global Cache is connected to at least two (2) Global Brokers and should be able to download the data from all WIS2 Nodes providing core data -* Two (2) Global Discovery Catalogues: Each Global Discovery Catalogue is connected to at least one (1) Global Broker -* Two (2) Global Monitors: Each Global Monitor should scrape the metrics from all other Global Services +* Three Global Brokers, with each Global Broker connected to at least two other Global Brokers; +* Three Global Caches, with each Global Cache connected to at least two Global Brokers and capable of downloading data from all WIS2 Nodes providing core data; +* Two Global Discovery Catalogues, with each Global Discovery Catalogue connected to at least one Global Broker; +* Two Global Monitors - each Global Monitor should scrape the metrics from all other Global Services -In addition to the above, WIS architecture can accommodate adding (or removing) Global Services. Candidate WIS centres should inform their WIS National Focal Point and contact the WMO Secretariat to discuss their offer to provide a Global Service. +In addition to the above, WIS architecture can accommodate adding (or removing) Global Services. Candidate WIS centres should inform their WIS NFP and contact the WMO Secretariat to discuss their offer to provide a Global Service. -Running a Global Service is a significant commitment for a WIS centre. To maintain a very high level of service of WIS, each Global Service will have a key role to play. +Running a Global Service is a significant commitment for a WIS centre. To maintain a very high level of service, each Global Service has a key role to play. -On receipt of an offer from a Member to operate a Global Service, the WMO Secretariat will suggest which Global Service the Member may provide to improve WIS2. This suggestion will be based on the current situation of WIS2 (such as, the number of existing Global Brokers, whether an additional Global Cache is needed, and so forth). +On receipt of an offer from a Member to operate a Global Service, the WMO Secretariat will suggest which Global Service the Member may provide to improve WIS2. This suggestion will be based on the current situation of WIS2 (such as the number of existing Global Brokers, whether an additional Global Cache is needed, and so forth). -The _Manual on WIS_, Volume II, this Guide and other material available will help WIS centres in deciding the best way forward. +The _Manual on WIS_, Volume II, the present Guide, and other available materials will help WIS centres decide how to proceed. -When decided, the WIS National Focal Point will inform the WMO Secretariat of its preference. Depending on the type of Global Service, the WMO Secretariat will provide a checklist to the WIS centre so that the future Global Service can be included in WIS operations. +When a decision on how to proceed has been made , the WIS NFP will inform the WMO Secretariat of its preference. Depending on the type of Global Service, the WMO Secretariat will provide a checklist to the WIS centre so that the future Global Service can be included in WIS operations. A WIS centre must commit to running the Global Service for a minimum of four years. @@ -28,129 +28,125 @@ The WMO Secretariat and other Global Services will make the required changes to ===== 2.7.2.1 Monitoring and metrics for WIS2 operations -The availability of data and performance of system components within WIS2 are actively monitored by GISCs and the Global Monitor service to ensure proactive response to incidents and effective capacity planning for future operations. +The availability of data and the performance of system components within WIS2 are actively monitored by GISCs and the Global Monitor service to ensure proactive responses to incidents and effective capacity planning for future operations. -WIS2 requires that metrics are provided using OpenMetrics – the de-facto standard footnote:[OpenMetrics is proposed as a draft standard within IETF.] for transmitting cloud-native metrics at scale. Widely adopted, many commercial and open-source software components already come preconfigured to provide performance metrics using the OpenMetrics standard. Tools such as Prometheus and Grafana provide aggregation and visualization of metrics provided in this form, making it simple to generate performance insights. The OpenMetrics standard can be found at openmetrics.io footnote:[cncf-openmetrics[https://openmetrics.io]]. +WIS2 requires that metrics are provided using OpenMetricsfootnote:[See OpenMetrics: https://openmetrics.io.] – the de-facto standardfootnote:[OpenMetrics is proposed as a draft standard within the Internet Engineering Task Force (IETF).] for transmitting cloud-native metrics at scale. Widely adopted, many commercial and open-source software components already come preconfigured to provide performance metrics using the OpenMetrics standard. Tools such as Prometheus and Grafana aggregate and visualize metrics provided in this format, making it simple to generate performance insights. -The WIS2 Global Services, namely the Global Broker, Global Cache, and Global Discovery Catalogue expose monitoring metrics on their respective service to the Global Monitor. +WIS2 Global Services (Global Brokers, Global Caches, and Global Discovery Catalogues) provide monitoring metrics about their respective service to Global Monitors. -There is no requirement on WIS2 Nodes to provide monitoring metrics. However their WIS2 interfaces may be queried remotely by Global Services, which in turn can provide metrics on the availability of WIS2 Nodes. +There is no requirement for WIS2 Nodes to provide monitoring metrics. However their WIS2 interfaces may be queried remotely by Global Services, which can then provide metrics on the availability of WIS2 Nodes. -Metrics for the WIS2 monitoring should follow the naming convention: +Metrics for WIS2 monitoring should follow the naming convention wmo__, where is the name of the responsible WMO programme and is the name of the metric. Examples of WIS2 metrics include: - wmo__ + wmo_wis2_gc_downloaded_total, and -Where program is the name of the responsible WMO Programme and name is the name of the metric. Examples for WIS2 metrics can look like: - - wmo_wis2_gc_downloaded_total - - wmo_wis2_gb_messages_invalid_total + wmo_wis2_gb_messages_invalid_total. The full set of the WIS2 monitoring metrics is given in WMO: WIS2 Metric Hierarchy footnote:wmo-wmh[https://github.com/wmo-im/wis2-metric-hierarchy] -===== 2.7.2.2 Service levels, performance indicators, and fair-usage policies -* Each WIS centre operating a WIS2 Node will be responsible for achieving the highest possible level of service based on their resources and capabilities. -* All Global Services, in particular Global Brokers and Global Caches, are collectively responsible for making the WIS a reliable and efficient means to exchange data required for the operations of all WIS centres. The agreed architecture provides a redundant solution where the failure of one component will not impact the overall level of service of WIS. -* Each Global Service should aim at achieving at least 99.5% availability of the service they propose. This is not a contractual target. It should be considered by the entity providing the Global Service as a guideline when designing and operating the Global Service. +===== 2.7.2.2 Service levels, performance indicators, and fair usage policies +* Each WIS centre operating a WIS2 Node is responsible for achieving the highest possible level of service based on its resources and capabilities. +* All Global Services, in particular Global Brokers and Global Caches, are collectively responsible for making WIS a reliable and efficient means of exchanging the data required for the operation of all WIS centres. The architecture provides a redundant solution where the failure of one component will not impact the overall level of service of WIS. +* Each Global Service should aim to achieve at least 99.5% availability of the service it provides. This is not a contractual target. It should be considered by the entity providing the Global Service as a guideline when designing and operating that service. * A Global Broker: -** Should support a minimum of 200 WIS2 Nodes or Global Services -** Should support a minimum of 1000 subscribers. -** Should support processing of a minimum of 10000 messages per second +** Should support a minimum of 200 WIS2 Nodes or Global Services; +** Should support a minimum of 1 000 subscribers; +** Should support the processing of a minimum of 10 000 messages per second. * A Global Cache: -** Should support a minimum of 100 GB of data in the cache -** Should support a minimum of 1000 simultaneous downloads -** Could limit the number of simultaneous connections from a user (known by its originating source IP) to 5 -** Could limit the bandwidth usage of the service to 1 Gb/s +** Should support a minimum of 100 GB of data in the cache; +** Should support a minimum of 1 000 simultaneous downloads; +** Could limit the number of simultaneous connections from a user (known by its originating source IP) to five; +** Could limit the bandwidth usage of the service to 1 Gb/s. * A Global Monitor: -** Should support a minimum of 50 metrics providers -** Should support 200 simultaneous access to the dashboard -** Could limit the bandwidth usage of the service to 100 Mb/s +** Should support a minimum of 50 metrics providers; +** Should support 200 simultaneous access to the dashboard; +** Could limit the bandwidth usage of the service to 100 Mb/s. * A Global Discovery Catalogue: -** Should support a minimum of 20000 metadata records -** Should support a minimum of 50 requests per second to the API endpoint +** Should support a minimum of 20 000 metadata records; +** Should support a minimum of 50 requests per second to the API endpoint. ===== 2.7.2.3 Metrics for Global Services -In the following sections and for each Global Service, a set of metrics is defined. Each Global Service will provide those metrics. They will then be ingested by the Global Monitor. +In the following sections, and for each Global Service, a set of metrics is defined. Each Global Service will provide those metrics. They will then be ingested by the Global Monitor. ==== 2.7.3 Global Broker ===== 2.7.3.1 Technical considerations -* As detailed above, there will be at least three instances of Global Broker to ensure highly available, low latency global provision of messages within WIS. -* A Global Broker instance subscribes to messages from WIS2 Nodes and other Global Services. The Global Broker should aim at subscribing to all WIS centres. If this is not possible, for whatever reason, the Global Broker should inform the WMO Secretariat so that the situation is documented. +* As detailed above, there will be at least three Global Brokers to ensure that messages within WIS2 are highly available and delivered globally with low latency. +* A Global Broker subscribes to messages from WIS2 Nodes and other Global Services. The Global Broker should aim to subscribe to all WIS centres. If this is not possible, the Global Broker should inform the WMO Secretariat so that the situation can be documented. * Every WIS2 Node or Global Service must have subscriptions from at least two Global Brokers. -* For full global coverage, a Global Broker instance will subscribe to messages from at least two other Global Brokers. +* For full global coverage, a Global Broker will subscribe to messages from at least two other Global Brokers. * When subscribing to messages from WIS2 Nodes and other Global Services, a Global Broker must authenticate using the valid credentials managed by the WIS centre and available at WMO Secretariat. * A Global Broker is built around two software components: -** An off the shelf broker implementing both MQTT 3.1.1 and MQTT 5.0 in a highly-available setup, typically in a cluster mode. Tools such as EMQX, HiveMQ, VerneMQ, RabbitMQ (in its latest versions) are compliant with these requirements. It must be noted that the open source version of Mosquitto cannot be clustered and therefore should not be used as part of a Global Broker. -** Additional features including anti-loop detection, notification message format compliance, validation of the published topic, and provision of metrics are required. +** An off the shelf broker implementing both MQTT 3.1.1 and MQTT 5.0 in a highly available setup, typically in a cluster mode. Tools such as EMQX, HiveMQ, VerneMQ, RabbitMQ (in its latest versions) are compliant with these requirements. The open source version of Mosquitto cannot be clustered and therefore should not be used as part of a Global Broker. +** Additional required features, including anti-loop detection, notification message format compliance, validation of the published topic, and metrics provision. -* When receiving a message from a WIS centre or Global Service broker, The metric ``wmo_wis2_gb_messages_received_total`` will be increased by 1. +* When receiving a message from a WIS centre or a Global Service broker, the metric ``wmo_wis2_gb_messages_received_total`` will be increased by 1. * A Global Broker will check if a discovery metadata record exists corresponding to the topic on which a message has been published. If there is no corresponding discovery metadata record, the Global Broker will discard non-compliant messages and will raise an alert. The metric ``wmo_wis2_gb_messages_no_metadata_total`` will be increased by 1. Global Broker should not request information from the Global Discovery Catalogue for each notification message but should keep a cache of all valid topics for every ``centre-id``. * A Global Broker will check if the topic on which the message is received is valid. If the topic is invalid, the Global Broker will discard non-compliant messages and will raise an alert. The metric ``wmo_wis2_gb_invalid_topic_total`` will be increased by 1. -* During the pre-operational phase (2024), Global Broker will not discard the message but will send a message on the `monitor` topic hierarchy to inform the originating centre and its GISC. -* A Global Broker will validate notification messages against the standard format (see _Manual on WIS_, Volume II – Appendix E: WIS2 Notification Message), discarding non-compliant messages and raising an alert. The metric ``wmo_wis2_gb_invalid_format_total`` will be increased by 1. -* A Global Broker instance will republish a message only once. Using the message id as defined in the WIS2 Notification Message, the Global Broker will record the id of messages already published and will discard subsequent identical (with the same message id) messages. This is the anti-loop feature of the Global Broker. +* During the pre-operational phase (2024), a Global Broker will not discard the message but instead will send a message on the `monitor` topic hierarchy to inform the originating centre and its GISC. +* A Global Broker will validate notification messages against the standard format (see _Manual on WIS_, Volume II – Appendix E. WIS2 Notification Message), discarding non-compliant messages and raising an alert. The metric ``wmo_wis2_gb_invalid_format_total`` will be increased by 1. +* A Global Broker will republish a message only once. It will record the message id (as defined in the WIS2 Notification Message), of messages already published and will discard subsequent identical messages (those with the same message id). This is the anti-loop feature of the Global Broker. * When publishing a message to the local broker, the metric ``wmo_wis2_gb_messages_published_total`` will be increased by 1. -* All above defined metrics will be made available on HTTPS endpoints that the Global Monitor will ingest from regularly. -* As a convention Global Broker centre-id will be ``tld-{centre-name}-global-broker``. -* A Global Broker should operate with a fixed IP address so that WIS2 Nodes can permit access to download resources based on IP address filtering. A Global Broker should also operate with a public resolvable DNS name pointing to that IP address. The WMO Secretariat must be informed of the IP address and/or hostname, and any subsequent changes. +* All above-defined metrics will be made available on HTTPS endpoints that the Global Monitor will ingest from regularly. +* As a convention, the Global Broker centre-id will be ``tld-{centre-name}-global-broker``. +* A Global Broker should operate with a fixed IP address so that WIS2 Nodes can permit access to download resources based on IP address filtering. A Global Broker should also operate with a publicly resolvable Domain Name System (DNS) name pointing to that IP address. The WMO Secretariat must be informed of the IP address and/or hostname and any subsequent changes. ==== 2.7.4 Global Cache -In WIS2 Global Caches provide access to WMO core data for data consumers. This allows for data providers to restrict access to their systems to Global Services and it reduces the need for them to provide high bandwidth and low latency access to their data. Global Caches work transparent for end users in that they resend notification messages from data providers which are updated to point to the Global Cache data store for data, they copied from the original source. Additionally, Global Caches also resend notification messages from data providers for core data, that is not stored on the Global Cache, for instance if the originator indicates that a certain dataset should not be cached in the notification message. In the latter case, the notification messages that a Global Cache resends are unchanged and point to the original source. Data consumers should subscribe to the notification messages from Global Caches instead of the notification messages from the data providers for WMO core data. When data consumers receive a notification message they should follow the URLs from that message which either point to a Global Cache holding a copy of the data, or - in case of uncached content - point to the original source. +In WIS2, Global Caches provide access to WMO core data for data consumers. This allows data providers to restrict access to their systems to Global Services, and it reduces the need for them to provide high bandwidth and low latency access to their data. Global Caches work transparent for end users in that they resend notification messages from data providers which are updated to point to the Global Cache data store for data, they copied from the original source. Global Caches also resend notification messages from data providers for core data that are not stored in the Global Cache, such as when the originator specifies in the notification message that a certain dataset should not be cached. In these cases, the notification messages remain unchanged and point to the original source. Data consumers should subscribe to the notification messages from Global Caches instead of the notification messages from data providers for WMO core data. When data consumers receive a notification message, they should follow the URLs from that message, which either point to a Global Cache which has a copy of the data, or – in case of uncached content – point to the original source. ===== 2.7.4.1 Technical considerations * A Global Cache is built around three software components: -** A highly available data server allowing data consumers to download cache resources with high bandwidth and low latency. -** A Message Broker implementing both MQTTv3.1.1 and MQTTv5 for publishing notification messages about resources that are available from the Global Cache -** A cache management implementing the features needed to connect with the WIS ecosystem, receive data from WIS2 Nodes and other Global Caches, store the data to the data server and manage the content of the cache (such as, expiration of data, deduplication, and so forth). -* The Global Cache will aim at containing copies of real-time and near real-time data designated as "core" within the Unified Data Policy (Resolution 1 (Cg-Ext(2021))). -* A Global Cache instance will host data objects copied from NC/DCPCs. -* A Global Cache instance will publish notification messages advertising availability of the data objects it holds. The notification messages will follow the standard structure (see _Manual on WIS_, Volume II -Appendix E: WIS2 Notification Message). -* A Global Cache instance will use the standard topic structure in their local message brokers (see _Manual on WIS_, Volume II -Appendix D: WIS2 Topic Hierarchy). -* A Global Cache instance will publish on topic ``cache/a/wis2/...``. -* There will be multiple Global Cache instances to ensure highly available, low latency global provision of real-time and near real-time "core" data within WIS2. -* There will be multiple Global Cache instances that may attempt to download cacheable data objects from all originating centres with "cacheable" content. A Global Cache instance will also download data objects from other Global Cache instances. This ensures the instance has full global coverage, mitigating where direct download from an originating centre is not possible. -* A Global Cache instance will operate independently of other Global Cache instances. Each Global Cache instance will hold a full copy of the cache – albeit that there may be small differences between Global Cache instances as "data availability" notification messages propagate through WIS to each Global Cache in turn. There is no formal ‘synchronization’ between Global Cache instances. +** A highly available data server allowing data consumers to download cache resources with high bandwidth and low latency; +** A Message Broker implementing both MQTTv3.1.1 and MQTTv5 to publish notification messages about resources that are available from the Global Cache; +** A cache management system implementing the features needed to connect to the WIS ecosystem, receive data from WIS2 Nodes and other Global Caches, store the data on the data server and manage the content of the cache (expiration of data, deduplication, and so forth). +* A Global Cache will aim to contain copies of real-time and near real-time data designated as "core" within the Unified Data Policy (Resolution 1 (Cg-Ext(2021))). +* A Global Cache will host data objects copied from NCs/DCPCs. +* A Global Cache will publish notification messages advertising the availability of the data objects it holds. The notification messages will follow the standard structure (see _Manual on WIS_, Volume II -Appendix E. WIS2 Notification Message). +* A Global Cache will use the standard topic structure in its local Message Brokers (see _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). +* A Global Cache will publish on the topic ``cache/a/wis2/...``. +* There will be multiple Global Cache to ensure the highly available, low-latency global provision of real-time and near-real-time core data within WIS2. +* There will be multiple Global Caches that may attempt to download cacheable data objects from all originating centres with cacheable content. A Global Cache will also download data objects from other Global Caches. This will ensure that each Global Cache has full global coverage, even when direct download from an originating centre is not possible +* Global Caches will operate independently of one another. Each Global Cache will hold a full copy of the cache – although there may be small differences between the various Global Caches as data availability notification messages propagate through WIS to each one. There is no formal synchronization between Global Caches. * A Global Cache will temporarily cache all resources published on the ``metadata`` topic. A Global Discovery Catalogue will subscribe to notifications about the publication of new or updated metadata, download the metadata record from the Global Cache and insert it into the catalogue. A Global Discovery Catalogue will also publish a metadata record archive each day containing the complete content of the catalogue and advertise its availability with a notification message. This resource will also be cached by a Global Cache. -* A Global Cache is designed to support real-time distribution of content. Data consumers access data objects from a Global Cache instance by resolving the URL in a "data availability" notification message and downloading the file to which the URL points. Apart from the URL it is transparent to the data consumers from which Global Cache they download the data. There is no need to download the same data object from multiple Global Caches. The data id contained within the notification messages is used by data consumers and Global Services to detect such duplicates. -* There is no requirement for a Global Cache to provide a "browseable" interface to the files in its repository allowing data consumers to discover what content is available. However, a Global Cache may choose to provide such a capability (for example, implemented as a WAF) along with adequate documentation for data consumers to understand how the capability works. +* A Global Cache is designed to support real-time content distribution. Data consumers access data objects from a Global Cache instance by resolving the URL in a data availability notification message and downloading the file to which the URL points. Apart from the URL it is transparent to the data consumers from which Global Cache they download the data. There is no need to download the same data object from multiple Global Caches. The data id contained within notification messages is used by data consumers and Global Services to detect such duplicates. +* There is no requirement for a Global Cache to provide a browsable interface to the files in its repository in order to allow data consumers to discover what content is available. However, a Global Cache may choose to provide such a capability (for example, implemented as a WAF), along with documentation to inform data consumers of how the capability works. * The default behaviour for a Global Cache is to cache all data published under the ``origin/a/wis2/data/+/core`` topic. A data publisher may indicate that data should not be cached by adding the ``"cache": false`` assertion in the WIS2 Notification Message. -* A Global Cache may decide not to cache data. For example, if the data is considered too large, or a WIS2 Node publishes an excessive number of small files. Where a Global Cache decides not to cache data it should behave as though the ``cache`` property is set to false and send a message on the `monitor` topic hierarchy to inform the originating centre and its GISC. The Global Cache operator should work with the originating WIS2 Node and their GISC to remedy the issue. -* If core data is not cached on a Global Cache (that is, if the data is flagged as ``"cache": false`` or if the Global Cache decides not to cache this data), the Global Cache shall nevertheless republish the WIS2 Notification Message to the ``cache/a/wis2/...`` topic. In this case the message id will be changed and the rest of the message will not be modified. -* A Global Cache should operate with a fixed IP address so that WIS2 Nodes can permit access to download resources based on IP address filtering. A Global Cache should also operate with a public resolvable DNS name pointing to that IP address. The WMO Secretariat must be informed of the IP address and/or hostname, and any subsequent changes. -* A Global Cache should validate the integrity of the resources it caches and only accept data that matches the integrity value from the WIS2 Notification Message. If the WIS2 Notification Message does not contain an integrity value, a Global Cache should accept the data as valid. In this case a Global Cache may add an integrity value to the message it republishes. -* As a convention Global Cache centre-id will be ``tld-{centre-name}-global-cache``. +* A Global Cache may decide not to cache data, for example, if the data are considered too large, or if a WIS2 Node publishes an excessive number of small files. If a Global Cache decides not to cache data, it should behave as though the cache property is set to false and send a message on the monitor topic hierarchy to inform the originating centre and its GISC. The Global Cache operator should work with the originating WIS2 Node and its GISC to remedy this issue. +* If core data are not cached on a Global Cache (that is, if the data are flagged as ``"cache": false`` or if the Global Cache decides not to cache these data), the Global Cache shall nevertheless republish the WIS2 Notification Message to the ``cache/a/wis2/...`` topic. In this case, the message id will be changed, and the rest of the message will not be modified. +* A Global Cache should operate with a fixed IP address so that WIS2 Nodes can permit access to download resources based on IP address filtering. A Global Cache should also operate with a publicly resolvable DNS name pointing to that IP address. The WMO Secretariat must be informed of the IP address and/or hostname, and any subsequent changes. +* A Global Cache should validate the integrity of the resources it caches and only accept data that match the integrity value from the WIS2 Notification Message. If the WIS2 Notification Message does not contain an integrity value, the Global Cache should accept the data as valid. In this case, the Global Cache may add an integrity value to the message it republishes. +* As a convention, the Global Cache centre-id will be ``tld-{centre-name}-global-cache``. ===== 2.7.4.2 Practices and procedures -* A Global Cache shall subscribe to the topics `+origin/a/wis2/#+`, `+cache/a/wis2/#+`. -* A Global Cache shall ignore all messages received on the topics ``++origin/a/wis2/+/data/recommended/#++`` and ``++cache/a/wis2/+/data/recommended/#++`` footnote:[It is also technically possible to filter recommended data by using a wildcard subscription such as ``++origin/a/wis2/+/data/core/#++``. However, avoiding wildcard subscription is generally considered good practice as it limits the burden of the broker operated by Global Brokers.] -* A Global Cache shall retain the data and metadata they receive for a minimum period of 24 hours. Requirements relating to varying retention times for different types of data may be added later. -* For messages received on topic ``++origin/a/+/data/core/#++`` or ``++cache/a/+/data/core/#++``, a Global Cache shall: -** If the message contains the property ``"properties.cache": false`` +* A Global Cache shall subscribe to the topics `+origin/a/wis2/#+` and `+cache/a/wis2/#+`. +* A Global Cache shall ignore all messages received on the topics ``++origin/a/wis2/+/data/recommended/#++`` and ``++cache/a/wis2/+/data/recommended/#++``footnote:[It is also technically possible to filter recommended data by using a wildcard subscription such as ``++origin/a/wis2/+/data/core/#++``. However, avoiding wildcard subscription is generally considered good practice as it limits the burden of the broker operated by Global Brokers.] +* A Global Cache shall retain the data and metadata it receives for a minimum of 24 hours. Requirements relating to varying retention times for different types of data may be added later. +* For messages received on the topic ``++origin/a/+/data/core/#++`` or ``++cache/a/+/data/core/#++``, a Global Cache shall: +** If the message contains the property ``"properties.cache": false``, *** Republish the message at topic ``cache/a/wis2/...`` matching ``+/a/wis2/...`` where the original message has been received after having updated the id of the message. ** Else -*** Maintain a list of data_ids already downloaded. -*** Verify if the message points to new or updated data by comparing the pubtime value of the notification message with the list of data_ids. -*** If the message is new or updated -**** Download only new or updated data from the ``href`` or extract the data from the message content. -**** If the message contains an integrity value for the data, verify the integrity of the data. -**** If data is downloaded successfully, move the data to the HTTP endpoint of the Global Cache. -**** Wait until the data becomes available at the endpoint. -**** Modify the message identifier and the canonical link's ``href`` of the received message. Leave all other fields untouched. -**** Republish the modified message to topic ``cache/a/wis2/...`` matching the ``+/a/wis2/...`` where the original message has been received. -**** The metric ``wmo_wis2_gc_downloaded_total`` will be increased by 1. The metric ``wmo_wis2_gc_dataserver_last_download_timestamp_seconds`` will be updated with the timestamp (in seconds) of the last successful download from the WIS2 Node or Global Cache. +*** Maintain a list of data_ids that have already been downloaded; +*** Verify whether the message points to new or updated data by comparing the pubtime value of the notification message with the list of data_ids; +*** If the message is new or updated: +**** Download only new or updated data from the ``href`` or extract the data from the message content; +**** If the message contains an integrity value for the data, verify the integrity of the data; +**** If data is downloaded successfully, move the data to the HTTP endpoint of the Global Cache; +**** Wait until the data becomes available at the endpoint; +**** Modify the message identifier and the canonical link's ``href`` of the received message and leave all other fields untouched; +**** Republish the modified message to topic ``cache/a/wis2/...`` ,matching the ``+/a/wis2/...`` where the original message has been received; +**** The metric ``wmo_wis2_gc_downloaded_total`` will be increased by 1; The metric ``wmo_wis2_gc_dataserver_last_download_timestamp_seconds`` will be updated with the timestamp (in seconds) of the last successful download from the WIS2 Node or Global Cache; *** Else -**** Drop the messages for data already present on the Global Cache. +**** Drop the messages for data already present in the Global Cache. -* If the Global Cache is not able to download the data the metric ``wmo_wis2_gc_downloaded_error_total`` will be increased by 1. -* A Global Cache shall provide the metric defined in this Guide at an HTTP endpoint -* A Global Cache should make sure that data is downloaded in parallel and downloads are not blocking each other +* If the Global Cache is not able to download the data, the metric ``wmo_wis2_gc_downloaded_error_total`` will be increased by 1. +* A Global Cache shall provide the metric defined in this Guide at an HTTP endpoint. +* A Global Cache should make sure that data are downloaded in parallel and that downloads are not blocking each other. * The metric ``wmo_wis2_gc_dataserver_status_flag`` will reflect the status of the connection to the download endpoint of the centre. Its value will be 1 when the endpoint is up and 0 otherwise. * The metric ``wmo_wis2_gc_last_metadata`` will reflect the datetime (in RFC3339 format) of the last metadata resource processed by a given centre. @@ -160,60 +156,59 @@ In WIS2 Global Caches provide access to WMO core data for data consumers. This a ===== 2.7.5.1 Technical considerations * The Global Discovery Catalogue provides data consumers with a mechanism to discover and search for datasets of interest, as well as how to interact with and find out more information about those datasets. -* The Global Discovery Catalogue implements the OGC API – Records – Part 1: Core standardfootnote:[OGC-API Records - Part 1 https://docs.ogc.org/DRAFTS/20-004.html], adhering to the following conformance classes and their dependencies: -** Searchable Catalog (Deployment) -** Searchable Catalog - Sorting (Deployment) -** Searchable Catalog - Filtering (Deployment) -** JSON (Building Block) -** HTML (Building Block) -* The Global Discovery Catalogue will make discovery metadata available via the collection identifier of `wis2-discovery-metadata`. +* The Global Discovery Catalogue implements the OGC API – Records – Part 1: Core standard,footnote:[OGC-API Records - Part 1 https://docs.ogc.org/DRAFTS/20-004.html] adhering to the following conformance classes and their dependencies: +** Searchable Catalog (Deployment); +** Searchable Catalog - Sorting (Deployment); +** Searchable Catalog - Filtering (Deployment); +** JSON (Building Block); +** HTML (Building Block). +* The Global Discovery Catalogue will make discovery metadata available via the collection identifier `wis2-discovery-metadata`. * The Global Discovery Catalogue advertises the availability of datasets and how to access them or subscribe to updates. * The Global Discovery Catalogue does not advertise or list the availability of individual data objects that comprise a dataset (that is, data files). -* A single Global Discovery Catalogue instance is sufficient for WIS2. -* Multiple Global Discovery Catalogue instances may be deployed for resilience. -* Global Discovery Catalogue instances operate independently of each other; each Global Discovery Catalogue instance will hold all discovery metadata records. Global Discovery Catalogues do not need to synchronize between themselves. -* A Global Discovery Catalogue is populated with discovery metadata records from a Global Cache instance, receiving messages about the availability of discovery metadata records via a Global Broker. +* A single Global Discovery Catalogue is sufficient for WIS2. +* Multiple Global Discovery Catalogues may be deployed for resilience. +* Global Discovery Catalogues operate independently of each other; each Global Discovery Catalogue holds all discovery metadata records. Global Discovery Catalogues do not need to synchronize with each other. +* A Global Discovery Catalogue is populated with discovery metadata records from a Global Cache and receives messages about the availability of discovery metadata records via a Global Broker. ** The subscription topic shall be ``++cache/a/wis2/+/metadata/#++``. -* A Global Discovery Catalogue should connect and subscribe to more than one Global Broker instance to ensure that no messages are lost in the event of a Global Broker failure. A Global Discovery Catalogue instance will discard duplicate messages as needed. -* A Global Discovery Catalogue will validate that a discovery metadata record identifier's `centre-id` token (see _Manual on WIS_, Volume II -Appendix F: WMO Core Metadata Profile) matches against the `centre-id` level of the topic from which it was published (see _Manual on WIS_, Volume II -Appendix D: WIS2 Topic Hierarchy), to ensure that discovery metadata is published by the authoritative organization. -* A Global Discovery Catalogue will validate discovery metadata records against the WMO Core Metadata Profile (WCMP2). Valid WCMP2 records will be ingested into the catalogue. Invalid or malformed records will be discarded and reported to the Global Monitor against the centre identifier associated with the discovery metadata record. -* A Global Discovery Catalogue will only update discovery metadata records to replace links for dataset subscription and notification (origin) with their equivalent links for subscription at Global Broker instances (cache). +* A Global Discovery Catalogue should connect to and subscribe to more than one Global Broker to ensure that no messages are lost in the event of a Global Broker failure. A Global Discovery Catalogue will discard duplicate messages as needed. +* A Global Discovery Catalogue will verify that a discovery metadata record identifier’s centre-id token (see Manual on WIS, Volume II – Appendix F. WMO Core Metadata Profile (Version 2)) matches the centre-id level of the topic from which it was published (see Manual on WIS, Volume II – Appendix D. WIS2 Topic Hierarchy) to ensure that discovery metadata are published by the authoritative organization. +* • A Global Discovery Catalogue will validate discovery metadata records against the WCMP2. Valid WCMP2 records will be ingested into the catalogue. Invalid or malformed records will be discarded and reported to the Global Monitor against the centre-id associated with the discovery metadata record. +* A Global Discovery Catalogue will only update discovery metadata records to replace links for dataset subscription and notification (origin), with their equivalent links for subscription at Global Brokers (cache). * A Global Discovery Catalogue will periodically assess discovery metadata provided by NCs and DCPCs against a set of key performance indicators (KPIs) in support of continuous improvement. Suggestions for improvement will be reported to the Global Monitor against the centre identifier associated with the discovery metadata record. -* A Global Discovery Catalogue will remove discovery metadata that is marked for deletion as specified in the data notification message. +* A Global Discovery Catalogue will remove discovery metadata that are marked for deletion as specified in the data notification message. * A Global Discovery Catalogue should apply faceting capability as specified in the cataloguing considerations of the WCMP2 specification, as defined in OGC API - Records. -* A Global Discovery Catalogue will provide human-readable web pages with embedded markup using the schema.org vocabulary, thereby enabling search engines to crawl and index the content of the Global Discovery Catalogue. Consequently, data consumers should also be able to discover WIS content via third party search engines. -* A Global Discovery Catalogue will generate and store a zip file of all WCMP2 records once a day, that will be made be accessible via HTTP. +* • A Global Discovery Catalogue will provide human-readable web pages with embedded markup using the schema.org vocabulary, thereby enabling search engines to crawl and index its content. Consequently, data consumers should be able to discover WIS content via third party search engines. +* A Global Discovery Catalogue will generate and store a zip file of all WCMP2 records once a day; this file will be made be accessible via HTTP. * A Global Discovery Catalogue will publish a WIS2 Notification Message of its zip file of all WCMP2 records on its centre-id's +metadata+ topic (for example, `origin/a/wis2/centre-id/metadata`, where `centre-id` is the centre identifier of the Global Discovery Catalogue). * A Global Discovery Catalogue may initialize itself (cold start) from a zip file of all WCMP2 records published. -* As a convention Global Discovery Catalogue centre-id will be ``tld-{centre-name}-global-discovery-catalogue``. +* As a convention, a Global Discovery Catalogue's centre-id will be ``tld-{centre-name}-global-discovery-catalogue``. ===== 2.7.5.2 Global Discovery Catalogue reference implementation: wis2-gdc -To provide a Global Discovery Catalogue, Members may use whichever software components they consider most appropriate to comply with WIS2 technical regulations. +To provide a Global Discovery Catalogue, Members may use whichever software components they consider most appropriate to comply with the WIS2 technical regulations. -To assist Members' participation in WIS2, a free and open-source Global Discovery Catalogue reference implementation is made available for download and use. wis2-gdc builds on mature and robust free and open-source software components that are widely adopted for operational use. +To assist Members in participating in WIS2, a free and open-source Global Discovery Catalogue reference implementation, wis2-gdc, is available for download and use. wis2-gdc builds on mature and robust free and open-source software components that are widely adopted for operational use. wis2-gdc provides the functionality required for the Global Discovery Catalogue, providing the following technical functions: -* Discovery metadata subscription and publication from the Global Broker -* Discovery metadata download from the Global Cache -* Discovery metadata validation, ingest and publication -* WCMP2 compliance -* Quality assessment (KPIs) -* OGC API - Records - Part 1: Core compliance -* Metrics reporting -* Implementation of metrics +* Discovery metadata subscription and publication from the Global Broker; +* Discovery metadata download from the Global Cache; +* Discovery metadata validation, ingest and publication; +* WCMP2 compliance; +* Quality assessment (KPIs); +* OGC API - Records - Part 1: Core compliance; +* Metrics reporting; +* Implementation of metrics. -wis2-gdc is managed as a free and open source project. Source code, issue tracking and discussions are hosted in the open on GitHub.footnote:[https://github.com/wmo-im/wis2-gdc] +wis2-gdc is managed as a free and open source project. Source code, issue tracking and discussions are hosted in the open on GitHub.footnote:[See https://github.com/wmo-im/wis2-gdc.] ==== 2.7.6 Global Monitor ===== 2.7.6.1 Technical considerations * WIS standardizes how system performance and data availability metrics are published from WIS2 Nodes and Global Services. -* For each type of Global Service, a set of standard metrics has been defined. Global Services will implement those metrics and provide an endpoint for those metrics to be scraped by the Global Monitor +* For each type of Global Service, a set of standard metrics has been defined. Global Services will implement and provide an endpoint for those metrics to be scraped by the Global Monitor. * The Global Monitor will collect metrics as defined in the OpenMetrics standard. -* The Global Monitor will monitor the 'health' (namely, performance) of components at NC/DCPC as well as Global Service instances. -* The Global Monitor will provide a web-based ‘dashboard’ that displays the WIS2 system performance and data availability. -* As a convention Global Monitor centre-id will be ``tld-{centre-name}-global-monitor``. - - The main task of the Global Monitor is to regularly query the provided metrics from the relevant WIS2 entities, aggregate and process the data and then provide the results to the end user in a suitable presentation. +* The Global Monitor will monitor the "health" (that is, the performance) of components at NCs/DCPCs, as well as Global Services. +* The Global Monitor will provide a web-based dashboard that displays the WIS2 system performance and data availability. +* As a convention, the Global Monitor centre-id will be ``tld-{centre-name}-global-monitor``. +* The main task of the Global Monitor will be to regularly query the metrics provided by the relevant WIS2 entities, aggregate and process the data and then provide the results to the end user in an appropriate format. From d1fc59937a9b87fffce8f3a41df5b23cc27ef992 Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Tue, 8 Oct 2024 17:41:35 +0200 Subject: [PATCH 06/20] update part 2 operations --- guide/sections/part2/operations.adoc | 162 +++++++++++++-------------- 1 file changed, 79 insertions(+), 83 deletions(-) diff --git a/guide/sections/part2/operations.adoc b/guide/sections/part2/operations.adoc index 2f4ad67..91a7d00 100644 --- a/guide/sections/part2/operations.adoc +++ b/guide/sections/part2/operations.adoc @@ -15,31 +15,31 @@ Meteorological data is an essential input for public weather services and aviation services alike. WIS2 provides the mechanism for data exchange in WMO, while SWIM is the ICAO initiative to harmonize the provision of aeronautical, meteorological and flight information to support air traffic management (ATM). -Both WIS2 and SWIM support similar outcomes relating to data -exchange. However, there are differences in both approach and +Both WIS2 and SWIM support similar outcomes regarding data +exchange. However, there are differences with respect to both approach and implementation. -Specifications for WIS2 are defined in the _Manual on WIS_, Volume II, and further elaborated in this Guide. Specifications for SWIM will be defined in the Procedures for Air Navigation Services –Information Management (PANS-IM) (ICAO Doc. 10199)footnote:[The PANS-IM is expected to available on ICAO NET by July 2024 and become applicable in November 2024. Information provided in herein is based on best understanding of draft proposals from ICAO.]. +Specifications for WIS2 are defined in the _Manual on WIS_, Volume II, and further elaborated in this Guide. Specifications for SWIM will be defined in the Procedures for Air Navigation Services –Information Management (PANS-IM) (ICAO Doc. 10199).footnote:[PANS-IM is expected to available on ICAO NET by July 2024 and to become applicable in November 2024. The information provided herein is based on draft proposals from ICAO.] During the WIS2 transition phase (2025-2033), meteorological data published -via WIS2 will automatically be published to the GTS via the WIS2-to-GTS gateways. +via WIS2 will automatically be published to GTS via the WIS2-to-GTS gateways. |=== |*WIS2* |*SWIM* -|Earth-system scope: weather, climate, hydrology, atmospheric -composition, cryosphere, ocean and space weather data |ATM scope: aeronautical, meteorological and flight information +|Earth system scope: Weather, climate, hydrology, atmospheric +composition, cryosphere, ocean and space weather data |ATM scope: Aeronautical, meteorological and flight information -|Data centric - a consumer discovers data and then determines the -services through which that data may be accessed |Service centric - a +|Data centric: A consumer discovers data and then determines the +services through which those data may be accessed. |Service centric: A consumer discovers a service (or service provider) and determines what -resources (that is, information) is available therein +resources (that is, information) are available therein. |Technical protocols: MQTT, HTTP |Technical protocols: -AMQPfootnote:[AMQP 1.0 is one of the protocols proposed in the draft PANS-IM] +AMQPfootnote:[AMQP 1.0 is one of the protocols proposed in the draft PANS-IM.] |=== An organization (for example, the National Meteorological Service) that is -responsible for providing meteorological data to WIS2 may be designated by the ICAO Contracting State as a responsible entity to provide aeronautical meteorological information into SWIM. +responsible for providing meteorological data to WIS2 may be designated by the ICAO Contracting State as a responsible entity for providing aeronautical meteorological information into SWIM. Where requirements dictate, the organization may provide regional capability on behalf of a group of countries or territories. @@ -48,29 +48,29 @@ interoperability approach between WIS2 and SWIM where meteorological data published via WIS2 can be automatically propagated to SWIM. This Guide covers only how data from WIS2 can be published into SWIM. -Consumption of information from SWIM services is not in scope. +It does not address the consumption of information from SWIM services. -This Guide also does not cover implementation details of the SWIM -service - including, but not limited to: +It also does not cover details regarding the implementation of the SWIM +services - including, but not limited to: -* Mechanisms used by SWIM to discover service providers and services. -* Specification of the SWIM data message. -* AMQP message broker configuration. -* Operation, logging and monitoring. +* Mechanisms used by SWIM to discover service providers and services; +* Specifications of SWIM data messages; +* AMQP Message Broker configuration; +* Operation, logging and monitoring; * Cybersecurity considerations for the provision of SWIM services. This Guide will be updated as more information is made available from -ICAO and/or recommended practices are updated. +ICAO and/or as recommended practices are updated. -Finally, it should be noted that the provision of aeronautical meteorological information and its exchange via the ICAO -Aeronautical Fixed Service (AFS) is also out of scope as they are solely defined under the ICAO regulatory framework. +Finally, it should also be noted that the provision of aeronautical meteorological information and its exchange via the ICAO +Aeronautical Fixed Service (AFS) are defined solely under the ICAO regulatory framework and are therefore beyond the scope of this Guide. ====== 2.8.1.1.1 WIS2 to SWIM gateway -The WIS2 to SWIM interoperability approach employs a gateway component (as per the figure below): +The WIS2 to SWIM interoperability approach employs a gateway component (as per Figure 2): .Schematic of an interoperability approach -image::images/wis2-to-swim-temp.png[Schematic of interoperability approach] +image::images/wis2-to-swim-temp.png[Figure 2. Schematic of interoperability approach] The gateway can operate as an "adapter" between WIS2 and SWIM by pulling the requisite meteorological data from WIS2 and re-publishing it @@ -79,34 +79,34 @@ to SWIM. ====== 2.8.1.1.2 Data types and format Specifications for aeronautical meteorological information are provided in ICAO -Annex 3 and other relevant guidance materials. The ICAO Meteorological Information Exchange Model (IWXXM) format (FM 205)footnote:[IWXXM (FM205) is defined in the _Manual on Codes_ (WMO-No. 306), Volume I.3 – International Codes] is to be used for encoding aeronautical meteorological information in SWIM. +Annex 3 and other relevant guidance materials. The ICAO Meteorological Information Exchange Model (IWXXM) format (FM 205)footnote:[IWXXM (FM 205) is defined in the _Manual on Codes_ (WMO-No. 306), Volume I.3.] is to be used for encoding aeronautical meteorological information in SWIM. ====== 2.8.1.1.3 Publishing meteorological data via WIS2 For meteorological data to be published from WIS2 to SWIM, the organization -responsible for this provision will need to operate a WIS2 Node and -comply with the pertinent technical regulations as specified in the +responsible for providing the data needs to operate a WIS2 Node and +comply with the pertinent technical regulations specified in the _Manual on WIS_, Volume II. Onward distribution of the data by the Message Broker over SWIM can be handled by the respective Information Service Provider in accordance with ICAO Standards and Recommended Practices (SARPs). -The responsible organization should consider whether this +The responsible organization should consider whether these data should be published via an existing WIS2 Node, or whether a separate WIS2 Node should be established. For example, the data may be provided by a separate operational unit, or there may be a requirement to easily -distinguish between data for SWIM and any other meteorological data. +distinguish between data for SWIM and other meteorological data. -Where a new WIS2 Node is needed, the responsible organization must -establish a new WIS2 Node and register it with the WMO Secretariat. For more information, see <<_2_6_implementation_and_operation_of_a_wis2_node>>. +If a new WIS2 Node is needed, the responsible organization must +establish one and register it with the WMO Secretariat. For more information, see <<_2_6_implementation_and_operation_of_a_wis2_node>>. Datasets are a central concept in WIS2. Where meteorological data is published via WIS2, it will be packaged into -“datasets”. The data should be grouped at the country/territory -level; (for instance, datasets should be published for a given country/territory), one for each datatype (for example, -aerodrome observation, aerodrome forecast and quantitative volcanic ash -concentration information). +datasets. The data should be grouped at the country/territory +level (for instance, datasets should be published for a given country/territory), one for each datatype (for example, +aerodrome observation, aerodrome forecast,quantitative volcanic ash +concentration, and so forth). -For the purposes of publishing through WIS2, datasets containing aeronautical meteorological information should be considered as "recommended data", as +For the purposes of publishing through WIS2, datasets containing aeronautical meteorological information should be considered as recommended data, as described in Resolution 1 (Cg-Ext(2021)). The recommended data category of the policy is intended to cover data that should be exchanged by Members to support Earth system monitoring @@ -114,24 +114,24 @@ and prediction efforts. Recommended data: -* May be subject to conditions on use and reuse. +* May be subject to conditions on use and reuse; * May have access controlsfootnote:[WIS2 follows the recommendations from OpenAPI regarding choice of security schemes for authenticated access - a choice of HTTP authentication, API keys, OAuth2 or OpenID Connect Discovery. For more information see -OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object]footnote:[WIS2 does not provide any guidance on use of Public Key Infrastructure (PKI).] applied at the WIS2 Node. -* Are not cached within WIS2 by the Global Cachesfootnote:[Global +OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object], footnote:[WIS2 does not provide any guidance on use of Public Key Infrastructure (PKI).] applied at the WIS2 Node; +* Are not cached within WIS2 by the Global Caches.footnote:[Global Caches enable highly available, low-latency distribution of core data. Given that core data is provided on a free and unrestricted basis, -Global Caches do not implement any data access control.]. +Global Caches do not implement any data access control.] Resolution 1 (Cg-Ext(2021)) requires transparency on the conditions of use for recommended data. Conditions regarding the use of aeronautical meteorological information are specified in ICAO Annex 3 and, optionally, by the ICAO Contracting State. Such conditions of use should be explicitly stated in the discovery metadata for each dataset as described below. * The attribute ``wmo:dataPolicy`` should be set to ``recommended``. -* Information about conditions of use should be specified using a ``rights`` property (see example below) and/or a ``link`` object with a relation ``license``. +* Information about conditions of use should be specified using the ``rights`` property (see the example below) and/or a ``link`` object with a relation ``license``. * Information about access control should be specified using a ``security`` object in the ``link`` object describing the data access details. -.Example expression of conditions relating to the use of aeronautical meteorological information: +.The following is an example expression of conditions relating to the use of aeronautical meteorological information: [source,json] ---- "properties": { @@ -141,52 +141,51 @@ Resolution 1 (Cg-Ext(2021)) requires transparency on the conditions of use for r } ---- -For more information on the WMO Core Metadata Profile version 2, see the -_Manual on WIS_, Volume II, Appendix F. +For more information on the WMO Core Metadata Profile, see the +_Manual on WIS_, Volume II, Appendix F. WMO Core Metadata Profile (Version 2). On receipt of new data, the WIS2 Node will: -1. Publish the data as a resource via a Web server (or Web service). -2. Publish a WIS2 Notification Message to a local message broker that -advertises the availability of the data resource. +1. Publish the data as a resource via a web server (or web service); +2. Publish a WIS2 Notification Message advertising the availability of the data resource to a local Message Broker. -Note that, in contrast to the GTS, WIS2 publishes data resources +Note that, in contrast to GTS, WIS2 publishes data resources individually, each with an associated notification message. WIS2 does not group data resources into bulletins. The data resource is identified using a URL. The notification message -refers to the data resource using this URLfootnote:[Where the data +refers to the data resource using this URL.footnote:[Where the data resource does not exceed 4 Kb, it may additionally be embedded in the -notification message.]. +notification message.] -For more details on the WIS2 Notification Message, see the _Manual on WIS_, Volume II, Appendix E: WIS2 Notification Message. +For more details on the WIS2 Notification Message, see the _Manual on WIS_, Volume II, Appendix E. WIS2 Notification Message. The notification message must be published to the proper topic on the Message Broker. WIS2 defines a standard topic hierarchy to ensure -that data is published consistently by all WIS2 Nodes. Notification +that data are published consistently by all WIS2 Nodes. Notification messages for aviation data should be published on a specific topic allowing a data consumer, such as the gateway, to subscribe only to aviation-specific notifications. See the example below: -.Example topic used to publish notifications about Quantitative Volcanic Ash Concentration Information +.Example topic used to publish notifications about quantitative volcanic ash concentration information: [source,text] ---- origin/a/wis2/{centre-id}/data/recommended/weather/aviation/qvaci ---- -For more details of the WIS Topic Hierarchy, see the _Manual on WIS_, Volume II, Appendix D: WIS2 Topic Hierarchy. +For more details on the WIS Topic Hierarchy, see the _Manual on WIS_, Volume II, Appendix D. WIS2 Topic Hierarchy. -WIS Global Brokers subscribe to the local message brokers of WIS2 Nodes +WIS Global Brokers subscribe to the local Message Brokers of WIS2 Nodes and republish notification messages for global distribution. -As a minimum, the WIS2 Node should retain the aviation data for a +At a minimum, the WIS2 Node should retain the aviation data for a duration that meets the needs of the gateway. A retention period of at least 24 hours is recommended. ====== 2.8.1.1.4 Gateway implementation The potential interactions between the gateway component, WIS2 and SWIM are -illustrated in the figure belowfootnote:[Note that the figure simplifies +illustrated in Figure 3.footnote:[Note that the figure simplifies the transmission of discovery metadata from WIS2 Node to the Global Discovery Catalogue. The WIS2 Node publishes notification messages advertising the availability of new discovery metadata resource @@ -196,13 +195,13 @@ the discovery metadata from the WIS2 Node using the URL supplied in the message.] .Interactions between the gateway and components of WIS2 and SWIM -image::images/wis2-to-swim-interaction-temp.png[Interactions between the gateway and components of WIS2 and SWIM] +image::images/wis2-to-swim-interaction-temp.png[Figure 3. Interactions between the gateway and components of WIS2 and SWIM] **Configuration** Dataset discovery metadata will provide useful information that can be used to configure the gateway, for example, the -topic(s) to subscribe to plus various other information that may be +topic(s) to subscribe to plus additional information that may be needed for the SWIM service. Discovery metadata can be downloaded from the Global Discovery Catalogue. @@ -212,20 +211,19 @@ Discovery metadata can be downloaded from the Global Discovery Catalogue. The gateway component implements the following functions: * Subscribe to the pertinent topic(s) for notifications about new -aeronautical meteorological informationfootnote:[WIS2 recommends that one subscribes to +aeronautical meteorological information;footnote:[WIS2 recommends that one subscribes to notifications from a Global Broker. However, where both gateway and WIS2 Node are operated by the same organization, it may be advantageous to subscribe directly to the local message broker of WIS2 Node, for example, to -reduce latency.]. +reduce latency.] * On receipt of notification messages about newly available data: ** Parse the notification message, discarding duplicate messages already -processed previously; +processed; ** Download the data resource from the WIS2 -Nodefootnote:[The WIS2 Node may control access to data - the gateway will -need to be implemented accordingly.] using the URL in the message - the +Nodefootnote:[The WIS2 Node may control access to data. If this is the case, the gateway component will need to be implemented accordingly] using the URL in the message - the resource should be in IWXXM format; -** Create a new "data message" as per the SWIM specifications, including -the unique identifier extracted from the data resourcefootnote:[In case +** Create a new data message as per the SWIM specifications, including +the unique identifier extracted from the data resource,footnote:[In case a unique identifier is required for proper passing of an aviation weather message to the gateway, one can use the GTS abbreviated heading (TTAAii CCCC YYGGgg) in the COLLECT envelop (available in IWXXM messages @@ -233,20 +231,19 @@ having a corresponding TAC message), or content in attribute ``gml:identifier`` (available in newer IWXXM messages like WAFS SIGWX Forecast and QVACI), for such purpose. There is currently no agreed definition for unique identifier of IWXXM METAR and TAF reports of -individual aerodrome.], and embedding the aviation weather data resource +individual aerodrome.] and embed the aviation weather data resource within the data message; ** Publish the data message to the appropriate topic on the SWIM Message Broker component of the SWIM service. The choice of protocol for publishing to the SWIM Message Broker should -be based on bilateral agreement between operators of the gateway and -SWIM service. +be based on a bilateral agreement between operators of the gateway and +the SWIM service. The gateway should implement logging and error handling as necessary to enable reliable operations. WIS2 uses the OpenMetrics standardfootnote:[OpenMetrics: -https://openmetrics.io] for -publishing metrics and other operating information. Use of OpenMetrics +https://openmetrics.io] to publish metrics and other operating information. The use of OpenMetrics by the gateway would enable monitoring and performance reporting to be easily integrated into the WIS2 system. @@ -257,9 +254,9 @@ the organizational governance in place. ====== 2.8.1.1.5 SWIM service -The SWIM aviation weather information service may comprise of a Message Broker -component which implements the AMQP 1.0 messaging standardfootnote:[AMQP -1.0: https://www.amqp.org/resources/specifications]. +The SWIM aviation weather information service may include of a Message Broker +component which implements the Advanced Message Queuing Protocol (AMQP) 1.0 messaging standard.footnote:[See AMQP +1.0: https://www.amqp.org/resources/specifications.] The Message Broker publishes the data messages provided by the gateway. @@ -270,7 +267,7 @@ authorized sources such as a gateway and should validate incoming messages as ae The Ocean Data and Information System (ODIS) is a federation of independent data systems coordinated by the International Oceanographic -Data and Information Exchange (IODE) of IOC-UNESCO. This federation +Data and Information Exchange (IODE) of the Intergovernmental Oceanographic Commission of the United Nations Educational, Scientific and Cultural Organization (IOC-UNESCO). This federation includes continental-scale data systems as well as those of small organizations. ODIS partners use web architectural approaches to share metadata describing their holdings, services, and other capacities. In @@ -283,7 +280,7 @@ data. IODE harvests all metadata shared by ODIS partners, combines it as a knowledge graph, and processes this to export derivative products (for example, diagnostic reports and cloud-optimized data products). The Ocean InfoHub (OIH) system is IODE's reference implementation of a -discovery system leveraging ODIS. The ODIS architecture and tools are +discovery system leveraging ODIS. ODIS architecture and tools are free and open-source software (FOSS), with regular releases published for the community. @@ -292,27 +289,26 @@ federations, such as WIS2, to define sustainable data and metadata exchanges and - where needed - translators or converters. The resources needed to convert between such systems are developed in the open and in close collaboration with staff from those systems. These exchanges include -extract transform load (ETL) functions, to ensure that the bilateral exchange is mutually beneficial. +the extract, transform and load (ETL) functions to ensure that the bilateral exchange is mutually beneficial. ====== 2.8.1.2.1 Cross system interoperability Given the strong support for standards and interoperability by both WIS2 -and ODIS, data and metadata exchange is realized using web architectural -principles and approaches. The ability to discover ODIS data on WIS2 (as well -as the inverse) is a goal in extending the reach of both systems and data +and ODIS, data and metadata exchange are carried out using web architecture +principles and approaches. The ability to discover ODIS data on WIS2 (and the reverse) is a goal in extending the reach of both systems and data beyond their primary communities of interest. -The WIS2 Global Discovery Catalogue will provide discovery metadata records +WIS2 Global Discovery Catalogues will provide discovery metadata records using the OGC API - Records standard. This will include schema.org and JSON-LD annotations on WCMP2 discovery metadata in the GDC, to enable cross-pollination and federation. ODIS dataset records will be made available using the WCMP2 standard and provided -as objects available via HTTP for ingest, validation and publication to the GDC as a +as objects available via HTTP for ingest, validation and publication to the Global Discovery Catalogues as a federated catalogue. ODIS data will be published as recommended data as per the Unified Data Policy (Resolution 1 (Cg-Ext(2021))). - +(See Figure 4) .WIS2 and ODIS metadata and catalogue interoperability -image::images/wis2-odis-metadata-discovery-interop.png[WIS2 and ODIS metadata and catalogue interoperability] +image::images/wis2-odis-metadata-discovery-interop.png[Figure 4. WIS2 and ODIS metadata and catalogue interoperability] As a result, federated discovery will be realized between both systems, allowing for use and reuse of data in an authoritative manner, closest to the source of the data. From f2527aedbabed5675178d56da701016e848f0ee9 Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Wed, 9 Oct 2024 13:06:18 +0200 Subject: [PATCH 07/20] update intros --- guide/sections/introduction.adoc | 2 +- guide/sections/part1/introduction.adoc | 46 +++++++++++++------------- 2 files changed, 24 insertions(+), 24 deletions(-) diff --git a/guide/sections/introduction.adoc b/guide/sections/introduction.adoc index df9f107..a72fc38 100644 --- a/guide/sections/introduction.adoc +++ b/guide/sections/introduction.adoc @@ -1,4 +1,4 @@ == Introduction === Purpose -In conjunction with the https://library.wmo.int/idurl/4/68731[_Manual on the WMO Information System_] (WMO-No. 1060), Volume II – WMO Information System 2.0 (_Manual on WIS_, Volume II), the present _Guide to the WMO Information System_ (WMO-No. 1061), Volume II – WMO Information System 2.0 _(Guide to WIS_, Volume II) is designed to ensure adequate uniformity and standardization in the data, information and communication practices, procedures and specifications employed by WMO Members in the operation of the WMO Information System WIS 2.0 as it supports the mission of the Organization. The _Manual on WIS_, Volume II contains standard and recommended practices, procedures and specifications. The _Guide to WIS_, Volume II contains additional information concerning practices, procedures and specifications that Members are invited to follow or implement in establishing and conducting their arrangements in compliance with the WMO technical regulations and in developing meteorological and hydrological services. \ No newline at end of file +In conjunction with the https://library.wmo.int/idurl/4/68731[_Manual on the WMO Information System_] (WMO-No. 1060), Volume II – WMO Information System 2.0 (_Manual on WIS_, Volume II), the present _Guide to the WMO Information System_ (WMO-No. 1061), Volume II – WMO Information System 2.0 _(Guide to WIS_, Volume II) is designed to ensure adequate uniformity and standardization in the data, information and communication practices, procedures and specifications employed by WMO Members in the operation of the WMO Information System 2.0 (WIS2) as it supports the mission of the Organization. The _Manual on WIS_, Volume II contains standard and recommended practices, procedures and specifications. The _Guide to WIS_, Volume II contains additional information concerning practices, procedures and specifications that Members are invited to follow or implement in establishing and conducting their arrangements in compliance with the WMO technical regulations and in developing meteorological and hydrological services. \ No newline at end of file diff --git a/guide/sections/part1/introduction.adoc b/guide/sections/part1/introduction.adoc index 30a2794..bdd32fe 100644 --- a/guide/sections/part1/introduction.adoc +++ b/guide/sections/part1/introduction.adoc @@ -2,31 +2,31 @@ Since the Global Telecommunication System (GTS) entered operational life in 1971, it has been a reliable real-time exchange mechanism of essential data for WMO Members. -In 2007, the WMO Information System (WIS) entered operations to complement the GTS, providing a searchable catalogue and a Global Cache to enable additional discovery, access and retrieval of data. The success of WIS was limited as the system only partially met the requirement of providing simple access to WMO data. Today’s technology developed for the Internet of Things (IoT) opens the possibility of creating a WIS2 that is able to deliver an increasing number and volume of real-time data to WMO centres in a reliable and cost -effective way. +In 2007, the WMO Information System (WIS) entered operation to complement the GTS, providing a searchable catalogue and a Global Cache to enable additional discovery, access and retrieval of data. The success of WIS was limited, as the system only partially met the requirement of providing simple access to WMO data. Today’s technology developed for the Internet of Things (IoT) opens the possibility of creating a WIS2 that is able to deliver an increasing number and volume of real-time data to WMO centres in a reliable and cost -effective way. WIS2 has been designed to meet the shortfalls of the current WIS and GTS, support Resolution 1 (Cg-Ext(2021)) – WMO Unified Policy for the International Exchange of Earth System Data (https://library.wmo.int/idurl/4/57850[_World Meteorological Congress: Abridged Final Report of the Extraordinary Session_] (WMO-No. 1281)), support the Global Basic Observing Network (GBON) and meet the demand for high data volume, variety, velocity and veracity. -WIS2 technical framework is based around three foundational pillars: leveraging open standards, simpler data exchange and cloud-ready solutions. +The WIS2 technical framework is based around three foundational pillars: leveraging open standards, simpler data exchange and cloud-ready solutions. ==== 1.1.1 Leveraging open standards -WIS2 leverages open standards to take advantage of the ecosystem of technologies available on the market,thereby avoiding the need to build bespoke solutions that can force National Meteorological and Hydrological Services (NMHSs) to procure costly systems and equipment. In today’s standards development ecosystem, standards bodies work together closely to minimize overlap and build on one another’s areas of expertise. For example, the World Wide Web Consortium provides the framework of web standards, which the Open Geospatial Consortium (OGC) and other standards bodies leverage. WIS2 leverages open standards with industry adoption and wider, stable and robust implementations, thus extending the reach of WMO data sharing and lowering the barrier to access by Members. +WIS2 leverages open standards to take advantage of the ecosystem of technologies available on the market, thereby avoiding the need to build bespoke solutions that can force National Meteorological and Hydrological Services (NMHSs) to procure costly systems and equipment. In today’s standards development ecosystem, standards bodies work together closely to minimize overlap and build on one another’s areas of expertise. For example, the World Wide Web Consortium provides the framework of web standards, which the Open Geospatial Consortium (OGC) and other standards bodies leverage. WIS2 leverages open standards with industry adoption and wider, stable and robust implementations, thus extending the reach of WMO data sharing and lowering the barrier to access by Members. ==== 1.1.2 Simpler data exchange -WIS2 prioritizes public telecommunication networks, rather than private networks for GTS links. As a result, using the Internet will enable the best choice for a local connection, using commonly available and well-understood technology. +WIS2 prioritizes public telecommunication networks, rather than private networks, for GTS links. The Internet is therefore the best choice for a local connection, as it uses commonly available and well-understood technology. WIS2 aims to improve the discovery, access and utilization of weather, climate and water data by adopting web technologies proven to provide a truly collaborative platform for a more participatory approach. Data exchange using the Web also facilitates easy access mechanisms. Browsers and search engines allow web users to discover data without the need for specialized software. The Web also enables additional data access platforms, such as desktop geographical information systems (GIS), mobile applications and forecaster workstations. The Web provides access control and security mechanisms that can be utilized to freely share core data as defined by Resolution 1 (Cg-Ext(2021)) and to protect the data with more restrictive licensing constraints. Web technologies also allow for authentication and authorization to enable the provider to retain control of who can access published resources and to request users to accept a license specifying the terms and conditions for using the data as a condition of being granted access. -WIS2 uses a "publish-subscribe" pattern by which users subscribe to a topic to receive new data in real time. The mechanism is similar to WhatsApp and other messaging applications. It is a reliable and straightforward way to allow the users to choose their data of interest and to receive them reliably. +WIS2 uses a "publish-subscribe" pattern by which users subscribe to a topic to receive new data in real time. The mechanism is similar to WhatsApp and other messaging applications. It is a reliable and straightforward way to allow users to choose their data of interest and to receive them reliably. ==== 1.1.3 Cloud-ready solutions -The cloud provides reliable platforms for data sharing and processing. It reduces the need for expensive local IT infrastructure, which constitutes a barrier to developing effective and reliable data processing workflows for some WMO Members. WIS2 encourages WMO centres to adopt cloud technologies where appropriate to meet users' needs. While WMO technical regulations will not mandate cloud services, WIS2 will promote the gradual adoption of cloud technologies that provide the most effective solution. +The cloud provides reliable platforms for data sharing and processing. It reduces the need for expensive local IT infrastructure, which constitutes a barrier to developing effective and reliable data processing workflows for some WMO Members. WIS2 encourages WMO centres to adopt cloud technologies, where appropriate, to meet users' needs. While WMO technical regulations will not mandate cloud services, WIS2 will promote the gradual adoption of cloud technologies that provide the most effective solution. -The cloud-based infrastructure allows for the easy portability of technical solutions, ensuring that a system implemented by a specific country or territory can be packaged and deployed easily in other countries/territories with similar needs. In addition, using cloud technologies allows WIS2 to deploy infrastructure and systems efficiently, while requiring minimal effort from the NMHSs by shipping ready-made services and implementing consistent data processing and exchange techniques. +The cloud-based infrastructure allows for the easy portability of technical solutions, ensuring that a system implemented by a specific country or territory can be packaged and deployed easily in other countries/territories with similar needs. In addition, using cloud technologies allows WIS2 to deploy infrastructure and systems efficiently, while requiring minimal effort from the NMHSs, by shipping ready-made services and implementing consistent data processing and exchange techniques. -It is importantn to note that hosting data and services on the cloud does not affect data ownership. Even in a cloud environment, organizations retain ownership of their data, software, configuration and change management as if they were hosting their infrastructure. As a result, data authority and provenance stay with the organization, and the cloud is simply a technical means to publish the data. +It is important to note that hosting data and services on the cloud does not affect data ownership. Even in a cloud environment, organizations retain ownership of their data, software, configuration and change management as if they were hosting their infrastructure. As a result, data authority and provenance stay with the organization, and the cloud is simply a technical means to publish the data. ==== 1.1.4 Why are datasets so important? @@ -34,35 +34,35 @@ WMO enables the international exchange of observations and model data for all Ea Resolution 1 (Cg-Ext(2021)) describes the Earth system data that are necessary for efforts to monitor, understand and predict the weather and climate – including the hydrological cycle, the atmospheric environment and space weather. -WIS is the mechanism by which these Earth system data are exchanged. +WIS2 is the mechanism by which these Earth system data are exchanged. -A common practice when working with data is to group them into "datasets". All the data in a dataset share some common characteristics. The Data Catalog Vocabulary (DCAT) defines a dataset as a "collection of data, published or curated by a single agent, and available for access or download in one or more representations" footnote:[See Data Catalog Vocabulary (DCAT) - Version 2, W3C Recommendation 04 February 2020 https://www.w3.org/TR/vocab-dcat-2/#Class:Dataset]. +A common practice when working with data is to group them into datasets. All the data in a dataset share some common characteristics. The Data Catalog Vocabulary (DCAT) defines a dataset as a "collection of data, published or curated by a single agent, and available for access or download in one or more representations".footnote:[See _Data Catalog Vocabulary (DCAT) - Version 2, W3C Recommendation 04 February 2020_ https://www.w3.org/TR/vocab-dcat-2/#Class:Dataset] Why is this important? The "single agent" (such as an organization) responsible for managing the collection ensures consistency among the data. For example, in a dataset: -* All the data should be of the same type (for example, observations from weather stations). -* All the data should have the same license and/or usage conditions. -* All the data should be subject to the same quality management regime - which may mean that all the data are collected or created using the same processes. -* All the data should be encoded in the same way (for example, using the same data formats and vocabularies). +* All the data should be of the same type (for example, observations from weather stations); +* All the data should have the same license and/or usage conditions; +* All the data should be subject to the same quality management regime - which may mean that all the data are collected or created using the same processes; +* All the data should be encoded in the same way (for example, using the same data formats and vocabularies); * All the data should be accessible using the same protocols - ideally from a single location. This consistency means that it is possible to predict the contents of a dataset, at least regarding the common characteristics, making it easier to write applications to process the data. -A dataset may be published as an immutable resource (such as, data collected from a research programme), or it may be routinely updated (for example, every minute, as new observations are collected from weather stations). +A dataset may be published as an immutable resource (such as data collected from a research programme), or it may be routinely updated (for example, every minute, as new observations are collected from weather stations). -A dataset may be represented as a single, structured file or object (for example, a CSV file in which each row represents a data record) or as thousands of consistent files (for example, output from a reanalysis model encoded as many thousands of General Regularly-distributed Information in Binary form (GRIB) files). Determining the best way to represent a dataset is beyond the scope of this Guide – there are many factors to consider. The key point here is that the dataset is considered to be a single, identifiable resource, irrespective of how it is represented. +A dataset may be represented as a single, structured file or object (for example, a CSV file in which each row represents a data record) or as thousands of consistent files (for example, output from a reanalysis model encoded as many thousands of General Regularly-distributed Information in Binary form (GRIB) files). Determining the best way to represent a dataset is beyond the scope of this Guide – there are many factors to consider. The key point here is that the dataset is considered a single, identifiable resource, irrespective of how it is represented. Because data are grouped into a single, conceptual resource (that is, the dataset) it is possible to: * Assign this resource an identifier and use this identifier to unambiguously refer to collections of data; * Make statements about the dataset (that is, metadata) and infer that these statements apply to the entire collection. -The dataset concept is central to WIS: +The dataset concept is central to WIS2: * Discovery metadata about datasets are published, as specified in the _Manual on WIS_, Volume II – Appendix F. WMO Core Metadata Profile (Version 2); * Data consumers can search for datasets that contain relevant data using the Global Discovery Catalogue (see <<_2_4_4_global_discovery_catalogue>>); -* Data consumers can subscribe to notifications about updates about a dataset via a Global Broker (see <<_2_4_2_global_broker>>); -* Data consumers can access the data that comprise a dataset from a single location using a well -described mechanism. +* Data consumers can subscribe to notifications about updates to a dataset via a Global Broker (see <<_2_4_2_global_broker>>); +* Data consumers can access the data that comprise a dataset from a single location using a well-described mechanism. -It is up to data publishers to decide how their data are grouped into datasets – effectively, to decide what datasets they publish to WIS. That said, it is recommended that, subject to the consistency rules above, data publishers should organize their data into as few datasets as possible. +It is up to data publishers to decide how their data are grouped into datasets – effectively, to decide what datasets they publish to WIS2. That said, it is recommended that, subject to the consistency rules above, data publishers should organize their data into as few datasets as possible. For a data publisher, this means fewer discover metadata records to maintain. For a data consumer, this means fewer topics to subscribe to and fewer places to access the data. @@ -73,10 +73,10 @@ There are some things that are fixed requirements for datasets: Some examples of datasets include: -* The most recent five days of synoptic observations for an entire country or territory footnote:[Why 5-days in this example? Because the system used to publish the data in this example only retains data for 5-days]; +* The most recent five days of synoptic observations for an entire country or territory; footnote:[Why 5-days in this example? Because the system used to publish the data in this example only retains data for 5-days] * A long-term record of observed water quality for a managed set of hydrological stations; * The output from the most recent 24 hours of operational numerical weather prediction model runs; -* The output from six months of experimental model runs. It is important to note that output from the operational and experimental model runs should not be merged into the same dataset because they use different algorithms - it is very useful to be able to distinguish the provenance (or lineage) of data; +* The output from six months of experimental model runs. It is important to note that the output from operational and experimental model runs should not be merged into the same dataset because they use different algorithms - it is very useful to be able to distinguish the provenance (or lineage) of data; * A multi-petabyte global reanalysis spanning 1950 to the present. -In summary, datasets are important because they are how data are managed in WIS. +In summary, datasets are important because they are how data are managed in WIS2. From a155e2d20f9a2a3970fad130f7cc4a46a2bddcac Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Wed, 9 Oct 2024 13:58:56 +0200 Subject: [PATCH 08/20] update data consumer section --- guide/sections/part1/data-consumer.adoc | 30 +++++++++++++------------ 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/guide/sections/part1/data-consumer.adoc b/guide/sections/part1/data-consumer.adoc index 10dbd5d..8cd50f6 100644 --- a/guide/sections/part1/data-consumer.adoc +++ b/guide/sections/part1/data-consumer.adoc @@ -8,26 +8,28 @@ The first step to using data published via WIS2 is to determine which dataset or A key component of dataset records in the Global Discovery Catalogue is "actionable" links. A dataset record provides one or more links, each clearly identifying its nature and purpose (informational, direct download, application programming interface (API), subscription) so that the data consumer can interact with the data accordingly. For example, a dataset record may include a link to subscribe to notifications about the data(see <<_1_2_2_how_to_subscribe_to_notifications_about_the_availability_of_new_data>>), or an API, or an offline archive retrieval service. -The Global Discovery Catalogue is accessible via an API and provides a low barrier mechanism (see <<_2_2_4_global_discovery_catalogue>>). Internet search engines are able to index the discovery metadata in the Global Discovery Catalogue, thereby providing data consumers with an alternative means to search for WIS2 data. +The Global Discovery Catalogue is accessible via an API and provides a low-barrier mechanism (see <<_2_2_4_global_discovery_catalogue>>). Internet search engines are able to index the discovery metadata in the Global Discovery Catalogue, thereby providing data consumers with an alternative means to search for WIS2 data. ==== 1.2.2 How to subscribe to notifications about the availability of new data -WIS2 provides notifications about updates to datasets; for example, a notification may indicate that a new observation record from an automatic weather station has been added to a dataset of surface observations. These notifications are published on Message Brokers. If data consumers need to use data rapidly once they have been published (for example, as inputs to a weather prediction model), they should subscribe to one or more Global Brokers to get notification messages using Message Queuing Telemetry Transport (MQTT) protocolfootnote:[Subscribing to notifications about newly available data means that you don't need to continually to poll the data server to check for updates.]. +WIS2 provides notifications about updates to datasets; for example, a notification may indicate that a new observation record from an automatic weather station has been added to a dataset of surface observations. These notifications are published on Message Brokers. If data consumers need to use data rapidly once they have been published (for example, as inputs to a weather prediction model), they should subscribe to one or more Global Brokers to get notification messages using Message Queuing Telemetry Transport (MQTT) protocol. + +Subscribing to notifications about newly available data ensures that the data consumer does not have to continually to poll the data server to check for updates. In WIS2, notifications are republished by Global Brokers to ensure resilient distribution. Consequently, there will be multiple places where one can subscribe. Data consumers requiring real-time notifications must subscribe to Global Brokers. A data consumer should subscribe to more than one Global Broker, thereby ensuring that notifications continue to be received if a Global Broker instance fails. -A dataset in WIS2 is associated with a unique _topic_. Notifications about updates to a dataset are published to the associated topic. Topics are organized according to a standard scheme (see the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). +A dataset in WIS2 is associated with a unique topic. Notifications about updates to a dataset are published to the associated topic. Topics are organized according to a standard scheme (see the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). -A data consumer can find the appropriate topic to subscribe to either by searching the Global Discovery Catalogue, by using an Internet search engine,footnote:[Internet search engines allow data consumers to discover WIS2 datasets by indexing the content in the Global Discovery Catalogues.], or by browsing the topic hierarchy on a Message Broker. +A data consumer can find the appropriate topic to subscribe to either by searching the Global Discovery Catalogue, by using an Internet search engine,footnote:[Internet search engines allow data consumers to discover WIS2 datasets by indexing the content in Global Discovery Catalogues.], or by browsing the topic hierarchy on a Message Broker. WIS2 uses Global Caches to distribute core data, as defined in the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). Each Global Cache republishes core data on its own highly available data server and publishes a new notification message advertising the availability of those data from the Global Cache location. Notifications from WIS2 Nodes and Global Caches are published on different topics: The root topic used by WIS2 Nodes is ``origin``, while the root topic used by Global Caches is ``cache``. Other than the root, the topic hierarchy is identical. For example, for synoptic weather observations published by Environment Canada: -* Environment and Climate Change Canada, Meteorological Service of Canada's WIS2 Node publishes to: ``origin/a/wis2/ca-eccc-msc/data/core/weather/surface-based-observations/synop``; +* Environment and Climate Change Canada, Meteorological Service of Canada's WIS2 Node, publishes to: ``origin/a/wis2/ca-eccc-msc/data/core/weather/surface-based-observations/synop``; * Global Caches publish to: ``cache/a/wis2/ca-eccc-msc/data/core/weather/surface-based-observations/synop``. -As per clause 3.2.13 of the _Manual on WIS_, Volume II, data consumers should access core data from the Global Caches. In order to access these data, they need to subscribe to the ``cache`` topic hierarchy. They will then receive the relevant notifications from the Global Caches, each of which will contain a link (URL) enabling them to download the relevant data from the data server of the corresponding Global Cache. +As per clause 3.2.13 of the _Manual on WIS_, Volume II, data consumers should access core data from the Global Caches. In order to access these data, they must subscribe to the ``cache`` topic hierarchy. They will then receive the relevant notifications from the Global Caches, each of which will contain a link (URL) enabling them to download the relevant data from the data server of the corresponding Global Cache. ==== 1.2.3 How to use a notification message to decide whether to download data @@ -35,13 +37,13 @@ On receipt of a notification message, a data consumer needs to decide whether to In many cases, data consumers will use a software application to determine whether or not to download the data. The present section explains this process. -When subscribing to multiple Global Brokers, the data consumers will receive multiple copies of a notification message. Each notification message has a unique identifier, defined using the ``id`` property. Duplicate messages should be discarded. +When subscribing to multiple Global Brokers, data consumers will receive multiple copies of a notification message. Each notification message has a unique identifier, defined using the ``id`` property. Duplicate messages should be discarded. Core data are available from both a WIS2 Node and the Global Caches, each of which will publish a different notification message advertising an alternative location from which the data may be downloaded. Because these are different messages, they will have different identifiers. However, each of these messages refers to the same data object, which is uniquely identified in the notification message using the data_id property. Notification messages from different sources can easily be compared to determine whether they refer to the same data. By subscribing to the cache root topic, data consumers will only receive notifications about data available from the Global Caches. The origin root topic should be used when subscribing to notifications about recommended data. Data consumers should not subscribe to the origin root topic for notifications about core data because the notification messages provided on these topics will refer to data published directly on the WIS2 Nodes (referred to as the "origin"). Data consumers need to consider their strategy for managing these duplicate messages. From a data perspective, it does not matter which Global Cache instance is used – they will all provide an identical copy of the data object published by the originating WIS2 Node. The simplest strategy is to accept the first notification message and download the data from the Global Cache instance that the message refers to by using a URL for the data object at that Global Cache instance. Alternatively, data consumers may have a preferred Global Cache instance, for example, one that is located in their region. Whichever Global Cache instance is chosen, data consumers will need to implement logic to discard duplicate notification messages based on ``id`` and duplicate data objects based on ``data_id``. -A notification message also provides a small amount of metadata about the data object it references, such as location and time. Data consumers can use these metadata to decide whether the data object referenced in the message should be downloaded. This is known as client-side filtering. +A notification message also provides a small amount of metadata about the data object it references, such as location or time. Data consumers can use these metadata to decide whether the data object referenced in the message should be downloaded. This is known as client-side filtering. The notification message should also include the metadata identifier for the dataset to which the data object belongs. A data consumer can use the metadata identifier to search the Global Discovery Catalogue and discover more about the data - in particular, whether there are any conditions on the use of those data. @@ -60,7 +62,7 @@ If a download link implements access control (for example, the data consumer nee Data are shared on WIS2 in accordance with the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). This data policy describes two categories of data: core and recommended. -* Core data are considered essential for the provision of services for the protection of life and property and the well-being of all nations. Core data are provided on a free and unrestricted basis, without charge and with no conditions on use. +* Core data are considered essential for the provision of services for the protection of life and property and the well-being of all nations. Core data are provided on a free and unrestricted basis. * Recommended data are exchanged on WIS2 in support of Earth system monitoring and prediction efforts. Recommended data _may_ be provided with conditions on use and/or subject to a license. The Unified Data Policy (Resolution 1 (Cg-Ext (2021))) encourages attribution of the source of the data in all cases. This ensures that, credit is given to those who have expended effort and resources in collecting, curating, generating, or processing the data. Attribution provides visibility into who is using the data, which, for many organizations, serves as crucial evidence to justify the continued provision and updating of the data. @@ -81,10 +83,10 @@ Data consumers wanting to use data published via WIS2 should, at a minimum, read * <<_1_1_introduction_to_wis2>> * <<_2_1_wis2_architecture>> * <<_2_2_roles_in_wis2>> -* <<_2_4_components_of_wis2>> +* <<_2_4_wis2_components>> -The following specifications in the _Manual on WIS_, Volume II also provide useful information: +The following sections in the _Manual on WIS_, Volume II also provide useful information: -* Appendix D. WIS2 Topic Hierarchy -* Appendix E. WIS2 Notification Message -* Appendix F. WMO Core Metadata Profile (Version 2) +* Appendix D. WIS2 Topic Hierarchy; +* Appendix E. WIS2 Notification Message; +* Appendix F. WMO Core Metadata Profile (Version 2). From 4f4f113e845a354207c7f93885591e7a35e8302c Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Wed, 9 Oct 2024 14:37:22 +0200 Subject: [PATCH 09/20] update data publisher section --- guide/sections/part1/data-publisher.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/guide/sections/part1/data-publisher.adoc b/guide/sections/part1/data-publisher.adoc index 4d72441..73e42b5 100644 --- a/guide/sections/part1/data-publisher.adoc +++ b/guide/sections/part1/data-publisher.adoc @@ -51,7 +51,7 @@ Whether providing data as files or through interactive APIs, data publishers nee ===== 1.3.3.2 Providing data as files -The simplest way to publish data through WIS2 is to persist your data as files and publish those files on a web server. All these files need to be organized in some manner, for example, in a flat structure or grouped into collections that resemble folders or directory structures. +The simplest way to publish data through WIS2 is to persist the data as files and publish those files on a web server. All these files need to be organized in some manner, for example, in a flat structure or grouped into collections that resemble folders or directory structures. To ensure that the data are usable, users need to be able to find the specific file (or files) they need. @@ -100,7 +100,7 @@ Optionally, data may be embedded in a notification message using a content objec Notification messages are encoded as GeoJSON (RFC 7946) and must conform to the _Manual on WIS_, Volume II, Appendix E. WIS2 Notification Message. -The URL used in the notification message should refer only to the newly added data object (for example, the new temperature profile), rather than the entire dataset. However, the WIS2 Notification Message specification allows for multiple URLs to be provided. When providing data through an interactive API, it may be useful to provide a "canonical" link (designated with link relation: ``"rel": "canonical"``footnote:[IANA Link Relations https://www.iana.org/assignments/link-relations/link-relations.xhtml]) and an additional link with the URL for the root of the web service from which the entire dataset can be accessed or queried. +The URL used in the notification message should refer only to the newly added data object (for example, the new temperature profile), rather than the entire dataset. However, the WIS2 Notification Message specification allows for multiple URLs to be provided. When providing data through an interactive API, it may be useful to provide a "canonical" link (designated by link relation: ``"rel": "canonical"``footnote:[IANA Link Relations https://www.iana.org/assignments/link-relations/link-relations.xhtml]) and an additional link with the URL for the root of the web service from which the entire dataset can be accessed or queried. The dataset identifier should be included in the notification message (``metadata_id`` property). This allows data consumers receiving the notification to cross reference it with information provided in the discovery metadata for the dataset, for example the conditions of use specified in the data policy, rights, or license. @@ -118,7 +118,7 @@ Whatever topic is used, the discovery metadata provided to the Global Discovery Core data, as specified in the Unified Data Policy (Resolution 1 (Cg-Ext(2021))) are considered essential for the provision of services for the protection of life and property and for the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. -WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published at WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24-hours.footnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]) Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. +WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published from WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24 hours.footnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]) Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. The URL included in a notification message that is used to access core data from a WIS2 Node, or the "canonical" URL, if multiple URLs are provided, must: @@ -137,7 +137,7 @@ Core data that are not to be cached must have the cache property in the notifica Data publishers must ensure that core data that are not cached are publicly accessible from their WIS2 Node, that is, with no access control mechanisms in place. -Global Cache operators may choose to disregard a cache preference, for example, if they feel that the content being providing is large enough to impede the provision of caching services for other Members.footnote:[Excessive data volume is not the only reason they may refuse to cache content. Other reasons include: too many small files, unreliable download from a WIS2 Node, etc.] In such cases, the Global Cache operator will log this behaviour. In collaboration with the Global Cache operators, your GISC will work with you to resolve concerns. +Global Cache operators may choose to disregard a cache preference, for example, if they feel that the content being providing is large enough to impede the provision of caching services for other Members.footnote:[Excessive data volume is not the only reason they may refuse to cache content. Other reasons include: too many small files, unreliable download from a WIS2 Node, etc.] In such cases, the Global Cache operator will log this behaviour. Global Cache operators will collaborate with data publishers and their GISCs to resolve any concerns. Finally, note that Global Caches are under no obligation to cache data published on _experimental_ topics. For such data, the ``cache`` property should be set to ``false``. @@ -163,7 +163,7 @@ Finally, note that if only core data are being published, it may be possible to ===== 1.3.3.7 Providing access to data archives -There is no requirement for a WIS2 Node to publish notification messages about newly available data; however, the mechanism is available if needed (for instance, for real-time data exchange). Data archives published through WIS2 do not need to provide notification messages for data unless the user community has expressed a need to be rapidly notified about changes (for example, the addition of new records into a climate observation archive). +There is no requirement for a WIS2 Node to publish notification messages about newly available data; however, the mechanism is available if needed (for instance, for real-time data exchange). Data archives published via WIS2 do not need to provide notification messages for data unless the user community has expressed a need to be rapidly notified about changes (for example, the addition of new records to a climate observation archive). However, notification messages must still be used to share discovery metadata with WIS2. Given that the provision of metadata and subsequent updates are likely to be infrequent, it may be sufficient to "handcraft" notification messages as needed and publish them locally on an MQTT brokerfootnote:[MQTT broker managed services are available online, often with a free (no cost) starter plan sufficient for infrequent publications of notifications about metadata. These provide a viable alternative to implementing an MQTT broker instance yourself.] or with the help of a GISC. See above for more details on publishing discovery metadata to WIS2. From 8294a6c9809e8d7d34c5abf828280aa3c473590c Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Wed, 9 Oct 2024 15:57:55 +0200 Subject: [PATCH 10/20] update architecture --- guide/sections/part2/wis2-architecture.adoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/guide/sections/part2/wis2-architecture.adoc b/guide/sections/part2/wis2-architecture.adoc index 078c230..33c3928 100644 --- a/guide/sections/part2/wis2-architecture.adoc +++ b/guide/sections/part2/wis2-architecture.adoc @@ -14,7 +14,7 @@ WIS centres shall comply with the technical regulations defined in the _Manual o === 2.2 Roles in WIS2 -When describing the functions of WIS2 there are four roles to consider: +When describing the functions of WIS2, there are four roles to consider: . Data publisher; . Global coordinator; @@ -50,7 +50,7 @@ These roles are outlined below. * Data consumers determine whether the data or metadata referenced in the notification messages are required. * Data consumers download data from a Global Cache or WIS2 Node. -=== 2.3 Specifications of WIS2 +=== 2.3 WIS2 Specifications Leveraging existing open standards, WIS2 defines the following specifications in support of publication, subscription, notification and discovery: @@ -132,7 +132,7 @@ Refer to the _Manual on WIS_, Volume II for details. ** Message retention: false; ** Quality of Service (QoS) of 1; ** A maximum of 2000 messages to be held in a queue per client. -* In order to permit authentication and authorization for users, WIS2 Node, Global Cache, Global Discovery Catalogue and Global Brokers shall use a user and password based mechanism. +* To enable user authentication and authorization, WIS2 Nodes, Global Caches, Global Discovery Catalogues and Global Brokers shall use a user and password based mechanism. * To improve the overall level of security of WIS2, the secure version of the MQTT protocol is preferred. If used, the certificate must be valid. * The standard Transmission Control Protocol (TCP) ports to be used are 8883 for Secure MQTT (MQTTS) and 443 for Secure Web Socket (WSS). From 880481a4773aed9e5512d5f70ebe6a5e1df04ed4 Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Mon, 14 Oct 2024 18:13:18 +0200 Subject: [PATCH 11/20] LSP editing continued --- guide/sections/part1/data-publisher.adoc | 30 +++++----- guide/sections/part2/global-services.adoc | 20 +++---- guide/sections/part2/operations.adoc | 58 +++++++++---------- guide/sections/part2/wis2-architecture.adoc | 4 +- guide/sections/part2/wis2node.adoc | 14 ++--- .../part3/information-management.adoc | 2 +- guide/sections/part4/security.adoc | 6 +- guide/sections/part5/competencies.adoc | 2 +- 8 files changed, 68 insertions(+), 68 deletions(-) diff --git a/guide/sections/part1/data-publisher.adoc b/guide/sections/part1/data-publisher.adoc index 73e42b5..518bbd8 100644 --- a/guide/sections/part1/data-publisher.adoc +++ b/guide/sections/part1/data-publisher.adoc @@ -20,14 +20,14 @@ Once the scope of the datasets has been determined, the applicable data policy h Discovery metadata is the mechanism by which data publishers tell potential consumers about their data, how it may be accessed, and any conditions they may place on the use of those data. -Each dataset that is published must have an associated discovery metadata record. This record is encoded as GeoJSON (RFC 7946footnote:[See RFC 7946 - The GeoJSON Format: https://datatracker.ietf.org/doc/html/rfc7946.]) and must conform to the specification given in the _Manual on WIS_, Volume II - Appendix F. WMO Core Metadata Profile (Version 2). +Each dataset that is published must have an associated discovery metadata record. This record is encoded as GeoJSON (See RFC 7946footnote:[See RFC 7946 - The GeoJSON Format: https://datatracker.ietf.org/doc/html/rfc7946.]) and must conform to the specification given in the _Manual on WIS_, Volume II - Appendix F. WMO Core Metadata Profile (Version 2). Copies of all discovery metadata records from WIS2 are held in the Global Discovery Catalogues, where data consumers can search and browse to find data that is of interest to them. Depending on local arrangements, your GISC may be able to assist in transferring discovery metadata record(s) to the Global Discovery Catalogues. If this is not the case, data publishers will need to publish the discovery metadata record(s) themselvesfootnote:[In the future, WIS2 may provide metadata publication services (for example, through a WIS2 metadata management portal) to assist with this task. However, such services are not currently available.] using one of two methods: * The simplest method is to encode the discovery metadata record as a file and publish it to an HTTP server, where it can be accessed with a URL. -* Alternatively, a data publisher may operate a local metadata catalogue through which discovery metadata records can be shared using an API (for example, OGC API – Recordsfootnote:[See OGC API - Records - Part 1: Core https://docs.ogc.org/DRAFTS/20-004.html.]). Each discovery metadata record (for instance, an item that is part of the discovery metadata catalogue) can be accessed with a unique URL via the API . +* Alternatively, a data publisher may operate a local metadata catalogue through which discovery metadata records can be shared using an API (for example, OGC API – Recordsfootnote:[See OGC API - Records - Part 1: Core: https://docs.ogc.org/DRAFTS/20-004.html.]). Each discovery metadata record (for instance, an item that is part of the discovery metadata catalogue) can be accessed with a unique URL via the API . In both cases, a notification message needs to be published on a Message Broker that tells WIS2 that there is a new discovery metadata record to upload and that it can be accessed at the specified URL.footnote:[Both data and metadata are published using the same notification message mechanism to announce the availability of new resources.] Notification messages shall conform to the specification given in the _Manual on WIS_, Volume II - Appendix E. WIS2 Notification Message. They must also be published on a topic that conforms to the specification given in the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy. For example, metadata published by Deutscher Wetterdienst would use the following topic: ``origin/a/wis2/de-dwd/metadata/core``. @@ -39,9 +39,9 @@ Discovery metadata must be published in the Global Discovery Catalogues before t ==== 1.3.3 How to provide data to WIS2 -WIS2 is based on the web architecture.footnote:[See Architecture of the World Wide Web, Volume One: https://www.w3.org/TR/webarch/.] As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; and the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[See RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986]), which can be used to resolve the identifieed resource and interact with it (for example, to download a representation of the resource over an open-standard protocol such as HTTP). +WIS2 is based on the web architecture.footnote:[See Architecture of the World Wide Web, Volume One: https://www.w3.org/TR/webarch/.] As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; and the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[See RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986].), which can be used to resolve the identifieed resource and interact with it (for example, to download a representation of the resource over an open-standard protocol such as HTTP). -In simple terms, data (and metadata) are provided to WIS2 by assigning them a unique identifier, in this case a URLfootnote:[The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (such as its network "location"). RFC 3986], and making them available via a data server - most typically a web server using HTTP protocol.footnote:[WIS2 strongly prefers secure versions of protocols (such as HTTPS) wherein the communication protocol is encrypted using Transport Layer Security (TLS)] It is up to the data server to decide what to provide when resolving the identifier. For example, the URL of a data granule may resolve as a representation encoded in a given data format, whereas the URL of a dataset may resolve as a description of the dataset (that is, metadata) that includes links to access the data from which the dataset is comprised - either individual files (that is, the data granules) or an interactive API that enables users to request only the parts of the dataset they need by specifying query parameters. +In simple terms, data (and metadata) are provided to WIS2 by assigning them a unique identifier, in this case a URLfootnote:[The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (such as its network location). See RFC 3986: https://datatracker.ietf.org/doc/html/rfc3986.], and making them available via a data server - most typically a web server using HTTP protocol.footnote:[WIS2 strongly prefers secure versions of protocols (such as HTTPS), wherein the communication protocol is encrypted using Transport Layer Security (TLS)] It is up to the data server to decide what to provide when resolving the identifier. For example, the URL of a data granule may resolve as a representation encoded in a given data format, whereas the URL of a dataset may resolve as a description of the dataset (that is, metadata) that includes links to access the data from which the dataset is comprised - either individual files (that is, the data granules) or an interactive API that enables users to request only the parts of the dataset they need by specifying query parameters. The following sections cover specific considerations relating to publishing data to WIS2. @@ -82,7 +82,7 @@ Based on the experience of data publishers that have been using web APIs to serv * First, interactive APIs should be self-describing. Data consumers should not need to know, a priori, how to make requests from an API. They should be able to discover this information from the API endpoint itself – even if this simply entails a link to a documentation page they need to read. * Second, interactive APIs should comply with OpenAPIfootnote:[See OpenAPI Specification v3.1.0: https://spec.openapis.org/oas/v3.1.0.] version 3 or later. OpenAPI provides a standardized mechanism to describe the API. Tooling (free, commercial, etc.) that can read this metadata and automatically generate client applications to query the API is widely available. -* Third, the OGC has developed a suite of APIsfootnote:[Open Geospatial Consortium OGC API https://ogcapi.ogc.org/] (called "OGC APIs") that are specifically designed to provide APIs for geospatial data workflows (discovery, visualization, access, processing/exploitation) – all of which build on OpenAPI. Among these, OGC API – Environmental Data Retrieval (EDR),footnote:[OGC API - Environmental Data Retrieval (EDR) https://ogcapi.ogc.org/edr] OGC API – Features,footnote:[OGC API - Features https://ogcapi.ogc.org/features] and OGC API - Coveragesfootnote:[OGC API - Coverages https://ogcapi.ogc.org/coverages] are considered particularly useful. Because these are open standards, there is an ever-growing suite of software implementations (both free and proprietary) that support them. It is recommended that data publishers assess these open-standard API specifications to determine their suitability for publishing their datasets using APIs. +* Third, the OGC has developed a suite of APIsfootnote:[See OGC API: https://ogcapi.ogc.org/.] (called "OGC APIs") that are specifically designed to provide APIs for geospatial data workflows (discovery, visualization, access, processing/exploitation) – all of which build on OpenAPI. Among these, OGC API – Environmental Data Retrieval (EDR),footnote:[See OGC API - Environmental Data Retrieval (EDR): https://ogcapi.ogc.org/edr.] OGC API – Features,footnote:[See OGC API - Features: https://ogcapi.ogc.org/features.] and OGC API - Coveragesfootnote:[See OGC API - Coverages: https://ogcapi.ogc.org/coverages.] are considered particularly useful. Because these are open standards, there is an ever-growing suite of software implementations (both free and proprietary) that support them. It is recommended that data publishers assess these open-standard API specifications to determine their suitability for publishing their datasets using APIs. Finally, it is advisable to consider versioning the API to avoid breaking changes when adding new features. A common approach is to add a _version number_ prefix into the API path, for example, ``/v1/service/{rest-of-path}`` or ``/service/v1/{rest-of-path}``. @@ -90,7 +90,7 @@ More guidance on the use of interactive APIs in WIS2 is anticipated in future ve ===== 1.3.3.4 Providing data in (near) real time -WIS2 is designed to support the data sharing needs of all WMO disciplines and domains. Among these, the World Weather Watch footnote:[WMO World Weather Watch https://wmo.int/world-weather-watch] drives specific needs for the rapid exchange of data to support weather forecasting. +WIS2 is designed to support the data sharing needs of all WMO disciplines and domains. Among these, the World Weather Watch footnote:[See World Weather Watch: https://wmo.int/world-weather-watch.] drives specific needs for the rapid exchange of data to support weather forecasting. To enable real-time data sharing,footnote:[In the context of WIS2, real time implies anything from a few seconds to a few minutes - not the milliseconds required by some applications.] WIS2 uses notification messages to inform users of the availability of a new resource, either data or discovery metadata, and how they can access that resource. Notification messages are published to a queue on a Message Broker in a data publisher's WIS2 Nodefootnote:[WIS2 ensures the rapid global distribution of notification messages using a network of Global Brokers which subscribe to the Message Brokers of WIS2 Nodes and republish notification messages (see <<_2_4_2_Global_Broker>>).] using the MQTT protocol and immediately delivered to all users subscribing to that queue. A queue is associated with a specific _topic_, such as a dataset. @@ -100,7 +100,7 @@ Optionally, data may be embedded in a notification message using a content objec Notification messages are encoded as GeoJSON (RFC 7946) and must conform to the _Manual on WIS_, Volume II, Appendix E. WIS2 Notification Message. -The URL used in the notification message should refer only to the newly added data object (for example, the new temperature profile), rather than the entire dataset. However, the WIS2 Notification Message specification allows for multiple URLs to be provided. When providing data through an interactive API, it may be useful to provide a "canonical" link (designated by link relation: ``"rel": "canonical"``footnote:[IANA Link Relations https://www.iana.org/assignments/link-relations/link-relations.xhtml]) and an additional link with the URL for the root of the web service from which the entire dataset can be accessed or queried. +The URL used in the notification message should refer only to the newly added data object (for example, the new temperature profile), rather than the entire dataset. However, the WIS2 Notification Message specification allows for multiple URLs to be provided. When providing data through an interactive API, it may be useful to provide a "canonical" link (designated by link relation: ``"rel": "canonical"``footnote:[See Internet Assigned Numbers Authority (IANA) Link Relations: https://www.iana.org/assignments/link-relations/link-relations.xhtml.]) and an additional link with the URL for the root of the web service from which the entire dataset can be accessed or queried. The dataset identifier should be included in the notification message (``metadata_id`` property). This allows data consumers receiving the notification to cross reference it with information provided in the discovery metadata for the dataset, for example the conditions of use specified in the data policy, rights, or license. @@ -110,9 +110,9 @@ To ensure that data consumers can easily find the topics they want to subscribe If the data seem to relate to more than one topic, the most appropriate one should be selected. The topic hierarchy is not a knowledge organization system – it is used solely to ensure the uniqueness of topics for publishing notification messages. Discovery metadata is used to describe a dataset and its relevance to additional disciplines; each dataset is mapped to one, and only one, topic. -If the WIS2 Topic Hierarchy does not include a topic appropriate for the data, the data should be published on an experimental topic. This will allow data exchange to be established while the formalities are being considered.footnote:[The "experimental" topic is necessary for the WIS2 pre-operational phase and future pre-operational data exchange in test mode.] Experimental topics are provided for each Earth system discipline at level eight in the topic hierarchy (for example, ``origin/a/wis2/{centre-id}/data/{earth-system-discipline}/experimental/``). Data publishers can extend the experimental branch with subtopics they deem appropriate. Experimental topics are subject to change and will be removed once they are no longer needed. For more information, see _Manual on WIS_, Volume II, Appendix D. WIS2 Topic Hierarchy, section 1.2 Publishing guidelines. +If the WIS2 Topic Hierarchy does not include a topic appropriate for the data, the data should be published on an experimental topic. This will allow data exchange to be established while the formalities are being considered.footnote:[Experimental topics are necessary for the WIS2 pre-operational phase and future pre-operational data exchange in test mode.] Experimental topics are provided for each Earth system discipline at level eight in the topic hierarchy (for example, ``origin/a/wis2/{centre-id}/data/{earth-system-discipline}/experimental/``). Data publishers can extend the experimental branch with subtopics they deem appropriate. Experimental topics are subject to change and will be removed once they are no longer needed. For more information, see _Manual on WIS_, Volume II, Appendix D. WIS2 Topic Hierarchy, section 1.2 Publishing guidelines. -Whatever topic is used, the discovery metadata provided to the Global Discovery Catalogue must include subscription links using that topic.footnote:[The Global Discovery Catalogue will reject discovery metadata records containing links to topics outside the official topic-hierarchy.] The Global Broker will only republish notification messages on topics specified in the discovery metadata records. +Whatever topic is used, the discovery metadata provided to the Global Discovery Catalogue must include subscription links using that topic.footnote:[The Global Discovery Catalogue will reject discovery metadata records containing links to topics outside the official topic hierarchy.] The Global Broker will only republish notification messages on topics specified in the discovery metadata records. ===== 1.3.3.5 Considerations when providing core data in WIS2 @@ -133,11 +133,11 @@ Unfortunately, Global Caches cannot republish _all_ core data; there is a limit If frequent updates to a dataset are very large (for example, in the case of weather prediction models or remote sensing observations) data publishers will need to share the burden of distributing their data with Global Cache operators. They should work with their GISC to determine the highest priority elements of their datasets that will be republished by the Global Caches. -Core data that are not to be cached must have the cache property in the notification message set to false.footnote:[Default value for the ``cache`` property is ``true``; omission of the property will result in the data object being cached.] +Core data that are not to be cached must have the cache property in the notification message set to false.footnote:[The default value for the ``cache`` property is ``true``. Omitting the property will result in the data object being cached.] Data publishers must ensure that core data that are not cached are publicly accessible from their WIS2 Node, that is, with no access control mechanisms in place. -Global Cache operators may choose to disregard a cache preference, for example, if they feel that the content being providing is large enough to impede the provision of caching services for other Members.footnote:[Excessive data volume is not the only reason they may refuse to cache content. Other reasons include: too many small files, unreliable download from a WIS2 Node, etc.] In such cases, the Global Cache operator will log this behaviour. Global Cache operators will collaborate with data publishers and their GISCs to resolve any concerns. +Global Cache operators may choose to disregard a cache preference, for example, if they feel that the content being providing is large enough to impede the provision of caching services for other Members.footnote:[Excessive data volume is not the only reason a Global Cache operator may refuse to cache content. Other reasons include too many small files, unreliable download from a WIS2 Node, and so forth.] In such cases, the Global Cache operator will log this behaviour. Global Cache operators will collaborate with data publishers and their GISCs to resolve any concerns. Finally, note that Global Caches are under no obligation to cache data published on _experimental_ topics. For such data, the ``cache`` property should be set to ``false``. @@ -145,7 +145,7 @@ Finally, note that Global Caches are under no obligation to cache data published Recommended data, as defined in the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))), are exchanged on WIS2 in support of Earth system monitoring and prediction efforts and may be provided with conditions on use. This means that the data publisher may control access to recommended data. -Access control should only use the "security schemes" for authentication and authorization specified in OpenAPI.footnote:[OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object] +Access control should only use the "security schemes" for authentication and authorization specified in OpenAPI.footnote:[See OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object.] Where access control is implemented, a ``security`` object should be included in the download links in discovery metadata and notification messages to provide the user with pertinent information about the access control mechanism used and where/how they might register to request access. @@ -153,9 +153,9 @@ Recommended data are never cached by the Global Caches. The use of core data must always be free and unrestricted. However, it may be necessary to leverage existing systems with built-in access control when implementing the download service for the WIS2 Node. -Example 1: API key. The data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for data publication via WIS2 so that usage can be tracked.], footnote:[Given that users are encouraged to download Core data from the Global Cache, there will likely be only a few accesses using the WIS2 account's API key. If the usage quota for the WIS2 account is exceeded (for instance, further data access is blocked) then this should encourage users to download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] +Example 1: API key. The data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for the publication of data via WIS2 so that data usage can be tracked.], footnote:[Given that users are encouraged to download core data from the Global Cache, there will likely be limited access using the API key of the WIS2 account. If the usage quota for the WIS2 account is exceeded (for instance, if further data access is blocked), then this should encourage users to download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] -Example 2: Presigned URLs. The data server uses a cloud-based object store that requires credentials to be provided when downloading data. The URLs used in notification messages should be _presigned_ with the data publisher's credentials and valid for the cache retention period (for example, 24 hours).footnote:[Working with presigned URLs on Amazon S3 https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html] +Example 2: Presigned URLs. The data server uses a cloud-based object store that requires credentials to be provided when downloading data. The URLs used in notification messages should be _presigned_ with the data publisher's credentials and valid for the cache retention period (for example, 24 hours).footnote:[See working with presigned URLs on Amazon S3: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html.] In both cases, the URL provided in a notification message can be directly resolved without requiring a user or a Global Cache to take additional action, such as providing credentials or authenticating. @@ -165,7 +165,7 @@ Finally, note that if only core data are being published, it may be possible to There is no requirement for a WIS2 Node to publish notification messages about newly available data; however, the mechanism is available if needed (for instance, for real-time data exchange). Data archives published via WIS2 do not need to provide notification messages for data unless the user community has expressed a need to be rapidly notified about changes (for example, the addition of new records to a climate observation archive). -However, notification messages must still be used to share discovery metadata with WIS2. Given that the provision of metadata and subsequent updates are likely to be infrequent, it may be sufficient to "handcraft" notification messages as needed and publish them locally on an MQTT brokerfootnote:[MQTT broker managed services are available online, often with a free (no cost) starter plan sufficient for infrequent publications of notifications about metadata. These provide a viable alternative to implementing an MQTT broker instance yourself.] or with the help of a GISC. See above for more details on publishing discovery metadata to WIS2. +However, notification messages must still be used to share discovery metadata with WIS2. Given that the provision of metadata and subsequent updates are likely to be infrequent, it may be sufficient to "handcraft" notification messages as needed and publish them locally on an MQTT brokerfootnote:[MQTT broker managed services are available online, often with a free starter plan sufficient for the occassional publication of notifications about metadata. These services provide a viable alternative to implementing an MQTT broker instance.] or with the help of a GISC. See above for more details on publishing discovery metadata to WIS2. Note that some data archives, for example, Essential Climate Variables, are categorized as core data. Core data may be distributed via the Global Caches; however, given that they provide only short-term data hosting (for instance, for 24 hours), Global Caches are not an appropriate mechanism to provide access to core data archives. These archives must be accessed directly via the WIS2 Node. diff --git a/guide/sections/part2/global-services.adoc b/guide/sections/part2/global-services.adoc index 5c5ad50..5ba0973 100644 --- a/guide/sections/part2/global-services.adoc +++ b/guide/sections/part2/global-services.adoc @@ -1,6 +1,6 @@ === 2.7 Implementation and operation of a Global Service -==== 2.7.1 Procedure for registration of a new Global Service +==== 2.7.1 Procedure for registering a new Global Service The successful operation of WIS2 depends on a set of Global Services running well-managed IT environments with a very high level of reliability so that all WIS2 users and WIS2 Nodes are able to access and provide the data they need for their duties. Depending on the nature of the Global Service, the following are the minimum capabilities to ensure that, collectively, the level of service is 100% (or very close): @@ -42,7 +42,7 @@ Metrics for WIS2 monitoring should follow the naming convention wmo__>). @@ -26,14 +26,14 @@ The registration of a WIS2 Node involves the following steps: - Verification of compliance of notification messages with the WIS2 Notification Message (WNM) specification; - Verification that the data server is correctly configured and properly functioning; - Verification that the Message Broker is correctly configured and properly functioning. -* Add a new centre to WIS2: Upon completion of the verification and confirmation that the centre satisfies all the conditions for operating a WIS2 Node, the GISC notifies the WMO Secretariat and confirms that the WIS2 Node can be added to WIS2. +* Add a new centre to WIS2: Upon completion of the verification and confirmation that the centre satisfies all the conditions for operating a WIS2 Node, the GISC notifies the WMO Secretariat and confirms that the centre can be added to WIS2 as a WIS2 Node. * Communicate the details to the Global Services: The WMO Secretariat provides the details of the WIS2 Node to the Global Brokers so that they can subscribe to the WIS2 Node. A diagram of the process for registering a WIS2 Node is presented below (see Figure 1). image::images/add-wis2node.png[Adding a WIS2 Node,link=images/add-wis2node.png] -Once a WIS2 Node has been registered and connected to the Global Services, it can proceed to register the datasets it will publish via WIS2. To register a dataset, the authorized WIS2 Node publishes discovery metadata about the new dataset. Validation of the discovery metadata is completed by the Global Discovery Catalogues, and the Global Brokers automatically subscribe to the topics provided in the discovery metadata record. For more information, see <<_1_3_2_how_to_provide_discovery_metadata_to_wis2>>. +Once a WIS2 Node has been registered and connected to the Global Services, it can proceed to register the datasets it will publish via WIS2. To register a dataset, the WIS2 Node publishes discovery metadata about the new dataset. Validation of the discovery metadata is completed by the Global Discovery Catalogues, and the Global Brokers automatically subscribe to the topics provided in the discovery metadata record. For more information, see <<_1_3_2_how_to_provide_discovery_metadata_to_wis2>>. Once the dataset has been successfully registered, the WIS2 Node can proceed to exchange data - see <<_1_3_3_how_to_provide_data_in_wis2>>. @@ -46,13 +46,13 @@ The centre identifier (``centre-id``) is used in WIS2 to uniquely identify a par The ``centre-id`` comprises two dash-separated tokens. -*Token 1* is a _Top Level Domain_ (TLD) defined by the Internet Assigned Numbers Authority (IANA).footnote:[IANA Top Level Domains https://data.iana.org/TLD] +*Token 1* is a Top Level Domain (TLD) defined by the Internet Assigned Numbers Authority (IANA).footnote:[See IANA TLDs: https://data.iana.org/TLD.] It is usually fairly easy for a Member to choose a TLD. However, for Members’ overseas territories, this may require some thought. The recommended approach depends on the governance structure of the overseas territory. For example, Réunion is a French Department; it is considered part of France, and it uses the Euro. Réunion would use the “fr” TLD. New Caledonia is a French overseas territory with a TLD of “nc” because it has a separate, devolved governance structure. The recommendation is to use “nc”. However, the decision of which TLD to use is made at the national level. *Token 2* is a descriptive name for the centre. It may contain dashes, but it may not contain other special characters. -The descriptive name should be something recognizable – not only by the WIS2 community, but also by other users. Basing the name on the web domain name is likely to ensure that centre identifiers remain unique within a particular country or territory. For example, the National Meteorological Service of the United Kingdom of Great Britian and Northern Ireland is the Met Office,footnote:[see http://www.metoffice.gov.uk] so “metoffice” is better than “ukmo”.footnote:[The “.gov” part of the domain name is superfluous for the purposes of WIS2 There is nothing preventing its use, but it does not add any value.]. Using a four-letter GTS centre identifier (for example, CCCC) is not recommended because those who are unfamiliar with GTS will not understand these identifiers. +The descriptive name should be something recognizable – not only by the WIS2 community, but also by other users. Basing the name on the web domain name is likely to ensure that centre identifiers remain unique within a particular country or territory. For example, the National Meteorological Service of the United Kingdom of Great Britian and Northern Ireland is the Met Office,footnote:[See http://www.metoffice.gov.uk.] so “metoffice” is better than “ukmo”.footnote:[The “.gov” part of the domain name is superfluous for the purposes of WIS2 There is nothing preventing its use, but it does not add any value.]. Using a four-letter GTS centre identifier (for example, CCCC) is not recommended because those who are unfamiliar with GTS will not understand these identifiers. The centre identifier specification says that larger organizations operating multiple centres may wish to register separate centre-ids for each centre. This is a good practice. Keeping with the UK example, the Met Office operates a National Meteorological Centre (NMC), 9 DCPCs (for example, a Volcanic Ash Advisory Centre) and a WIS2 Global Service, so it is important to separate them. For example: @@ -68,13 +68,13 @@ When configuring a WIS2 Node, it is necessary to consider how it will be accesse Global Brokers must authenticate when they connect to the MQTT Message Broker in the WIS2 Node. Username and password credentials are used.footnote:[The default connection credentials for a WIS2 Node Message Broker are username ``everyone`` and password ``everyone`` WIS2 Node operators should choose credentials that meet their local policies (for example, password complexity).]. When registering the WIS2 Node with the WMO Secretariat, these credentials must be provided. The WMO Secretariat will share the credentials with the Global Service operators and store them in the WIS register. These credentials should not be considered confidential or secret. -Given that Global Brokers republish notification messages provided by the WIS2 Node, you may decide to restrict access to the MQTT Message Broker. Global Brokers operate using a fixed IP address which allows you to permit them access using IP filtering.footnote:[In WIS2 the IP addresses are used to determine the origin of connections and therefore confer trust to remote systems. It is well documented that IP addresses can be hi-jacked and that there are alternative, more sophisticated, mechanisms available for reliably determining the origin of connections requests, such as Public Key Infrastructure (PKI). However, the complexities of such implementation would introduce a barrier to Member's participation in WIS2. IP addresses are considered to provide an adequate level of trust for the purposes of WIS2: distributing publicly accessible data and messages.] MQTT Message Brokers must be accessible by more than one Global Broker to ensure resilient transmission of notification messages to WIS2. +Given that Global Brokers republish notification messages provided by the WIS2 Node, you may decide to restrict access to the MQTT Message Broker. Global Brokers operate using a fixed IP address which allows you to permit them access using IP filtering.footnote:[In WIS2, IP addresses are used to determine the origin of connections and confer trust to remote systems. It is well documented that IP addresses can be hijacked and that more sophisticated mechanisms, such as Public Key Infrastructure (PKI), are available for reliably determining the origin of connection requests. However, the complexities of implementing such mechanisms create barriers to Member participation in WIS2. For the purposes of WIS2, which involves distributing publicly accessible data and messages, IP addresses are considered to provide an adequate level of trust.] MQTT Message Brokers must be accessible by more than one Global Broker to ensure resilient transmission of notification messages to WIS2. If your WIS2 Node only publishes core data,footnote:[In some cases, WIS2 Nodes will need to serve core data directly (see <<_1_3_3_5_considerations_when_providing_core_data_in_wis2>>). In these situations, the WIS2 Node data server must remain publicly accessible.] access to the data server may also be restricted, with the distribution of data handled by Global Caches. Global Caches also operate on fixed IP addresses, allowing their connections to be easily identified. Again, access must be granted to more than one Global Broker to ensure resilience. During registration, the WMO Secretariat will provide host names and IP addresses of the Global Services to enable access controls to be configured. -Access controls may be implemented for recommended data. Only the security schemes for authentication and authorization specified in OpenAPI should be used.footnote:[OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object] +Access controls may be implemented for recommended data. Only the security schemes for authentication and authorization specified in OpenAPI should be used.footnote:[See OpenAPI Specification - Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object.] ==== 2.6.2 Performance management diff --git a/guide/sections/part3/information-management.adoc b/guide/sections/part3/information-management.adoc index 9c2be96..7dffc99 100644 --- a/guide/sections/part3/information-management.adoc +++ b/guide/sections/part3/information-management.adoc @@ -197,7 +197,7 @@ When an information resource is marked for disposal the reasons for disposal, in ===== 3.1.4.1 Technology and technology migration -Information managers must be aware of the need to ensure that the technologies, hardware and software used do not become obsolete and must be aware of emerging data issues. This topic is discussed further in the https://library.wmo.int/idurl/4/56904[_WMO Guidelines on Emerging Data Issues_] (WMO-No. 1239). +Information managers must be aware of the need to ensure that the technologies, hardware and software used do not become obsolete, and they must be aware of emerging data issues. This topic is discussed further in the https://library.wmo.int/idurl/4/56904[_WMO Guidelines on Emerging Data Issues_] (WMO-No. 1239). ===== 3.1.4.2 Information security diff --git a/guide/sections/part4/security.adoc b/guide/sections/part4/security.adoc index 65b220c..740b61a 100644 --- a/guide/sections/part4/security.adoc +++ b/guide/sections/part4/security.adoc @@ -1,6 +1,6 @@ === 4.1 Security -For this initial version of the Guide to WIS, Volume II, existing guidance on information technology security (also known as cybersecurity) remains largely applicable. Please refer to: +For this initial version of the Guide to WIS, Volume II, existing guidance on information technology security (also known as "cybersecurity") remains largely applicable. Please refer to: -* _Guide to Information Technology Security_ (WMO-No. 1115) -* https://library.wmo.int/idurl/4/28988[_Guide to the WMO Information System_] (WMO-No. 1061), Vol I, Appendix E - Annex To Paragraph 7.8 (ICT Service Incident Management), and Appendix F - WIS IT Security Incident Response Process \ No newline at end of file +* _Guide to Information Technology Security_ (WMO-No. 1115); +* https://library.wmo.int/idurl/4/28988[_Guide to the WMO Information System_] (WMO-No. 1061), Volume I, Appendix E. Annex To Paragraph 7.8, 1. ICT Service Incident Management; and Appendix F.WIS IT Security Incident Response Process. \ No newline at end of file diff --git a/guide/sections/part5/competencies.adoc b/guide/sections/part5/competencies.adoc index 9e85dd9..138c485 100644 --- a/guide/sections/part5/competencies.adoc +++ b/guide/sections/part5/competencies.adoc @@ -1,3 +1,3 @@ === 5.1 Competencies -For this initial version of the _Guide to WIS_, Volume II, existing guidance on competencies remains largely applicable. Please refer to _Guide to the WMO Information System_ (WMO-No. 1061), Volume I, Appendix A - WIS training and learning guide. \ No newline at end of file +For this initial version of the _Guide to WIS_, Volume II, existing guidance on competencies remains largely applicable. Please refer to _Guide to the WMO Information System_ (WMO-No. 1061), Volume I, Appendix A. WMO Information System Training and Learning Guide. \ No newline at end of file From 976036af569bd11f15f7e2182a0aaeb08706239e Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Fri, 18 Oct 2024 11:00:50 +0200 Subject: [PATCH 12/20] LSP edits --- guide/sections/part1/data-consumer.adoc | 6 ++---- guide/sections/part1/data-publisher.adoc | 8 ++++---- guide/sections/part1/introduction.adoc | 2 +- guide/sections/part2/global-services.adoc | 14 +++++++------- guide/sections/part2/operations.adoc | 15 +++++++-------- guide/sections/part2/wis2node.adoc | 10 +++++----- 6 files changed, 26 insertions(+), 29 deletions(-) diff --git a/guide/sections/part1/data-consumer.adoc b/guide/sections/part1/data-consumer.adoc index 8cd50f6..0e79850 100644 --- a/guide/sections/part1/data-consumer.adoc +++ b/guide/sections/part1/data-consumer.adoc @@ -12,11 +12,9 @@ The Global Discovery Catalogue is accessible via an API and provides a low-barri ==== 1.2.2 How to subscribe to notifications about the availability of new data -WIS2 provides notifications about updates to datasets; for example, a notification may indicate that a new observation record from an automatic weather station has been added to a dataset of surface observations. These notifications are published on Message Brokers. If data consumers need to use data rapidly once they have been published (for example, as inputs to a weather prediction model), they should subscribe to one or more Global Brokers to get notification messages using Message Queuing Telemetry Transport (MQTT) protocol. +WIS2 provides notifications about updates to datasets; for example, a notification may indicate that a new observation record from an automatic weather station has been added to a dataset of surface observations. These notifications are published on Message Brokers. If data consumers need to use data rapidly once they have been published (for example, as inputs to a weather prediction model), they should subscribe to one or more Global Brokers to get notification messages using Message Queuing Telemetry Transport (MQTT) protocol.footnote[Subscribing to notifications about newly available data ensures that the data consumer does not need to continually to poll the data server to check for updates.] -Subscribing to notifications about newly available data ensures that the data consumer does not have to continually to poll the data server to check for updates. - -In WIS2, notifications are republished by Global Brokers to ensure resilient distribution. Consequently, there will be multiple places where one can subscribe. Data consumers requiring real-time notifications must subscribe to Global Brokers. A data consumer should subscribe to more than one Global Broker, thereby ensuring that notifications continue to be received if a Global Broker instance fails. +In WIS2, notifications are republished by Global Brokers to ensure resilient distribution. Consequently, there will be multiple places where one can subscribe. Data consumers requiring real-time notifications must subscribe to Global Brokers. Data consumers should subscribe to more than one Global Broker, thereby ensuring that notifications continue to be received if a Global Broker instance fails. A dataset in WIS2 is associated with a unique topic. Notifications about updates to a dataset are published to the associated topic. Topics are organized according to a standard scheme (see the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). diff --git a/guide/sections/part1/data-publisher.adoc b/guide/sections/part1/data-publisher.adoc index 518bbd8..3ecfa3d 100644 --- a/guide/sections/part1/data-publisher.adoc +++ b/guide/sections/part1/data-publisher.adoc @@ -35,7 +35,7 @@ These discovery metadata records are then propagated through the Global Service Upon receipt of a new discovery metadata record, a Global Discovery Catalogue (see <<_2_4_4_global_discovery_catalogue>>) will validate, assess, ingest, and publish the record. Validation ensures compliance with the specification, while the assessment evaluates the discovery record against good practices. The Global Discovery Catalogue will notify the data publisher if the discovery metadata record fails validation and provide recommendations for improvements. -Discovery metadata must be published in the Global Discovery Catalogues before the data is published. +Discovery metadata must be published in the Global Discovery Catalogues before the data are published. ==== 1.3.3 How to provide data to WIS2 @@ -118,7 +118,7 @@ Whatever topic is used, the discovery metadata provided to the Global Discovery Core data, as specified in the Unified Data Policy (Resolution 1 (Cg-Ext(2021))) are considered essential for the provision of services for the protection of life and property and for the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. -WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published from WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24 hours.footnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]) Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. +WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published at WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24 hours.footnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]) Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. The URL included in a notification message that is used to access core data from a WIS2 Node, or the "canonical" URL, if multiple URLs are provided, must: @@ -153,7 +153,7 @@ Recommended data are never cached by the Global Caches. The use of core data must always be free and unrestricted. However, it may be necessary to leverage existing systems with built-in access control when implementing the download service for the WIS2 Node. -Example 1: API key. The data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for the publication of data via WIS2 so that data usage can be tracked.], footnote:[Given that users are encouraged to download core data from the Global Cache, there will likely be limited access using the API key of the WIS2 account. If the usage quota for the WIS2 account is exceeded (for instance, if further data access is blocked), then this should encourage users to download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] +Example 1: API key. The data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for the publication of data via WIS2 so that data usage can be tracked.], footnote:[Given that users are encouraged to download core data from the Global Cache, there will likely be limited access using the API key of the WIS2 account. If the usage quota for the WIS2 account is exceeded (for instance, if further data access is blocked), users should download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] Example 2: Presigned URLs. The data server uses a cloud-based object store that requires credentials to be provided when downloading data. The URLs used in notification messages should be _presigned_ with the data publisher's credentials and valid for the cache retention period (for example, 24 hours).footnote:[See working with presigned URLs on Amazon S3: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html.] @@ -165,7 +165,7 @@ Finally, note that if only core data are being published, it may be possible to There is no requirement for a WIS2 Node to publish notification messages about newly available data; however, the mechanism is available if needed (for instance, for real-time data exchange). Data archives published via WIS2 do not need to provide notification messages for data unless the user community has expressed a need to be rapidly notified about changes (for example, the addition of new records to a climate observation archive). -However, notification messages must still be used to share discovery metadata with WIS2. Given that the provision of metadata and subsequent updates are likely to be infrequent, it may be sufficient to "handcraft" notification messages as needed and publish them locally on an MQTT brokerfootnote:[MQTT broker managed services are available online, often with a free starter plan sufficient for the occassional publication of notifications about metadata. These services provide a viable alternative to implementing an MQTT broker instance.] or with the help of a GISC. See above for more details on publishing discovery metadata to WIS2. +However, notification messages must still be used to share discovery metadata with WIS2. Given that the provision of metadata and subsequent updates are likely to be infrequent, it may be sufficient to manually author notification messages as needed and publish them locally on an MQTT brokerfootnote:[MQTT broker managed services are available online, often with a free starter plan sufficient for the occassional publication of notifications about metadata. These services provide a viable alternative to implementing an MQTT broker instance.] or with the help of a GISC. See above for more details on publishing discovery metadata to WIS2. Note that some data archives, for example, Essential Climate Variables, are categorized as core data. Core data may be distributed via the Global Caches; however, given that they provide only short-term data hosting (for instance, for 24 hours), Global Caches are not an appropriate mechanism to provide access to core data archives. These archives must be accessed directly via the WIS2 Node. diff --git a/guide/sections/part1/introduction.adoc b/guide/sections/part1/introduction.adoc index bdd32fe..f0f2198 100644 --- a/guide/sections/part1/introduction.adoc +++ b/guide/sections/part1/introduction.adoc @@ -2,7 +2,7 @@ Since the Global Telecommunication System (GTS) entered operational life in 1971, it has been a reliable real-time exchange mechanism of essential data for WMO Members. -In 2007, the WMO Information System (WIS) entered operation to complement the GTS, providing a searchable catalogue and a Global Cache to enable additional discovery, access and retrieval of data. The success of WIS was limited, as the system only partially met the requirement of providing simple access to WMO data. Today’s technology developed for the Internet of Things (IoT) opens the possibility of creating a WIS2 that is able to deliver an increasing number and volume of real-time data to WMO centres in a reliable and cost -effective way. +In 2007, the WMO Information System (WIS) entered operation to complement GTS, providing a searchable catalogue and a Global Cache to enable additional discovery, access and retrieval of data. The success of WIS was limited, as the system only partially met the requirement of providing simple access to WMO data. Today’s technology developed for the Internet of Things (IoT) opens the possibility of creating a WIS2 that is able to deliver an increasing number and volume of real-time data to WMO centres in a reliable and cost -effective way. WIS2 has been designed to meet the shortfalls of the current WIS and GTS, support Resolution 1 (Cg-Ext(2021)) – WMO Unified Policy for the International Exchange of Earth System Data (https://library.wmo.int/idurl/4/57850[_World Meteorological Congress: Abridged Final Report of the Extraordinary Session_] (WMO-No. 1281)), support the Global Basic Observing Network (GBON) and meet the demand for high data volume, variety, velocity and veracity. diff --git a/guide/sections/part2/global-services.adoc b/guide/sections/part2/global-services.adoc index 5ba0973..fce453d 100644 --- a/guide/sections/part2/global-services.adoc +++ b/guide/sections/part2/global-services.adoc @@ -3,7 +3,7 @@ ==== 2.7.1 Procedure for registering a new Global Service The successful operation of WIS2 depends on a set of Global Services running well-managed IT environments with a very high level of reliability so that all WIS2 users and WIS2 Nodes are able to access and provide the data they need for their duties. -Depending on the nature of the Global Service, the following are the minimum capabilities to ensure that, collectively, the level of service is 100% (or very close): +Depending on the nature of the Global Service, the following are the minimum capabilities needed to ensure that the level of service as a whole reaches 100% (or very close): * Three Global Brokers, with each Global Broker connected to at least two other Global Brokers; * Three Global Caches, with each Global Cache connected to at least two Global Brokers and capable of downloading data from all WIS2 Nodes providing core data; @@ -30,7 +30,7 @@ The WMO Secretariat and other Global Services will make the required changes to The availability of data and the performance of system components within WIS2 are actively monitored by GISCs and the Global Monitor service to ensure proactive responses to incidents and effective capacity planning for future operations. -WIS2 requires that metrics are provided using OpenMetricsfootnote:[See OpenMetrics: https://openmetrics.io.] – the de-facto standardfootnote:[OpenMetrics is proposed as a draft standard within the Internet Engineering Task Force (IETF).] for transmitting cloud-native metrics at scale. Widely adopted, many commercial and open-source software components already come preconfigured to provide performance metrics using the OpenMetrics standard. Tools such as Prometheus and Grafana aggregate and visualize metrics provided in this format, making it simple to generate performance insights. +WIS2 requires that metrics are provided using OpenMetricsfootnote:[See OpenMetrics: https://openmetrics.io.] – the widely adopted, de-facto standardfootnote:[OpenMetrics is proposed as a draft standard within the Internet Engineering Task Force (IETF).] for transmitting cloud-native metrics at scale. Many commercial and open-source software components already come preconfigured to provide performance metrics using the OpenMetrics standard. Tools such as Prometheus and Grafana aggregate and visualize metrics provided in this format, making it simple to generate performance insights. WIS2 Global Services (Global Brokers, Global Caches, and Global Discovery Catalogues) provide monitoring metrics about their respective service to Global Monitors. @@ -59,7 +59,7 @@ The full set of the WIS2 monitoring metrics is given in WMO: WIS2 Metric Hierarc ** Could limit the bandwidth usage of the service to 1 Gb/s. * A Global Monitor: ** Should support a minimum of 50 metrics providers; -** Should support 200 simultaneous access to the dashboard; +** Should support 200 simultaneous "access" to the dashboard; ** Could limit the bandwidth usage of the service to 100 Mb/s. * A Global Discovery Catalogue: ** Should support a minimum of 20 000 metadata records; @@ -108,12 +108,12 @@ In WIS2, Global Caches provide access to WMO core data for data consumers. This * A Global Cache will host data objects copied from NCs/DCPCs. * A Global Cache will publish notification messages advertising the availability of the data objects it holds. The notification messages will follow the standard structure (see _Manual on WIS_, Volume II -Appendix E. WIS2 Notification Message). * A Global Cache will use the standard topic structure in its local Message Brokers (see _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). -* A Global Cache will publish on the topic ``cache/a/wis2/...``. +* A Global Cache will publish to the topic ``cache/a/wis2/...``. * There will be multiple Global Cache to ensure the highly available, low-latency global provision of real-time and near-real-time core data within WIS2. * There will be multiple Global Caches that may attempt to download cacheable data objects from all originating centres with cacheable content. A Global Cache will also download data objects from other Global Caches. This will ensure that each Global Cache has full global coverage, even when direct download from an originating centre is not possible * Global Caches will operate independently of one another. Each Global Cache will hold a full copy of the cache – although there may be small differences between the various Global Caches as data availability notification messages propagate through WIS to each one. There is no formal synchronization between Global Caches. * A Global Cache will temporarily cache all resources published on the ``metadata`` topic. A Global Discovery Catalogue will subscribe to notifications about the publication of new or updated metadata, download the metadata record from the Global Cache and insert it into the catalogue. A Global Discovery Catalogue will also publish a metadata record archive each day containing the complete content of the catalogue and advertise its availability with a notification message. This resource will also be cached by a Global Cache. -* A Global Cache is designed to support real-time content distribution. Data consumers access data objects from a Global Cache instance by resolving the URL in a data availability notification message and downloading the file to which the URL points. Apart from the URL it is transparent to the data consumers from which Global Cache they download the data. There is no need to download the same data object from multiple Global Caches. The data id contained within notification messages is used by data consumers and Global Services to detect such duplicates. +* A Global Cache is designed to support real-time content distribution. Data consumers access data objects from a Global Cache instance by resolving the URL in a data availability notification message and downloading the file to which the URL points. Only by checking the URL, is it transparent to the data consumers from which Global Cache they are downloading the data. There is no need to download the same data object from multiple Global Caches. The data id contained within notification messages is used by data consumers and Global Services to detect such duplicates. * There is no requirement for a Global Cache to provide a browsable interface to the files in its repository in order to allow data consumers to discover what content is available. However, a Global Cache may choose to provide such a capability (for example, implemented as a WAF), along with documentation to inform data consumers of how the capability works. * The default behaviour for a Global Cache is to cache all data published under the ``origin/a/wis2/data/+/core`` topic. A data publisher may indicate that data should not be cached by adding the ``"cache": false`` assertion in the WIS2 Notification Message. * A Global Cache may decide not to cache data, for example, if the data are considered too large, or if a WIS2 Node publishes an excessive number of small files. If a Global Cache decides not to cache data, it should behave as though the cache property is set to false and send a message on the monitor topic hierarchy to inform the originating centre and its GISC. The Global Cache operator should work with the originating WIS2 Node and its GISC to remedy this issue. @@ -129,7 +129,7 @@ In WIS2, Global Caches provide access to WMO core data for data consumers. This * A Global Cache shall retain the data and metadata it receives for a minimum of 24 hours. Requirements relating to varying retention times for different types of data may be added later. * For messages received on the topic ``++origin/a/+/data/core/#++`` or ``++cache/a/+/data/core/#++``, a Global Cache shall: ** If the message contains the property ``"properties.cache": false``, -*** Republish the message at topic ``cache/a/wis2/...`` matching ``+/a/wis2/...`` where the original message has been received after having updated the id of the message. +*** Republish the message at topic ``cache/a/wis2/...``, matching ``+/a/wis2/...`` where the original message has been received, after having updated the id of the message. ** Else *** Maintain a list of data_ids that have already been downloaded; *** Verify whether the message points to new or updated data by comparing the pubtime value of the notification message with the list of data_ids; @@ -200,7 +200,7 @@ wis2-gdc provides the functionality required for the Global Discovery Catalogue, * Metrics reporting; * Implementation of metrics. -wis2-gdc is managed as a free and open source project. Source code, issue tracking and discussions are hosted in the open on GitHub.footnote:[See https://github.com/wmo-im/wis2-gdc.] +wis2-gdc is managed as a free and open source project. Source code, issue tracking and discussions are hosted openly on GitHub: https://github.com/wmo-im/wis2-gdc. ==== 2.7.6 Global Monitor diff --git a/guide/sections/part2/operations.adoc b/guide/sections/part2/operations.adoc index 05588d8..40afd70 100644 --- a/guide/sections/part2/operations.adoc +++ b/guide/sections/part2/operations.adoc @@ -165,7 +165,7 @@ Message Broker. WIS2 defines a standard topic hierarchy to ensure that data are published consistently by all WIS2 Nodes. Notification messages for aviation data should be published on a specific topic allowing a data consumer, such as the gateway, to subscribe only to -aviation-specific notifications. See the example below: +aviation-specific notifications. See the example below. .Example topic used to publish notifications about quantitative volcanic ash concentration information: [source,text] @@ -194,7 +194,7 @@ Global Discovery Catalogue subscribes to a Global Broker and downloads the discovery metadata from the WIS2 Node using the URL supplied in the message.] -.Interactions between the gateway and components of WIS2 and SWIM +.Interactions between the gateway component and WIS2 and SWIM components image::images/wis2-to-swim-interaction-temp.png[Figure 3. Interactions between the gateway and components of WIS2 and SWIM] **Configuration** @@ -240,7 +240,7 @@ The choice of protocol for publishing to the SWIM Message Broker should be based on a bilateral agreement between operators of the gateway and the SWIM service. -The gateway should implement logging and error handling as necessary to +The gateway component should implement logging and error handling as necessary to enable reliable operations. WIS2 uses the OpenMetrics standardfootnote:[See OpenMetrics: https://openmetrics.io.] to publish metrics and other operating information. The use of OpenMetrics @@ -249,7 +249,7 @@ easily integrated into the WIS2 system. **Operation** -The gateway may be operated at the national or regional level depending on +The gateway component may be operated at the national or regional level depending on the organizational governance in place. ====== 2.8.1.1.5 SWIM service @@ -299,8 +299,8 @@ principles and approaches. The ability to discover ODIS data on WIS2 (and the re beyond their primary communities of interest. WIS2 Global Discovery Catalogues will provide discovery metadata records -using the OGC API - Records standard. This will include schema.org and JSON-LD -annotations on WCMP2 discovery metadata in the GDC, to enable cross-pollination +using the OGC API - Records standard. The Global Discovery Catalogues will include schema.org and JSON-LD +annotations on WCMP2 discovery metadata to enable cross-pollination and federation. ODIS dataset records will be made available using the WCMP2 standard and provided @@ -310,5 +310,4 @@ federated catalogue. ODIS data will be published as recommended data as per the .WIS2 and ODIS metadata and catalogue interoperability image::images/wis2-odis-metadata-discovery-interop.png[Figure 4. WIS2 and ODIS metadata and catalogue interoperability] -As a result, federated discovery will be realized between both systems, allowing for -use and reuse of data in an authoritative manner, closest to the source of the data. +As a result, federated discovery will be realized between both systems, users will be able to access the data from as close as possible to their source, and the data will be able to be used and reused in an authoritative manner. \ No newline at end of file diff --git a/guide/sections/part2/wis2node.adoc b/guide/sections/part2/wis2node.adoc index 21feb64..4b26a27 100644 --- a/guide/sections/part2/wis2node.adoc +++ b/guide/sections/part2/wis2node.adoc @@ -14,7 +14,7 @@ The WMO Secretariat will maintain a WIS2 register with an authoritative list of The registration of a WIS2 Node involves the following steps: -* Request to host a WIS2 Node: A request to host a WIS2 Node shall be put forward by the WIS NFP of the country of the WIS2 Node host centre, or, in the case of international organizations, by either the PR of the country or territory where the centre is located or the president of the relevant organization, if the WMO partner or programme is designated as a DCPC. +* Request to host a WIS2 Node: A request to host a WIS2 Node shall be put forward by the WIS NFP of the country of the WIS2 Node host centre, or, in the case of international organizations, by either the PR with WMO of the country or territory where the centre is located or the president of the relevant organization, if the WMO partner or programme is designated as a DCPC. * Assign a centre identifier (“centre-id”): The centre-id is an acronym proposed by the Member and endorsed by the WMO Secretariat. It is a single identifier consisting of a top-level domain (TLD) and a centre name and represents the data publisher, distributor or issuing centre of a given dataset or data product/granule (see the Manual on WIS, Volume II – Appendix D. WIS2 Topic Hierarchy). See guidance on assigning a centre identifier (<<_2_6_1_2_guidance_on_assigning_a_centre_identifier_for_a_wis2_node>>). @@ -68,7 +68,7 @@ When configuring a WIS2 Node, it is necessary to consider how it will be accesse Global Brokers must authenticate when they connect to the MQTT Message Broker in the WIS2 Node. Username and password credentials are used.footnote:[The default connection credentials for a WIS2 Node Message Broker are username ``everyone`` and password ``everyone`` WIS2 Node operators should choose credentials that meet their local policies (for example, password complexity).]. When registering the WIS2 Node with the WMO Secretariat, these credentials must be provided. The WMO Secretariat will share the credentials with the Global Service operators and store them in the WIS register. These credentials should not be considered confidential or secret. -Given that Global Brokers republish notification messages provided by the WIS2 Node, you may decide to restrict access to the MQTT Message Broker. Global Brokers operate using a fixed IP address which allows you to permit them access using IP filtering.footnote:[In WIS2, IP addresses are used to determine the origin of connections and confer trust to remote systems. It is well documented that IP addresses can be hijacked and that more sophisticated mechanisms, such as Public Key Infrastructure (PKI), are available for reliably determining the origin of connection requests. However, the complexities of implementing such mechanisms create barriers to Member participation in WIS2. For the purposes of WIS2, which involves distributing publicly accessible data and messages, IP addresses are considered to provide an adequate level of trust.] MQTT Message Brokers must be accessible by more than one Global Broker to ensure resilient transmission of notification messages to WIS2. +Given that Global Brokers republish notification messages provided by the WIS2 Node, access to the MQTT Message Broker may be restricted. Global Brokers operate using a fixed IP address, which allows access to be granted using IP filtering.footnote:[In WIS2, IP addresses are used to determine the origin of connections and confer trust to remote systems. It is well documented that IP addresses can be hijacked and that more sophisticated mechanisms, such as Public Key Infrastructure (PKI), are available for reliably determining the origin of connection requests. However, the complexities of implementing such mechanisms create barriers to Member participation in WIS2. For the purposes of WIS2, which involves distributing publicly accessible data and messages, IP addresses are considered to provide an adequate level of trust.] MQTT Message Brokers must be accessible by more than one Global Broker to ensure resilient transmission of notification messages to WIS2. If your WIS2 Node only publishes core data,footnote:[In some cases, WIS2 Nodes will need to serve core data directly (see <<_1_3_3_5_considerations_when_providing_core_data_in_wis2>>). In these situations, the WIS2 Node data server must remain publicly accessible.] access to the data server may also be restricted, with the distribution of data handled by Global Caches. Global Caches also operate on fixed IP addresses, allowing their connections to be easily identified. Again, access must be granted to more than one Global Broker to ensure resilience. @@ -96,7 +96,7 @@ If contacted by a Global Monitor for a performance issue via a GISC, the WIS2 No When providing a WIS2 Node, Members may use whichever software components they consider most appropriate to comply with the WIS2 technical regulations. -To assist Members, a free and open-source reference implementation called “WIS2 in a box” (wis2box) is available. wis2box implements the requirements for a WIS2 Node and contains additional enhancements. wis2box is built on mature and robust free open-source software components that are widely adopted for operational use. +To assist Members, a free and open-source reference implementation called “WIS2 in a box” (wis2box) is available. wis2box implements the requirements for a WIS2 Node and contains additional enhancements. wis2box is built free and open-source software components that are mature, robust and widely adopted for operational use. wis2box provides the functionality required for both data publisher and data consumer roles, as well as the following technical functions: @@ -111,8 +111,8 @@ wis2box provides the functionality required for both data publisher and data con * Provision of system performance and data availability metrics; * Access control for publication of recommended data, as required; * Subscription to notifications and download of WIS data from Global Services; -* Modular design, allowing for extending to meet additional requirements or integration with existing data management systems. +* Modular design, which allows for extensibility to meet additional requirements or integration with existing data management systems. The project documentation can be found at https://docs.wis2box.wis.wmo.int. -The wis2box is managed as a free and open source project. The source code, issue tracking and discussions are hosted openly on GitHub: https://docs.wis2box.wis.wmo.int. +wis2box is managed as a free and open source project. The source code, issue tracking and discussions are hosted openly on GitHub: https://github.com/wmo-im/wis2box. From 9929f133830731f6eb177fe901bd2361d6aef482 Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Wed, 23 Oct 2024 12:26:56 +0200 Subject: [PATCH 13/20] update based on review from LSP in preparation for WMO publication --- guide/sections/part1/data-consumer.adoc | 16 ++-- guide/sections/part1/data-publisher.adoc | 4 +- guide/sections/part1/introduction.adoc | 4 +- guide/sections/part2/global-services.adoc | 8 +- guide/sections/part2/operations.adoc | 23 +++--- guide/sections/part2/wis2node.adoc | 6 +- .../part3/information-management.adoc | 76 +++++++++---------- 7 files changed, 68 insertions(+), 69 deletions(-) diff --git a/guide/sections/part1/data-consumer.adoc b/guide/sections/part1/data-consumer.adoc index 0e79850..1e5d4dd 100644 --- a/guide/sections/part1/data-consumer.adoc +++ b/guide/sections/part1/data-consumer.adoc @@ -12,15 +12,15 @@ The Global Discovery Catalogue is accessible via an API and provides a low-barri ==== 1.2.2 How to subscribe to notifications about the availability of new data -WIS2 provides notifications about updates to datasets; for example, a notification may indicate that a new observation record from an automatic weather station has been added to a dataset of surface observations. These notifications are published on Message Brokers. If data consumers need to use data rapidly once they have been published (for example, as inputs to a weather prediction model), they should subscribe to one or more Global Brokers to get notification messages using Message Queuing Telemetry Transport (MQTT) protocol.footnote[Subscribing to notifications about newly available data ensures that the data consumer does not need to continually to poll the data server to check for updates.] +WIS2 provides notifications about updates to datasets, for example, a notification may indicate that a new observation record from an automatic weather station has been added to a dataset of surface observations. These notifications are published on Message Brokers. Where data consumers need to use data rapidly once they have been published (for example, as inputs to a weather prediction model), they should subscribe to one or more Global Brokers to get notification messages using Message Queuing Telemetry Transport (MQTT) protocol.footnote[Subscribing to notifications about newly available data ensures that the data consumers do not need to continually to poll the data server to check for updates.] -In WIS2, notifications are republished by Global Brokers to ensure resilient distribution. Consequently, there will be multiple places where one can subscribe. Data consumers requiring real-time notifications must subscribe to Global Brokers. Data consumers should subscribe to more than one Global Broker, thereby ensuring that notifications continue to be received if a Global Broker instance fails. +In WIS2, notifications are republished by Global Brokers to ensure resilient distribution. Consequently, there will be multiple places where one can subscribe. Data consumers requiring real-time notifications must subscribe to Global Brokers. Data consumers should subscribe to more than one Global Broker to ensure that notifications continue to be received if a Global Broker instance fails. A dataset in WIS2 is associated with a unique topic. Notifications about updates to a dataset are published to the associated topic. Topics are organized according to a standard scheme (see the _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). A data consumer can find the appropriate topic to subscribe to either by searching the Global Discovery Catalogue, by using an Internet search engine,footnote:[Internet search engines allow data consumers to discover WIS2 datasets by indexing the content in Global Discovery Catalogues.], or by browsing the topic hierarchy on a Message Broker. -WIS2 uses Global Caches to distribute core data, as defined in the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). Each Global Cache republishes core data on its own highly available data server and publishes a new notification message advertising the availability of those data from the Global Cache location. +WIS2 uses Global Caches to distribute core data, as defined in the WMO Unified Data Policy (Resolution 1 (Cg-Ext (2021))). Each Global Cache republishes core data on its own highly available data server and publishes a new notification message advertising the availability of those data from the Global Cache location. Notifications from WIS2 Nodes and Global Caches are published on different topics: The root topic used by WIS2 Nodes is ``origin``, while the root topic used by Global Caches is ``cache``. Other than the root, the topic hierarchy is identical. For example, for synoptic weather observations published by Environment Canada: @@ -58,12 +58,12 @@ If a download link implements access control (for example, the data consumer nee ==== 1.2.5 How to use data -Data are shared on WIS2 in accordance with the Unified Data Policy (Resolution 1 (Cg-Ext (2021))). This data policy describes two categories of data: core and recommended. +Data are shared on WIS2 in accordance with the WMO Unified Data Policy (Resolution 1 (Cg-Ext (2021))). This data policy describes two categories of data: core and recommended. -* Core data are considered essential for the provision of services for the protection of life and property and the well-being of all nations. Core data are provided on a free and unrestricted basis. -* Recommended data are exchanged on WIS2 in support of Earth system monitoring and prediction efforts. Recommended data _may_ be provided with conditions on use and/or subject to a license. +* Core data are considered essential for the provision of services for the protection of life and property and the well-being of all nations. Core data are provided on a free and unrestricted basis, without charge and with no conditions on use. +* Recommended data are exchanged on WIS2 in support of Earth system monitoring and prediction efforts. Recommended data may be provided with conditions on use and/or subject to a license. -The Unified Data Policy (Resolution 1 (Cg-Ext (2021))) encourages attribution of the source of the data in all cases. This ensures that, credit is given to those who have expended effort and resources in collecting, curating, generating, or processing the data. Attribution provides visibility into who is using the data, which, for many organizations, serves as crucial evidence to justify the continued provision and updating of the data. +The WMO Unified Data Policy (Resolution 1 (Cg-Ext (2021))) encourages attribution of the source of the data in all cases. This ensures that, credit is given to those who have expended effort and resources in collecting, curating, generating, or processing the data. Attribution provides visibility into who is using the data, which, for many organizations, serves as crucial evidence to justify the continued provision and updating of the data. Details of the applicable WMO data policy and any rights or licenses associated with the data are provided in the discovery metadata accompanying the data. Discovery metadata records are available from the Global Discovery Catalogue. @@ -83,7 +83,7 @@ Data consumers wanting to use data published via WIS2 should, at a minimum, read * <<_2_2_roles_in_wis2>> * <<_2_4_wis2_components>> -The following sections in the _Manual on WIS_, Volume II also provide useful information: +The following specifications in the _Manual on WIS_, Volume II also provide useful information: * Appendix D. WIS2 Topic Hierarchy; * Appendix E. WIS2 Notification Message; diff --git a/guide/sections/part1/data-publisher.adoc b/guide/sections/part1/data-publisher.adoc index 3ecfa3d..1caf722 100644 --- a/guide/sections/part1/data-publisher.adoc +++ b/guide/sections/part1/data-publisher.adoc @@ -4,7 +4,7 @@ Data publishers wanting to share authoritative Earth system data that with the W ==== 1.3.1 How to get started -The first thing step is to consider the data, how they can be conceptually grouped into one or more datasets (see <<_1_1_4_why_are_datasets_so_important?>>), and whether they are core or recommended data, as per the Unified Data Policy (Resolution 1 (Cg-Ext(2021))) . +The first thing step is to consider the data, how they can be conceptually grouped into one or more datasets (see <<_1_1_4_why_are_datasets_so_important?>>), and whether they are core or recommended data, as per the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))) . Next, it is important to consider where the data are published. If the data relate to a specific country or territory, they should be published through a National Centre (NC). If they relate to a region, programme, or other specialized function within WMO, they should be published through a Data Collection or Production Centre (DCPC). The functional requirements for NCs and DCPCs are described in the _Manual on WIS_, Volume II - Part III Functions of WIS. @@ -116,7 +116,7 @@ Whatever topic is used, the discovery metadata provided to the Global Discovery ===== 1.3.3.5 Considerations when providing core data in WIS2 -Core data, as specified in the Unified Data Policy (Resolution 1 (Cg-Ext(2021))) are considered essential for the provision of services for the protection of life and property and for the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. +Core data, as specified in the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))) are considered essential for the provision of services for the protection of life and property and for the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published at WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24 hours.footnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]) Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. diff --git a/guide/sections/part1/introduction.adoc b/guide/sections/part1/introduction.adoc index f0f2198..1fb7295 100644 --- a/guide/sections/part1/introduction.adoc +++ b/guide/sections/part1/introduction.adoc @@ -36,7 +36,7 @@ Resolution 1 (Cg-Ext(2021)) describes the Earth system data that are necessary f WIS2 is the mechanism by which these Earth system data are exchanged. -A common practice when working with data is to group them into datasets. All the data in a dataset share some common characteristics. The Data Catalog Vocabulary (DCAT) defines a dataset as a "collection of data, published or curated by a single agent, and available for access or download in one or more representations".footnote:[See _Data Catalog Vocabulary (DCAT) - Version 2, W3C Recommendation 04 February 2020_ https://www.w3.org/TR/vocab-dcat-2/#Class:Dataset] +A common practice when working with data is to group them into datasets. All the data in a dataset share some common characteristics. The Data Catalog Vocabulary (DCAT) defines a dataset as a "collection of data, published or curated by a single agent, and available for access or download in one or more representations".footnote:[See _Data Catalog Vocabulary (DCAT) – Version 3, W3C Recommendation 22 August 2024_ https://www.w3.org/TR/vocab-dcat-3/#Class:Dataset] Why is this important? The "single agent" (such as an organization) responsible for managing the collection ensures consistency among the data. For example, in a dataset: @@ -73,7 +73,7 @@ There are some things that are fixed requirements for datasets: Some examples of datasets include: -* The most recent five days of synoptic observations for an entire country or territory; footnote:[Why 5-days in this example? Because the system used to publish the data in this example only retains data for 5-days] +* The most recent five days of synoptic observations for an entire country or territory; footnote:[In this example, the system used to publish the data only retains the data for five days. Other systems may retain the data for a longer or shorter period of time.] * A long-term record of observed water quality for a managed set of hydrological stations; * The output from the most recent 24 hours of operational numerical weather prediction model runs; * The output from six months of experimental model runs. It is important to note that the output from operational and experimental model runs should not be merged into the same dataset because they use different algorithms - it is very useful to be able to distinguish the provenance (or lineage) of data; diff --git a/guide/sections/part2/global-services.adoc b/guide/sections/part2/global-services.adoc index fce453d..c656d88 100644 --- a/guide/sections/part2/global-services.adoc +++ b/guide/sections/part2/global-services.adoc @@ -96,7 +96,7 @@ In the following sections, and for each Global Service, a set of metrics is defi ==== 2.7.4 Global Cache -In WIS2, Global Caches provide access to WMO core data for data consumers. This allows data providers to restrict access to their systems to Global Services, and it reduces the need for them to provide high bandwidth and low latency access to their data. Global Caches operate in a way that is transparent to end users in that they resend notification messages from data providers. These messages are updated to point to the Global Cache data store copies of the original data. Global Caches also resend notification messages from data providers for core data that are not stored in the Global Cache, such as when the originator specifies in the notification message that a certain dataset should not be cached. In these cases, the notification messages remain unchanged and point to the original source. Data consumers should subscribe to the notification messages from Global Caches instead of the notification messages from data providers for WMO core data. When data consumers receive a notification message, they should follow the URLs from that message, which either point to a Global Cache which has a copy of the data, or – in case of uncached content – point to the original source. +In WIS2, Global Caches provide access to WMO core data for data consumers. This allows data providers to restrict access to their systems to Global Services, and it reduces the need for them to provide high bandwidth and low latency access to their data. Global Caches operate in a way that is transparent to end users in that they resend notification messages from data providers. These messages are updated to point to copies of the original data held in the Global Cache data store. Global Caches also resend notification messages from data providers for core data that are not stored in the Global Cache, such as when the originator specifies in the notification message that a certain dataset should not be cached. In these cases, the notification messages remain unchanged and point to the original source. Data consumers should subscribe to the notification messages from Global Caches instead of the notification messages from data providers for WMO core data. When data consumers receive a notification message, they should follow the URLs from that message, which either point to a Global Cache which has a copy of the data, or – in case of uncached content – point to the original source. ===== 2.7.4.1 Technical considerations @@ -104,7 +104,7 @@ In WIS2, Global Caches provide access to WMO core data for data consumers. This ** A highly available data server allowing data consumers to download cache resources with high bandwidth and low latency; ** A Message Broker implementing both MQTTv3.1.1 and MQTTv5 to publish notification messages about resources that are available from the Global Cache; ** A cache management system implementing the features needed to connect to the WIS ecosystem, receive data from WIS2 Nodes and other Global Caches, store the data on the data server and manage the content of the cache (expiration of data, deduplication, and so forth). -* A Global Cache will aim to contain copies of real-time and near real-time data designated as "core" within the Unified Data Policy (Resolution 1 (Cg-Ext(2021))). +* A Global Cache will aim to contain copies of real-time and near real-time data designated as "core" within the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))). * A Global Cache will host data objects copied from NCs/DCPCs. * A Global Cache will publish notification messages advertising the availability of the data objects it holds. The notification messages will follow the standard structure (see _Manual on WIS_, Volume II -Appendix E. WIS2 Notification Message). * A Global Cache will use the standard topic structure in its local Message Brokers (see _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). @@ -131,8 +131,8 @@ In WIS2, Global Caches provide access to WMO core data for data consumers. This ** If the message contains the property ``"properties.cache": false``, *** Republish the message at topic ``cache/a/wis2/...``, matching ``+/a/wis2/...`` where the original message has been received, after having updated the id of the message. ** Else -*** Maintain a list of data_ids that have already been downloaded; -*** Verify whether the message points to new or updated data by comparing the pubtime value of the notification message with the list of data_ids; +*** Maintain a list of ``data_id`` values that have already been downloaded; +*** Verify whether the message points to new or updated data by comparing the pubtime value of the notification message with the list of ``data_id`` values; *** If the message is new or updated: **** Download only new or updated data from the ``href`` or extract the data from the message content; **** If the message contains an integrity value for the data, verify the integrity of the data; diff --git a/guide/sections/part2/operations.adoc b/guide/sections/part2/operations.adoc index 40afd70..bceb3c0 100644 --- a/guide/sections/part2/operations.adoc +++ b/guide/sections/part2/operations.adoc @@ -2,9 +2,7 @@ ==== 2.8.1 Interoperability with external systems -The WIS2 principles enable lowering the barrier to weather/climate/water data for WMO Members. Lowering the barrier is driven by international standards -for data discovery, access, and visualization. In addition to Member benefits, a by-product of utilizing standards is being able to provide -the same data and access mechanisms to external systems at no extra cost of implementation. +Driven by international standards for data discovery, access, and visualization, the WIS2 principles help lower the barrier for WMO Members to weather, climate, and water data. An additional benefit of adopting these standards is that Members are able to provide the same data and access mechanisms to external systems at no extra cost for implementation. WIS2 standards are based on industry standards (OGC, W3C, IETF) and allow for broad interoperability. This means that non-traditional users can also use data from WIS2 data in the same manner, without the requirement for specialized software, tools, or applications. @@ -72,7 +70,7 @@ The WIS2 to SWIM interoperability approach employs a gateway component (as per F .Schematic of an interoperability approach image::images/wis2-to-swim-temp.png[Figure 2. Schematic of interoperability approach] -The gateway can operate as an "adapter" between WIS2 and SWIM by pulling +The gateway component can operate as an "adapter" between WIS2 and SWIM by pulling the requisite meteorological data from WIS2 and re-publishing it to SWIM. @@ -195,12 +193,12 @@ the discovery metadata from the WIS2 Node using the URL supplied in the message.] .Interactions between the gateway component and WIS2 and SWIM components -image::images/wis2-to-swim-interaction-temp.png[Figure 3. Interactions between the gateway and components of WIS2 and SWIM] +image::images/wis2-to-swim-interaction-temp.png[Interactions between the gateway component and the WIS2 and SWIM components] **Configuration** Dataset discovery metadata will provide -useful information that can be used to configure the gateway, for example, the +useful information that can be used to configure the gateway component, for example, the topic(s) to subscribe to plus additional information that may be needed for the SWIM service. @@ -225,7 +223,7 @@ resource should be in IWXXM format; ** Create a new data message as per the SWIM specifications, including the unique identifier extracted from the data resource,footnote:[In case a unique identifier is required for proper passing of an aviation -weather message to the gateway, the GTS abbreviated heading +weather message to the gateway component, the GTS abbreviated heading (TTAAii CCCC YYGGgg) in the COLLECT envelop can be used (available in IWXXM messages that have a corresponding TAC message). Alternatively, content in the attribute ``gml:identifier`` (available in newer IWXXM messages such as WAFS SIGWX @@ -237,14 +235,14 @@ within the data message; Broker component of the SWIM service. The choice of protocol for publishing to the SWIM Message Broker should -be based on a bilateral agreement between operators of the gateway and +be based on a bilateral agreement between operators of the gateway component and the SWIM service. The gateway component should implement logging and error handling as necessary to enable reliable operations. WIS2 uses the OpenMetrics standardfootnote:[See OpenMetrics: https://openmetrics.io.] to publish metrics and other operating information. The use of OpenMetrics -by the gateway would enable monitoring and performance reporting to be +by the gateway component would enable monitoring and performance reporting to be easily integrated into the WIS2 system. **Operation** @@ -258,12 +256,12 @@ The SWIM aviation weather information service may include of a Message Broker component which implements the Advanced Message Queuing Protocol (AMQP) 1.0 messaging standard.footnote:[See AMQP 1.0: https://www.amqp.org/resources/specifications.] -The Message Broker publishes the data messages provided by the gateway. +The Message Broker publishes the data messages provided by the gateway component. The Message Broker must ensure that data messages are provided only by authorized sources, such as a gateway, and should validate incoming messages as aeronautical meteorological information. -===== 2.8.1.2 The Ocean Data and Information System +===== 2.8.1.2 Ocean Data and Information System The Ocean Data and Information System (ODIS) is a federation of independent data systems, coordinated by the International Oceanographic @@ -305,8 +303,9 @@ and federation. ODIS dataset records will be made available using the WCMP2 standard and provided as objects available via HTTP for ingest, validation and publication to the Global Discovery Catalogues as a -federated catalogue. ODIS data will be published as recommended data as per the Unified Data Policy (Resolution 1 (Cg-Ext(2021))). +federated catalogue. ODIS data will be published as recommended data as per the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))). (See Figure 4) + .WIS2 and ODIS metadata and catalogue interoperability image::images/wis2-odis-metadata-discovery-interop.png[Figure 4. WIS2 and ODIS metadata and catalogue interoperability] diff --git a/guide/sections/part2/wis2node.adoc b/guide/sections/part2/wis2node.adoc index 4b26a27..b737b9b 100644 --- a/guide/sections/part2/wis2node.adoc +++ b/guide/sections/part2/wis2node.adoc @@ -16,8 +16,7 @@ The registration of a WIS2 Node involves the following steps: * Request to host a WIS2 Node: A request to host a WIS2 Node shall be put forward by the WIS NFP of the country of the WIS2 Node host centre, or, in the case of international organizations, by either the PR with WMO of the country or territory where the centre is located or the president of the relevant organization, if the WMO partner or programme is designated as a DCPC. -* Assign a centre identifier (“centre-id”): The centre-id is an acronym proposed by the Member and endorsed by the WMO Secretariat. It is a single identifier consisting of a top-level domain (TLD) and a centre name and represents the data publisher, distributor or issuing centre of a given dataset or data product/granule (see the Manual on WIS, Volume II – Appendix D. WIS2 Topic Hierarchy). See guidance on assigning a centre identifier -(<<_2_6_1_2_guidance_on_assigning_a_centre_identifier_for_a_wis2_node>>). +* Assign a centre identifier (“centre-id”): The centre-id is an acronym proposed by the Member and endorsed by the WMO Secretariat. It is a single identifier consisting of a top-level domain (TLD) and a centre name and represents the data publisher, distributor or issuing centre of a given dataset or data product/granule (see the Manual on WIS, Volume II – Appendix D. WIS2 Topic Hierarchy). See (<<_2_6_1_2_guidance_on_assigning_a_centre_identifier_for_a_wis2_node>>). * Complete the WIS2 register: The WIS NFP shall complete the WIS2 register maintained by the WMO Secretariat. * Provide details of the Global Service: The WMO Secretariat provides connection details (such as IP addresses) for the Global Services so that the WIS2 Node can be configured to provide access. @@ -31,6 +30,7 @@ The registration of a WIS2 Node involves the following steps: A diagram of the process for registering a WIS2 Node is presented below (see Figure 1). +.Diagram of the process for registering a WIS2 Node image::images/add-wis2node.png[Adding a WIS2 Node,link=images/add-wis2node.png] Once a WIS2 Node has been registered and connected to the Global Services, it can proceed to register the datasets it will publish via WIS2. To register a dataset, the WIS2 Node publishes discovery metadata about the new dataset. Validation of the discovery metadata is completed by the Global Discovery Catalogues, and the Global Brokers automatically subscribe to the topics provided in the discovery metadata record. For more information, see <<_1_3_2_how_to_provide_discovery_metadata_to_wis2>>. @@ -80,7 +80,7 @@ Access controls may be implemented for recommended data. Only the security schem ===== 2.6.2.1 Service levels and performance indicators -A WIS2 Node must be able to publish datasets, compliant metadata and discovery metadata. This entails: +A WIS2 Node must be able to publish datasets and compliant discovery metadata. This entails: * Publishing metadata to the Global Data Catalogue; * Publishing core data to the Global Cache; * Publishing data for consumer access; diff --git a/guide/sections/part3/information-management.adoc b/guide/sections/part3/information-management.adoc index 7dffc99..9460820 100644 --- a/guide/sections/part3/information-management.adoc +++ b/guide/sections/part3/information-management.adoc @@ -12,11 +12,11 @@ _Note: The term "information" is used in a general sense and includes data and p High-level guidance on information management practices that apply in the context of information related to the Earth system is provided in this part of the Guide. Detailed technical information, such as the specification of data formats or quality control and assurance methods, is provided in other parts of the Guide and in other WMO publications. These are referenced where applicable. -The principles of information management are described below. Five focus areas are described in <<_3_1_3_the_information_management_life_cycle>>. These are: +The principles of information management are described below. Section <<_3_1_3_the_information_management_life_cycle>> describes five focus areas. 1. Planning, information creation and acquisition. The creation of information using internal and external data sources and the acquisition of information from various sources. -2. Representation and metadata. The use of standards to represent metadata, data and information; This is of primary importance to enable the interoperability and longterm usability of the information. -3. Publication and exchange of information. The creation and publication of discovery metadata in a standardized format enabling users to discover, access and retrieve the information. +2. Representation and metadata. The use of standards to represent metadata, data and information; this is of primary importance to enable the interoperability and long-term usability of the information. +3. Publication and exchange of information. The creation and publication of discovery metadata in a standardized format, enabling users to discover, access and retrieve the information. 4. Usage and communication. The publication of guidance material on the use of published information, including on the limitations and suitability of the information and any licensing terms. 5. Storage, archival and disposal. The policies and procedures for business continuity and disaster recovery, as well as retention and disposal. @@ -27,14 +27,14 @@ This guidance is primarily aimed at personnel within WMO centres, who are respon Specifically, the guidance has five main target audiences across the information life cycle: 1. Information producers or creators (those who produce or acquire the information) - they need to ensure the scientific quality of the underpinning information; -2. Information managers (those who manage information); -3. Information providers or publishers (those who publish the information) - they are responsible for the provision of the information, and for ensuring that appropriate access is enabled, licensing agreements are in place, and so forth; +2. Information managers (those who manage the information); +3. Information providers or publishers (those who publish the information) - they are responsible for the provision of the information and for ensuring that appropriate access is enabled, licensing agreements are in place, and so forth; 4. Service providers (those who disseminate the information) - they are responsible for ensuring information availability and maintaining capability for easy and secure access to the information; 5. Information consumers (those who utilize the information) - they need to understand the restrictions, rights, responsibilities and limitations associated with the information together with the suitability for the intended usage or purpose. ==== 3.1.2 Principles of information management -The effective management of information is essential for WMO centres to deliver operational services and information that is authoritative, seamless, secure and timely. The principles below underpin and provide a framework for information management across the full information life cycle. These principles are independent of the information type and are largely independent of technology; they are therefore expected to remain stable over time life cycle. +The effective management of information is essential for WMO centres to deliver operational services and information that is authoritative, seamless, secure and timely. The principles below underpin and provide a framework for information management across the full information life cycle. These principles are independent of the information type and are largely independent of technology; they are therefore expected to remain stable over time. ===== 3.1.2.1 Principle 1: Information is a valued asset * An information asset is information that has value. This value may be related to the cost of generating and collecting the information, it may be associated with the immediate use of the information, or it may be associated with the longer -term preservation and subsequent reuse of the information. @@ -43,18 +43,18 @@ The effective management of information is essential for WMO centres to deliver ===== 3.1.2.2 Principle 2: Information must be managed * An information asset must be managed throughout its life cycle, from creation to use to eventual disposal, in a way that makes it valuable, maximizes its benefits and reflects its value in time and its different uses. -* Information managers must consider the entire information life cycle, from identifying needs and business cases to creation, quality assurance, maintenance, reuse, archival and disposal. Careful consideration must be given to disposal, ensuring that information is destroyed only when it has ceased to be useful for all categories of users. +* Information managers must consider the entire information life cycle, from identifying needs and business cases to creation, quality assurance, maintenance, reuse, archival, and disposal. Careful consideration must be given to disposal, ensuring that information is destroyed only when it has ceased to be useful for all categories of users. * Professionally qualified and adequately skilled staff with clear roles and responsibilities should apply a sound custodianship framework concerning security, confidentiality and other statutory requirements of different types of information. ===== 3.1.2.3 Principle 3: Information must be fit for purpose * Information should be developed and managed in accordance with its function and use for internal and external users. * WMO centres should regularly assess information to ensure that it is fit for its purpose and that the related processes, procedures and documentation are adequate. -* Processes should be consistent with the general provisions and principles of quality management as described in the _WMO Technical Regulations_ (WMO-No. 49). +* Processes should be consistent with the general provisions and principles of quality management as described in the WMO https://library.wmo.int/idurl/4/35722[_Technical Regulations_] (WMO-No. 49). ===== 3.1.2.4 Principle 4: Information must be standardized and interoperable -* Information must be stored and exchanged in standardized formats to ensure wide usability in the short and longterm. It is essential for longterm archiving that information be stored in a form that can be understood and used after several decades. +* Information must be stored and exchanged in standardized formats to ensure wide usability in the short and long term. It is essential for long-term archiving that information be stored in a form that can be understood and used after several decades. * Standardization is essential for structured information such as dataset definitions and metadata to support interoperability. * Interoperability is essential for users to be able to utilize information through different systems and software. Open standards help ensure interoperability with their openness and wide adoption across various communities. * Which standards to use depends on the user community and organizational policies. Interoperability requirements should be considered when selecting the standard for internal use and broader dissemination. @@ -62,29 +62,29 @@ The effective management of information is essential for WMO centres to deliver ===== 3.1.2.5 Principle 5: Information must be well documented -* WMO centres should comprehensively document information processes, policies, and procedures to facilitate broad and longterm use. +* WMO centres should comprehensively document information processes, policies, and procedures to facilitate broad and long-term use. * WMO centres should keep documentation up to date to ensure full traceability of processes along the information life cycle, particularly for its creation. * Previous versions of the documentation should be retained, versioned, archived and made readily available for future use. In addition, versions should be assigned a unique and persistent identifier for future unambiguous identification. ===== 3.1.2.6 Principle 6: Information must be discoverable, accessible and retrievable -* Information should be easy to find through the web, and for this purpose, the publisher should share discovery metadata with a catalogue service. The catalogue service should include a web application programming interface (API) to be used by other applications in order to offer user-tailored search portals. +* Information should be easy to find through the Web, and for this purpose, the publisher should share discovery metadata with a catalogue service. The catalogue service should include a web API to be used by other applications in order to offer user-tailored search portals. * For information to be easily retrievable once discovered, it should be accessible using standard data exchange protocols. ===== 3.1.2.7 Principle 7: Information should be reusable -* In order to maximize the economic benefits of an information asset it should be made as widely available and as accessible as possible. +* In order to maximize the economic benefits of an information asset, it should be made as widely available and as accessible as possible. * Resolution 1 (Cg-Ext(2021)) encourages the reuse of data and information through the open and unrestricted exchange of core WMO data. WMO encourages the free and unrestricted exchange of information in all circumstances. * The publisher should provide an explicit and well-defined licence for each information item or dataset as part of the associated metadata. * The Findable, Accessible, Interoperable and Reusable (FAIR) data principles promote open data with the ultimate goal of optimizing the reuse of data. These principles should be followed where possible. _Note: Information on the FAIR data principles can be found at: FAIR Principles - GO FAIR_footnote:[https://go-fair.org] -===== 3.1.2.8 Principle 8: Information management is subject to accountability and governance. +===== 3.1.2.8 Principle 8: Information management is subject to accountability and governance * Information management processes must be governed as the information moves through its life cycle. All information must have a designated owner, steward, curator and custodian. These roles may be invested in the same person but should be clearly defined at the time of creation. A WMO centre with responsibility of managing information must ensure: ** The implementation of general information management practices, procedures and protocols, including well-defined roles, responsibilities and restrictions on managing the information; -** The definition and enforcement of appropriate retention policy, taking into account stakeholder needs and variations in value over the information life cycle; +** The definition and enforcement of an appropriate retention policy, taking into account stakeholder needs and variations in value over the information life cycle; ** The establishment of licensing and the definition and enforcement of any access restrictions. ** The designated owner should have budget and decision-making authority with respect to preservation and data usage, including the authority to pass ownership to another entity. @@ -121,77 +121,77 @@ Governance covers the rules that apply to managing information in a secure and t ====== 3.1.3.3.1 Planning, information creation and acquisition -Before the creation or acquisition of new information a business case and an information management plan should be developed, covering both the input information sources and any derived information. The plans should include: +Before the creation or acquisition of new information a business case plan and an information management plan should be developed, covering both the input information sources and any derived information. The plans should include: * Why the information is required; * How it will be collected or created; * How it will be stored; * Whether it will be exchanged with other users and under what policy; -* Where it should be submitted for longterm archival; +* Where it should be submitted for long-term archival; * Key roles and responsibilities associated with the management of the information. -For externally sourced data the plans should include where the information has come from and what the licensing terms are. +For externally sourced data, the plans should include where the information has come from and what the licensing terms are. -Once information has been acquired it should be checked to ensure that the contents and format are as expected. This may be done using a compliance checker or validation service. Once these checks have been performed the information content should also undergo quality control checks using well documented procedures to identify any issues. A record of the checks should be kept and any issues detected should be documented and sent back to the originators. It is also important to subscribe to updates from originators so any issues identified externally can be taken into account. +Once information has been acquired, it should be checked to ensure that the contents and format are as expected. This may be done using a compliance checker or a validation service. Once these checks have been performed, the information content should also undergo quality control checks using well-documented procedures to identify any issues. A record of the checks should be kept, and any issues detected should be documented and sent back to the originators. It is also important to subscribe to updates from originators so any issues identified externally can be taken into account. -Information created rather than acquired should undergo the same processes as the acquired information. Information created should undergo quality control and the resulting files should be checked against the specified format requirements. The results of the processes and checks should be documented. +Information created rather than acquired should undergo the same processes as acquired information. Information created should undergo quality control, and the resulting files should be checked against the specified format requirements. The results of the processes and checks should be documented. -To ensure traceability and reproducibility the information and documents at this, and subsequent stages, should be version controlled and clearly labelled with version information. Similarly, software, or computer code, used to generate or process information should be version controlled with the version information recorded in the documentation and metadata. Where possible, software should be maintained within a code repository. +To ensure traceability and reproducibility, the information and documents at this and subsequent stages, should be version controlled and clearly labelled with version information. Similarly, software or computer code used to generate or process information should be version controlled with the version information recorded in the documentation and metadata. Where possible, software should be maintained within a code repository. ====== 3.1.3.3.2 Representation and metadata -The formats used to store and exchange information should be standardized to ensure its usability in both the short and the longterm. It is essential that the information be accessible many years after archival if required. To ensure this usability, the format and version of the information should be recorded in the information metadata record and included within the information itself where the format allows. +The formats used to store and exchange information should be standardized to ensure its usability in both the short and the long term. It is essential that the information be accessible many years after archival if required. To ensure this usability, the format and version of the information should be recorded in the information metadata record and included within the information itself where the format allows. -Information exchanged on the WIS and between WMO centres is standardized through the use of the formats specified in the _Manual on Codes_ (WMO-No. 306), Volume I.2 and the _Manual on WIS_, Volume II. These include the GRIB and BUFR formats for numerical weather prediction products and observational data and the WMO Core Metadata Profile for discovery, access and retrieval metadata. The format for the exchange of station and instrumental metadata, WIGOS Metadata Data Representation, is defined in the https://library.wmo.int/records/item/35769-manual-on-codes-volume-i-3-international-codes?offset=2[_Manual on Codes_] (WMO-No. 306), Volume I.3. +Information exchanged on WIS and between WMO centres is standardized through the use of the formats specified in the _Manual on Codes_ (WMO-No. 306), Volume I.2 and the _Manual on WIS_, Volume II. These include the GRIB and BUFR formats for numerical weather prediction products and observational data and the WMO Core Metadata Profile for discovery, access and retrieval metadata. The format for the exchange of station and instrumental metadata, WMO Integrated Global Observing System (WIGOS) Metadata Data Representation, is defined in the https://library.wmo.int/idurl/4/35769[_Manual on Codes_] (WMO-No. 306), Volume I.3. -These formats have been developed within the WMO community to enable the efficient exchange of information between WMO centres and to enable the information to be interoperable between centres and systems. The formats, including detailed technical information, have also been published openly through the WMO manuals, permitting other communities to use the formats and the information and promoting the reuse of the information. +These formats have been developed within the WMO community to enable the efficient exchange of information between WMO centres and to enable the information to be interoperable between centres and systems. The formats, including detailed technical information, have also been published in WMO manuals, permitting other communities to use the formats and the information and promoting the reuse of the information. The WMO formats specified in the manuals are subject to strong governance processes, and changes to the formats can be traced through the versions of the manuals. The code tables and controlled vocabularies are also maintained in a code repository. To enable future reuse, the technical information, including detailed format specifications, should be archived alongside information for future access. This includes any controlled vocabulary, such as BUFR tables or WIGOS metadata code lists, associated with the format. ====== 3.1.3.3.3 Publication and exchange of information -To maximize the benefits and return on investment in the acquisition and generation of information there needs to be a clear method as to how the information will be published, exchanged and accessed by users. +To maximize the benefits and return on investment in the acquisition and generation of information, there needs to be a clear method as to how the information will be published, exchanged and accessed by users. -Information is published on WIS through the creation of discovery metadata records. These records are publicly searchable and retrievable via WMO cataloguing services, providing access to the records via the web and via a web application programming interface (API). The metadata records should include information on how to access the described datasets and services (see _Manual on WIS_, Volume II – Appendix F: WMO Core Metadata Profile) and how to subscribe to receive updates and new data. +Information is published on WIS through the creation of discovery metadata records. These records are publicly searchable and retrievable via WMO cataloguing services, providing access to the records via the Web and via a web API. The metadata records should include information on how to access the described datasets and services (see _Manual on WIS_, Volume II – Appendix F. WMO Core Metadata Profile (Version 2)) and how to subscribe to receive updates and new data. -Technical regulations are provided in the _Manual on WIS_, Volume II. Before exchange and publication, the metadata should be assessed using the WMO Core Metadata Profile KPIs to ensure usable and high -quality metadata in addition to metadata that conforms to the technical standard. +Technical regulations are provided in the _Manual on WIS_, Volume II. Before exchange and publication, the metadata should be assessed using the WMO Core Metadata Profile KPIs to ensure usable and high -quality metadata in addition to metadata that conform to the technical standard. The web standards and protocols used should be adequately documented to enable users to find and retrieve the information. This should be possible both manually and automatically via machine-to-machine interfaces and should be standardized between centres. -Updates to the information exchanged on WIS, including the publication of new information or the cessation of previously exchanged information, is published in the WMO Operational Newsletter. +Updates to the information exchanged on WIS, including the publication of new information or the cessation of previously exchanged information, are published in the WMO Operational Newsletter. -_Note: The newsletter is available from: https://community.wmo.int/news/operational-newsletter_ +_Note: The newsletter is available from: https://community.wmo.int/news/operational-newsletter_. ====== 3.1.3.3.4 Usage and communication For information to have value, it must inform users, aid knowledge discovery and have an impact through informed decision -making. Ensuring that the user can make effective use of the information is an important step in the information management life cycle. This is accomplished in two ways: -1. By providing suitable information within the discovery metadata, enabling users to discover and access the information - including licensing information - and to assess whether it meets their requirements; +1. By providing suitable information within the discovery metadata, enabling users to discover and access the information, including licensing information, and to assess whether it meets their requirements; 2. By providing user guides and documentation on the suitability of the information for different uses, including any technical caveats or restrictions on the use of the information. -For common types of information, the guides may be generic or link to standard documentation. Information on the observations available from the WIGOS is provided within the https://library.wmo.int/idurl/4/55063[_Manual on the WMO Integrated Global Observing System_] (WMO-No. 1160) and the https://library.wmo.int/idurl/4/55696[_Guide to the WMO Integrated Global Observing System_] (WMO-No. 1165). This includes information on the expected uses and quality of the data. Similarly, information on the data and products available through the WMO Integrated Processing and Prediction System is provided in the https://library.wmo.int/idurl/4/35703[_Manual on the WMO Integrated Processing and Prediction System_] (formerly Manual on the Global Data Processing and Forecasting System) (WMO-No. 485). +For common types of information, the guides may be generic or link to standard documentation. Information on the observations available from WIGOS is provided in the https://library.wmo.int/idurl/4/55063[_Manual on the WMO Integrated Global Observing System_] (WMO-No. 1160) and the https://library.wmo.int/idurl/4/55696[_Guide to the WMO Integrated Global Observing System_] (WMO-No. 1165). This includes information on the expected uses and quality of the data. Similarly, information on the data and products available through the WMO Integrated Processing and Prediction System is provided in the https://library.wmo.int/idurl/4/35703[_Manual on the WMO Integrated Processing and Prediction System_] (formerly the Manual on the Global Data Processing and Forecasting System) (WMO-No. 485). -For non-standard and specialist products targeted user guides may be more appropriate. These should be accessible and retrievable via a link within the discovery metadata and should include a plain text summary for the non-technical user. Any user guide should be in addition to the technical documentation described under <<_3_1_3_3_1_planning_information_creation_and_acquisition>>. +For non-standard and specialist products, targeted user guides may be more appropriate. These should be accessible and retrievable via a link within the discovery metadata and should include a plain text summary for the non-technical user. Any user guide should be in addition to the technical documentation described in <<_3_1_3_3_1_planning_information_creation_and_acquisition>>. -Updates and the availability of new information should be announced and published via the WMO Operational Newsletter (see <<_3_1_3_3_3_publication_and_exchange_of_information>>). Other communication methods may also be used but these should not be in place of the operational newsletter. It is also recommended that users be allowed to subscribe to the newsletter to receive updates directly. +Updates and the availability of new information should be announced and published via the WMO Operational Newsletter (see <<_3_1_3_3_3_publication_and_exchange_of_information>>). Other communication methods may also be used, but these should not be in place of the operational newsletter. It is also recommended that users be allowed to subscribe to the newsletter to receive updates directly. The discovery metadata should include a valid point of contact, enabling users to provide feedback and ask questions about the information provided. ====== 3.1.3.3.5 Storage, archival and disposal -The type of storage used should be appropriate to the type of information stored. Core information exchanged operationally should be stored and made available via high-availability and low latency media and services. For some operation critical information, such as hazard warnings, there is a requirement for the end-to-end global distribution of the information to be completed in two minutes. For other operational data there is a requirement for the global exchange to be completed in 15 minutes. +The type of storage used should be appropriate to the type of information stored. Core information exchanged operationally should be stored and made available via high-availability and low-latency media and services. For some operation-critical information, such as hazard warnings, there is a requirement for the end-to-end global distribution of the information to be completed in two minutes. For other operational data, there is a requirement for the global exchange to be completed in 15 minutes. -The storage requirements for non-operational services and information may be different but the guidance provided in this section applies equally. Further information on the performance requirements is provided within the WIS2 technical specifications listed in the _Manual on WIS_, Volume II. +The storage requirements for non-operational services and information may be different, but the guidance provided in this section applies equally. Further information on the performance requirements is provided within the WIS2 technical specifications listed in the _Manual on WIS_, Volume II. Backup policies and data recovery plans should be documented as part of the information management plan. They should be implemented either before or when the information is created or acquired and should include both the information and the associated metadata. The backup and recovery process should be routinely tested. -Business rules governing access to, and modification of the information should be clearly documented in the information management plan. These must include the clear specification of the roles and responsibilities of those managing the information. Information on who can authorize the archival and disposal of the information and the processes for doing so should be included. The roles associated with an information resource are standardized as part of the WMO Core Metadata Profile. +Business rules governing access to and modification of the information should be clearly documented in the information management plan. These must include the clear specification of the roles and responsibilities of those managing the information. Information on who can authorize the archival and disposal of the information and the processes for doing so should be included. The roles associated with an information resource are standardized as part of the WMO Core Metadata Profile. -The archival and longterm preservation of an information resource should be identified and included in the information management plan. This may be at a national data centre and/or a WMO centre. WMO centres are recommended for globally exchanged core data and include those centres contributing to the Global Atmosphere Watch, the Global Climate Observing System and Marine Climate Data System (see https://library.wmo.int/idurl/4/41592[_Manual on Marine Meteorological Services_] (WMO-No. 558), Volume II, as well as the WMO World Data Centres and those defined in the _Manual on WIS_, Volume II and those defined in the _Manual on the WMO Integrated Processing and Prediction System_ (formerly _Manual on the Global Data Processing and Forecasting System_) (WMO-No. 485). +The archival and long-term preservation of an information resource should be identified and included in the information management plan. It may take place at a national data centre and/or a WMO centre. WMO centres are recommended for globally exchanged core data and include those centres contributing to the Global Atmosphere Watch, the Global Climate Observing System and Marine Climate Data System (see https://library.wmo.int/idurl/4/41585[_Manual on Marine Meteorological Services_] (WMO-No. 558), Volume I, as well as the WMO World Data Centres and in the _Manual on WIS_, Volume II and those defined in the _Manual on the WMO Integrated Processing and Prediction System_ (formerly the _Manual on the Global Data Processing and Forecasting System_) (WMO-No. 485). -Earth system information, especially observational data, is often irreplaceable. Other information, while technically replaceable, is often costly to produce and therefore not easily replaceable. This includes outputs from numerical models and simulations. Before an information resource is marked for disposal careful consideration must be given to whether longterm archival or disposal is more appropriate. This consideration must follow a clearly defined process documented in the information management plan. +Earth system information, especially observational data, is often irreplaceable. Other information, while technically replaceable, is often costly to produce and therefore not easily replaceable. This includes outputs from numerical models and simulations. Before an information resource is marked for disposal, careful consideration must be given to whether long-term archival or disposal is more appropriate. This consideration must follow a clearly defined process documented in the information management plan. -When an information resource is marked for disposal the reasons for disposal, including the outcome of the consultation with stakeholders and users, must clearly be documented. The disposal must be authorized by the identified owner and custodian of the information. Information relating to the disposal must be included in the metadata associated with the information resource. The metadata must be retained for future reference. +When an information resource is marked for disposal, the reasons for disposal, including the outcome of the consultation with stakeholders and users, must clearly be documented. The disposal must be authorized by the identified owner and custodian of the information. Information relating to the disposal must be included in the metadata associated with the information resource. The metadata must be retained for future reference. ==== 3.1.4 Other considerations From dd2a6861aa5b6ea84fcc8c2f3d4e1c8b03a85dd3 Mon Sep 17 00:00:00 2001 From: Anna Milan Date: Wed, 23 Oct 2024 15:04:29 +0200 Subject: [PATCH 14/20] minor LSP revisions --- guide/sections/part1/data-publisher.adoc | 10 +++++----- guide/sections/part2/global-services.adoc | 14 +++++++------- guide/sections/part2/operations.adoc | 6 +++--- guide/sections/part2/wis2-architecture.adoc | 2 +- 4 files changed, 16 insertions(+), 16 deletions(-) diff --git a/guide/sections/part1/data-publisher.adoc b/guide/sections/part1/data-publisher.adoc index 1caf722..9f800fa 100644 --- a/guide/sections/part1/data-publisher.adoc +++ b/guide/sections/part1/data-publisher.adoc @@ -1,6 +1,6 @@ === 1.3 Data publisher -Data publishers wanting to share authoritative Earth system data that with the WMO community should read the guidance presented here. A list of references to informative material in this Guide and elsewhere is provided at the end of this section. +Data publishers wanting to share authoritative Earth system data with the WMO community should read the guidance presented here. A list of references to informative material in this Guide and elsewhere is provided at the end of this section. ==== 1.3.1 How to get started @@ -39,7 +39,7 @@ Discovery metadata must be published in the Global Discovery Catalogues before t ==== 1.3.3 How to provide data to WIS2 -WIS2 is based on the web architecture.footnote:[See Architecture of the World Wide Web, Volume One: https://www.w3.org/TR/webarch/.] As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; and the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[See RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986].), which can be used to resolve the identifieed resource and interact with it (for example, to download a representation of the resource over an open-standard protocol such as HTTP). +WIS2 is based on the web architecture.footnote:[See Architecture of the World Wide Web, Volume One: https://www.w3.org/TR/webarch/.] As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; and the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[See RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986].), which can be used to resolve the identified resource and interact with it (for example, to download a representation of the resource over an open-standard protocol such as HTTP). In simple terms, data (and metadata) are provided to WIS2 by assigning them a unique identifier, in this case a URLfootnote:[The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (such as its network location). See RFC 3986: https://datatracker.ietf.org/doc/html/rfc3986.], and making them available via a data server - most typically a web server using HTTP protocol.footnote:[WIS2 strongly prefers secure versions of protocols (such as HTTPS), wherein the communication protocol is encrypted using Transport Layer Security (TLS)] It is up to the data server to decide what to provide when resolving the identifier. For example, the URL of a data granule may resolve as a representation encoded in a given data format, whereas the URL of a dataset may resolve as a description of the dataset (that is, metadata) that includes links to access the data from which the dataset is comprised - either individual files (that is, the data granules) or an interactive API that enables users to request only the parts of the dataset they need by specifying query parameters. @@ -47,7 +47,7 @@ The following sections cover specific considerations relating to publishing data ===== 1.3.3.1 Data formats and encodings -Whether providing data as files or through interactive APIs, data publishers need to decide which encodings (_data formats_) to use. WMO technical regulations may require that data be encoded in specific formats. For example, synoptic observations must be encoded in Binary Universal Form for the Representation of meteorological data (BUFR). The https://library.wmo.int/idurl/4/35625[_Manual on Codes_] (WMO-No. 306) provides details of data formats formally approved for use in WMO. However, the technical regulations do not cover all data sharing requirements. In such cases, data publishers should select data formats that are open, non-proprietary, widely adopted, and understood in the target user community. In this context, “open” means that anyone can use the format without needing a license – either to encode data in that format or to write software that understands it. +Whether providing data as files or through interactive APIs, data publishers need to decide which encodings (data formats) to use. WMO technical regulations may require that data be encoded in specific formats. For example, synoptic observations must be encoded in Binary Universal Form for the Representation of meteorological data (BUFR). The https://library.wmo.int/idurl/4/35625[_Manual on Codes_] (WMO-No. 306) provides details of data formats formally approved for use in WMO. However, the technical regulations do not cover all data sharing requirements. In such cases, data publishers should select data formats that are open, non-proprietary, widely adopted, and understood in the target user community. In this context, “open” means that anyone can use the format without needing a license – either to encode data in that format or to write software that understands it. ===== 1.3.3.2 Providing data as files @@ -81,7 +81,7 @@ Importantly, when considering the use of interactive APIs to serve data, it is n Based on the experience of data publishers that have been using web APIs to serve their communities, this Guide makes the following recommendations regarding interactive APIs: * First, interactive APIs should be self-describing. Data consumers should not need to know, a priori, how to make requests from an API. They should be able to discover this information from the API endpoint itself – even if this simply entails a link to a documentation page they need to read. -* Second, interactive APIs should comply with OpenAPIfootnote:[See OpenAPI Specification v3.1.0: https://spec.openapis.org/oas/v3.1.0.] version 3 or later. OpenAPI provides a standardized mechanism to describe the API. Tooling (free, commercial, etc.) that can read this metadata and automatically generate client applications to query the API is widely available. +* Second, APIs should comply with OpenAPIfootnote:[See OpenAPI Specification v3.1.0: https://spec.openapis.org/oas/v3.1.0.] version 3 or later. OpenAPI provides a standardized mechanism to describe the API. Tooling (free, commercial, etc.) that can read this metadata and automatically generate client applications to query the API is widely available. * Third, the OGC has developed a suite of APIsfootnote:[See OGC API: https://ogcapi.ogc.org/.] (called "OGC APIs") that are specifically designed to provide APIs for geospatial data workflows (discovery, visualization, access, processing/exploitation) – all of which build on OpenAPI. Among these, OGC API – Environmental Data Retrieval (EDR),footnote:[See OGC API - Environmental Data Retrieval (EDR): https://ogcapi.ogc.org/edr.] OGC API – Features,footnote:[See OGC API - Features: https://ogcapi.ogc.org/features.] and OGC API - Coveragesfootnote:[See OGC API - Coverages: https://ogcapi.ogc.org/coverages.] are considered particularly useful. Because these are open standards, there is an ever-growing suite of software implementations (both free and proprietary) that support them. It is recommended that data publishers assess these open-standard API specifications to determine their suitability for publishing their datasets using APIs. Finally, it is advisable to consider versioning the API to avoid breaking changes when adding new features. A common approach is to add a _version number_ prefix into the API path, for example, ``/v1/service/{rest-of-path}`` or ``/service/v1/{rest-of-path}``. @@ -96,7 +96,7 @@ To enable real-time data sharing,footnote:[In the context of WIS2, real time imp For example, when a new temperature profile from a radiosonde deployment is added to a dataset of upper-air data measurements, a notification message will be published that includes the URL used to access the new temperature profile data. All subscribers to notification messages about the upper-air measurement dataset will receive the notification message and be able to identify the URL and download the new temperature profile data. -Optionally, data may be embedded in a notification message using a content object in addition to being published via the data server. Inline data must be encoded as ``UTF-8``, ``Base64``, or ``gzip``, and must not exceed 4096 bytes in length once encoded. +Optionally, data may be embedded in a notification message using a ``content`` object in addition to being published via the data server. Inline data must be encoded as UTF-8, Base64, or gzip, and must not exceed 4096 bytes in length once encoded. Notification messages are encoded as GeoJSON (RFC 7946) and must conform to the _Manual on WIS_, Volume II, Appendix E. WIS2 Notification Message. diff --git a/guide/sections/part2/global-services.adoc b/guide/sections/part2/global-services.adoc index 0f4e8c1..0e1ed58 100644 --- a/guide/sections/part2/global-services.adoc +++ b/guide/sections/part2/global-services.adoc @@ -36,11 +36,11 @@ WIS2 Global Services (Global Brokers, Global Caches, and Global Discovery Catalo There is no requirement for WIS2 Nodes to provide monitoring metrics. However their WIS2 interfaces may be queried remotely by Global Services, which can then provide metrics on the availability of WIS2 Nodes. -Metrics for WIS2 monitoring should follow the naming convention wmo__, where is the name of the responsible WMO programme and is the name of the metric. Examples of WIS2 metrics include: +Metrics for WIS2 monitoring should follow the naming convention ``wmo__``, where ```` is the name of the responsible WMO programme and ```` is the name of the metric. Examples of WIS2 metrics include: - wmo_wis2_gc_downloaded_total, and + ``wmo_wis2_gc_downloaded_total``, and - wmo_wis2_gb_messages_invalid_total. + ``wmo_wis2_gb_messages_invalid_total``. The full set of the WIS2 monitoring metrics is given in WMO: WIS2 Metric Hierarchy footnote:[See https://github.com/wmo-im/wis2-metric-hierarchy.] @@ -81,14 +81,14 @@ In the following sections, and for each Global Service, a set of metrics is defi * A Global Broker is built around two software components: ** An off the shelf broker implementing both MQTT 3.1.1 and MQTT 5.0 in a highly available setup, typically in a cluster mode. Tools such as EMQX, HiveMQ, VerneMQ, RabbitMQ (in its latest versions) are compliant with these requirements. The open source version of Mosquitto cannot be clustered and therefore should not be used as part of a Global Broker. -** Additional required features, including anti-loop detection, notification message format compliance, validation of the published topic, and metrics provision. +** Additional features, including anti-loop detection, notification message format compliance, validation of the published topic, and metrics provision. * When receiving a message from a WIS centre or a Global Service broker, the metric ``wmo_wis2_gb_messages_received_total`` will be increased by 1. * A Global Broker will check if a discovery metadata record exists corresponding to the topic on which a message has been published. If there is no corresponding discovery metadata record, the Global Broker will discard non-compliant messages and will raise an alert. The metric ``wmo_wis2_gb_messages_no_metadata_total`` will be increased by 1. The Global Broker should not request information from a Global Discovery Catalogue for each notification message but should keep a cache of all valid topics for every ``centre-id``. * A Global Broker will check that the topic on which the message is received is valid. If the topic is invalid, the Global Broker will discard non-compliant messages and will raise an alert. The metric ``wmo_wis2_gb_invalid_topic_total`` will be increased by 1. * During the pre-operational phase (2024), a Global Broker will not discard the message but instead will send a message on the `monitor` topic hierarchy to inform the originating centre and its GISC. * A Global Broker will validate notification messages against the standard format (see _Manual on WIS_, Volume II – Appendix E. WIS2 Notification Message), discarding non-compliant messages and raising an alert. The metric ``wmo_wis2_gb_invalid_format_total`` will be increased by 1. -* A Global Broker will republish a message only once. It will record the message identifier (``id``) (as defined in the WIS2 Notification Message), of messages already published and will discard subsequent identical messages (those with the same message id). This is the anti-loop feature of the Global Broker. +* A Global Broker will republish a message only once. It will record the message identifier (``id``) (as defined in the WIS2 Notification Message), of messages already published and will discard subsequent identical messages (those with the same message ``id``). This is the anti-loop feature of the Global Broker. * When publishing a message to the local broker, the metric ``wmo_wis2_gb_messages_published_total`` will be increased by 1. * All above-defined metrics will be made available on HTTPS endpoints that the Global Monitor will ingest from regularly. * As a convention, the Global Broker centre-id will be ``tld-{centre-name}-global-broker``. @@ -110,7 +110,7 @@ In WIS2, Global Caches provide access to WMO core data for data consumers. This * A Global Cache will use the standard topic structure in its local Message Brokers (see _Manual on WIS_, Volume II - Appendix D. WIS2 Topic Hierarchy). * A Global Cache will publish to the topic ``cache/a/wis2/...``. * There will be multiple Global Cache to ensure the highly available, low-latency global provision of real-time and near-real-time core data within WIS2. -* There will be multiple Global Caches that may attempt to download cacheable data objects from all originating centres with cacheable content. A Global Cache will also download data objects from other Global Caches. This will ensure that each Global Cache has full global coverage, even when direct download from an originating centre is not possible +* There will be multiple Global Caches that may attempt to download cacheable data objects from all originating centres with cacheable content. A Global Cache will also download data objects from other Global Caches. This will ensure that each Global Cache has full global coverage, even when direct download from an originating centre is not possible. * Global Caches will operate independently of one another. Each Global Cache will hold a full copy of the cache – although there may be small differences between the various Global Caches as data availability notification messages propagate through WIS to each one. There is no formal synchronization between Global Caches. * A Global Cache will temporarily cache all resources published on the ``metadata`` topic. A Global Discovery Catalogue will subscribe to notifications about the publication of new or updated metadata, download the metadata record from the Global Cache and insert it into the catalogue. A Global Discovery Catalogue will also publish a metadata record archive each day containing the complete content of the catalogue and advertise its availability with a notification message. This resource will also be cached by a Global Cache. * A Global Cache is designed to support real-time content distribution. Data consumers access data objects from a Global Cache instance by resolving the URL in a data availability notification message and downloading the file to which the URL points. Only by checking the URL, is it transparent to the data consumers from which Global Cache they are downloading the data. There is no need to download the same data object from multiple Global Caches. The data id contained within notification messages is used by data consumers and Global Services to detect such duplicates. @@ -211,4 +211,4 @@ wis2-gdc is managed as a free and open source project. Source code, issue track * The Global Monitor will monitor the "health" (that is, the performance) of components at NCs/DCPCs, as well as Global Services. * The Global Monitor will provide a web-based dashboard that displays the WIS2 system performance and data availability. * As a convention, the Global Monitor centre-id will be ``tld-{centre-name}-global-monitor``. -* The main task of the Global Monitor will be to regularly query the metrics provided by the relevant WIS2 entities, aggregate and process the data and then provide the results to the end user in an appropriate format. +* The main task of the Global Monitor will be to regularly query the metrics provided by the relevant WIS2 entities, aggregate and process the data and then provide the results to the end user in a suitable presentation. diff --git a/guide/sections/part2/operations.adoc b/guide/sections/part2/operations.adoc index bceb3c0..0ba81d4 100644 --- a/guide/sections/part2/operations.adoc +++ b/guide/sections/part2/operations.adoc @@ -68,7 +68,7 @@ Aeronautical Fixed Service (AFS) are defined solely under the ICAO regulatory fr The WIS2 to SWIM interoperability approach employs a gateway component (as per Figure 2): .Schematic of an interoperability approach -image::images/wis2-to-swim-temp.png[Figure 2. Schematic of interoperability approach] +image::images/wis2-to-swim-temp.png[Schematic of interoperability approach] The gateway component can operate as an "adapter" between WIS2 and SWIM by pulling the requisite meteorological data from WIS2 and re-publishing it @@ -101,8 +101,8 @@ Datasets are a central concept in WIS2. Where meteorological data is published via WIS2, it will be packaged into datasets. The data should be grouped at the country/territory level (for instance, datasets should be published for a given country/territory), one for each datatype (for example, -aerodrome observation, aerodrome forecast,quantitative volcanic ash -concentration, and so forth). +aerodrome observation, aerodrome forecast, quantitative volcanic ash +concentration information, and so forth). For the purposes of publishing through WIS2, datasets containing aeronautical meteorological information should be considered as recommended data, as described in Resolution 1 (Cg-Ext(2021)). diff --git a/guide/sections/part2/wis2-architecture.adoc b/guide/sections/part2/wis2-architecture.adoc index 3baaf44..c8cfe13 100644 --- a/guide/sections/part2/wis2-architecture.adoc +++ b/guide/sections/part2/wis2-architecture.adoc @@ -50,7 +50,7 @@ These roles are outlined below. * Data consumers determine whether the data or metadata referenced in the notification messages are required. * Data consumers download data from a Global Cache or WIS2 Node. -=== 2.3 WIS2 Specifications +=== 2.3 WIS2 specifications Leveraging existing open standards, WIS2 defines the following specifications in support of publication, subscription, notification and discovery: From fd16d05667d7bc4cd98fd4dcac2f59e8268983da Mon Sep 17 00:00:00 2001 From: Jeremy Tandy Date: Thu, 24 Oct 2024 15:12:55 +0100 Subject: [PATCH 15/20] minor changes from review --- guide/sections/part1/data-publisher.adoc | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/guide/sections/part1/data-publisher.adoc b/guide/sections/part1/data-publisher.adoc index 9f800fa..9a8f65f 100644 --- a/guide/sections/part1/data-publisher.adoc +++ b/guide/sections/part1/data-publisher.adoc @@ -4,7 +4,7 @@ Data publishers wanting to share authoritative Earth system data with the WMO co ==== 1.3.1 How to get started -The first thing step is to consider the data, how they can be conceptually grouped into one or more datasets (see <<_1_1_4_why_are_datasets_so_important?>>), and whether they are core or recommended data, as per the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))) . +The first step is to consider the data, how they can be conceptually grouped into one or more datasets (see <<_1_1_4_why_are_datasets_so_important?>>), and whether they are core or recommended data, as per the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))) . Next, it is important to consider where the data are published. If the data relate to a specific country or territory, they should be published through a National Centre (NC). If they relate to a region, programme, or other specialized function within WMO, they should be published through a Data Collection or Production Centre (DCPC). The functional requirements for NCs and DCPCs are described in the _Manual on WIS_, Volume II - Part III Functions of WIS. @@ -39,9 +39,9 @@ Discovery metadata must be published in the Global Discovery Catalogues before t ==== 1.3.3 How to provide data to WIS2 -WIS2 is based on the web architecture.footnote:[See Architecture of the World Wide Web, Volume One: https://www.w3.org/TR/webarch/.] As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; and the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[See RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986].), which can be used to resolve the identified resource and interact with it (for example, to download a representation of the resource over an open-standard protocol such as HTTP). +WIS2 is based on the web architecture.footnote:[See Architecture of the World Wide Web, Volume One: https://www.w3.org/TR/webarch/.] As such it is _resource oriented_. Datasets are resources; the "granules" of data grouped in a dataset are resources; and the discovery metadata records that describe datasets are resources. In web architecture, every resource has a unique identifier (such as a URIfootnote:[See RFC 3986 - Uniform Resource Identifier (URI) - Generic Syntax: https://datatracker.ietf.org/doc/html/rfc3986.]), which can be used to resolve the identified resource and interact with it (for example, to download a representation of the resource over an open-standard protocol such as HTTP). -In simple terms, data (and metadata) are provided to WIS2 by assigning them a unique identifier, in this case a URLfootnote:[The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (such as its network location). See RFC 3986: https://datatracker.ietf.org/doc/html/rfc3986.], and making them available via a data server - most typically a web server using HTTP protocol.footnote:[WIS2 strongly prefers secure versions of protocols (such as HTTPS), wherein the communication protocol is encrypted using Transport Layer Security (TLS)] It is up to the data server to decide what to provide when resolving the identifier. For example, the URL of a data granule may resolve as a representation encoded in a given data format, whereas the URL of a dataset may resolve as a description of the dataset (that is, metadata) that includes links to access the data from which the dataset is comprised - either individual files (that is, the data granules) or an interactive API that enables users to request only the parts of the dataset they need by specifying query parameters. +In simple terms, data (and metadata) are provided to WIS2 by assigning them a unique identifier, in this case a URLfootnote:[The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (such as its network location). See RFC 3986: https://datatracker.ietf.org/doc/html/rfc3986.], and making them available via a data server - most typically a web server using HTTP protocol.footnote:[WIS2 strongly prefers secure versions of protocols (such as HTTPS), wherein the communication protocol is encrypted using Transport Layer Security (TLS).] It is up to the data server to decide what to provide when resolving the identifier. For example, the URL of a data granule may resolve as a representation encoded in a given data format, whereas the URL of a dataset may resolve as a description of the dataset (that is, metadata) that includes links to access the data from which the dataset is comprised - either individual files (that is, the data granules) or an interactive API that enables users to request only the parts of the dataset they need by specifying query parameters. The following sections cover specific considerations relating to publishing data to WIS2. @@ -62,7 +62,7 @@ WIS2 does not require the use of specific naming conventions. Another approach to enhance the usability of the data is to complement the collections (such as directories or folders in which files are grouped) with information that describes their content. Then users, both humans and software agents, can browse the structure and find what they need. Examples of this approach include: * Web Accessible Folders (WAF) and README files: A web-based folder structure listing the data object files by name, where each folder contains a formatted README file describing the folder contents; -* SpatioTemporal Asset Catalog (STAC):footnote[See STAC: SpatioTemporal Asset Catalogs: https://stacspec.org/en.] A community standard based on GeoJSON to describe geospatial data files that can be easily indexed, browsed and accessed. Free and open source tools present STAC records (one for each data object file) through a web-based, browsable user interface. +* SpatioTemporal Asset Catalog (STAC)footnote:[See STAC: SpatioTemporal Asset Catalogs: https://stacspec.org/en.]: A community standard based on GeoJSON to describe geospatial data files that can be easily indexed, browsed and accessed. Free and open source tools present STAC records (one for each data object file) through a web-based, browsable user interface. When publishing collections of data, it is tempting to package content into zip or submission information package (SIP)footnote:[See https://www.iasa-web.org/tc04/submission-information-package-sip or https://user.eumetsat.int/resources/user-guides/formats.] resources - perhaps even to package the entire collection, including folders, into a single resource. Similarly, WMO formats such as GRIB and BUFR allow multiple data objects (such as fields or observations) to be packed into a single file. Downloading a single resource is convenient for many users, but the downside is that the user must download the entire resource and then unpack/decompress it. The convenience of downloading fewer resources must be balanced against the cost of forcing users to download data they may not need. The decision should be guided by common practice in the specific domain - for example, only using zip files, SIP resources, or packing files if this is what the users expect. @@ -82,7 +82,7 @@ Based on the experience of data publishers that have been using web APIs to serv * First, interactive APIs should be self-describing. Data consumers should not need to know, a priori, how to make requests from an API. They should be able to discover this information from the API endpoint itself – even if this simply entails a link to a documentation page they need to read. * Second, APIs should comply with OpenAPIfootnote:[See OpenAPI Specification v3.1.0: https://spec.openapis.org/oas/v3.1.0.] version 3 or later. OpenAPI provides a standardized mechanism to describe the API. Tooling (free, commercial, etc.) that can read this metadata and automatically generate client applications to query the API is widely available. -* Third, the OGC has developed a suite of APIsfootnote:[See OGC API: https://ogcapi.ogc.org/.] (called "OGC APIs") that are specifically designed to provide APIs for geospatial data workflows (discovery, visualization, access, processing/exploitation) – all of which build on OpenAPI. Among these, OGC API – Environmental Data Retrieval (EDR),footnote:[See OGC API - Environmental Data Retrieval (EDR): https://ogcapi.ogc.org/edr.] OGC API – Features,footnote:[See OGC API - Features: https://ogcapi.ogc.org/features.] and OGC API - Coveragesfootnote:[See OGC API - Coverages: https://ogcapi.ogc.org/coverages.] are considered particularly useful. Because these are open standards, there is an ever-growing suite of software implementations (both free and proprietary) that support them. It is recommended that data publishers assess these open-standard API specifications to determine their suitability for publishing their datasets using APIs. +* Third, the OGC has developed a suite of APIsfootnote:[See OGC API: https://ogcapi.ogc.org/.] (called "OGC APIs") that are specifically designed to provide APIs for geospatial data workflows (discovery, visualization, access, processing/exploitation) – all of which build on OpenAPI. Among these, OGC API – Environmental Data Retrieval (EDR)footnote:[See OGC API - Environmental Data Retrieval (EDR): https://ogcapi.ogc.org/edr.], OGC API – Featuresfootnote:[See OGC API - Features: https://ogcapi.ogc.org/features.], and OGC API - Coveragesfootnote:[See OGC API - Coverages: https://ogcapi.ogc.org/coverages.] are considered particularly useful. Because these are open standards, there is an ever-growing suite of software implementations (both free and proprietary) that support them. It is recommended that data publishers assess these open-standard API specifications to determine their suitability for publishing their datasets using APIs. Finally, it is advisable to consider versioning the API to avoid breaking changes when adding new features. A common approach is to add a _version number_ prefix into the API path, for example, ``/v1/service/{rest-of-path}`` or ``/service/v1/{rest-of-path}``. @@ -92,7 +92,7 @@ More guidance on the use of interactive APIs in WIS2 is anticipated in future ve WIS2 is designed to support the data sharing needs of all WMO disciplines and domains. Among these, the World Weather Watch footnote:[See World Weather Watch: https://wmo.int/world-weather-watch.] drives specific needs for the rapid exchange of data to support weather forecasting. -To enable real-time data sharing,footnote:[In the context of WIS2, real time implies anything from a few seconds to a few minutes - not the milliseconds required by some applications.] WIS2 uses notification messages to inform users of the availability of a new resource, either data or discovery metadata, and how they can access that resource. Notification messages are published to a queue on a Message Broker in a data publisher's WIS2 Nodefootnote:[WIS2 ensures the rapid global distribution of notification messages using a network of Global Brokers which subscribe to the Message Brokers of WIS2 Nodes and republish notification messages (see <<_2_4_2_Global_Broker>>).] using the MQTT protocol and immediately delivered to all users subscribing to that queue. A queue is associated with a specific _topic_, such as a dataset. +To enable real-time data sharingfootnote:[In the context of WIS2, real time implies anything from a few seconds to a few minutes - not the milliseconds required by some applications.] WIS2 uses notification messages to inform users of the availability of a new resource, either data or discovery metadata, and how they can access that resource. Notification messages are published to a queue on a Message Broker in a data publisher's WIS2 Nodefootnote:[WIS2 ensures the rapid global distribution of notification messages using a network of Global Brokers which subscribe to the Message Brokers of WIS2 Nodes and republish notification messages (see <<_2_4_2_Global_Broker>>).] using the MQTT protocol and immediately delivered to all users subscribing to that queue. A queue is associated with a specific _topic_, such as a dataset. For example, when a new temperature profile from a radiosonde deployment is added to a dataset of upper-air data measurements, a notification message will be published that includes the URL used to access the new temperature profile data. All subscribers to notification messages about the upper-air measurement dataset will receive the notification message and be able to identify the URL and download the new temperature profile data. @@ -118,7 +118,7 @@ Whatever topic is used, the discovery metadata provided to the Global Discovery Core data, as specified in the WMO Unified Data Policy (Resolution 1 (Cg-Ext(2021))) are considered essential for the provision of services for the protection of life and property and for the well-being of all nations. Core data is provided on a free and unrestricted basis, without charge and with no conditions on use. -WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published at WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24 hours.footnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]) Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. +WIS2 ensures highly available, rapid access to _most_ core data via a collection of Global Caches (see <<_2_4_3_global_cache>>). Global Caches subscribe to notification messages about the availability of new core data published at WIS2 Nodes, download a copy of that data and republish it on a high-performance data server and then discard it after the retention period expires (normally after 24 hoursfootnote:[A Global Cache provides short-term hosting of data. Consequently, it is not an appropriate mechanism to provide access to archives of core data, such as Essential Climate Variables. Providers of such archive data must be prepared to serve such data directly from their WIS2 Node.]). Global Caches do not provide sophisticated APIs. They publish notification messages advertising the availability of data on their caches and allow users to download data via HTTPS using the URL in the notification message. The URL included in a notification message that is used to access core data from a WIS2 Node, or the "canonical" URL, if multiple URLs are provided, must: @@ -153,7 +153,7 @@ Recommended data are never cached by the Global Caches. The use of core data must always be free and unrestricted. However, it may be necessary to leverage existing systems with built-in access control when implementing the download service for the WIS2 Node. -Example 1: API key. The data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for the publication of data via WIS2 so that data usage can be tracked.], footnote:[Given that users are encouraged to download core data from the Global Cache, there will likely be limited access using the API key of the WIS2 account. If the usage quota for the WIS2 account is exceeded (for instance, if further data access is blocked), users should download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] +Example 1: API key. The data server requires a valid API key to be included in download requests. The URLs used in notification messages should include a valid API key.footnote:[A specific API key should be used for the publication of data via WIS2 so that data usage can be tracked.]footnote:[Given that users are encouraged to download core data from the Global Cache, there will likely be limited access using the API key of the WIS2 account. If the usage quota for the WIS2 account is exceeded (for instance, if further data access is blocked), users should download via the Global Cache as mandated in the _Manual on WIS_, Volume II.] Example 2: Presigned URLs. The data server uses a cloud-based object store that requires credentials to be provided when downloading data. The URLs used in notification messages should be _presigned_ with the data publisher's credentials and valid for the cache retention period (for example, 24 hours).footnote:[See working with presigned URLs on Amazon S3: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html.] From 12935b87a82bcf81f9a423f8b6dec22c637c56a9 Mon Sep 17 00:00:00 2001 From: Jeremy Tandy Date: Thu, 24 Oct 2024 15:21:35 +0100 Subject: [PATCH 16/20] Minor edits following review editorial only - no content changes --- guide/sections/part1/introduction.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guide/sections/part1/introduction.adoc b/guide/sections/part1/introduction.adoc index 1fb7295..e2a09aa 100644 --- a/guide/sections/part1/introduction.adoc +++ b/guide/sections/part1/introduction.adoc @@ -50,7 +50,7 @@ This consistency means that it is possible to predict the contents of a dataset, A dataset may be published as an immutable resource (such as data collected from a research programme), or it may be routinely updated (for example, every minute, as new observations are collected from weather stations). -A dataset may be represented as a single, structured file or object (for example, a CSV file in which each row represents a data record) or as thousands of consistent files (for example, output from a reanalysis model encoded as many thousands of General Regularly-distributed Information in Binary form (GRIB) files). Determining the best way to represent a dataset is beyond the scope of this Guide – there are many factors to consider. The key point here is that the dataset is considered a single, identifiable resource, irrespective of how it is represented. +A dataset may be represented as a single, structured file or object (for example, a CSV file in which each row represents a data record) or as thousands of consistent files (for example, output from a reanalysis model encoded as many thousands of General Regularly-distributed Information in Binary form (GRIB) files). Determining the best way to represent a dataset is beyond the scope of this Guide – there are many factors to consider. The key point here is that the dataset is considered a single, identifiable resource irrespective of how it is represented. Because data are grouped into a single, conceptual resource (that is, the dataset) it is possible to: * Assign this resource an identifier and use this identifier to unambiguously refer to collections of data; @@ -73,7 +73,7 @@ There are some things that are fixed requirements for datasets: Some examples of datasets include: -* The most recent five days of synoptic observations for an entire country or territory; footnote:[In this example, the system used to publish the data only retains the data for five days. Other systems may retain the data for a longer or shorter period of time.] +* The most recent five days of synoptic observations for an entire country or territory; footnote:[In this example, the system used to publish the data only retains the data for five days. Other systems may retain the data for a longer or shorter period of time.] * A long-term record of observed water quality for a managed set of hydrological stations; * The output from the most recent 24 hours of operational numerical weather prediction model runs; * The output from six months of experimental model runs. It is important to note that the output from operational and experimental model runs should not be merged into the same dataset because they use different algorithms - it is very useful to be able to distinguish the provenance (or lineage) of data; From 2e16ef4d8e31a1ca86cb3c7fb3ee43b27454851d Mon Sep 17 00:00:00 2001 From: Jeremy Tandy Date: Thu, 24 Oct 2024 15:34:52 +0100 Subject: [PATCH 17/20] minor changes from review editorial only - no content changes --- guide/sections/part2/global-services.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/guide/sections/part2/global-services.adoc b/guide/sections/part2/global-services.adoc index 0e1ed58..6241c88 100644 --- a/guide/sections/part2/global-services.adoc +++ b/guide/sections/part2/global-services.adoc @@ -18,7 +18,7 @@ On receipt of an offer from a Member to operate a Global Service, the WMO Secret The _Manual on WIS_, Volume II, the present Guide, and other available materials will help WIS centres decide how to proceed. -When a decision on how to proceed has been made , the WIS NFP will inform the WMO Secretariat of its preference. Depending on the type of Global Service, the WMO Secretariat will provide a checklist to the WIS centre so that the future Global Service can be included in WIS operations. +When a decision on how to proceed has been made, the WIS NFP will inform the WMO Secretariat of its preference. Depending on the type of Global Service, the WMO Secretariat will provide a checklist to the WIS centre so that the future Global Service can be included in WIS operations. A WIS centre must commit to running the Global Service for a minimum of four years. @@ -88,7 +88,7 @@ In the following sections, and for each Global Service, a set of metrics is defi * A Global Broker will check that the topic on which the message is received is valid. If the topic is invalid, the Global Broker will discard non-compliant messages and will raise an alert. The metric ``wmo_wis2_gb_invalid_topic_total`` will be increased by 1. * During the pre-operational phase (2024), a Global Broker will not discard the message but instead will send a message on the `monitor` topic hierarchy to inform the originating centre and its GISC. * A Global Broker will validate notification messages against the standard format (see _Manual on WIS_, Volume II – Appendix E. WIS2 Notification Message), discarding non-compliant messages and raising an alert. The metric ``wmo_wis2_gb_invalid_format_total`` will be increased by 1. -* A Global Broker will republish a message only once. It will record the message identifier (``id``) (as defined in the WIS2 Notification Message), of messages already published and will discard subsequent identical messages (those with the same message ``id``). This is the anti-loop feature of the Global Broker. +* A Global Broker will republish a message only once. It will record the message identifier (``id``) (as defined in the WIS2 Notification Message) of messages already published and will discard subsequent identical messages (those with the same message ``id``). This is the anti-loop feature of the Global Broker. * When publishing a message to the local broker, the metric ``wmo_wis2_gb_messages_published_total`` will be increased by 1. * All above-defined metrics will be made available on HTTPS endpoints that the Global Monitor will ingest from regularly. * As a convention, the Global Broker centre-id will be ``tld-{centre-name}-global-broker``. @@ -156,7 +156,7 @@ In WIS2, Global Caches provide access to WMO core data for data consumers. This ===== 2.7.5.1 Technical considerations * The Global Discovery Catalogue provides data consumers with a mechanism for discovering and searching for datasets of interest as well as learning how to interact with and find out more information about those datasets. -* The Global Discovery Catalogue implements the OGC API – Records – Part 1: Core standard,footnote:[See OGC-API Records - Part 1 https://docs.ogc.org/DRAFTS/20-004.html.] adhering to the following conformance classes and their dependencies: +* The Global Discovery Catalogue implements the OGC API – Records – Part 1: Core standardfootnote:[See OGC-API Records - Part 1 https://docs.ogc.org/DRAFTS/20-004.html.], adhering to the following conformance classes and their dependencies: ** Searchable Catalog (Deployment); ** Searchable Catalog - Sorting (Deployment); ** Searchable Catalog - Filtering (Deployment); @@ -172,7 +172,7 @@ In WIS2, Global Caches provide access to WMO core data for data consumers. This ** The subscription topic shall be ``++cache/a/wis2/+/metadata/#++``. * A Global Discovery Catalogue should connect to and subscribe to more than one Global Broker to ensure that no messages are lost in the event of a Global Broker failure. A Global Discovery Catalogue will discard duplicate messages as needed. * A Global Discovery Catalogue will verify that a discovery metadata record identifier’s centre-id token (see Manual on WIS, Volume II – Appendix F. WMO Core Metadata Profile (Version 2)) matches the centre-id level of the topic from which it was published (see Manual on WIS, Volume II – Appendix D. WIS2 Topic Hierarchy) to ensure that discovery metadata are published by the authoritative organization. -* • A Global Discovery Catalogue will validate discovery metadata records against the WCMP2. Valid WCMP2 records will be ingested into the catalogue. Invalid or malformed records will be discarded and reported to the Global Monitor against the centre-id associated with the discovery metadata record. +* A Global Discovery Catalogue will validate discovery metadata records against the WCMP2. Valid WCMP2 records will be ingested into the catalogue. Invalid or malformed records will be discarded and reported to the Global Monitor against the centre-id associated with the discovery metadata record. * A Global Discovery Catalogue will only update discovery metadata records to replace links for dataset subscription and notification (origin), with their equivalent links for subscription at Global Brokers (cache). * A Global Discovery Catalogue will periodically assess discovery metadata provided by NCs and DCPCs against a set of key performance indicators (KPIs) in support of continuous improvement. Suggestions for improvement will be reported to the Global Monitor against the centre identifier associated with the discovery metadata record. * A Global Discovery Catalogue will remove discovery metadata that are marked for deletion as specified in the data notification message. @@ -200,7 +200,7 @@ wis2-gdc provides the functionality required for the Global Discovery Catalogue, * Metrics reporting; * Implementation of metrics. -wis2-gdc is managed as a free and open source project. Source code, issue tracking and discussions are hosted openly on GitHub: https://github.com/wmo-im/wis2-gdc. +wis2-gdc is managed as a free and open source project. Source code, issue tracking and discussions are hosted openly on GitHub: https://github.com/wmo-im/wis2-gdc. ==== 2.7.6 Global Monitor From 8aae483b4e89f74a6f626df6fefc33238b653453 Mon Sep 17 00:00:00 2001 From: Jeremy Tandy Date: Thu, 24 Oct 2024 15:49:30 +0100 Subject: [PATCH 18/20] Minor edits following review editorial only - no content changes --- guide/sections/part2/operations.adoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/guide/sections/part2/operations.adoc b/guide/sections/part2/operations.adoc index 0ba81d4..bbadcfb 100644 --- a/guide/sections/part2/operations.adoc +++ b/guide/sections/part2/operations.adoc @@ -25,7 +25,7 @@ via WIS2 will automatically be published to GTS via the WIS2-to-GTS gateways. |=== |*WIS2* |*SWIM* |Earth system scope: Weather, climate, hydrology, atmospheric -composition, cryosphere, ocean and space weather data |ATM scope: Aeronautical, meteorological and flight information +composition, cryosphere, ocean and space weather data. |ATM scope: Aeronautical, meteorological and flight information. |Data centric: A consumer discovers data and then determines the services through which those data may be accessed. |Service centric: A @@ -117,7 +117,7 @@ Recommended data: regarding the security schemes for authenticated access - either HTTP authentication, API keys, OAuth2 or OpenID Connect Discovery. For more information, see -OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object.], footnote:[WIS2 does not provide any guidance on use of Public Key Infrastructure (PKI).] applied at the WIS2 Node; +OpenAPI Security Scheme Object: https://spec.openapis.org/oas/v3.1.0#security-scheme-object.]footnote:[WIS2 does not provide any guidance on use of Public Key Infrastructure (PKI).] applied at the WIS2 Node; * Are not cached within WIS2 by the Global Caches.footnote:[Global Caches enable the highly available, low-latency distribution of core data. Given that core data is provided on a free and unrestricted basis, @@ -221,7 +221,7 @@ processed; Nodefootnote:[The WIS2 Node may control access to data. If this is the case, the gateway component will need to be implemented accordingly] using the URL in the message - the resource should be in IWXXM format; ** Create a new data message as per the SWIM specifications, including -the unique identifier extracted from the data resource,footnote:[In case +the unique identifier extracted from the data resourcefootnote:[In case a unique identifier is required for proper passing of an aviation weather message to the gateway component, the GTS abbreviated heading (TTAAii CCCC YYGGgg) in the COLLECT envelop can be used (available in IWXXM messages @@ -229,7 +229,7 @@ that have a corresponding TAC message). Alternatively, content in the attribute ``gml:identifier`` (available in newer IWXXM messages such as WAFS SIGWX Forecast and QVACI), may also serve this purpose. There is currently no agreed definition for a unique identifier for IWXXM METAR and TAF reports for -individual aerodromes.] and embed the aviation weather data resource +individual aerodromes.], and embed the aviation weather data resource within the data message; ** Publish the data message to the appropriate topic on the SWIM Message Broker component of the SWIM service. @@ -309,4 +309,4 @@ federated catalogue. ODIS data will be published as recommended data as per the .WIS2 and ODIS metadata and catalogue interoperability image::images/wis2-odis-metadata-discovery-interop.png[Figure 4. WIS2 and ODIS metadata and catalogue interoperability] -As a result, federated discovery will be realized between both systems, users will be able to access the data from as close as possible to their source, and the data will be able to be used and reused in an authoritative manner. \ No newline at end of file +As a result, federated discovery will be realized between both systems, users will be able to access the data from as close as possible to their source, and the data will be able to be used and reused in an authoritative manner. From 521e93f68eeec9e83a773e1a9de6000033e44fb1 Mon Sep 17 00:00:00 2001 From: Jeremy Tandy Date: Thu, 24 Oct 2024 16:00:38 +0100 Subject: [PATCH 19/20] Minor edits following review editorial only - no content changes --- guide/sections/part2/wis2node.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guide/sections/part2/wis2node.adoc b/guide/sections/part2/wis2node.adoc index b737b9b..9ed7d45 100644 --- a/guide/sections/part2/wis2node.adoc +++ b/guide/sections/part2/wis2node.adoc @@ -66,11 +66,11 @@ It is not advisable to use a system name in the centre-id because system names m When configuring a WIS2 Node, it is necessary to consider how it will be accessed by Global Services and data consumers. -Global Brokers must authenticate when they connect to the MQTT Message Broker in the WIS2 Node. Username and password credentials are used.footnote:[The default connection credentials for a WIS2 Node Message Broker are username ``everyone`` and password ``everyone`` WIS2 Node operators should choose credentials that meet their local policies (for example, password complexity).]. When registering the WIS2 Node with the WMO Secretariat, these credentials must be provided. The WMO Secretariat will share the credentials with the Global Service operators and store them in the WIS register. These credentials should not be considered confidential or secret. +Global Brokers must authenticate when they connect to the MQTT Message Broker in the WIS2 Node. Username and password credentials are used.footnote:[The default connection credentials for a WIS2 Node Message Broker are username ``everyone`` and password ``everyone`` WIS2 Node operators should choose credentials that meet their local policies (for example, password complexity).] When registering the WIS2 Node with the WMO Secretariat, these credentials must be provided. The WMO Secretariat will share the credentials with the Global Service operators and store them in the WIS register. These credentials should not be considered confidential or secret. Given that Global Brokers republish notification messages provided by the WIS2 Node, access to the MQTT Message Broker may be restricted. Global Brokers operate using a fixed IP address, which allows access to be granted using IP filtering.footnote:[In WIS2, IP addresses are used to determine the origin of connections and confer trust to remote systems. It is well documented that IP addresses can be hijacked and that more sophisticated mechanisms, such as Public Key Infrastructure (PKI), are available for reliably determining the origin of connection requests. However, the complexities of implementing such mechanisms create barriers to Member participation in WIS2. For the purposes of WIS2, which involves distributing publicly accessible data and messages, IP addresses are considered to provide an adequate level of trust.] MQTT Message Brokers must be accessible by more than one Global Broker to ensure resilient transmission of notification messages to WIS2. -If your WIS2 Node only publishes core data,footnote:[In some cases, WIS2 Nodes will need to serve core data directly (see <<_1_3_3_5_considerations_when_providing_core_data_in_wis2>>). In these situations, the WIS2 Node data server must remain publicly accessible.] access to the data server may also be restricted, with the distribution of data handled by Global Caches. Global Caches also operate on fixed IP addresses, allowing their connections to be easily identified. Again, access must be granted to more than one Global Broker to ensure resilience. +If your WIS2 Node only publishes core datafootnote:[In some cases, WIS2 Nodes will need to serve core data directly (see <<_1_3_3_5_considerations_when_providing_core_data_in_wis2>>). In these situations, the WIS2 Node data server must remain publicly accessible.], access to the data server may also be restricted, with the distribution of data handled by Global Caches. Global Caches also operate on fixed IP addresses, allowing their connections to be easily identified. Again, access must be granted to more than one Global Broker to ensure resilience. During registration, the WMO Secretariat will provide host names and IP addresses of the Global Services to enable access controls to be configured. From 7ec159e47772dbb764a31f69d6441801a03b8a8e Mon Sep 17 00:00:00 2001 From: Jeremy Tandy Date: Thu, 24 Oct 2024 16:04:31 +0100 Subject: [PATCH 20/20] Minor edits following review grammar only - no content changes --- guide/sections/part3/information-management.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guide/sections/part3/information-management.adoc b/guide/sections/part3/information-management.adoc index 9460820..35f2e77 100644 --- a/guide/sections/part3/information-management.adoc +++ b/guide/sections/part3/information-management.adoc @@ -54,7 +54,7 @@ The effective management of information is essential for WMO centres to deliver ===== 3.1.2.4 Principle 4: Information must be standardized and interoperable -* Information must be stored and exchanged in standardized formats to ensure wide usability in the short and long term. It is essential for long-term archiving that information be stored in a form that can be understood and used after several decades. +* Information must be stored and exchanged in standardized formats to ensure wide usability in the short and long-term. It is essential for long-term archiving that information be stored in a form that can be understood and used after several decades. * Standardization is essential for structured information such as dataset definitions and metadata to support interoperability. * Interoperability is essential for users to be able to utilize information through different systems and software. Open standards help ensure interoperability with their openness and wide adoption across various communities. * Which standards to use depends on the user community and organizational policies. Interoperability requirements should be considered when selecting the standard for internal use and broader dissemination. @@ -140,7 +140,7 @@ To ensure traceability and reproducibility, the information and documents at thi ====== 3.1.3.3.2 Representation and metadata -The formats used to store and exchange information should be standardized to ensure its usability in both the short and the long term. It is essential that the information be accessible many years after archival if required. To ensure this usability, the format and version of the information should be recorded in the information metadata record and included within the information itself where the format allows. +The formats used to store and exchange information should be standardized to ensure its usability in both the short and the long-term. It is essential that the information be accessible many years after archival if required. To ensure this usability, the format and version of the information should be recorded in the information metadata record and included within the information itself where the format allows. Information exchanged on WIS and between WMO centres is standardized through the use of the formats specified in the _Manual on Codes_ (WMO-No. 306), Volume I.2 and the _Manual on WIS_, Volume II. These include the GRIB and BUFR formats for numerical weather prediction products and observational data and the WMO Core Metadata Profile for discovery, access and retrieval metadata. The format for the exchange of station and instrumental metadata, WMO Integrated Global Observing System (WIGOS) Metadata Data Representation, is defined in the https://library.wmo.int/idurl/4/35769[_Manual on Codes_] (WMO-No. 306), Volume I.3.