openEO platform: a truly federated and open environment for the EO community

Cloud-based digital platforms are increasingly becoming central for users to exploit the enormous amounts of Earth Observation (EO) data freely available today. A new initiative is now entering the EO platform landscape: the ESA funded openEO platform which was just released as the initial service offering to the public during the 2021 Phi Week.

Earth Observation archives today provide access to huge amounts of satellite-based observations of the planet. Following open data policies spearheaded by the Landsat program, the ESA/EC Copernicus Sentinel program is now the largest provider of EO data. The Sentinel satellites alone acquire more than 25 TeraByte of data and new observations every day.

 

EO platforms to date: advantages and limitations

For EO scientists and users, handling and effectively utilising these petabyte-scale EO data archives is becoming increasingly challenging. Traditional approaches that download large data volumes and handle these on local / in-house computing environments are becoming less effective. Following this approach, scientists and users find themselves spending most of their time and effort on data handling and management, rather than investigating the scientific question that drive their research.

 

openEO platform currently provides access to over 50 different data collections

 

Many of the existing EO platforms do not resolve this “data management burden” at a satisfying level. For example, many platforms require the user to specify the configuration of Virtual Machine prior to starting to starting the work, and then require the user to search for files, pass file paths into processors and access the data at the file level. Other rather restrictive characteristics that the EO user community encounters in existing platforms are the required literacy with engineering standards (e.g. OGC web-interface standards such as WMS or WPS) which are rather complex to implement and commonly perceived as being non-intuitive, especially by EO and data scientists. Other typical limitations encountered in existing EO platforms are the lack of open source and related uncertainty of Intellectual Property Rights (IPR) of user created content. Commonly encountered vendor lock-in constrains users to certain cloud provider, limiting interoperability and consolidating the limited reproducibility of EO methods and science.

 

openEO platform mission: abstracting complexities, enabling transparency and open source

openEO platform is being developed to address these challenges and the prevailing capability gaps that users experience when working with EO platforms. The consortium consists of EODC, EGI, EURAC, GEO, Sinergise, VITO and WWU Muenster. openEO platform builds on European excellence by developing an operational service based on the openEO Application Programming Interface (API). The openEO API was developed under H2020 funding where the project addressed identified shortcomings related to a lack of interoperability among existing EO and geospatial platforms. openEO API was developed to act as a translating interface connecting users and their clients (programming libraries, graphical or command-line-based clients) with different EO cloud environments. Now under ESA funding, openEO platform develops the API and its capabilities into an operational service.

A key aspect guiding the development of openEO platform is the abstraction of underlying complexity, which is addressed during various stages of a typical EO workflow.

For example, like other popular EO platforms, openEO platform adopts the concept of virtual data cubes as an alternative to file-centric approaches. By definition, a data cube presents the users with a perfectly aligned cube of EO data (typically with two spatial, one temporal and various spectral/thematic dimensions) that can directly be used in e.g. time-series analyses. In openEO platform, data cubes, in a first instance, are realised by specifying (1) an EO data collection (Sentinel-1 GRD in the example below), (2) a spatial extent, (3) a temporal period and (4) the bands of interest.

 

 

The example above shows how such a data cube is defined using the python client library for openEO platform. It is referred to as a virtual data cube as no replication of data is taking place, rather a view on the underlying data collection is defined. From this point on users work on this virtual data cube without having to worry about files, file paths or the map projections that individual datasets have. This is typically done by operator chaining as shown in the below example. Here the previously defined data cube is aggregated (or reduced) to the minimum value observed by Sentinel-1 over the time period of the (previously defined) data cube. The result is a single band raster that captured these minimum values for each pixel in the data cube.

 

 

Transparency and open source are additional guiding principles for the development of openEO platform. This includes the open source code for the openEO API and the various client libraries in Python, R and JavaScript programming languages. The different openEO platform cloud ”backends” build on open source technology stacks (including e.g. xarray, dask, ODC, GeoPySpark/GeoTrellis). The platform implementation follows an agile and open source development approach. User authentication within the platform builds on OpenID Connect  minimising the need to collect or track any personal data of the platform users.

 

Federated cloud architecture and multiple development environments

The current and initial platform architecture includes a deployment in three different European public cloud environments: TerraScope (operated by VITO), EODC and CreoDIAS. In addition, a data federation allows serving collections and optimised layer from the Euro Data Cube. The federation is however extendible and additional cloud backends will be added in the near future, increasing the federated architecture, extending the available data collections and enlarging the user community.

 

Jupyter notebooks are the primary development environment in openEO platform while desktop based development environments (e.g. PyCharm) can also be used. The openEO platform Web-Editor additionally provides a graphical workflow builder, where graphical  workflows (called process graphs) can automatically be translated into different programming languages. The Web-Editor also provides batch job process monitoring functionality to the user.

 

 

The openEO platform Web-Editor provides collection viewing and metadata listing, the definition of graphical workflows and the monitoring of user batch jobs. The animation demonstrates how an existing graphical workflow (a process graph, defined in JSON) can be automatically translated into the three client library languages: Python, R and JavaScript. Various linkages to GIS environments such as QGIS have been developed in an initial version and will be further expanded in the near future

 

 

The primary development solution for openEO platform is the Jupyter Lab environment. Users can now take advantage of the “signed URL” feature for interactive viewing of intermediate results in GIS environments – as shown in this animation.

 

Access to openEO platform and opportunities for Early Adopters

openEO platform is currently inviting the community to explore the functionality, provide feedback on the platform’s utility and report any encountered issue. To support this, licences for Early Adopters are sponsored through the ESA Network of Resources (NOR), initially for a period of three months. This will be extended as long as Early Adopters provide some initial feedback on their investigations. All Early Adopters are invited to contribute to the first openEO platform user consultation that will happen at the ESA Living Planet Symposium. The diversification of the commercial license packages will evolve over the next year adding also flexible pay per use licenses matching the requirements of the community. Sponsoring for scientific projects can be requested from the ESA NOR.

In addition, and just in time for Christmas and the new year, openEO platform can now be explored by everyone using the brand new free tier 30-day trial license.

The platform portal and all documentation can be accesses through https://openeo.cloud/.

SHARE