Use geospatial data in Azure with Planetary Computer Pro

Among his many achievements, pioneering computer scientist and Microsoft Technical Fellow Jim Gray came up with what he called the “fifth paradigm” of science: using large amounts of data and machine learning to discover new things about the world around us. That idea led to Microsoft’s work with scientists across the planet through its AI for Earth and similar projects.

Part of that work led to the development of a common source of geospatial data for use in research projects. Dubbed the Planetary Computer, the project built on a set of open standards and open source tools to deliver a wide variety of different geographic and environmental data sets that can be built into scientific computing applications. There are more than 50 petabytes of geospatial data in 120 data sets.

Adding geospatial data to scientific computing

Open the Planetary Computer data catalog and you will find all kinds of useful data: from decades’ worth of satellite imagery to biomass maps, from the US Census to fire data. All together, there are 17 different classes of data available, often with several different sources, all ready for research applications, on their own or to provide valuable context to your own data. A related GitHub repository provides the necessary code to implement much of it yourself.

Along with the data, the research platform includes a tool to quickly render data sets onto a map, giving you a quick way to start exploring data. It also provides the necessary Python code to include the data in your own applications. For example, you could mix demographic information with terrain data to show how population is affected by physical geography.

Data like this can help quickly prove or disprove hypotheses. This makes it a valuable tool for science, but what if you want to use it alongside the information stored in your own business systems?

From the lab to the enterprise

At Build 2025, Microsoft announced a preview of a new enterprise-focused version of this service, called Planetary Computer Pro. It’s wrapped as an Azure service and suitable for use across your applications or as part of machine learning services. The service can be managed using familiar Azure tools, including the portal and CLI, as well as with its own SDK and APIs.

Like the academic service, Planetary Computer Pro builds on the open SpatioTemporal Asset Catalog (STAC) specification, which allows you to use a standard set of queries across different data sets to simplify query development. It provides tools for visualizing data and supports access controls using Entra ID and Azure RBAC. It’s for your own data; it doesn’t provide third-party data sets for now, though direct links to the Planetary Computer service are on the road map, and the examples in Microsoft Learn show how to migrate data between the services.

Building a geospatial computing environment on top of Azure makes a lot of sense. Underneath Azure, tools like Cosmos DB and Fabric provide the necessary at-scale data platform, and support for Python and the Azure AI Foundry allow you to build analytical applications and machine learning tools that can mix geospatial and other data—especially time-series data from Internet of Things hardware and other sensors.

Planetary Computer Pro provides the necessary storage, management, and visualization tools to support your own geospatial data in 2D and 3D, with support for the necessary query tools. A built-in Explorer can help show your data on a map, letting you layer different data sets and find insights that may have been missed by conventional queries.

Microsoft envisions three types of user for the service. First are solution developers who want a tool for building and running geospatial data-processing pipelines and hosting applications using that data. The second group is data managers who need catalog tools to control and share data and provide access to developers across the enterprise. Finally, data scientists can use its visualization tools to explore geospatial data for insights.

Building your own geospatial catalog

Getting started with Planetary Computer Pro requires deploying a GeoCatalog resource into your Azure tenant. This is where you’ll store and manage geospatial data, using STAC to access data. You can use the Azure Portal or a REST API to create the catalog. While the service is in preview, you’re limited to a small number of regions and can only create publicly accessible catalogs. Deployment won’t take long, and once it’s in place you can add collections of data.

With a GeoCatalog in place you can now start to use the Planetary Computer Pro web UI. This is where you create and manage collections, defining data sources and how they’re loaded. It includes an Explorer view based on the one in Planetary Computer, which takes STAC data and displays it on a map.

You need to understand STAC to use Planetary Computer Pro, as much of the data used to define new collections needs to be written in STAC JSON. This can be uploaded from a development PC or authored inside the web UI. You can use Planetary Computer Pro’s own templates to simplify the process.

Data can now be brought into your collection. Data needs to be stored in Azure Blob Storage Containers; once it’s in place, all you need to do is provide the storage URI to Planetary Computer Pro. Azure’s identity management tool allows you to ensure only authenticated accounts can import data.

Importing data into a catalog

Imports use Python to copy data into a collection, using the STAC libraries from pip. This allows you to add metadata to images, along with data about the source, the time the data was collected, and the associated coordinates, for example, a bounding box around an image or precise coordinates for a specific sensor. There’s a lot of specialized information associated with a STAC record, and this can be added as necessary, allowing you to provide critical information about the data you’re cataloging. It’s important to understand that you’re building a catalog, akin to one in a library, not a database.

The STAC data you create can be used to find the data you need and then align it appropriately on a map, ready for use. That data can then be used alongside traditional queries for, say, time-series data to show how weather and geographic relief affect a vineyard’s yield.

If you have a lot of data to add to a catalog, you can use the service’s bulk ingestion API to add it to a collection. All the data needs to be in the same Blob Container, with code to provide the STAC data for each catalog entry based on your source’s metadata. The built-in Explorer helps you validate data on maps.

With data stored in a catalog, you can now use the Planetary Computer Pro APIs to start using it your own applications, with tools for searching STAC data, as well as tiling images onto maps ready for display. You can even link it to geographic information systems like ESRI’s ArcGIS, Microsoft’s own Fabric data lakes, or the Azure AI Foundry as part of machine learning applications.

Building a geographic Internet of Things

How can you use Planetary Computer Pro? One use case is to provide added context for machine learning systems or teams using Fabric’s new Digital Twin feature. Large-scale energy production systems are part of much larger environmental systems, and Planetary Computer Pro can provide more inputs as well as visualization tools. Having geospatial data alongside the sensor data adds other prediction options.

Planetary Computer Pro also fits in well with Microsoft’s work in precision agriculture, adding tools for better weather prediction features and a deeper understanding of how a local ecosystem has developed over time using historical satellite imaging data to track biodiversity and growth patterns. You can even use your own drones and aerial photography to add geospatial data to applications, comparing your current growth and yield with historical regional data.

Too often we think of the systems we’re building and modeling as isolated from the planet they’re part of. We are learning otherwise, and geospatial tools can help us make our systems part of the wider world, reducing harmful impacts and improving operations at the same time. Having geospatial data and the tools to work with it lets Microsoft deliver on the vision of a truly Planetary Computer, one that can add intelligence and analytics to the interfaces between our systems and the broader environment.

Total
0
Shares
Previous Post

Why India is the world’s AI epicenter

Next Post

How to use frozen collections in C#