Introduction

Within the last decade, a huge amount of geospatial data, such as satellite data (e.g. land surface temperatures, vegetation) or the output of large scale, even global models (e.g. wind speed, groundwater recharge), have become freely available from multiple national agencies and universities (e.g. NASA, USGS, NOAA, and ESA). These geospactial data are used every day by scientists and engineers of all fields, to predict weather, prevent disasters, secure water supply or study the consequences of climate change. When using these geospatial data, things fast becomes tricky and the following questions arise:
– What data is available and where can we find it ?
– How can we access these data?
– How can we manipulate these petabytes of data?

Well, since a couple of years you can answer all three questions with “Google Earth Engine” – A platform that allows online cloud-computation on a large varierty of global datasets. After singin up (here) with your google-account (might take a couple of hours to unlock), you can either type https://code.earthengine.google.com/ in your browser to play with their seemingly endless amount of geospatial datasets typing Javascripts. Or…, you can continue to read this post to learn how to manipulate these remote sensing data through a Python API.

Google Earth Engine is a cloud-based platform for planetary-scale geospatial analysis that brings Google’s massive computational capabilities to bear on a variety of high-impact societal issuesal capacity needed to utilize traditional supercomputers or large-scale commodity cloud computing resources. (Golerick et al. 2017)

In this article, an introduction to the GEE Python API is presented. After some setups and some exploration of the Earth Engine Data Catalog, we’ll see how to handle geospatial datasets with pandas and make some plots with matplotlib. Particularly, we’ll see how to get the timeseries of a variable on a region of interest. An application of this procedure will be done to extract Land Surface Temperature in an urban and a rural area near the city of Lyon (France) to illustrate the heat island effect.
Then, we will focus on static mapping and a procedure to export the result in a geotiff file will be detailed.
Finally, the folium library will be introduced to make interactive maps. In this last part, we’ll see how to include some GEE datasets as tiles layers of a folium map.

 

code-earth-engine
Screenshot of the Javascript google earth engine platform

Exploration of the Earth Engine Data Catalog

Have you ever thought that getting a meteorological dataset could be as easy as finding the nearest pizzeria? To convince you, go here and play with the search bar and select the dataset you want to explore.

Let’s say that we need to know the elevation of a region, some soil properties (e.g. clay/sand/silt content) and some meteorological data (e.g. temperature, precipitation, evapotranspiration). Well, inside the Earth Engine Catalog we find:

  • global elevation with a resolution of 30m is available here,
  • OpenLandMap datasets with some soil porperties with a resolution of 250m (e.g. clay, sand, and silt content)
  • temperature, precipitation and evapotranspiration datasets with different resolution. For example you can explore the GRIDMET model data from the University of Idaho or for streight up satelite derived values look into the MODIS Collections, etc.

Of course the resolution, frequency, spatial and temporal extent as well as data source (e.g. satelite image, interpolated station data, or model output) vary from one dataset to another. Therefore, read the description carefully and make sure you know what kind of dataset you are selecting!

To give you a global picture of what can be found in the Earth Engine Data Catalog, you can have a look at the word cloud below summarizing tags associated to available datasets.

earth-engine-data-catalog-word-cloud
Word cloud based on Earth Engine Data Catalog tags

Installation of the earth-engine-api and start

Some setups

First of all, the client earthengine api should be installed. It can be done properly using the documentation here, or directly using the following command from your notebook.

!pip install earthengine-api --upgrade

Then, we can import the earthengine library and authenticate with a google account. After running the following cell, a new windows should open to give you a token associated to your google account (If no windows opens, just click the resulting link). Just copy and past the token into the empty cell poping up after the run.

import ee
ee.Authenticate()

Now we can initialize the API:

ee.Initialize()

Congratulations ! Now we can play with global datasets =)

Start to play with collections

In the Earth Engine Catalog, datasets can have different shapes. Basically, we work with:

  • features which are geometric objects with a list of properties. For example, a watershed with some properties such as name, area, is a feature.
  • image which are like features, but may include several bands. For example, The ground elevation given by the USGS here is an image.
  • collections which are group of features or images. For example, the Global Administrative Unit Layers giving administrative boundaries is a featureCollection and the MODIS Land Surface Temperature dataset is an imageCollection.

If you want to know more about different data models, you may want to visit the earth engine user guide

In the following, we work with the MODIS Land Cover (LC), the MODIS Land Surface Temperature (LST) and with the USGS Ground Elevation (ELV) which are imageCollections. The descriptions provide us with all the information we need to import and manipulate these datasets: the availability, the provider, the Earth Engine Snippet, and the available bands associated to the collection.

Now, to import the LC, LST and ELV collections, we can copy and paste the Earth Engine Snippets:

# import a land cover Collection
LC = ee.ImageCollection('MODIS/006/MCD12Q1')

# import a land surface temperature Collection
LST = ee.ImageCollection('MODIS/006/MOD11A1')

# import a ground elevation Image
ELV = ee.Image('USGS/SRTMGL1_003')

All of these images come in a different resolution, frequency, and possibly projection, ranging from daily images in a 1 km resolution for LST (hence an ee.ImageCollection – a collection of several ee.Images) to a sinlge image giving data for the year 2000 in a 30 m resolution for the ELV. While we need to have an eye on the frequency, GEE takes care of resolution and projection by resampling and reprojecting all data we are going to work with to a common projection. We can define the resolution (called scale in GEE) whenever necessary and of course have the option to force no reprojection.

As you can see in the description of the datasets, they include several sets of information stored in several bands. For example, these bands are associated to the LST collection:

  • LST_Day_1km: Daytime Land Surface Temperature
  • Day_view_time: Local time of day observation,
  • LST_Night_1km: Nighttime Land Surface Temperature,

The description page of the collection tells us that the name of the band associted to the daytime LST is ‘LST_Day_1km’ which is in Kelvin. In addition, values are ranging from 7500 to 65535 with a corrective scale of 0.02.

Then, we have to select the bands we want to work with. Therefore, I decide to focus on daytime LST so we select the daytime band with the .select() command. We also need to filter the collection on the period of time we want. We can do that using the filterDate() method.

# initial date of interest - inclusive
i_date = '2013-01-01'

# final date of interest - exclusive
f_date = '2017-01-01' 

# selection of appropriate bands and dates for LST
LST = LST.select('LST_Day_1km').filterDate(i_date, f_date)

Now, we can either upload existing shape files or define some points with longitude and latitude coordinates where we want know more about Land cover, Land Surface Temperature and Elevation. For example, let’s take two:

  • The first one in the urban area of Lyon (France)
  • The second one, 30 kilometers away the city center, in a rural area
# Definition of the urban location of interest with a point
u_lon = 4.8148 #longitude of the location of interest
u_lat = 45.7758 #latitude of the location of interest
u_poi = ee.Geometry.Point(u_lon, u_lat)

# Definition of the rural location of interest with a point
r_lon = 5.175964 #longitude of the location of interest
r_lat = 45.574064 #latitude of the location of interest
r_poi = ee.Geometry.Point(r_lon, r_lat)

We can easily get information about our region/point of interest using following methods (to get more information about available methods and required arguments, please visit the API documentation here):

  • sample(): samples the image (does NOT work for an image collection – we’ll talk about sampeling an image collection later) according to a given geometry and a scale (in meters) of the projection to sample in. It restuns a featureCollection.
  • first(): returns the first entry of the collection,
  • get(): to select the appropriate band of your Image/Collection,
  • getInfo(): returns the value, not just the object.

Then we can evaluation the ground elevation and LST around our point of interest using the following commands. Please be careful when evaluating LST. According to the dataset description, the value should be corrected by a factor of 0.02 to get Kelvins (do not forget the conversion). To get the mean mulit-annual daytime LST, we use the mean() method on the LST ImageCollection. (The following run might take some time: about 15-20 seconds for me).

scale = 1000 #scale in meters

# Print the elevation near Lyon (France)
elv = ELV.sample(u_poi, scale).first().get('elevation').getInfo()
print('Ground elevation around the point:', elv, 'm')

# Print the daytime LST near Lyon (France)
# To take the mean value of the LST on the collection, we use the .mean() method:
lst = LST.mean().sample(u_poi, scale).first().get('LST_Day_1km').getInfo()
print('Average daytime LST around the point:', round(lst*0.02 -273.15,2), '°C')

# Print the land cover type around the point:
lct = LC.first().sample(u_poi, scale).first().get('LC_Type1').getInfo()
print('The land cover value around the point is:', lct)
Ground elevation around the point: 196 m
Average daytime LST around the point: 21.92 °C
The land cover value around the point is: 13

Going back to the band description of the LC dataset, we see that a LC value of “13” conrresponds to an urban land. You can run the above cells with the rural point coordinates if you want to notice a difference.

Get some time series

Now that you see we can get geospatial information about a place of interest pretty easily, you may want to get some timeseries, probably make some charts and draw get statistics about a place. Hence, we import the data at the given locations using the getRegion() method.

#here is the buffer zone we consider around each point : 1000m
buffer_points = 1000 

# We get the data for the point in urban area
LST_u_poi = LST.getRegion(u_poi, buffer_points).getInfo()
# We get the data for the point in rural area
LST_r_poi = LST.getRegion(r_poi, buffer_points).getInfo() 

LST_u_poi[:5]
[['id', 'longitude', 'latitude', 'time', 'LST_Day_1km'],
['2013_01_01', 4.810478346460038, 45.77365530231022, 1356998400000, None],
['2013_01_02', 4.810478346460038, 45.77365530231022, 1357084800000, None],
['2013_01_03', 4.810478346460038, 45.77365530231022, 1357171200000, None],
['2013_01_04', 4.810478346460038, 45.77365530231022, 1357257600000, None]]

Printing the first 5 lines of the result shows that we now have arrays full of data. We now define a function to transform this array into a pandas Dataframe which is much more convenient to manipulate.

import pandas as pd

def ee_array_to_df(arr, band):
    """
    We create a function with an array as input
    We return a pandas df
    """
    df = pd.DataFrame(arr)
    
    # we rearrange the header
    headers = df.iloc[0]
    df = pd.DataFrame(df.values[1:], columns = headers)

    # we remove raws without data inside:
    df = df[['longitude', 'latitude', 'time', band]].dropna()
    
    # We converr the data to numeric values
    df[band] = pd.to_numeric(df[band], errors='coerce')
    
    # We also convert the Time filed into a datetime
    df['datetime'] = pd.to_datetime(df['time'], unit='ms')
    
    # We keep the columns we want
    df = df[['time','datetime', band]]
    
    return df

We apply this function to get the two timeseries we want (and we print one):

LSTdf_urban = ee_array_to_df(LST_u_poi,'LST_Day_1km')

#Do not forget that the LST is corrected with a scale of 0.02.
#So we convert the appropriate field of the dataframe to get temperature in celcius:
LSTdf_urban['LST_Day_1km'] = 0.02*LSTdf_urban['LST_Day_1km'] - 273.15

#We do the same for the rural point:
LSTdf_rural = ee_array_to_df(LST_r_poi,'LST_Day_1km')
LSTdf_rural['LST_Day_1km'] = 0.02*LSTdf_rural['LST_Day_1km'] - 273.15 

LSTdf_urban.head()
time datetime LST_Day_1km
15 1358294400000 2013-01-16 -1.49
21 1358812800000 2013-01-22 6.47
25 1359158400000 2013-01-26 0.33
29 1359504000000 2013-01-30 11.93
30 1359590400000 2013-01-31 5.13

Now that we have our data in a good shape, we can easily make plots and compare the trends. As Land Surface Temperature has a seasonality influence, we expect that data looking something like LST(t) = LST_{0} (1 + sin(\omega t + \phi))

Consequently, on the top of the data scatter plot, we plot the fitting curve using the scipy library:

import matplotlib.pyplot as plt
import numpy as np
from scipy import optimize
%matplotlib inline

# Fitting Curves:
## first, we extract x values (times) from our dfs
x_data_u = np.asanyarray(LSTdf_urban['time'].apply(float)) #urban
x_data_r = np.asanyarray(LSTdf_rural['time'].apply(float)) #rural

## Secondly, we extract y values (LST) from our dfs
y_data_u = np.asanyarray(LSTdf_urban['LST_Day_1km'].apply(float)) #urban
y_data_r = np.asanyarray(LSTdf_rural['LST_Day_1km'].apply(float)) #rural

## Then, we define the fitting function with parameters a, b, c, d:
def fit_func(x, a, b, c):
    return a*(np.sin(b*x + c) + 1)

## We optimize the parameters using a good start p0
params_u, params_covariance_u = optimize.curve_fit(fit_func, x_data_u, y_data_u, p0 = [20., 0.002*np.pi/(365.*24.*3600.), 350])
params_r, params_covariance_r = optimize.curve_fit(fit_func, x_data_r, y_data_r, p0 = [20., 0.002*np.pi/(365.*24.*3600.), 350])

# Subplots
fig, ax = plt.subplots(figsize=(21, 9))

# We add scatter plots
ax.scatter(LSTdf_urban['datetime'], LSTdf_urban['LST_Day_1km'], 
           c = 'black', alpha = 0.2, label = 'Urban (data)')
ax.scatter(LSTdf_rural['datetime'], LSTdf_rural['LST_Day_1km'], 
           c = 'green', alpha = 0.35, label = 'Rural (data)')

# We add fitting curves
ax.plot(LSTdf_urban['datetime'], fit_func(x_data_u, params_u[0], params_u[1], params_u[2]), 
        label='Urban (fitted)', color = 'black', lw = 2.5)
ax.plot(LSTdf_rural['datetime'], fit_func(x_data_r, params_r[0], params_r[1], params_r[2]), 
        label='Rural (fitted)', color = 'green', lw = 2.5)

# We add some parameters
ax.set_title('Daytime Land Surface Tempearture near Lyon', fontsize = 16)
ax.set_xlabel('Date', fontsize = 14)
ax.set_ylabel('Temperature [K]', fontsize = 14)
ax.set_ylim(-5, 40)
ax.grid(lw = 0.2)
ax.legend(fontsize = 14, loc = 'lower right')

plt.show()
Python-Geosciences-temperature

Static mapping of Land Surface Temperature and Ground Elevation

Get a static map

Now, we want to get static maps of Land surface Temperature and Ground Elevation around a region of interest. We define this region of interest with a rectangle around France using lower left and toper right corners coordinates (longitude and latitude).

# Definition of a region of interest with a rectangle
l_left = [-6.1, 40.62] # lower left corner of the rectangle
t_right = [10.07, 52.73] # toper right corner of the rectangle

roi = ee.Geometry.Rectangle([l_left, t_right]) # region of interest

Also, we have to convert the LST ImageCollection into an Image, for example by taking the mean value of each pixel over the period of interest. And we convert the value of pixels into Celsius:

# ImageCollection to Image using the mean() method:
LST_im = LST.mean()

# Operation to take into account the scale factor:
LST_im = LST_im.select('LST_Day_1km').multiply(0.02)

# Kelvin -> Celsius
LST_im = LST_im.select('LST_Day_1km').add(-273.15)

Then, we use the getThumbUrl() method to get an url and we can use the IPython librairy to plot the mean daytime LST map on the region of interest. Blue represents the coldest areas (< 10°C) and red represents the warmest areas (>30°C). (For me, the following run is quite long and the image can pop some times after the task appears to be finished)

from IPython.display import Image

# Create the url associated to the Image you want
url = LST_im.getThumbUrl({'min': 10, 'max': 30, 'dimensions': 512,
                          'region' : roi,
                          'palette': ['blue', 'yellow', 'orange', 'red']})
print(url)

# Display a thumbnail Land Surface Temperature in France
Image(url = url)
daytime-land-surface-temperature-france

We do the same for ground elevation:

# We create a filter and apply a mask to get hide oceans and seas
url2 = ELV.updateMask(ELV.gt(0)).getThumbUrl({'min': 0, 'max': 2500, 'region' : roi,
                                              'dimensions': 512,
                                              'palette': ['006633', 'E5FFCC', '662A00', 'D8D8D8', 'F5F5F5']})
Image(url = url2)
ground-elevation-france-earth-engine

Of course you may want to have a closer look around a specific part of the map. So let’s define an other region, adjust the min/max scale and display:

# We create a buffer zone of 10km around our point of interest
lyon = u_poi.buffer(10000)

url3 = ELV.getThumbUrl({'min': 150, 'max': 350, 'region' : lyon,
                        'dimensions': 512,
                        'palette': ['006633', 'E5FFCC', '662A00', 'D8D8D8', 'F5F5F5']})
Image(url = url3)
ground-elevation-lyon-france

Clip an image on a region of interest

In case you want to display an image over a given reagion (and nbot outside), we can clip our dataset using the region as an argument of the clip() method. Let’s say that we want to display the ground elevation in France. We can get the geometry of the administrative boundary of France with the FAO featureCollection and do as same as before:

# We get the feature collection of Administrative boundaries (level0)
countries = ee.FeatureCollection('FAO/GAUL/2015/level0').select('ADM0_NAME')

# We filter the featureCollection to get the feature we want
france = countries.filter(ee.Filter.eq('ADM0_NAME', 'France'))

# We clip the image on France
ELV_fr = ELV.updateMask(ELV.gt(0)).clip(france)

# Create the url associated to the Image you want
url4 = ELV_fr.getThumbUrl({'min': 0, 'max': 2500, 'region' : roi,
                                              'dimensions': 512,
                                              'palette': ['006633', 'E5FFCC', '662A00', 'D8D8D8', 'F5F5F5']})
# Display a thumbnail of elevation in France.
Image(url = url4)
ground-elevation-france

Export a geotiff file

After manipulating Earth Engine datasets, you may need to export a resulting Image to a geotiff. For example, to use it as an input of a numerical model, or to overlap it with personnal georeferecend files in your favorite GIS. There are multiple ways to do that. Here we explore two:

  • In the first option, we save the Image you want in the Google Drive,
  • In the second option, we directly download the Image.

To export the image to our Google Drive, we have to define a task and start it. We have to specify the size of pixels (here 30 m), the projection (here EPSG:4326), the file format (here GeoTIFF) and the region of interest (here the area of lyon defined before).

task = ee.batch.Export.image.toDrive(image = ELV, description = 'Elevation_near_Lyon_France',
                                     scale = 30, region = lyon.getInfo()['coordinates'],
                                     fileNamePrefix = 'my_export_Lyon', crs = 'EPSG:4326',
                                     fileFormat = 'GeoTIFF')
task.start()

Then we can check the status of our task:

task.status()

Now you can check your google drive to find your file.

Similarly, we can use the getDownloadUrl() method and click on the provided link:

link = ELV.getDownloadUrl({
    'scale': 30,
    'crs': 'EPSG:4326',
    'fileFormat' : 'GeoTIFF',
    'region': lyon.getInfo()['coordinates']})
print(link)

Interactive mapping using folium

To display these GEE datasets on an interactive map, let me introduce you to folium. Folium is a python library based on leaflet.js (Open-source JavaScript library for mobile-friendly interactive maps) that you can use to make interactive maps. Folium supports WMS, GeoJSON layers, vector layers and tile layers which make it very convenient and straightforward to visulatise the data we manipulate with python. To install this folium, we use the python package installer:

!pip install folium

Then we create our first interactive map with one line of code, specifying the location where we want to center the map, the zoom strat, and the main dimensions of the map:

import folium

my_map = folium.Map(location=[45.77, 4.855], zoom_start=10, width = 600)
my_map

On top of this map, we now want to add the GEE layers we studied before: Land Cover (LC), Land Surface Temperature (LST) and Ground Elevation Model (GEM). For each GEE dataset, the process consists in adding a new tile layer to our map specifying some visualisation parameters. Particularly, we want to respect the common LC classes defined in the table of the previous section (hexadecimal codes are given for each class: water bodies are blue, urban area are grey, forests are green, etc.). Then we define visualisations parameters associated to LC as follow:

# Set visualization parameters for land cover.
LCvis_params = {'min': 1,'max': 17,
                'palette': ['05450a','086a10', '54a708', '78d203', '009900',
                            'c6b044', 'dcd159', 'dade48', 'fbff13', 'b6ff05',
                            '27ff87', 'c24f44', 'a5a5a5', 'ff6d4c', '69fff8',
                            'f9ffa4', '1c0dff']}

Then, we use the getMapId method on our LC dataset to get the URL associated to the LC tiles, and we use it as the tiles argument of the TileLayer method applied to our map:

LC = LC.select('LC_Type1').filterDate(i_date) #selection of appropriate bands and dates for LC

# We create a map
my_map = folium.Map(location=[45.77, 4.855], zoom_start=8)

# We get the url associted to the tiles we want to add on our map
LC_ee_url_tiles = LC.getMapId(LCvis_params)['tile_fetcher'].url_format

# We add a new tile layers to our map
folium.TileLayer(
tiles = LC_ee_url_tiles,
attr = 'Google Earth Engine',
name = 'Land Cover',
overlay = True,
control = True
).add_to(my_map)

# We add a layer control button on our map, and we do not want it collapsed:
folium.LayerControl(collapsed = False).add_to(my_map)

# We display the result:
my_map

Of course we can add other datasets similarly, by defining some visualization parameters and by addind the approriate tiles:

# Set visualization parameters for the ground elevation
ELVvis_params = {
  'min': 0,
  'max': 4000,
  'palette': ['006633', 'E5FFCC', '662A00', 'D8D8D8', 'F5F5F5']}
# We get the url associted to the GEM tiles
ELV_ee_url_tiles = ELV.updateMask(ELV.gt(0)).getMapId(ELVvis_params)['tile_fetcher'].url_format

# Set visualization parameters for land surface Temperature.
LSTvis_params = {
  'min': 0,
  'max': 40,
  'palette': ['white','blue','green','yellow','orange','red']}

# Here we consider that we want to disply the mean LST over the year 2016 (previously defined).
# ImageCollection to Image using the mean() method:
LST_im = LST.mean()
# Operation to take into account the scale factor and conversion into Celsius:
LST_im = LST_im.select('LST_Day_1km').multiply(0.02).add(-273.15)
# We get the url associted to the LST tiles
LST_ee_url_tiles = LST_im.getMapId(LSTvis_params)['tile_fetcher'].url_format

# We arrange our tiles urls inside a list:
ee_tiles = [ELV_ee_url_tiles, LST_ee_url_tiles, LC_ee_url_tiles]
# We arrange our tiles names inside a list:
ee_tiles_names = ['Elevation',  'Land Surface Temperature',  'Land Cover']

# We create a new map
my_map = folium.Map(location=[45.77, 4.855], zoom_start=5)

# We make a loop over our layers
for ee_tile, name in zip(ee_tiles, ee_tiles_names):
    folium.TileLayer(
    tiles = ee_tile,
    attr = 'Google Earth Engine',
    name = name,
    overlay = True,
    control = True
    ).add_to(my_map)

folium.LayerControl(collapsed = False).add_to(my_map)

my_map

Finally, the map can be saved in html using the command below. Then, you can open it with your favorite navigator.

my_map.save('my_LC_LST_ELV_interactive_map.html')

Play with the script…

This post is available as a notebook and can be run on Goolge Colab.

Documentation

  • The full documentation of the Google Earth Engine Pyhton API is available here
  • The Google Earth engine User Guide is available here
  • Some tutorials are available here
  • An example based on the Google Earth Engine Javascript console dedicated to Land Surface Temperature estimation is provided in the open access supplementary material of Benz et al., (2017). You can access the code here.