❗️ This notebook refers to WaPOR 3.4 and is outdated ❗️

Enhancing data

When running pre_et_look, on some variables extra functions are applied. In this notebook we’ll take a closer look at those functions and how to modify them.

First we install pywapor, in case it’s not installed yet.

[1]:
!pip install pywapor --quiet

And define the usual parameters.

[1]:
import pywapor

project_folder = r"/Users/hmcoerver/enhancers"
latlim = [28.9, 29.7]
lonlim = [30.2, 31.2]
timelim = ["2021-07-01", "2021-07-11"]

level = "level_1"
sources = pywapor.general.levels.pre_et_look_levels(level)

OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.

There are two moments at which these extra functions are applied.

Firstly, just after downloading a certain variable “enhancer” functions are applied. These functions can be configured per datasource and product. Let’s say you would like to use t_air data from different sources, but they come in different units for some reason. In that case you could apply a function to one of them, to convert the data. Another example could be that a certain datasets contains gaps, and you’d want to apply a gap filler to that specific product (think Landsat 7).

Secondly, after having created the composites for all the variables, some functions can be applied to further improve the data. For example the temperature variables are enhanced by using the elevation variable to apply a lapse rate correction.

Let’s focus on three variables in this notebook (ndvi, t_air and z). We’ll remove the other ones from sources like this.

[2]:
sources = {k: v for k, v in sources.items() if k in ["ndvi", "t_air", "z"]}

Next we run pre_et_look for the standard case.

[3]:
ds = pywapor.pre_et_look.main(project_folder, latlim, lonlim, timelim,
                                sources = sources)
> PRE_ET_LOOK
    --> Collecting `ndvi` from `MODIS.MOD13Q1.061`.
        --> Applying 'mask_qa' to `ndvi`.
        --> Saving merged data.
            > peak-memory-usage: 2.3MB, execution-time: 0:00:02.526141.
            > chunksize|dimsize: [time: 1|1, y: 403|403, x: 506|506], crs: EPSG:4326
            > timesize: 1 [2021-07-20T00:00, ..., 2021-07-20T00:00]
    --> Collecting `ndvi` from `MODIS.MYD13Q1.061`.
        --> Applying 'mask_qa' to `ndvi`.
        --> Saving merged data.
            > peak-memory-usage: 2.3MB, execution-time: 0:00:02.058573.
            > chunksize|dimsize: [time: 1|1, y: 403|403, x: 506|506], crs: EPSG:4326
            > timesize: 1 [2021-07-12T00:00, ..., 2021-07-12T00:00]
    --> Collecting `z` from `SRTM.30M`.
        --> Saving merged data.
            > peak-memory-usage: 118.8MB, execution-time: 0:00:02.686063.
            > chunksize|dimsize: [time: 1|1, y: 2883|2883, x: 3603|3603], crs: EPSG:4326
            > timesize: 1 [2000-02-11T00:00, ..., 2000-02-11T00:00]
    --> Collecting `t_air` from `GEOS5.inst3_2d_asm_Nx`.
        --> Downloading data.
            > peak-memory-usage: 169.8KB, execution-time: 0:00:22.110129.
            > chunksize|dimsize: [time: 137|137, y: 5|5, x: 5|5], crs: EPSG:4326
        --> Applying 'kelvin_to_celsius' to `t_air`.
        --> Saving netCDF.
            > peak-memory-usage: 209.2KB, execution-time: 0:00:02.072168.
            > chunksize|dimsize: [time: 137|137, y: 5|5, x: 5|5], crs: EPSG:4326
            > timesize: 137 [2021-06-28T01:30, ..., 2021-07-15T01:30]
    --> Compositing 3 variables.
        --> (1/3) Compositing `ndvi` (mean).
            --> Using `MYD13Q1.061.nc` as reprojecting example.
                > shape: (403, 506), res: 0.0020° x 0.0020°.
            --> Saving `ndvi` composites.
                > peak-memory-usage: 76.7MB, execution-time: 0:00:08.132130.
                > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
        --> (2/3) Compositing `z` (None).
            --> Saving `z` composites.
                > peak-memory-usage: 317.0MB, execution-time: 0:00:02.220688.
                > chunksize|dimsize: [y: 2883|2883, x: 3603|3603], crs: EPSG:4326
        --> (3/3) Compositing `t_air` (mean).
            --> Saving `t_air` composites.
                > peak-memory-usage: 98.1KB, execution-time: 0:00:02.064301.
                > chunksize|dimsize: [time_bins: 11|11, y: 5|5, x: 5|5], crs: EPSG:4326
    --> Using `MOD13Q1.061.nc` as reprojecting example.
        > shape: (403, 506), res: 0.0020° x 0.0020°.
    --> Selected `reproject_chunk` for reprojection of z_bin.nc.
        --> Warping VRT to netCDF.
            > peak-memory-usage: 5.7KB, execution-time: 0:00:00.222935.
        --> Saving reprojected data from z_bin.nc:z (bilinear).
            > peak-memory-usage: 6.2MB, execution-time: 0:00:02.049847.
            > chunksize|dimsize: [y: 403|403, x: 506|506], crs: EPSG:4326
    --> Selected `reproject_chunk` for reprojection of t_air_bin.nc.
        --> Warping VRT to netCDF.
            > peak-memory-usage: 6.0KB, execution-time: 0:00:00.154334.
        --> Saving reprojected data from t_air_bin.nc:t_air (bilinear).
            > peak-memory-usage: 85.6MB, execution-time: 0:00:02.165458.
            > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
    --> Calculating local means (r = 0.25°) of `z`.
        --> Filling 806 missing pixels in 'z'.
    --> Applying 'lapse_rate'.
    --> Applying 'rename_vars'.
    --> Applying 'fill_attrs'.
    --> Applying 'calc_doys'.
    --> Applying 'remove_empty_statics'.
    --> Applying 'add_constants_new'.
    --> Creating merged file `et_look_in.nc`.
        > peak-memory-usage: 94.2MB, execution-time: 0:00:02.396625.
        > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
< PRE_ET_LOOK (0:01:28.737536)

As you can see, a function called kelvin_to_celsius has been applied to t_air from GEOS5.inst3_2d_asm_Nx. We can also see that by plotting the temperature.

[4]:
ds.t_air_24.isel(time_bins = 0).plot()
[4]:
<matplotlib.collections.QuadMesh at 0x17f736150>
../../_images/notebooks_pyWaPOR_enhancers_13_1.png

The default enhancers for a specific product can be accessed like this.

[5]:
pywapor.collect.product.GEOS5.default_post_processors("inst3_2d_asm_Nx", "t_air")
[5]:
{'t_air': [<function pywapor.enhancers.temperature.kelvin_to_celsius(ds, var, in_var=None, out_var=None)>]}

Changing them happens through the enhancers key of the specific variable and product in sources, as you can see it’s currectly set to 'default'.

[6]:
sources["t_air"]["products"][0]
[6]:
{'source': 'GEOS5', 'product_name': 'inst3_2d_asm_Nx', 'enhancers': 'default'}

Instead of providing the string 'default', we can also supply a list of functions instead. So giving an empty list will disable the kelvin_to_celcius function.

[7]:
sources["t_air"]["products"][0]["enhancers"] = []

Before rerunning pre_et_look, we’ll delete the GEOS5 data, to force the script to recalculate it.

[8]:
import os
os.remove(os.path.join(project_folder, "GEOS5", "inst3_2d_asm_Nx.nc"))

ds = pywapor.pre_et_look.main(project_folder, latlim, lonlim, timelim,
                                sources = sources)
> PRE_ET_LOOK
    --> Collecting `ndvi` from `MODIS.MOD13Q1.061`.
            > timesize: 1 [2021-07-20T00:00, ..., 2021-07-20T00:00]
    --> Collecting `ndvi` from `MODIS.MYD13Q1.061`.
            > timesize: 1 [2021-07-12T00:00, ..., 2021-07-12T00:00]
    --> Collecting `z` from `SRTM.30M`.
            > timesize: 1 [2000-02-11T00:00, ..., 2000-02-11T00:00]
    --> Collecting `t_air` from `GEOS5.inst3_2d_asm_Nx`.
        --> Downloading data.
            > peak-memory-usage: 178.8KB, execution-time: 0:00:06.092417.
            > chunksize|dimsize: [time: 137|137, y: 5|5, x: 5|5], crs: EPSG:4326
        --> Saving netCDF.
            > peak-memory-usage: 220.3KB, execution-time: 0:00:02.079685.
            > chunksize|dimsize: [time: 137|137, y: 5|5, x: 5|5], crs: EPSG:4326
            > timesize: 137 [2021-06-28T01:30, ..., 2021-07-15T01:30]
    --> Compositing 3 variables.
        --> (1/3) Compositing `ndvi` (mean).
            --> Using `MYD13Q1.061.nc` as reprojecting example.
                > shape: (403, 506), res: 0.0020° x 0.0020°.
            --> Saving `ndvi` composites.
                > peak-memory-usage: 85.6MB, execution-time: 0:00:08.097447.
                > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
        --> (2/3) Compositing `z` (None).
            --> Saving `z` composites.
                > peak-memory-usage: 317.0MB, execution-time: 0:00:02.214960.
                > chunksize|dimsize: [y: 2883|2883, x: 3603|3603], crs: EPSG:4326
        --> (3/3) Compositing `t_air` (mean).
            --> Saving `t_air` composites.
                > peak-memory-usage: 96.2KB, execution-time: 0:00:02.060108.
                > chunksize|dimsize: [time_bins: 11|11, y: 5|5, x: 5|5], crs: EPSG:4326
    --> Using `MOD13Q1.061.nc` as reprojecting example.
        > shape: (403, 506), res: 0.0020° x 0.0020°.
    --> Selected `reproject_chunk` for reprojection of z_bin.nc.
        --> Warping VRT to netCDF.
            > peak-memory-usage: 5.5KB, execution-time: 0:00:00.223344.
        --> Saving reprojected data from z_bin.nc:z (bilinear).
            > peak-memory-usage: 6.2MB, execution-time: 0:00:02.063322.
            > chunksize|dimsize: [y: 403|403, x: 506|506], crs: EPSG:4326
    --> Selected `reproject_chunk` for reprojection of t_air_bin.nc.
        --> Warping VRT to netCDF.
            > peak-memory-usage: 5.8KB, execution-time: 0:00:00.147901.
        --> Saving reprojected data from t_air_bin.nc:t_air (bilinear).
            > peak-memory-usage: 85.6MB, execution-time: 0:00:02.157935.
            > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
    --> Calculating local means (r = 0.25°) of `z`.
        --> Filling 806 missing pixels in 'z'.
    --> Applying 'lapse_rate'.
    --> Applying 'rename_vars'.
    --> Applying 'fill_attrs'.
    --> Applying 'calc_doys'.
    --> Applying 'remove_empty_statics'.
    --> Applying 'add_constants_new'.
    --> Creating merged file `et_look_in_.nc`.
        > peak-memory-usage: 85.6MB, execution-time: 0:00:02.391275.
        > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
< PRE_ET_LOOK (0:00:29.250795)

The temperature is now stored in Kelvin.

[9]:
ds.t_air_24.isel(time_bins = 0).plot()
[9]:
<matplotlib.collections.QuadMesh at 0x28704ead0>
../../_images/notebooks_pyWaPOR_enhancers_23_1.png

Another type of enhancement is applied near the end of pre_et_look. These functions can be adjusted through the keyword argument enhancers in the pywapor.pre_et_look.main function.

For example, runnning with an empty list, will disable the lapse_rate correction.

[10]:
ds = pywapor.pre_et_look.main(project_folder, latlim, lonlim, timelim,
                                sources = sources, enhancers = [])
> PRE_ET_LOOK
    --> Collecting `ndvi` from `MODIS.MOD13Q1.061`.
            > timesize: 1 [2021-07-20T00:00, ..., 2021-07-20T00:00]
    --> Collecting `ndvi` from `MODIS.MYD13Q1.061`.
            > timesize: 1 [2021-07-12T00:00, ..., 2021-07-12T00:00]
    --> Collecting `z` from `SRTM.30M`.
            > timesize: 1 [2000-02-11T00:00, ..., 2000-02-11T00:00]
    --> Collecting `t_air` from `GEOS5.inst3_2d_asm_Nx`.
            > timesize: 137 [2021-06-28T01:30, ..., 2021-07-15T01:30]
    --> Compositing 3 variables.
        --> (1/3) Compositing `ndvi` (mean).
            --> Using `MYD13Q1.061.nc` as reprojecting example.
                > shape: (403, 506), res: 0.0020° x 0.0020°.
            --> Saving `ndvi` composites.
                > peak-memory-usage: 79.4MB, execution-time: 0:00:08.113078.
                > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
        --> (2/3) Compositing `z` (None).
            --> Saving `z` composites.
                > peak-memory-usage: 317.0MB, execution-time: 0:00:02.210364.
                > chunksize|dimsize: [y: 2883|2883, x: 3603|3603], crs: EPSG:4326
        --> (3/3) Compositing `t_air` (mean).
            --> Saving `t_air` composites.
                > peak-memory-usage: 105.7KB, execution-time: 0:00:02.071231.
                > chunksize|dimsize: [time_bins: 11|11, y: 5|5, x: 5|5], crs: EPSG:4326
    --> Using `MOD13Q1.061.nc` as reprojecting example.
        > shape: (403, 506), res: 0.0020° x 0.0020°.
    --> Selected `reproject_chunk` for reprojection of z_bin.nc.
        --> Warping VRT to netCDF.
            > peak-memory-usage: 5.5KB, execution-time: 0:00:00.226198.
        --> Saving reprojected data from z_bin.nc:z (bilinear).
            > peak-memory-usage: 6.2MB, execution-time: 0:00:02.076547.
            > chunksize|dimsize: [y: 403|403, x: 506|506], crs: EPSG:4326
    --> Selected `reproject_chunk` for reprojection of t_air_bin.nc.
        --> Warping VRT to netCDF.
            > peak-memory-usage: 5.8KB, execution-time: 0:00:00.146248.
        --> Saving reprojected data from t_air_bin.nc:t_air (bilinear).
            > peak-memory-usage: 85.6MB, execution-time: 0:00:02.157827.
            > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
    --> Applying 'rename_vars'.
    --> Applying 'fill_attrs'.
    --> Applying 'calc_doys'.
    --> Applying 'remove_empty_statics'.
    --> Applying 'add_constants_new'.
    --> Creating merged file `et_look_in__.nc`.
        > peak-memory-usage: 68.6MB, execution-time: 0:00:02.346385.
        > chunksize|dimsize: [time_bins: 11|11, y: 403|403, x: 506|506], crs: EPSG:4326
< PRE_ET_LOOK (0:00:19.561168)
[11]:
ds.t_air_24.isel(time_bins = 0).plot()
[11]:
<matplotlib.collections.QuadMesh at 0x17f7202d0>
../../_images/notebooks_pyWaPOR_enhancers_26_1.png

Besides turning off these enhancers, it is also possible to apply other enhancers to variables, e.g. your own gap-filling algorithm or your custom image sharpener.