How To Use

Introduction

The API for wrf-python can be summarized as a variable computation/extraction routine, several interpolation routines, and a few plotting helper utilities. The API is kept as simple as possible to help minimize the learning curve for new programmers, students, and scientists. In the future, we plan to extend xarray for programmers desiring a more object oriented API, but this remains a work in progress.

The five most commonly used routines can be summarized as:

  • wrf.getvar() - Extracts WRF-ARW NetCDF variables and computes diagnostic variables that WRF does not compute (e.g. storm relative helicity). This is the routine that you will use most often.
  • wrf.interplevel() - Interpolates a three-dimensional field to a horizontal plane at a specified level using simple (fast) linear interpolation (e.g. 850 hPa temperature).
  • wrf.vertcross() - Interpolates a three-dimensional field to a vertical plane through a user-specified horizontal line (i.e. a cross section).
  • wrf.interpline() - Interpolates a two-dimensional field to a user-specified line.
  • wrf.vinterp() - Interpolates a three-dimensional field to user-specified ‘surface’ levels (e.g. theta-e levels). This is a smarter, albeit slower, version of wrf.interplevel().

Basic Usage

Computing Diagnostic Variables

The primary use for the wrf.getvar() function is to return diagnostic variables that require a calculation, since WRF does not produce these variables natively. These diagnostics include CAPE, storm relative helicity, omega, sea level pressure, etc. A table of all available diagnostics can be found here: Table of Available Diagnostics.

In the example below, sea level pressure is calculated and printed.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the Sea Level Pressure
slp = getvar(ncfile, "slp")

print(slp)

Result:

<xarray.DataArray u'slp' (south_north: 1059, west_east: 1799)>
array([[ 1012.220337,  1012.298157,  1012.247864, ...,  1010.132019,
         1009.932312,  1010.067078],
       [ 1012.432861,  1012.444763,  1012.33667 , ...,  1010.1073  ,
         1010.108459,  1010.047607],
       [ 1012.395447,  1012.380859,  1012.417053, ...,  1010.22937 ,
         1010.055969,  1010.026794],
       ...,
       [ 1009.042358,  1009.069214,  1008.987793, ...,  1019.19281 ,
         1019.144348,  1019.110596],
       [ 1009.224854,  1009.075134,  1008.986389, ...,  1019.071899,
         1019.042664,  1019.061279],
       [ 1009.188965,  1009.107117,  1008.979797, ...,  1018.917786,
         1018.956848,  1019.047485]], dtype=float32)
Coordinates:
    XLONG    (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT     (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
    Time     datetime64[ns] 2016-10-07
Dimensions without coordinates: south_north, west_east
Attributes:
    FieldType: 104
    MemoryOrder: XY
    description: sea level pressure
    units: hPa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)

Extracting WRF NetCDF Variables

In addition to computing diagnostic variables (see Computing Diagnostic Variables), the wrf.getvar() function can be used to extract regular WRF-ARW output NetCDF variables.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

p = getvar(ncfile, "P")

print(p)

Result:

<xarray.DataArray u'P' (bottom_top: 50, south_north: 1059, west_east: 1799)>
array([[[  1.217539e+03,   1.225320e+03, ...,   9.876406e+02,   1.001117e+03],
        [  1.238773e+03,   1.240047e+03, ...,   1.005297e+03,   9.991719e+02],
        ...,
        [  9.208594e+02,   9.059141e+02, ...,   1.902922e+03,   1.904805e+03],
        [  9.172734e+02,   9.091094e+02, ...,   1.894375e+03,   1.903422e+03]],

       [[  1.219562e+03,   1.210273e+03, ...,   9.973984e+02,   9.907891e+02],
        [  1.224578e+03,   1.223508e+03, ...,   9.985547e+02,   9.921172e+02],
        ...,
        [  9.012734e+02,   9.052031e+02, ...,   1.897766e+03,   1.894500e+03],
        [  9.137500e+02,   9.071719e+02, ...,   1.893273e+03,   1.893664e+03]],

       ...,
       [[  7.233154e+00,   7.224121e+00, ...,   3.627930e+00,   3.613770e+00],
        [  7.226318e+00,   7.358154e+00, ...,   3.725098e+00,   3.634033e+00],
        ...,
        [  5.354248e+00,   5.406006e+00, ...,   1.282715e+01,   1.264844e+01],
        [  5.295410e+00,   5.177490e+00, ...,   1.256274e+01,   1.257642e+01]],

       [[  2.362061e+00,   2.376221e+00, ...,   1.151367e+00,   1.156982e+00],
        [  2.342529e+00,   2.403809e+00, ...,   1.198486e+00,   1.155273e+00],
        ...,
        [  1.732910e+00,   1.768799e+00, ...,   4.247070e+00,   4.135498e+00],
        [  1.715332e+00,   1.657227e+00, ...,   4.036377e+00,   4.047852e+00]]], dtype=float32)
Coordinates:
    XLONG    (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT     (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
    Time     datetime64[ns] 2016-10-07
Dimensions without coordinates: bottom_top, south_north, west_east
Attributes:
    FieldType: 104
    MemoryOrder: XYZ
    description: perturbation pressure
    units: Pa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)

Disabling xarray and metadata

Sometimes you just want a regular numpy array and don’t care about metadata. This is often the case when you are working with compiled extensions. Metadata can be disabled in one of two ways.

  1. disable xarray completely
  2. set the meta function parameter to False.

The example below illustrates both.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, disable_xarray

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Disable xarray completely
disable_xarray()
p_no_meta = getvar(ncfile, "P")
print (type(p_no_meta))
enable_xarray()

# Disable by using the meta parameter
p_no_meta = getvar(ncfile, "P", meta=False)
print (type(p_no_meta))

Result:

<type 'numpy.ndarray'>
<type 'numpy.ndarray'>

Extracting a Numpy Array from a DataArray

If you need to convert an xarray.DataArray to a numpy.ndarray, wrf-python provides the wrf.to_np() function for this purpose. Although an xarray.DataArary object already contains the xarray.DataArray.values attribute to extract the Numpy array, there is a problem when working with compiled extensions. The behavior for xarray (and pandas) is to convert missing/fill values to NaN, which may cause crashes when working with compiled extensions. Also, some existing code may be designed to work with numpy.ma.MaskedArray, and numpy arrays with NaN may not work with it.

The wrf.to_np() function does the following:

  1. If no missing/fill values are used, wrf.to_np() simply returns the xarray.DataArray.values attribute.
  2. If missing/fill values are used, then wrf.to_np() replaces the NaN values with the _FillValue found in the xarray.DataArray.attrs attribute (required) and a numpy.ma.MaskedArray is returned.
from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the 3D CAPE, which contains missing values
cape_3d = getvar(ncfile, "cape_3d")

# Since there are missing values, this should return a MaskedArray
cape_3d_ndarray = to_np(cape_3d)

print(type(cape_3d_ndarray))

Result:

<class 'numpy.ma.core.MaskedArray'>

Sequences of Files

Combining Multiple Files Using the ‘cat’ Method

The ‘cat’ (concatenate) method aggregates all files in the sequence along the ‘Time’ dimension, which will be the leftmost dimension for the output array. To include all of the times, in all of the files, in the output array, set the timeidx parameter to wrf.ALL_TIMES (an alias for None). If a single value is specified for timeidx, then the time index is assumed to be taken from the concatenation of all times for all files.

It is import to note that no sorting is performed in the wrf.getvar() routine, so all files in the sequence must be sorted prior to calling this function.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES

# Creating a simple test list with three timesteps
wrflist = [Dataset("wrfout_d01_2016-10-07_00_00_00"),
           Dataset("wrfout_d01_2016-10-07_01_00_00"),
           Dataset("wrfout_d01_2016-10-07_02_00_00")]

# Extract the 'P' variable for all times
p_cat = getvar(wrflist, "P", timeidx=ALL_TIMES, method="cat")

print(p_cat)

Result:

<xarray.DataArray u'P' (Time: 3, bottom_top: 50, south_north: 1059, west_east: 1799)>
array([[[[  1.21753906e+03,   1.22532031e+03,   1.22030469e+03, ...,
        1.00760156e+03,   9.87640625e+02,   1.00111719e+03],
     [  1.23877344e+03,   1.24004688e+03,   1.22926562e+03, ...,
        1.00519531e+03,   1.00529688e+03,   9.99171875e+02],
     [  1.23503906e+03,   1.23367188e+03,   1.23731250e+03, ...,
        1.01739844e+03,   1.00005469e+03,   9.97093750e+02],
     ...,
     [  1.77978516e+00,   1.77050781e+00,   1.79003906e+00, ...,
        4.22949219e+00,   4.25659180e+00,   4.13647461e+00],
     [  1.73291016e+00,   1.76879883e+00,   1.77978516e+00, ...,
        4.24047852e+00,   4.24707031e+00,   4.13549805e+00],
     [  1.71533203e+00,   1.65722656e+00,   1.67480469e+00, ...,
        4.06884766e+00,   4.03637695e+00,   4.04785156e+00]]]], dtype=float32)
Coordinates:
    XLONG        (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT         (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
  * Time         (Time) datetime64[ns] 2016-10-07 2016-10-07 2016-10-07
    datetime     (Time) datetime64[ns] 2016-10-07T00:00:00 ...
Dimensions without coordinates: bottom_top, south_north, west_east
Attributes:
    FieldType: 104
    MemoryOrder: XYZ
    description: perturbation pressure
    units: Pa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)

Combining Multiple Files Using the ‘join’ Method

The ‘join’ method combines a sequence of files by adding a new leftmost dimension for the file/sequence index. In situations where there are multiple files with multiple times, and the last file contains less times than the previous files, the remaining arrays will be arrays filled with missing values. There are checks in place within the wrf-python algorithms to look for these missing arrays, but be careful when calling compiled routines outside of wrf-python.

In most cases, timeidx parameter should be set to wrf.ALL_TIMES. If a timeidx value is specified, then this time index is used when extracting the variable from each file. In cases where there are multiple files with multiple time steps, this is probably nonsensical, since the nth time index for each file represents a different time.

In general, join is rarely used, so the concatenate method should be used for most cases.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES


# Creating a simple test list with three timesteps
wrflist = [Dataset("wrfout_d01_2016-10-07_00_00_00"),
           Dataset("wrfout_d01_2016-10-07_01_00_00"),
           Dataset("wrfout_d01_2016-10-07_02_00_00")]

# Extract the 'P' variable for all times
p_join = getvar(wrflist, "P", timeidx=ALL_TIMES, method="join")

print(p_join)

Result:

<xarray.DataArray u'P' (file: 3, bottom_top: 50, south_north: 1059, west_east: 1799)>
array([[[[  1.217539e+03, ...,   1.001117e+03],
         ...,
         [  9.172734e+02, ...,   1.903422e+03]],
        ...,
        [[  2.362061e+00, ...,   1.156982e+00],
         ...,
         [  1.715332e+00, ...,   4.047852e+00]]]], dtype=float32)
Coordinates:
    XLONG     (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT      (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
  * file      (file) int64 0 1 2
    datetime  (file) datetime64[ns] 2016-10-07 ...
Dimensions without coordinates: bottom_top, south_north, west_east
Attributes:
    FieldType: 104
    MemoryOrder: XYZ
    description: perturbation pressure
    units: Pa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)

Note how the ‘Time’ dimension was replaced with the ‘file’ dimension, due to numpy’s automatic squeezing of the single element ‘Time’ dimension. To maintain the ‘Time’ dimension, set the squeeze parameter to False.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES


# Creating a simple test list with three timesteps
wrflist = [Dataset("wrfout_d01_2016-10-07_00_00_00"),
           Dataset("wrfout_d01_2016-10-07_01_00_00"),
           Dataset("wrfout_d01_2016-10-07_02_00_00")]

# Extract the 'P' variable for all times
p_join = getvar(wrflist, "P", timeidx=ALL_TIMES, method="join", squeeze=False)

print(p_join)

Result

<xarray.DataArray u'P' (file: 3, Time: 1, bottom_top: 50, south_north: 1059, west_east: 1799)>
array([[[[[  1.217539e+03, ...,   1.001117e+03],
      ...,
      [  9.172734e+02, ...,   1.903422e+03]],

     ...,
     [[  2.362061e+00, ...,   1.156982e+00],
      ...,
      [  1.715332e+00, ...,   4.047852e+00]]]]], dtype=float32)
Coordinates:
    XLONG     (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT      (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
  * file      (file) int64 0 1 2
    datetime  (file, Time) datetime64[ns] 2016-10-07 2016-10-07 2016-10-07
Dimensions without coordinates: Time, bottom_top, south_north, west_east
Attributes:
    FieldType: 104
    MemoryOrder: XYZ
    description: perturbation pressure
    units: Pa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)

Dictionaries of WRF File Sequences

Dictionaries can also be used as input to the wrf.getvar() functions. This can be useful when working with ensembles. However, all WRF files in the dictionary must have the same dimensions. The result is an array where the leftmost dimension is the keys from the dictionary. Nested dictionaries are allowed.

The method argument is used to describe how each sequence in the dictionary will be combined.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES

wrf_dict = {"ens1" : [Dataset("ens1/wrfout_d01_2016-10-07_00_00_00"),
                      Dataset("ens1/wrfout_d01_2016-10-07_01_00_00"),
                      Dataset("ens1/wrfout_d01_2016-10-07_02_00_00")],
            "ens2" : [Dataset("ens2/wrfout_d01_2016-10-07_00_00_00"),
                      Dataset("ens2/wrfout_d01_2016-10-07_01_00_00"),
                      Dataset("ens2/wrfout_d01_2016-10-07_02_00_00")]
            }

p = getvar(wrf_dict, "P", timeidx=ALL_TIMES)

print(p)

Result:

<xarray.DataArray 'P' (key_0: 2, Time: 2, bottom_top: 50, south_north: 1059, west_east: 1799)>
array([[[[[  1.217539e+03, ...,   1.001117e+03],
          ...,
          [  9.172734e+02, ...,   1.903422e+03]],

         ...,
         [[  2.362061e+00, ...,   1.156982e+00],
          ...,
          [  1.715332e+00, ...,   4.047852e+00]]]]], dtype=float32)
Coordinates:
    XLONG     (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT      (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
  * Time      (Time) datetime64[ns] 2016-10-07 ...
    datetime  (Time) datetime64[ns] 2016-10-07 ...
  * key_0     (key_0) <U6 u'label1' u'label2'
Dimensions without coordinates: bottom_top, south_north, west_east
Attributes:
    FieldType: 104
    MemoryOrder: XYZ
    description: perturbation pressure
    units: Pa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)

Interpolation Routines

Interpolating to a Horizontal Level

The wrf.interplevel() function is used to interpolate a 3D field to a specific horizontal level, usually pressure or height.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, interplevel

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Extract the Geopotential Height and Pressure (hPa) fields
z = getvar(ncfile, "z")
p = getvar(ncfile, "pressure")

# Compute the 500 MB Geopotential Height
ht_500mb = interplevel(z, p, 500.)

print(ht_500mb)

Result:

<xarray.DataArray u'height_500_hPa' (south_north: 1059, west_east: 1799)>
array([[ 5882.16992188,  5881.87939453,  5881.81005859, ...,
     5890.14501953,  5890.23583984,  5890.33349609],
   [ 5882.71777344,  5882.17529297,  5882.1171875 , ...,
     5890.37695312,  5890.38525391,  5890.27978516],
   [ 5883.32177734,  5882.47119141,  5882.34130859, ...,
     5890.48339844,  5890.42871094,  5890.17724609],
   ...,
   [ 5581.45800781,  5580.46826172,  5579.32617188, ...,
     5788.93554688,  5788.70507812,  5788.64453125],
   [ 5580.32714844,  5579.51611328,  5578.34863281, ...,
     5788.15869141,  5787.87304688,  5787.65527344],
   [ 5579.64404297,  5578.30957031,  5576.98632812, ...,
     5787.19384766,  5787.10888672,  5787.06933594]], dtype=float32)
Coordinates:
    XLONG        (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT         (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
    Time         datetime64[ns] 2016-10-07
Dimensions without coordinates: south_north, west_east
Attributes:
    FieldType: 104
    units: m
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    level: 500 hPa
    missing_value: 9.96920996839e+36
    _FillValue: 9.96920996839e+36

Vertical Cross Sections

The wrf.vertcross() function is used to create vertical cross sections. To define a cross section, a start point and an end point needs to be specified. Alternatively, a pivot point and an angle may be used. The start point, end point, and pivot point are specified using a wrf.CoordPair object, and coordinates can either be in grid (x,y) coordinates or (latitude,longitude) coordinates. When using (latitude,longitude) coordinates, a NetCDF file object or a wrf.WrfProj object must be provided.

The vertical levels can also be specified using the levels parameter. If not specified, then approximately 100 levels will be chosen in 1% increments.

Example Using Start Point and End Point

from __future__ import print_function, division

from netCDF4 import Dataset
from wrf import getvar, vertcross, CoordPair

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the geopotential height (m) and pressure (hPa).
z = getvar(ncfile, "z")
p = getvar(ncfile, "pressure")

# Define a start point and end point in grid coordinates
start_point = CoordPair(x=0, y=(z.shape[-2]-1)//2)
end_point = CoordPair(x=-1, y=(z.shape[-2]-1)//2)

# Calculate the vertical cross section.  By setting latlon to True, this
# also calculates the latitude and longitude coordinates along the cross
# section line and adds them to the 'xy_loc' metadata to help with plotting.
p_vert = vertcross(p, z, start_point=start_point, end_point=end_point, latlon=True)

print(p_vert)

Result:

<xarray.DataArray u'pressure_cross' (vertical: 100, idx: 1798)>
array([[          nan,           nan,           nan, ...,           nan,
              nan,           nan],
   [ 989.66168213,  989.66802979,  989.66351318, ...,  988.05737305,
     987.99151611,  987.96917725],
   [ 959.49450684,  959.50109863,  959.50030518, ...,  958.96948242,
     958.92980957,  958.89294434],
   ...,
   [  24.28092003,   24.27359581,   24.27034378, ...,   24.24800491,
      24.2486496 ,   24.24947357],
   [  23.2868309 ,   23.27933884,   23.27607918, ...,   23.25231361,
      23.2530098 ,   23.25384521],
   [          nan,           nan,           nan, ...,           nan,
              nan,           nan]], dtype=float32)
Coordinates:
    Time      datetime64[ns] 2016-10-07
    xy_loc    (idx) object CoordPair(x=0.0, y=529.0, lat=34.5279502869, lon=-127.398925781) ...
  * vertical  (vertical) float32 0.0 261.828 523.656 785.484 1047.31 1309.14 ...
Dimensions without coordinates: idx
Attributes:
    FieldType: 104
    description: pressure
    units: hPa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    orientation: (0.0, 529.0) to (1797.0, 529.0)
    missing_value: 9.96920996839e+36
    _FillValue: 9.96920996839e+36

Example Using Pivot Point and Angle

from __future__ import print_function, division

from netCDF4 import Dataset
from wrf import getvar, vertcross, CoordPair

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the geopotential height (m) and pressure (hPa).
z = getvar(ncfile, "z")
p = getvar(ncfile, "pressure")

# Define a pivot point and angle in grid coordinates, with the
# pivot point being the center of the grid.
pivot_point = CoordPair(x=(z.shape[-1]-1)//2, y=(z.shape[-2]-1)//2)
angle = 90.0

# Calculate the vertical cross section.  By setting latlon to True, this
# also calculates the latitude and longitude coordinates along the line
# and adds them to the metadata to help with plotting labels.
p_vert = vertcross(p, z, pivot_point=pivot_point, angle=angle, latlon=True)

print (p_vert)

Result:

<xarray.DataArray u'pressure_cross' (vertical: 100, idx: 1798)>
array([[          nan,           nan,           nan, ...,           nan,
              nan,           nan],
   [ 989.66168213,  989.66802979,  989.66351318, ...,  988.05737305,
     987.99151611,  987.96917725],
   [ 959.49450684,  959.50109863,  959.50030518, ...,  958.96948242,
     958.92980957,  958.89294434],
   ...,
   [  24.28092003,   24.27359581,   24.27034378, ...,   24.24800491,
      24.2486496 ,   24.24947357],
   [  23.2868309 ,   23.27933884,   23.27607918, ...,   23.25231361,
      23.2530098 ,   23.25384521],
   [          nan,           nan,           nan, ...,           nan,
              nan,           nan]], dtype=float32)
Coordinates:
    Time      datetime64[ns] 2016-10-07
    xy_loc    (idx) object CoordPair(x=0.0, y=529.0, lat=34.5279502869, lon=-127.398925781) ...
  * vertical  (vertical) float32 0.0 261.828 523.656 785.484 1047.31 1309.14 ...
Dimensions without coordinates: idx
Attributes:
    FieldType: 104
    description: pressure
    units: hPa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    orientation: (0.0, 529.0) to (1797.0, 529.0) ; center=CoordPair(x=899.0, y=529.0) ; angle=90.0
    missing_value: 9.96920996839e+36
    _FillValue: 9.96920996839e+36

Example Using Lat/Lon Coordinates

from __future__ import print_function, division

from netCDF4 import Dataset
from wrf import getvar, vertcross, CoordPair

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the geopotential height (m) and pressure (hPa).
z = getvar(ncfile, "z")
p = getvar(ncfile, "pressure")
lats = getvar(ncfile, "lat")
lons = getvar(ncfile, "lon")

# Making the same horizontal line, but with lats/lons
start_lat = lats[(lats.shape[-2]-1)//2, 0]
end_lat = lats[(lats.shape[-2]-1)//2, -1]
start_lon = lons[(lats.shape[-2]-1)//2, 0]
end_lon = lons[(lats.shape[-2]-1)//2, -1]

# Cross section line using start_point and end_point.
start_point = CoordPair(lat=start_lat, lon=start_lon)
end_point = CoordPair(lat=end_lat, lon=end_lon)

# When using lat/lon coordinates, you must supply a WRF netcdf file object,
# or a projection object with the lower left latitude and lower left
# longitude points.
p_vert = vertcross(p, z, wrfin=ncfile, start_point=start_point, end_point=end_point, latlon=True)
print(p_vert)

Result:

<xarray.DataArray u'pressure_cross' (vertical: 100, idx: 1798)>
array([[          nan,           nan,           nan, ...,           nan,
              nan,           nan],
   [ 989.66168213,  989.66802979,  989.66351318, ...,  988.05737305,
     987.99151611,  987.96917725],
   [ 959.49450684,  959.50109863,  959.50030518, ...,  958.96948242,
     958.92980957,  958.89294434],
   ...,
   [  24.28092003,   24.27359581,   24.27034378, ...,   24.24800491,
      24.2486496 ,   24.24947357],
   [  23.2868309 ,   23.27933884,   23.27607918, ...,   23.25231361,
      23.2530098 ,   23.25384521],
   [          nan,           nan,           nan, ...,           nan,
              nan,           nan]], dtype=float32)
Coordinates:
    Time      datetime64[ns] 2016-10-07
    xy_loc    (idx) object CoordPair(x=0.0, y=529.0, lat=34.5279502869, lon=-127.398925781) ...
  * vertical  (vertical) float32 0.0 261.828 523.656 785.484 1047.31 1309.14 ...
Dimensions without coordinates: idx
Attributes:
    FieldType: 104
    description: pressure
    units: hPa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    orientation: (0.0, 529.0) to (1797.0, 529.0)
    missing_value: 9.96920996839e+36
    _FillValue: 9.96920996839e+36

Example Using Specified Vertical Levels

from __future__ import print_function, division

from netCDF4 import Dataset
from wrf import getvar, vertcross, CoordPair

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the geopotential height (m) and pressure (hPa).
z = getvar(ncfile, "z")
p = getvar(ncfile, "pressure")
lats = getvar(ncfile, "lat")
lons = getvar(ncfile, "lon")

# Making the same horizontal line, but with lats/lons
start_lat = lats[(lats.shape[-2]-1)//2, 0]
end_lat = lats[(lats.shape[-2]-1)//2, -1]
start_lon = lons[(lats.shape[-2]-1)//2, 0]
end_lon = lons[(lats.shape[-2]-1)//2, -1]

# Pressure using start_point and end_point.  These were obtained using
start_point = CoordPair(lat=start_lat, lon=start_lon)
end_point = CoordPair(lat=end_lat, lon=end_lon)

# Specify vertical levels
levels = [1000., 2000., 3000.]

# Calculate the cross section
p_vert = vertcross(p, z, wrfin=ncfile, levels=levels, start_point=start_point, end_point=end_point, latlon=True)

print(p_vert)

Result:

<xarray.DataArray u'pressure_cross' (vertical: 3, idx: 1798)>
array([[ 906.375     ,  906.38043213,  906.39367676, ...,  907.6661377 ,
         907.63006592,  907.59191895],
       [ 804.24737549,  804.26885986,  804.28076172, ...,  806.98632812,
         806.95556641,  806.92608643],
       [ 713.24578857,  713.2722168 ,  713.27886963, ...,  716.09594727,
         716.06610107,  716.03503418]], dtype=float32)
Coordinates:
    Time      datetime64[ns] 2016-10-07
    xy_loc    (idx) object CoordPair(x=0.0, y=529.0, lat=34.5279502869, lon=-127.398925781) ...
  * vertical  (vertical) float32 1000.0 2000.0 3000.0
Dimensions without coordinates: idx
Attributes:
    FieldType: 104
    description: pressure
    units: hPa
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    orientation: (0.0, 529.0) to (1797.0, 529.0)
    missing_value: 9.96920996839e+36
    _FillValue: 9.96920996839e+36

Interpolating Two-Dimensional Fields to a Line

Two-dimensional fields can be interpolated along a line, in a manner similar to the vertical cross section (see Vertical Cross Sections), using the wrf.interpline() function. To define the line to interpolate along, a start point and an end point needs to be specified. Alternatively, a pivot point and an angle may be used. The start point, end point, and pivot point are specified using a wrf.CoordPair object, and coordinates can either be in grid (x,y) coordinates or (latitude,longitude) coordinates. When using (latitude,longitude) coordinates, a NetCDF file object or a wrf.WrfProj object must also be provided.

Example Using Start Point and End Point

from __future__ import print_function, division

from netCDF4 import Dataset
from wrf import getvar, interpline, CoordPair

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the 2m temperature
t2 = getvar(ncfile, "T2")

# Create a south-north line in the center of the domain using
# start point and end point
start_point = CoordPair(x=(t2.shape[-1]-1)//2, y=0)
end_point = CoordPair(x=(t2.shape[-1]-1)//2, y=-1)

# Calculate the vertical cross section.  By setting latlon to True, this
# also calculates the latitude and longitude coordinates along the line
# and adds them to the metadata to help with plotting labels.
t2_line = interpline(t2, start_point=start_point, end_point=end_point, latlon=True)

print(t2_line, "\n")

Result:

<xarray.DataArray u'T2_line' (line_idx: 1058)>
array([ 302.07214355,  302.08505249,  302.08688354, ...,  279.18557739,
        279.1998291 ,  279.23132324], dtype=float32)
Coordinates:
    Time      datetime64[ns] 2016-10-07
    xy_loc    (line_idx) object CoordPair(x=899.0, y=0.0, lat=24.3645858765, lon=-97.5) ...
Dimensions without coordinates: line_idx
Attributes:
    FieldType: 104
    description: TEMP at 2 M
    units: K
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    orientation: (899.0, 0.0) to (899.0, 1057.0)

Example Using Pivot Point and Angle

from __future__ import print_function, division

from netCDF4 import Dataset
from wrf import getvar, interpline, CoordPair

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the 2m temperature
t2 = getvar(ncfile, "T2")

# Create a south-north line using pivot point and angle
pivot_point = CoordPair((t2.shape[-1]-1)//2, (t2.shape[-2]-1)//2)
angle = 0.0

# Calculate the vertical cross section.  By setting latlon to True, this
# also calculates the latitude and longitude coordinates along the line
# and adds them to the metadata to help with plotting labels.
t2_line = interpline(t2, pivot_point=pivot_point, angle=angle, latlon=True)

print(t2_line, "\n")

Result:

<xarray.DataArray u'T2_line' (line_idx: 1058)>
array([ 302.07214355,  302.08505249,  302.08688354, ...,  279.18557739,
        279.1998291 ,  279.23132324], dtype=float32)
Coordinates:
    Time      datetime64[ns] 2016-10-07
    xy_loc    (line_idx) object CoordPair(x=899.0, y=0.0, lat=24.3645858765, lon=-97.5) ...
Dimensions without coordinates: line_idx
Attributes:
    FieldType: 104
    description: TEMP at 2 M
    units: K
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    orientation: (899.0, 0.0) to (899.0, 1057.0) ; center=CoordPair(x=899, y=529) ; angle=0.0

Example Using Lat/Lon Coordinates

from __future__ import print_function, division

from netCDF4 import Dataset
from wrf import getvar, interpline, CoordPair

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

t2 = getvar(ncfile, "T2")
lats = getvar(ncfile, "lat")
lons = getvar(ncfile, "lon")

# Select the latitude,longitude points for a vertical line through
# the center of the domain.
start_lat = lats[0, (lats.shape[-1]-1)//2]
end_lat = lats[-1, (lats.shape[-1]-1)//2]
start_lon = lons[0, (lons.shape[-1]-1)//2]
end_lon = lons[-1, (lons.shape[-1]-1)//2]

# Create the CoordPairs
start_point = CoordPair(lat=start_lat, lon=start_lon)
end_point = CoordPair(lat=end_lat, lon=end_lon)

# Calculate the interpolated line.  To use latitude and longitude points,
# you must supply a WRF NetCDF file object, or a projection object along
# with the lower left latitude and lower left longitude points.
# Also, by setting latlon to True, this routine will
# also calculate the latitude and longitude coordinates along the line
# and adds them to the metadata to help with plotting labels.
t2_line = interpline(t2, wrfin=ncfile, start_point=start_point, end_point=end_point, latlon=True)

print (t2_line)

Result:

<xarray.DataArray u'T2_line' (line_idx: 1058)>
array([ 302.07214355,  302.08505249,  302.08688354, ...,  279.18557739,
        279.1998291 ,  279.23132324], dtype=float32)
Coordinates:
    Time      datetime64[ns] 2016-10-07
    xy_loc    (line_idx) object CoordPair(x=899.0, y=0.0, lat=24.3645858765, lon=-97.5) ...
Dimensions without coordinates: line_idx
Attributes:
    FieldType: 104
    description: TEMP at 2 M
    units: K
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    orientation: (899.0, 0.0) to (899.0, 1057.0)

Interpolating a 3D Field to a Surface Type

The wrf.vinterp() is used to interpolate a field to a type of surface. The available surfaces are pressure, geopotential height, theta, and theta-e. The surface levels to interpolate also need to be specified.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, vinterp

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

tk = getvar(ncfile, "tk")
# Interpolate tk to theta-e levels
interp_levels = [200, 300, 500, 1000]

interp_field = vinterp(ncfile,
               field=tk,
               vert_coord="eth",
               interp_levels=interp_levels,
               extrapolate=True,
               field_type="tk",
               log_p=True)

print(interp_field)

Result:

<xarray.DataArray u'temp' (interp_level: 4, south_north: 1059, west_east: 1799)>
array([[[ 296.12872314,  296.1166687 ,  296.08905029, ...,  301.71026611,
          301.67956543,  301.67791748],
        [ 296.11352539,  295.95581055,  295.91555786, ...,  301.63052368,
          301.62905884,  301.65887451],
        [ 296.07556152,  295.91577148,  295.88214111, ...,  301.61499023,
          301.60287476,  301.63961792],
        ...,
        [ 219.11134338,  219.08581543,  219.08602905, ...,  218.29879761,
          218.30923462,  218.3787384 ],
        [ 219.09260559,  219.07765198,  219.08340454, ...,  218.2855072 ,
          218.30444336,  218.37931824],
        [ 219.07936096,  219.08181763,  219.10089111, ...,  218.31173706,
          218.34288025,  218.3687439 ]]], dtype=float32)
Coordinates:
    XLONG         (south_north, west_east) float32 -122.72 -122.693 -122.666 ...
    XLAT          (south_north, west_east) float32 21.1381 21.1451 21.1521 ...
    Time          datetime64[ns] 2016-10-07
  * interp_level  (interp_level) int64 200 300 500 1000
Dimensions without coordinates: south_north, west_east
Attributes:
    FieldType: 104
    MemoryOrder: XYZ
    description: temperature
    units: K
    stagger:
    coordinates: XLONG XLAT
    projection: LambertConformal(stand_lon=-97.5, moad_cen_lat=38.5000038147,
                                 truelat1=38.5, truelat2=38.5, pole_lat=90.0,
                                 pole_lon=0.0)
    vert_interp_type: eth

Lat/Lon <-> XY Routines

wrf-python includes a set of routines for converting back and forth between latitude,longitude space and x,y space. The methods are wrf.xy_to_ll(), wrf.xy_to_ll_proj(), wrf.ll_to_xy(), wrf.ll_to_xy_proj(). The latitude, longitude, x, and y parameters to these methods can contain sequences if multiple points are desired to be converted.

Example With Single Coordinates

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, interpline, CoordPair, xy_to_ll, ll_to_xy

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

lat_lon = xy_to_ll(ncfile, 400, 200)

print(lat_lon)

x_y = ll_to_xy(ncfile, lat_lon[0], lat_lon[1])

print (x_y)

Result:

<xarray.DataArray u'latlon' (lat_lon: 2)>
array([  28.55816408, -112.67827617])
Coordinates:
  * lat_lon   (lat_lon) <U3 u'lat' u'lon'
    xy_coord  object CoordPair(x=400, y=200)
Dimensions without coordinates: idx


<xarray.DataArray u'xy' (x_y: 2)>
array([400, 200])
Coordinates:
    latlon_coord  object CoordPair(lat=28.5581640822, lon=-112.678276173)
  * x_y           (x_y) <U1 u'x' u'y'
Dimensions without coordinates: idx

Example With Multiple Coordinates

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, interpline, CoordPair, xy_to_ll, ll_to_xy

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

lat_lon = xy_to_ll(ncfile, [400,105], [200,205])

print(lat_lon)

x_y = ll_to_xy(ncfile, lat_lon[0,:], lat_lon[1,:])

print (x_y)

Result:

<xarray.DataArray u'latlon' (lat_lon: 2, idx: 2)>
array([[  28.55816408,   27.03835783],
       [-112.67827617, -121.36392174]])
Coordinates:
  * lat_lon   (lat_lon) <U3 u'lat' u'lon'
    xy_coord  (idx) object CoordPair(x=400, y=200) CoordPair(x=105, y=205)
Dimensions without coordinates: idx


<xarray.DataArray u'xy' (x_y: 2, idx: 2)>
array([[400, 105],
       [200, 205]])
Coordinates:
    latlon_coord  (idx) object CoordPair(lat=28.5581640822, lon=-112.678276173) ...
  * x_y           (x_y) <U1 u'x' u'y'
Dimensions without coordinates: idx

Mapping Helper Routines

wrf-python includes several routines to assist with plotting, primarily for obtaining the mapping object used for cartopy, basemap, and PyNGL. For all three plotting systems, the mapping object can be determined directly from a variable when using xarray, or can be obtained from the WRF output file(s) if xarray is turned off.

Also included are utilities for extracting the geographic boundaries directly from xarray variables. This can be useful in situations where you only want to work with subsets (slices) of a large domain, but don’t want to define the map projection over the subset region.

Cartopy Example Using a Variable

In this example, we’re going to extract the cartopy mapping object from a diagnostic variable (slp), the lat,lon coordinates, and the geographic boundaries. Next, we’re going to take a subset of the diagnostic variable and extract the geographic boundaries. Some of the variables will be printed for demonstration.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, get_cartopy, latlon_coords, geo_bounds

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Use SLP for the example variable
slp = getvar(ncfile, "slp")

# Get the cartopy mapping object
cart_proj = get_cartopy(slp)

print (cart_proj)

# Get the latitude and longitude coordinate.  This is usually needed for plotting.
lats, lons = latlon_coords(slp)

# Get the geobounds for the SLP variable
bounds = geo_bounds(slp)

print (bounds)

# Get the geographic boundaries for a subset of the domain
slp_subset = slp[150:250, 150:250]
slp_subset_bounds = geo_bounds(slp_subset)

print (slp_subset_bounds)

Result:

<cartopy.crs.LambertConformal object at 0x115374290>
GeoBounds(CoordPair(lat=25.9246292114, lon=-119.675048828), CoordPair(lat=29.0761833191, lon=-117.46484375))
GeoBounds(CoordPair(lat=25.9246292114, lon=-119.675048828), CoordPair(lat=29.0761833191, lon=-117.46484375))

Cartopy Example Using WRF Output Files

In this example, the cartopy mapping object and geographic boundaries will be extracted directly from the netcdf variable.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import get_cartopy, geo_bounds

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the cartopy mapping object from the netcdf file
cart_proj = get_cartopy(wrfin=ncfile)

print (cart_proj)

# Get the geobounds from the netcdf file (by default, uses XLAT, XLONG)
# You can supply a variable name to get the staggered boundaries
bounds = geo_bounds(wrfin=ncfile)

print (bounds)

Result:

<cartopy.crs.LambertConformal object at 0x11d3be650>
GeoBounds(CoordPair(lat=21.1381225586, lon=-122.719528198), CoordPair(lat=47.8436355591, lon=-60.9013671875))

Basemap Example Using a Variable

In this example, we’re going to extract the basemap mapping object from a diagnostic variable (slp), the lat,lon coordinates, and the geographic boundaries. Next, we’re going to take a subset of the diagnostic variable and extract the geographic boundaries. Some of the variables will be printed for demonstration.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, get_basemap, latlon_coords, geo_bounds

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

slp = getvar(ncfile, "slp")

# Get the basemap mapping object
bm = get_basemap(slp)

print (bm)

# Get the latitude and longitude coordinate.  This is usually needed for plotting.
lats, lons = latlon_coords(slp)

# Get the geobounds for the SLP variable
bounds = geo_bounds(slp)

print(bounds)

# Get the geographic boundaries for a subset of the domain
slp_subset = slp[150:250, 150:250]
slp_subset_bounds = geo_bounds(slp_subset)

print (slp_subset_bounds)

Result:

<mpl_toolkits.basemap.Basemap object at 0x114d65650>
GeoBounds(CoordPair(lat=21.1381225586, lon=-122.719528198), CoordPair(lat=47.8436355591, lon=-60.9013671875))
GeoBounds(CoordPair(lat=25.9246292114, lon=-119.675048828), CoordPair(lat=29.0761833191, lon=-117.46484375)

Basemap Example Using WRF Output Files

In this example, the basemap mapping object and geographic boundaries will be extracted directly from the netcdf variable.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import get_basemap, geo_bounds

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the basemap object from the netcdf file
bm = get_basemap(wrfin=ncfile)

print (bm)

# Get the geographic boundaries from the netcdf file
bounds = geo_bounds(wrfin=ncfile)

print (bounds)

Result:

<mpl_toolkits.basemap.Basemap object at 0x125bb4750>
GeoBounds(CoordPair(lat=21.1381225586, lon=-122.719528198), CoordPair(lat=47.8436355591, lon=-60.9013671875))

PyNGL Example Using a Variable

In this example, we’re going to extract the basemap mapping object from a diagnostic variable (slp), the lat,lon coordinates, and the geographic boundaries. Next, we’re going to take a subset of the diagnostic variable and extract the geographic boundaries. Some of the variables will be printed for demonstration.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import getvar, get_pyngl, latlon_coords, geo_bounds

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Use SLP as the example variable
slp = getvar(ncfile, "slp")

# Get the pyngl resources from the variable
pyngl_resources = get_pyngl(slp)

print (pyngl_resources)

# Get the latitude and longitude coordinate.  This is needed for plotting.
lats, lons = latlon_coords(slp)

# Get the geobounds from the SLP variable
bounds = geo_bounds(slp)

print(bounds)

# Get the geographic boundaries for a subset of the domain
slp_subset = slp[150:250, 150:250]
slp_subset_bounds = geo_bounds(slp_subset)

print (slp_subset_bounds)

Result:

<Ngl.Resources instance at 0x114cabbd8>
GeoBounds(CoordPair(lat=21.1381225586, lon=-122.719528198), CoordPair(lat=47.8436355591, lon=-60.9013671875))
GeoBounds(CoordPair(lat=25.9246292114, lon=-119.675048828), CoordPair(lat=29.0761833191, lon=-117.46484375))

PyNGL Example Using WRF Output Files

In this example, the basemap mapping object and geographic boundaries will be extracted directly from the netcdf variable.

from __future__ import print_function

from netCDF4 import Dataset
from wrf import get_pyngl, geo_bounds

ncfile = Dataset("wrfout_d01_2016-10-07_00_00_00")

# Get the pyngl resources from the netcdf file
pyngl_resources = get_pyngl(wrfin=ncfile)

print (pyngl_resources)

# Get the geographic boundaries from the netcdf file
bounds = geo_bounds(wrfin=ncfile)

print (bounds)

Result:

<Ngl.Resources instance at 0x115391f80>
GeoBounds(CoordPair(lat=21.1381225586, lon=-122.719528198), CoordPair(lat=47.8436355591, lon=-60.9013671875))

Moving Nests

When a domain nest is moving, the domain boundaries become a function of time when combining the files using the ‘cat’ method. When using ‘join’, the domain boundaries become a function of both file and time. As a result, the methods that depend on geographic boundaries (wrf.geo_bounds(), wrf.get_basemap(), etc) will return arrays of objects rather than a single object when multiple times and/or files are detected in the underlying coordinate variables.

An exception is wrf.get_cartopy(), which contains no geographic boundary information in the mapping object. Instead, the wrf.cartopy_xlim() and wrf.cartopy_ylim() methods can be used to get the array of matplotlib axes boundaries (returned in the axes projection coordinates).

Geographic Boundaries with Moving Nest Example

In this example, the geographic boundaries are extracted from a sequence of files that use a moving nest. The result will be an array of wrf.GeoBounds objects.

from __future__ import print_function

from glob import glob
from netCDF4 import Dataset as nc

from wrf import getvar, ALL_TIMES, geo_bounds

# Get all the domain 02 files
wrf_filenames = glob("wrf_files/wrf_vortex_multi/wrfout_d02_*")
ncfiles = [nc(x) for x in wrf_filenames]

# SLP is the example variable and includes all times
slp = getvar(ncfiles, "slp", timeidx=ALL_TIMES)

# Get the geographic boundaries
bounds = geo_bounds(slp)
print (bounds)

Result:

[ GeoBounds(CoordPair(lat=21.3020038605, lon=-90.5740585327), CoordPair(lat=29.0274410248, lon=-82.0291671753))
 GeoBounds(CoordPair(lat=21.3020038605, lon=-90.3042221069), CoordPair(lat=29.0274410248, lon=-81.7593231201))
 GeoBounds(CoordPair(lat=21.3020038605, lon=-90.8438949585), CoordPair(lat=29.0274410248, lon=-82.2990036011))
 GeoBounds(CoordPair(lat=21.3020038605, lon=-91.1137390137), CoordPair(lat=29.0274410248, lon=-82.5688400269))
 GeoBounds(CoordPair(lat=21.8039493561, lon=-91.6534042358), CoordPair(lat=29.4982528687, lon=-83.1085205078))
 GeoBounds(CoordPair(lat=22.0542640686, lon=-92.193107605), CoordPair(lat=29.7328338623, lon=-83.6481933594))
 GeoBounds(CoordPair(lat=22.5535621643, lon=-92.7327728271), CoordPair(lat=30.2003688812, lon=-84.1878738403))
 GeoBounds(CoordPair(lat=22.8025398254, lon=-93.0026092529), CoordPair(lat=30.4333114624, lon=-84.4577102661))
 GeoBounds(CoordPair(lat=23.0510597229, lon=-93.2724456787), CoordPair(lat=30.665681839, lon=-84.7275543213))]

Cartopy Mapping with Moving Nest Example

In this example, a cartopy mapping object is extracted from a variable that uses a moving nest. Since cartopy objects do not include geographic boundary information, only a single cartopy object is returned. However, if the axes xlimits and ylimits are desired, the wrf.cartopy_xlim() and wrf.cartopy_ylim() functions can be used to obtain the array of moving boundaries in the axes projected coordinate space.

from __future__ import print_function

from glob import glob
from netCDF4 import Dataset as nc

from wrf import getvar, ALL_TIMES, get_cartopy, cartopy_xlim, cartopy_ylim

# Get all of the domain 02 WRF output files
wrf_filenames = glob("wrf_files/wrf_vortex_multi/wrfout_d02_*")
ncfiles = [nc(x) for x in wrf_filenames]

# Use SLP as the example variable and include all times
slp = getvar(ncfiles, "slp", timeidx=ALL_TIMES)

# Get the cartopy mapping object
cart_proj = get_cartopy(slp)
print (cart_proj)
print ("\n")

# Get the array of axes x-limits
xlims = cartopy_xlim(slp)
print (xlims)
print ("\n")

# Get the array of axes y-limits
ylims = cartopy_ylim(slp)
print (ylims)

Result:

<wrf.projection.MercatorWithLatTS object at 0x13893c9b0>

[[-174999.8505754546, 774999.5806103835]
 [-145000.11853874932, 805000.1608638937]
 [-204999.58261215844, 744999.8485736783]
 [-235000.16286567, 715000.1165369744]
 [-294998.77872227144, 654999.804246759]
 [-355001.6356629085, 595000.34017335]
 [-415000.25151950994, 535000.0278831345]
 [-444999.98355621524, 505000.29584642925]
 [-474999.7155929191, 474999.7155929177]]

[[2424828.507236154, 3374828.14098255]
 [2424828.507236154, 3374828.14098255]
 [2424828.507236154, 3374828.14098255]
 [2424828.507236154, 3374828.14098255]
 [2484829.1182174017, 3434828.972518358]
 [2514829.1041871575, 3464828.196283651]
 [2574829.0041584675, 3524828.8880928173]
 [2604829.1786526926, 3554829.5610342724]
 [2634828.9016262344, 3584828.016406863]]

Basemap Mapping with Moving Nest Example

In this example, basemap objects are extracted from a variable that uses a moving nest. An array of basemap objects is returned because the basemap object includes geographic boundary information.

from __future__ import print_function

from glob import glob
from netCDF4 import Dataset as nc

from wrf import getvar, ALL_TIMES, get_basemap

# Get all of the domain 02 WRF output files
wrf_filenames = glob("wrf_files/wrf_vortex_multi/wrfout_d02_*")
ncfiles = [nc(x) for x in wrf_filenames]

# Use SLP as the reference variable and include all times
slp = getvar(ncfiles, "slp", timeidx=ALL_TIMES)

# Get the array of basemap objects
bm = get_basemap(slp)
print (bm)
print ("\n")

Result:

[<mpl_toolkits.basemap.Basemap object at 0x1327bc510>
 <mpl_toolkits.basemap.Basemap object at 0x115a9a790>
 <mpl_toolkits.basemap.Basemap object at 0x115a9a750>
 <mpl_toolkits.basemap.Basemap object at 0x115a9a7d0>
 <mpl_toolkits.basemap.Basemap object at 0x115a9a850>
 <mpl_toolkits.basemap.Basemap object at 0x115a9a8d0>
 <mpl_toolkits.basemap.Basemap object at 0x115a9a950>
 <mpl_toolkits.basemap.Basemap object at 0x115a9a9d0>
 <mpl_toolkits.basemap.Basemap object at 0x115a9aa50>]

PyNGL Mapping with Moving Nest Example

In this example, pyngl resource objects are extracted from a variable that uses a moving nest. An array of pyngl resource objects is returned because the pyngl object includes geographic boundary information.

from __future__ import print_function

from glob import glob
from netCDF4 import Dataset as nc

from wrf import getvar, ALL_TIMES, get_pyngl

# Get the domain 02 WRF output files
wrf_filenames = glob("wrf_files/wrf_vortex_multi/wrfout_d02_*")
ncfiles = [nc(x) for x in wrf_filenames]

# Use SLP as the example variable and include all times
slp = getvar(ncfiles, "slp", timeidx=ALL_TIMES)

# Get the array of pyngl resource objects
bm = get_pyngl(slp)
print (bm)
print ("\n")

Result:

[<Ngl.Resources instance at 0x140cd30e0>
 <Ngl.Resources instance at 0x11d3187a0>
 <Ngl.Resources instance at 0x11d3185a8>
 <Ngl.Resources instance at 0x11d3188c0>
 <Ngl.Resources instance at 0x11d318878>
 <Ngl.Resources instance at 0x11d3183f8>
 <Ngl.Resources instance at 0x11d318950>
 <Ngl.Resources instance at 0x11d318a70>
 <Ngl.Resources instance at 0x11d318710>]

Using OpenMP

Beginning in version 1.1, the Fortran computational routines in wrf-python make use of OpenMP directives. OpenMP enables the calculations to use multiple CPU cores, which can improve performance. In order to use OpenMP features, wrf-python has to be compiled with OpenMP enabled (most pre-built binary installations will have this enabled).

The Fortran computational routines have all been built using runtime scheduling, instead of compile time scheduling, so that the user can choose the scheduler type within their Python application. By default, the scheduling type is set to wrf.OMP_SCHED_STATIC using only 1 CPU core, so wrf-python will behave similarly to the non-OpenMP built versions. For the most part, the difference between the scheduling types is minimal, with the exception being the wrf.OMP_SCHED_DYNAMIC scheduler that is much slower due to the additional overhead associated with it. For new users, using the default scheduler should be sufficient.

Verifying that OpenMP is Enabled

To take advantage of the performance improvements offered by OpenMP, wrf-python needs to have been compiled with OpenMP features enabled. The example below shows how you can determine if OpenMP is enabled in your build of wrf-python.

from __future__ import print_function

from wrf import omp_enabled

print(omp_enabled())

Result:

True

Determining the Number of Available Processors

The example below shows how you can get the maximum number of processors that are available on your system.

from __future__ import print_function

from wrf import omp_get_num_procs

print(omp_get_num_procs())

Result:

8

Specifying the Number of Threads

To enable multicore support via OpenMP, specifying the maximum number of OpenMP threads (i.e. CPU cores) is the only step that you need to take.

In the example below, wrf.omp_set_num_threads() is used to set the maximum number of threads to use, and wrf.omp_get_max_threads() is used to retrieve (and print) the maximum number of threads used.

Note

Although there is an OpenMP routine named wrf.omp_get_num_threads(), this routine will always return 1 when called from the sequential part of the program. Use wrf.omp_get_max_threads() to return the value set by wrf.omp_set_num_threads().

from __future__ import print_function

from wrf import omp_set_num_threads, omp_get_max_threads

omp_set_num_threads(4)

print (omp_get_max_threads())

Result:

4

Setting a Different Scheduler Type

When an OpenMP directive is encountered in the Fortran code, a scheduler is used to determine how the work is divided among the threads. All of the Fortran routines are compiled to use a ‘runtime’ scheduler, which indicates that the scheduler type (from the four listed below) is to be chosen at runtime (i.e. inside a Python script)

By default, the scheduler chosen is the wrf.OMP_SCHED_STATIC scheduler, which should be sufficient for most users. However, OpenMP and wrf-python include the following options for the scheduler type:

  • wrf.OMP_SCHED_STATIC
  • wrf.OMP_SCHED_DYNAMIC
  • wrf.OMP_SCHED_GUIDED
  • wrf.OMP_SCHED_AUTO

Refer to the OpenMP Specification. for more information about these scheduler types. In local testing, wrf.OMP_SCHED_GUIDED produced the best results, but differences between wrf.OMP_SCHED_STATIC, wrf.OMP_SCHED_GUIDED, and wrf.OMP_SCHED_AUTO were minor. However, wrf.OMP_SCHED_DYNAMIC produced noticeably slower results due to the overhead of using a dynamic scheduler.

When setting a scheduler type, the wrf.omp_set_schedule() takes two arguments. The first is the scheduler type (one from the list above), and the second optional argument is a modifier, which is usually referred as the chunk size. If the modifier/chunk_size is set to 0, then the OpenMP default implementation is used. For wrf.OMP_SCHED_AUTO, the modifier is ignored.

If you are new to OpenMP and all this sounds confusing, don’t worry about setting a scheduler type. The default static scheduler will be good enough.

In the example below, the scheduler type is set to wrf.OMP_SCHED_GUIDED and uses the default chunk size of 0. The scheduler type is then read back using wrf.omp_get_schedule() and printed.

from __future__ import print_function

from wrf import omp_set_schedule, omp_get_schedule, OMP_SCHED_GUIDED

omp_set_schedule(OMP_SCHED_GUIDED, 0)

sched, modifier = omp_get_schedule()

print(sched, modifier)

Result:

3 1

Notice that the printed scheduler type (sched variable) is set to a value of 3, which is the actual integer constant value for the wrf.OMP_SCHED_GUIDED scheduler type. The modifier is returned as a value of 1, which is different than the 0 that was supplied to the wrf.omp_set_schedule() routine. This is because the 0 tells OpenMP to use its own default value for the scheduler, which is 1 for this type of scheduler.

Performance Tips

Memory Issues with wrf.ALL_TIMES

The use of wrf.ALL_TIMES for the timeidx parameter to wrf.getvar() is convenient for computing diagnostic variables across multiple files/times, but there is something that users should be aware of. When wrf.ALL_TIMES is set as the timeidx argument, all arrays used in the computation are extracted for all times before the computation is started. This can cause serious memory issues on smaller hardware systems like laptops.

In this example, the user wants to use a data set that is 289 x 39 x 300 x 300 and compute z for the entire data set. The user is using a laptop with 8 GB of memory.

from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES

file_list = [Dataset("/path/to/file1"), Dataset("/path/to/file2"),...]
z = getvar(file_list, "z", ALL_TIMES)

Five hours later, the computation finished. What happened?

In wrf-python, all of the computational routines use 8-byte REAL variables so that both the 4-byte and 8-byte version of WRF output can be used. The calculation for z extracts three variables (P, PHB, and HGT) and returns a fourth array (RESULT). The RESULT will get cut in half to 4-byte REALs after the computation, but needs an 8-byte REAL when the result is computed.

Let’s look at the approximate amount memory needed:

P: 289 x 39 x 300 x 300 x 8 = 8,115,120,000 bytes (~8 GB!)

PHB: 289 x 39 x 300 x 300 x 8 = 8,115,120,000 bytes (~8 GB!)

HGT: 289 x 300 x 300 x 8 = 208,080,000 (~208 MB)

RESULT: 289 x 39 x 300 x 300 x 8 = 8,115,120,000 bytes (~8 GB!)

Yikes! So, in order to do this calculation using wrf.ALL_TIMES as the timeidx, over 24.2 GB are needed for this one calculation. When the laptop runs out of memory, it begins using the hard drive for swap memory, which runs hundreds of times slower than real memory.

To fix this situation, it is better to allocate the output array yourself and run the calculation for each time step in a loop (“loop-and-fill”). The required memory requirements change to:

(Note: only need to store the result in a 4-byte REAL)

FINAL_RESULT: 289 x 39 x 300 x 300 x 4 = 4,057560,000 bytes (~4 GB)

(Note: the numbers below are for each loop iteration)

P: 39 x 300 x 300 x 8 = 28,080,000 bytes (~28 MB)

PHB: 39 x 300 x 300 x 8 = 28,080,000 bytes (~28 MB)

HGT: 300 x 300 x 8 = 720,000 bytes (720 KB)

RESULT: 39 x 300 x 300 x 8 = 28,080,000 bytes (~28 MB)

Since the memory for the computation is deleted after each loop iteration, the total memory usage drops to approximately 4.1 GB.

The moral of the story is that you need to make sure that your system has enough memory to extract everything it needs up front if you want to use wrf.ALL_TIMES, otherwise it is better to “loop-and-fill” yourself.

Here is an example of the “loop-and-fill” technique:

from __future__ import print_function, division

import numpy as np
from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES

filename_list = ["/path/to/file1", "/path/to/file2",...]

# Result shape (hard coded for this example)
result_shape = (289, 39, 300, 300)

# Only need 4-byte floats
z_final = np.empty(result_shape, np.float32)

# Modify this number if using more than 1 time per file
times_per_file = 1

for timeidx in range(result_shape[0]):
    # Compute the file index and the time index inside the file
    fileidx = timeidx // times_per_file
    file_timeidx = timeidx % times_per_file

    f = Dataset(filename_list[fileidx])
    z = getvar(f, "z", file_timeidx)

    z_final[timeidx,:] = z[:]
    f.close()

The cache Argument for wrf.getvar()

If you have read through the documentation, you may have noticed that the wrf.getvar() routine contains a cache argument. What is this for?

Internally, if metadata is turned on, a variable is extracted from the NetCDF file and its metadata is copied to form the result’s metadata. Often this variable is one of the computation’s function arguments, so rather than spend time extracting the variable again for the computation, it is placed in a cache (dictionary) and passed on to the computational function.

What isn’t widely known is that this cache argument can also be supplied by end users wishing to speed up their application. This can be useful in situations where numerous calculations are being performed on the same data set. For many algorithms, the time needed to extract the arrays from the NetCDF file is on par with the time needed to perform the calculation. If you are computing numerous diagnostics, extracting the variables up front allows you to only pay this extraction penalty once, rather than inside of each call to wrf.getvar().

The cache is nothing more than a dictionary where each key is the variable name (e.g. “P”) and the value is the xarray.DataArray or numpy.ndarray variable. Creating the cache dictionary is easy, since the wrf.extract_vars() routine returns a dictionary for a sequence of variables.

Note

The timeidx parameter supplied to extract_vars() must be the same timeidx parameter that you plan to use for wrf.getvar(). Otherwise, it will crash with dimension mismatch errors.

Some common variables that you can use to create an effective cache are: P, PB, PH, PHB, T, QVAPOR, HGT, PSFC, U, V, W.

Below is an example showing the same computation done with and without the cache. The execution time is printed. The hardware used is a 2.8 GHz Intel Core i7, which contains 4 CPU cores with 2 hyper threads (8 total threads). This will be interpreted as 8 CPUs for OpenMP.

from __future__ import print_function

import time
from netCDF4 import Dataset
from wrf import getvar, ALL_TIMES, extract_vars

# The first two files contain four times, the last file contains only one.
wrf_filenames = ["/path/to/wrfout_d02_2005-08-28_00:00:00",
                 "/path/to/wrfout_d02_2005-08-28_12:00:00",
                 "/path/to/wrfout_d02_2005-08-29_00:00:00"]

wrfin = [Dataset(x) for x in wrf_filenames]

start = time.time()
my_cache = extract_vars(wrfin, ALL_TIMES, ("P", "PSFC", "PB", "PH", "PHB",
                                           "T", "QVAPOR", "HGT", "U", "V",
                                           "W"))
end = time.time()
print ("Time taken to build cache: ", (end-start), "s")

vars = ("avo", "eth", "cape_2d", "cape_3d", "ctt", "dbz", "mdbz",
        "geopt", "helicity", "lat", "lon", "omg", "p", "pressure",
        "pvo", "pw", "rh2", "rh", "slp", "ter", "td2", "td", "tc",
        "theta", "tk", "tv", "twb", "updraft_helicity", "ua", "va",
        "wa", "uvmet10", "uvmet", "z", "cfrac", "zstag", "geopt_stag")

# No cache
start = time.time()
for var in vars:
    v = getvar(wrfin, var, ALL_TIMES)
end = time.time()
no_cache_time = (end-start)

print ("Time taken without variable cache: ", no_cache_time, "s")

# With a cache
start = time.time()
for var in vars:
    v = getvar(wrfin, var, ALL_TIMES, cache=my_cache)
end = time.time()
cache_time = (end-start)

print ("Time taken with variable cache: ", cache_time, "s")

improvement = ((no_cache_time-cache_time)/no_cache_time) * 100
print ("The cache decreased computation time by: ", improvement, "%")

Result:

Time taken to build cache:  0.28154706955 s
Time taken without variable cache:  11.0905270576 s
Time taken with variable cache:  8.25931215286 s
The cache decreased computation time by:  25.5282268378 %

By removing the repeated extraction of common variables in the wrf.getvar() routine, for the single threaded case, the computation time has been reduced by 25.5% in this particular example.

Things get more interesting when OpenMP is turned on and set to use the maximum number of processors (in this case 8 threads are used).

from __future__ import print_function

import time
from netCDF4 import Dataset
from wrf import (getvar, ALL_TIMES, extract_vars,
                 omp_set_num_threads, omp_get_num_procs)

# The first two files contain four times, the last file contains only one.
wrf_filenames = ["/path/to/wrfout_d02_2005-08-28_00:00:00",
                 "/path/to/wrfout_d02_2005-08-28_12:00:00",
                 "/path/to/wrfout_d02_2005-08-29_00:00:00"]

wrfin = [Dataset(x) for x in wrf_filenames]

start = time.time()
my_cache = extract_vars(wrfin, ALL_TIMES, ("P", "PSFC", "PB", "PH", "PHB",
                                           "T", "QVAPOR", "HGT", "U", "V",
                                           "W"))
end = time.time()
print ("Time taken to build cache: ", (end-start), "s")

omp_set_num_threads(omp_get_num_procs())

vars = ("avo", "eth", "cape_2d", "cape_3d", "ctt", "dbz", "mdbz",
        "geopt", "helicity", "lat", "lon", "omg", "p", "pressure",
        "pvo", "pw", "rh2", "rh", "slp", "ter", "td2", "td", "tc",
        "theta", "tk", "tv", "twb", "updraft_helicity", "ua", "va",
        "wa", "uvmet10", "uvmet", "z", "cfrac", "zstag", "geopt_stag")

# No cache
start = time.time()
for var in vars:
    v = getvar(wrfin, var, ALL_TIMES)
end = time.time()
no_cache_time = (end-start)

print ("Time taken without variable cache: ", no_cache_time, "s")

# With a cache
start = time.time()
for var in vars:
    v = getvar(wrfin, var, ALL_TIMES, cache=my_cache)
end = time.time()
cache_time = (end-start)

print ("Time taken with variable cache: ", cache_time, "s")

improvement = ((no_cache_time-cache_time)/no_cache_time) * 100
print ("The cache decreased computation time by: ", improvement, "%")

Result:

Time taken to build cache:  0.2700548172 s
Time taken without variable cache:  6.02652812004 s
Time taken with variable cache:  3.27777099609 s
The cache decreased computation time by:  45.6109565772 %

In this example, 4 CPU cores (8 total threads) are used. When the cache is used, the computation time drops by 45%, so almost half the time was spent simply extracting variables from the NetCDF file. When compared to the 11.09 s needed to compute the single threaded case with no variable cache, the computation time drops by roughly 70% (compared to 45% with 8 threads but no cache).

In summary, if you are computing a lot of diagnostic variables, consider using the cache argument to improve performance, particularly if you want to maximize your multithreaded performance with OpenMP.