Weather and climate data

xarray can leverage metadata that follows the Climate and Forecast (CF) conventions if present. Examples include automatic labelling of plots with descriptive names and units if proper metadata is present (see Plotting) and support for non-standard calendars used in climate science through the cftime module (see Non-standard calendars and dates outside the Timestamp-valid range). There are also a number of geosciences-focused projects that build on xarray (see Xarray related projects).

CF-compliant coordinate variables

MetPy adds a metpy accessor that allows accessing coordinates with appropriate CF metadata using generic names x, y, vertical and time. There is also a cartopy_crs attribute that provides projection information, parsed from the appropriate CF metadata, as a Cartopy projection object. See their documentation for more information.

Non-standard calendars and dates outside the Timestamp-valid range

Through the standalone cftime library and a custom subclass of pandas.Index, xarray supports a subset of the indexing functionality enabled through the standard pandas.DatetimeIndex for dates from non-standard calendars commonly used in climate science or dates using a standard calendar, but outside the Timestamp-valid range (approximately between years 1678 and 2262).

Note

As of xarray version 0.11, by default, cftime.datetime objects will be used to represent times (either in indexes, as a CFTimeIndex, or in data arrays with dtype object) if any of the following are true:

  • The dates are from a non-standard calendar
  • Any dates are outside the Timestamp-valid range.

Otherwise pandas-compatible dates from a standard calendar will be represented with the np.datetime64[ns] data type, enabling the use of a pandas.DatetimeIndex or arrays with dtype np.datetime64[ns] and their full set of associated features.

For example, you can create a DataArray indexed by a time coordinate with dates from a no-leap calendar and a CFTimeIndex will automatically be used:

In [1]: from itertools import product

In [2]: from cftime import DatetimeNoLeap

In [3]: dates = [DatetimeNoLeap(year, month, 1) for year, month in
   ...:          product(range(1, 3), range(1, 13))]
   ...: 

In [4]: da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')

xarray also includes a cftime_range() function, which enables creating a CFTimeIndex with regularly-spaced dates. For instance, we can create the same dates and DataArray we created above using:

In [5]: dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-db0de1f65d4e> in <module>()
----> 1 dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in cftime_range(start, end, periods, freq, normalize, name, closed, calendar)
    961 
    962     if start is not None:
--> 963         start = to_cftime_datetime(start, calendar)
    964         start = _maybe_normalize_date(start, normalize)
    965     if end is not None:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in to_cftime_datetime(date_str_or_date, calendar)
    677                 "a calendar type must be provided"
    678             )
--> 679         date, _ = _parse_iso8601_with_reso(get_date_type(calendar), date_str_or_date)
    680         return date
    681     elif isinstance(date_str_or_date, cftime.datetime):

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
    114     # 1.0.3.4.
    115     replace["dayofwk"] = -1
--> 116     return default.replace(**replace), resolution
    117 
    118 

cftime/_cftime.pyx in cftime._cftime.datetime.replace()

ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.

In [6]: da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')

With strftime() we can also easily generate formatted strings from the datetime values of a CFTimeIndex directly or through the dt() accessor for a DataArray using the same formatting as the standard datetime.strftime convention .

In [7]: dates.strftime('%c')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-f1cfa1dad864> in <module>()
----> 1 dates.strftime('%c')

AttributeError: 'list' object has no attribute 'strftime'

In [8]: da['time'].dt.strftime('%Y%m%d')
Out[8]: 
<xarray.DataArray 'strftime' (time: 24)>
array(['   10101', '   10201', '   10301', ..., '   21001', '   21101', '   21201'], dtype=object)
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00

For data indexed by a CFTimeIndex xarray currently supports:

In [9]: da.sel(time='0001')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-7e8fcf8f1e3d> in <module>()
----> 1 da.sel(time='0001')

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1045             method=method,
   1046             tolerance=tolerance,
-> 1047             **indexers_kwargs
   1048         )
   1049         return self._from_temp_dataset(ds)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1998         indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
   1999         pos_indexers, new_indexes = remap_label_indexers(
-> 2000             self, indexers=indexers, method=method, tolerance=tolerance
   2001         )
   2002         result = self.isel(indexers=pos_indexers, drop=drop)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs)
    390 
    391     pos_indexers, new_indexes = indexing.remap_label_indexers(
--> 392         obj, v_indexers, method=method, tolerance=tolerance
    393     )
    394     # attach indexer's coordinate to pos_indexers

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance)
    259             coords_dtype = data_obj.coords[dim].dtype
    260             label = maybe_cast_to_coords_dtype(label, coords_dtype)
--> 261             idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
    262             pos_indexers[dim] = idxr
    263             if new_idx is not None:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in convert_label_indexer(index, label, index_name, method, tolerance)
    179             else:
    180                 indexer = index.get_loc(
--> 181                     label.item(), method=method, tolerance=tolerance
    182                 )
    183         elif label.dtype.kind == "b":

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in get_loc(self, key, method, tolerance)
    328         """Adapted from pandas.tseries.index.DatetimeIndex.get_loc"""
    329         if isinstance(key, str):
--> 330             return self._get_string_slice(key)
    331         else:
    332             return pd.Index.get_loc(self, key, method=method, tolerance=tolerance)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _get_string_slice(self, key)
    318     def _get_string_slice(self, key):
    319         """Adapted from pandas.tseries.index.DatetimeIndex._get_string_slice"""
--> 320         parsed, resolution = _parse_iso8601_with_reso(self.date_type, key)
    321         try:
    322             loc = self._partial_date_slice(resolution, parsed)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
    114     # 1.0.3.4.
    115     replace["dayofwk"] = -1
--> 116     return default.replace(**replace), resolution
    117 
    118 

cftime/_cftime.pyx in cftime._cftime.datetime.replace()

ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.

In [10]: da.sel(time=slice('0001-05', '0002-02'))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-797d31c211e5> in <module>()
----> 1 da.sel(time=slice('0001-05', '0002-02'))

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1045             method=method,
   1046             tolerance=tolerance,
-> 1047             **indexers_kwargs
   1048         )
   1049         return self._from_temp_dataset(ds)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1998         indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
   1999         pos_indexers, new_indexes = remap_label_indexers(
-> 2000             self, indexers=indexers, method=method, tolerance=tolerance
   2001         )
   2002         result = self.isel(indexers=pos_indexers, drop=drop)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs)
    390 
    391     pos_indexers, new_indexes = indexing.remap_label_indexers(
--> 392         obj, v_indexers, method=method, tolerance=tolerance
    393     )
    394     # attach indexer's coordinate to pos_indexers

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance)
    259             coords_dtype = data_obj.coords[dim].dtype
    260             label = maybe_cast_to_coords_dtype(label, coords_dtype)
--> 261             idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
    262             pos_indexers[dim] = idxr
    263             if new_idx is not None:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in convert_label_indexer(index, label, index_name, method, tolerance)
    123             _sanitize_slice_element(label.start),
    124             _sanitize_slice_element(label.stop),
--> 125             _sanitize_slice_element(label.step),
    126         )
    127         if not isinstance(indexer, slice):

/usr/lib/python3/dist-packages/pandas/core/indexes/base.py in slice_indexer(self, start, end, step, kind)
   4105         """
   4106         start_slice, end_slice = self.slice_locs(start, end, step=step,
-> 4107                                                  kind=kind)
   4108 
   4109         # return a slice

/usr/lib/python3/dist-packages/pandas/core/indexes/base.py in slice_locs(self, start, end, step, kind)
   4306         start_slice = None
   4307         if start is not None:
-> 4308             start_slice = self.get_slice_bound(start, 'left', kind)
   4309         if start_slice is None:
   4310             start_slice = 0

/usr/lib/python3/dist-packages/pandas/core/indexes/base.py in get_slice_bound(self, label, side, kind)
   4232         # For datetime indices label may be a string that has to be converted
   4233         # to datetime boundary according to its resolution.
-> 4234         label = self._maybe_cast_slice_bound(label, side, kind)
   4235 
   4236         # we need to look up the label

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _maybe_cast_slice_bound(self, label, side, kind)
    336         pandas.tseries.index.DatetimeIndex._maybe_cast_slice_bound"""
    337         if isinstance(label, str):
--> 338             parsed, resolution = _parse_iso8601_with_reso(self.date_type, label)
    339             start, end = _parsed_string_to_bounds(self.date_type, resolution, parsed)
    340             if self.is_monotonic_decreasing and len(self) > 1:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
    114     # 1.0.3.4.
    115     replace["dayofwk"] = -1
--> 116     return default.replace(**replace), resolution
    117 
    118 

cftime/_cftime.pyx in cftime._cftime.datetime.replace()

ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.
  • Access of basic datetime components via the dt accessor (in this case just “year”, “month”, “day”, “hour”, “minute”, “second”, “microsecond”, “season”, “dayofyear”, and “dayofweek”):
In [11]: da.time.dt.year
Out[11]: 
<xarray.DataArray 'year' (time: 24)>
array([1, 1, 1, ..., 2, 2, 2])
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00

In [12]: da.time.dt.month
Out[12]: 
<xarray.DataArray 'month' (time: 24)>
array([ 1,  2,  3, ..., 10, 11, 12])
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00

In [13]: da.time.dt.season
Out[13]: 
<xarray.DataArray 'season' (time: 24)>
array(['DJF', 'DJF', 'MAM', ..., 'SON', 'SON', 'DJF'], dtype='<U3')
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00

In [14]: da.time.dt.dayofyear
Out[14]: 
<xarray.DataArray 'dayofyear' (time: 24)>
array([  1,  32,  60, ..., 274, 305, 335])
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00

In [15]: da.time.dt.dayofweek
Out[15]: 
<xarray.DataArray 'dayofweek' (time: 24)>
array([1, 4, 4, ..., 2, 5, 0])
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00
  • Group-by operations based on datetime accessor attributes (e.g. by month of the year):
In [16]: da.groupby('time.month').sum()
Out[16]: 
<xarray.DataArray 'foo' (month: 12)>
array([12, 14, 16, ..., 30, 32, 34])
Coordinates:
  * month    (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
  • Interpolation using cftime.datetime objects:
In [17]: da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])
Out[17]: 
<xarray.DataArray 'foo' (time: 2)>
array([0.452, 1.5  ])
Coordinates:
  * time     (time) object 0001-01-15 00:00:00 0001-02-15 00:00:00
  • Interpolation using datetime strings:
In [18]: da.interp(time=['0001-01-15', '0001-02-15'])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-96491dc4b198> in <module>()
----> 1 da.interp(time=['0001-01-15', '0001-02-15'])

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataarray.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs)
   1349             kwargs=kwargs,
   1350             assume_sorted=assume_sorted,
-> 1351             **coords_kwargs
   1352         )
   1353         return self._from_temp_dataset(ds)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs)
   2488 
   2489         coords = either_dict_or_kwargs(coords, coords_kwargs, "interp")
-> 2490         indexers = dict(self._validate_interp_indexers(coords))
   2491 
   2492         obj = self if assume_sorted else self.sortby([k for k in coords])

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in _validate_interp_indexers(self, indexers)
   1790         """Variant of _validate_indexers to be used for interpolation
   1791         """
-> 1792         for k, v in self._validate_indexers(indexers):
   1793             if isinstance(v, Variable):
   1794                 if v.ndim == 1:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in _validate_indexers(self, indexers)
   1776                         v = v.astype("datetime64[ns]")
   1777                     elif isinstance(index, xr.CFTimeIndex):
-> 1778                         v = _parse_array_of_cftime_strings(v, index.date_type)
   1779 
   1780                 if v.ndim > 1:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_array_of_cftime_strings(strings, date_type)
    544     """
    545     return np.array(
--> 546         [_parse_iso8601_without_reso(date_type, s) for s in strings.ravel()]
    547     ).reshape(strings.shape)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in <listcomp>(.0)
    544     """
    545     return np.array(
--> 546         [_parse_iso8601_without_reso(date_type, s) for s in strings.ravel()]
    547     ).reshape(strings.shape)

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_without_reso(date_type, datetime_str)
    522 
    523 def _parse_iso8601_without_reso(date_type, datetime_str):
--> 524     date, _ = _parse_iso8601_with_reso(date_type, datetime_str)
    525     return date
    526 

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
    114     # 1.0.3.4.
    115     replace["dayofwk"] = -1
--> 116     return default.replace(**replace), resolution
    117 
    118 

cftime/_cftime.pyx in cftime._cftime.datetime.replace()

ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.
  • Differentiation:
In [19]: da.differentiate('time')
Out[19]: 
<xarray.DataArray 'foo' (time: 24)>
array([3.734e-07, 3.944e-07, 3.944e-07, ..., 3.797e-07, 3.797e-07, 3.858e-07])
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00
  • Serialization:
In [20]: da.to_netcdf('example-no-leap.nc')

In [21]: xr.open_dataset('example-no-leap.nc')
Out[21]: 
<xarray.Dataset>
Dimensions:  (time: 24)
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00
Data variables:
    foo      (time) int64 ...
  • And resampling along the time dimension for data indexed by a CFTimeIndex:
In [22]: da.resample(time='81T', closed='right', label='right', base=3).mean()
Out[22]: 
<xarray.DataArray 'foo' (time: 12428)>
array([ 0., nan, nan, ..., nan, nan, 23.])
Coordinates:
  * time     (time) object 0001-01-01 00:03:00 ... 0002-12-01 00:30:00

Note

For some use-cases it may still be useful to convert from a CFTimeIndex to a pandas.DatetimeIndex, despite the difference in calendar types. The recommended way of doing this is to use the built-in to_datetimeindex() method:

In [23]: modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-6b729ab1ce9d> in <module>()
----> 1 modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap')

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in cftime_range(start, end, periods, freq, normalize, name, closed, calendar)
    961 
    962     if start is not None:
--> 963         start = to_cftime_datetime(start, calendar)
    964         start = _maybe_normalize_date(start, normalize)
    965     if end is not None:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in to_cftime_datetime(date_str_or_date, calendar)
    677                 "a calendar type must be provided"
    678             )
--> 679         date, _ = _parse_iso8601_with_reso(get_date_type(calendar), date_str_or_date)
    680         return date
    681     elif isinstance(date_str_or_date, cftime.datetime):

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
    114     # 1.0.3.4.
    115     replace["dayofwk"] = -1
--> 116     return default.replace(**replace), resolution
    117 
    118 

cftime/_cftime.pyx in cftime._cftime.datetime.replace()

ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.

In [24]: da = xr.DataArray(range(24), [('time', modern_times)])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-24-531789cc9665> in <module>()
----> 1 da = xr.DataArray(range(24), [('time', modern_times)])

NameError: name 'modern_times' is not defined

In [25]: da
Out[25]: 
<xarray.DataArray 'foo' (time: 24)>
array([ 0,  1,  2, ..., 21, 22, 23])
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00

In [26]: datetimeindex = da.indexes['time'].to_datetimeindex()
---------------------------------------------------------------------------
OutOfBoundsDatetime                       Traceback (most recent call last)
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/times.py in cftime_to_nptime(times)
    329             dt = pd.Timestamp(
--> 330                 t.year, t.month, t.day, t.hour, t.minute, t.second, t.microsecond
    331             )

pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()

pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()

pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()

pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject()

pandas/_libs/tslibs/np_datetime.pyx in pandas._libs.tslibs.np_datetime.check_dts_bounds()

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-26-e98f64c62d2e> in <module>()
----> 1 datetimeindex = da.indexes['time'].to_datetimeindex()

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in to_datetimeindex(self, unsafe)
    477         DatetimeIndex(['2000-01-01', '2000-01-02'], dtype='datetime64[ns]', freq=None)
    478         """
--> 479         nptimes = cftime_to_nptime(self)
    480         calendar = infer_calendar_name(self)
    481         if calendar not in _STANDARD_CALENDARS and not unsafe:

/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/times.py in cftime_to_nptime(times)
    333             raise ValueError(
    334                 "Cannot convert date {} to a date in the "
--> 335                 "standard calendar.  Reason: {}.".format(t, e)
    336             )
    337         new[i] = np.datetime64(dt)

ValueError: Cannot convert date 0001-01-01 00:00:00 to a date in the standard calendar.  Reason: Out of bounds nanosecond timestamp: 1-01-01 00:00:00.

In [27]: da['time'] = datetimeindex
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-27-8d60be7f4b2c> in <module>()
----> 1 da['time'] = datetimeindex

NameError: name 'datetimeindex' is not defined

However in this case one should use caution to only perform operations which do not depend on differences between dates (e.g. differentiation, interpolation, or upsampling with resample), as these could introduce subtle and silent errors due to the difference in calendar types between the dates encoded in your data and the dates stored in memory.