Weather and climate data¶
xarray
can leverage metadata that follows the Climate and Forecast (CF) conventions if present. Examples include automatic labelling of plots with descriptive names and units if proper metadata is present (see Plotting) and support for non-standard calendars used in climate science through the cftime
module (see Non-standard calendars and dates outside the Timestamp-valid range). There are also a number of geosciences-focused projects that build on xarray (see Xarray related projects).
CF-compliant coordinate variables¶
MetPy adds a metpy
accessor that allows accessing coordinates with appropriate CF metadata using generic names x
, y
, vertical
and time
. There is also a cartopy_crs attribute that provides projection information, parsed from the appropriate CF metadata, as a Cartopy projection object. See their documentation for more information.
Non-standard calendars and dates outside the Timestamp-valid range¶
Through the standalone cftime
library and a custom subclass of
pandas.Index
, xarray supports a subset of the indexing
functionality enabled through the standard pandas.DatetimeIndex
for
dates from non-standard calendars commonly used in climate science or dates
using a standard calendar, but outside the Timestamp-valid range
(approximately between years 1678 and 2262).
Note
As of xarray version 0.11, by default, cftime.datetime
objects
will be used to represent times (either in indexes, as a
CFTimeIndex
, or in data arrays with dtype object) if
any of the following are true:
- The dates are from a non-standard calendar
- Any dates are outside the Timestamp-valid range.
Otherwise pandas-compatible dates from a standard calendar will be
represented with the np.datetime64[ns]
data type, enabling the use of a
pandas.DatetimeIndex
or arrays with dtype np.datetime64[ns]
and their full set of associated features.
For example, you can create a DataArray indexed by a time
coordinate with dates from a no-leap calendar and a
CFTimeIndex
will automatically be used:
In [1]: from itertools import product
In [2]: from cftime import DatetimeNoLeap
In [3]: dates = [DatetimeNoLeap(year, month, 1) for year, month in
...: product(range(1, 3), range(1, 13))]
...:
In [4]: da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
xarray also includes a cftime_range()
function, which enables
creating a CFTimeIndex
with regularly-spaced dates. For
instance, we can create the same dates and DataArray we created above using:
In [5]: dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-db0de1f65d4e> in <module>()
----> 1 dates = xr.cftime_range(start='0001', periods=24, freq='MS', calendar='noleap')
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in cftime_range(start, end, periods, freq, normalize, name, closed, calendar)
961
962 if start is not None:
--> 963 start = to_cftime_datetime(start, calendar)
964 start = _maybe_normalize_date(start, normalize)
965 if end is not None:
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in to_cftime_datetime(date_str_or_date, calendar)
677 "a calendar type must be provided"
678 )
--> 679 date, _ = _parse_iso8601_with_reso(get_date_type(calendar), date_str_or_date)
680 return date
681 elif isinstance(date_str_or_date, cftime.datetime):
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
114 # 1.0.3.4.
115 replace["dayofwk"] = -1
--> 116 return default.replace(**replace), resolution
117
118
cftime/_cftime.pyx in cftime._cftime.datetime.replace()
ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.
In [6]: da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo')
With strftime()
we can also easily generate formatted strings from
the datetime values of a CFTimeIndex
directly or through the
dt()
accessor for a DataArray
using the same formatting as the standard datetime.strftime convention .
In [7]: dates.strftime('%c') --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-7-f1cfa1dad864> in <module>() ----> 1 dates.strftime('%c') AttributeError: 'list' object has no attribute 'strftime' In [8]: da['time'].dt.strftime('%Y%m%d') Out[8]: <xarray.DataArray 'strftime' (time: 24)> array([' 10101', ' 10201', ' 10301', ..., ' 21001', ' 21101', ' 21201'], dtype=object) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00
For data indexed by a CFTimeIndex
xarray currently supports:
- Partial datetime string indexing using strictly ISO 8601-format partial datetime strings:
In [9]: da.sel(time='0001') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-9-7e8fcf8f1e3d> in <module>() ----> 1 da.sel(time='0001') /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs) 1045 method=method, 1046 tolerance=tolerance, -> 1047 **indexers_kwargs 1048 ) 1049 return self._from_temp_dataset(ds) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs) 1998 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel") 1999 pos_indexers, new_indexes = remap_label_indexers( -> 2000 self, indexers=indexers, method=method, tolerance=tolerance 2001 ) 2002 result = self.isel(indexers=pos_indexers, drop=drop) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs) 390 391 pos_indexers, new_indexes = indexing.remap_label_indexers( --> 392 obj, v_indexers, method=method, tolerance=tolerance 393 ) 394 # attach indexer's coordinate to pos_indexers /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance) 259 coords_dtype = data_obj.coords[dim].dtype 260 label = maybe_cast_to_coords_dtype(label, coords_dtype) --> 261 idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance) 262 pos_indexers[dim] = idxr 263 if new_idx is not None: /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in convert_label_indexer(index, label, index_name, method, tolerance) 179 else: 180 indexer = index.get_loc( --> 181 label.item(), method=method, tolerance=tolerance 182 ) 183 elif label.dtype.kind == "b": /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in get_loc(self, key, method, tolerance) 328 """Adapted from pandas.tseries.index.DatetimeIndex.get_loc""" 329 if isinstance(key, str): --> 330 return self._get_string_slice(key) 331 else: 332 return pd.Index.get_loc(self, key, method=method, tolerance=tolerance) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _get_string_slice(self, key) 318 def _get_string_slice(self, key): 319 """Adapted from pandas.tseries.index.DatetimeIndex._get_string_slice""" --> 320 parsed, resolution = _parse_iso8601_with_reso(self.date_type, key) 321 try: 322 loc = self._partial_date_slice(resolution, parsed) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr) 114 # 1.0.3.4. 115 replace["dayofwk"] = -1 --> 116 return default.replace(**replace), resolution 117 118 cftime/_cftime.pyx in cftime._cftime.datetime.replace() ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported. In [10]: da.sel(time=slice('0001-05', '0002-02')) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-10-797d31c211e5> in <module>() ----> 1 da.sel(time=slice('0001-05', '0002-02')) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs) 1045 method=method, 1046 tolerance=tolerance, -> 1047 **indexers_kwargs 1048 ) 1049 return self._from_temp_dataset(ds) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs) 1998 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel") 1999 pos_indexers, new_indexes = remap_label_indexers( -> 2000 self, indexers=indexers, method=method, tolerance=tolerance 2001 ) 2002 result = self.isel(indexers=pos_indexers, drop=drop) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs) 390 391 pos_indexers, new_indexes = indexing.remap_label_indexers( --> 392 obj, v_indexers, method=method, tolerance=tolerance 393 ) 394 # attach indexer's coordinate to pos_indexers /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance) 259 coords_dtype = data_obj.coords[dim].dtype 260 label = maybe_cast_to_coords_dtype(label, coords_dtype) --> 261 idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance) 262 pos_indexers[dim] = idxr 263 if new_idx is not None: /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/indexing.py in convert_label_indexer(index, label, index_name, method, tolerance) 123 _sanitize_slice_element(label.start), 124 _sanitize_slice_element(label.stop), --> 125 _sanitize_slice_element(label.step), 126 ) 127 if not isinstance(indexer, slice): /usr/lib/python3/dist-packages/pandas/core/indexes/base.py in slice_indexer(self, start, end, step, kind) 4105 """ 4106 start_slice, end_slice = self.slice_locs(start, end, step=step, -> 4107 kind=kind) 4108 4109 # return a slice /usr/lib/python3/dist-packages/pandas/core/indexes/base.py in slice_locs(self, start, end, step, kind) 4306 start_slice = None 4307 if start is not None: -> 4308 start_slice = self.get_slice_bound(start, 'left', kind) 4309 if start_slice is None: 4310 start_slice = 0 /usr/lib/python3/dist-packages/pandas/core/indexes/base.py in get_slice_bound(self, label, side, kind) 4232 # For datetime indices label may be a string that has to be converted 4233 # to datetime boundary according to its resolution. -> 4234 label = self._maybe_cast_slice_bound(label, side, kind) 4235 4236 # we need to look up the label /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _maybe_cast_slice_bound(self, label, side, kind) 336 pandas.tseries.index.DatetimeIndex._maybe_cast_slice_bound""" 337 if isinstance(label, str): --> 338 parsed, resolution = _parse_iso8601_with_reso(self.date_type, label) 339 start, end = _parsed_string_to_bounds(self.date_type, resolution, parsed) 340 if self.is_monotonic_decreasing and len(self) > 1: /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr) 114 # 1.0.3.4. 115 replace["dayofwk"] = -1 --> 116 return default.replace(**replace), resolution 117 118 cftime/_cftime.pyx in cftime._cftime.datetime.replace() ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.
- Access of basic datetime components via the
dt
accessor (in this case just “year”, “month”, “day”, “hour”, “minute”, “second”, “microsecond”, “season”, “dayofyear”, and “dayofweek”):
In [11]: da.time.dt.year Out[11]: <xarray.DataArray 'year' (time: 24)> array([1, 1, 1, ..., 2, 2, 2]) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00 In [12]: da.time.dt.month Out[12]: <xarray.DataArray 'month' (time: 24)> array([ 1, 2, 3, ..., 10, 11, 12]) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00 In [13]: da.time.dt.season Out[13]: <xarray.DataArray 'season' (time: 24)> array(['DJF', 'DJF', 'MAM', ..., 'SON', 'SON', 'DJF'], dtype='<U3') Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00 In [14]: da.time.dt.dayofyear Out[14]: <xarray.DataArray 'dayofyear' (time: 24)> array([ 1, 32, 60, ..., 274, 305, 335]) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00 In [15]: da.time.dt.dayofweek Out[15]: <xarray.DataArray 'dayofweek' (time: 24)> array([1, 4, 4, ..., 2, 5, 0]) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00
- Group-by operations based on datetime accessor attributes (e.g. by month of the year):
In [16]: da.groupby('time.month').sum()
Out[16]:
<xarray.DataArray 'foo' (month: 12)>
array([12, 14, 16, ..., 30, 32, 34])
Coordinates:
* month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
- Interpolation using
cftime.datetime
objects:
In [17]: da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)])
Out[17]:
<xarray.DataArray 'foo' (time: 2)>
array([0.452, 1.5 ])
Coordinates:
* time (time) object 0001-01-15 00:00:00 0001-02-15 00:00:00
- Interpolation using datetime strings:
In [18]: da.interp(time=['0001-01-15', '0001-02-15'])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-96491dc4b198> in <module>()
----> 1 da.interp(time=['0001-01-15', '0001-02-15'])
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataarray.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs)
1349 kwargs=kwargs,
1350 assume_sorted=assume_sorted,
-> 1351 **coords_kwargs
1352 )
1353 return self._from_temp_dataset(ds)
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs)
2488
2489 coords = either_dict_or_kwargs(coords, coords_kwargs, "interp")
-> 2490 indexers = dict(self._validate_interp_indexers(coords))
2491
2492 obj = self if assume_sorted else self.sortby([k for k in coords])
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in _validate_interp_indexers(self, indexers)
1790 """Variant of _validate_indexers to be used for interpolation
1791 """
-> 1792 for k, v in self._validate_indexers(indexers):
1793 if isinstance(v, Variable):
1794 if v.ndim == 1:
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/core/dataset.py in _validate_indexers(self, indexers)
1776 v = v.astype("datetime64[ns]")
1777 elif isinstance(index, xr.CFTimeIndex):
-> 1778 v = _parse_array_of_cftime_strings(v, index.date_type)
1779
1780 if v.ndim > 1:
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_array_of_cftime_strings(strings, date_type)
544 """
545 return np.array(
--> 546 [_parse_iso8601_without_reso(date_type, s) for s in strings.ravel()]
547 ).reshape(strings.shape)
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in <listcomp>(.0)
544 """
545 return np.array(
--> 546 [_parse_iso8601_without_reso(date_type, s) for s in strings.ravel()]
547 ).reshape(strings.shape)
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_without_reso(date_type, datetime_str)
522
523 def _parse_iso8601_without_reso(date_type, datetime_str):
--> 524 date, _ = _parse_iso8601_with_reso(date_type, datetime_str)
525 return date
526
/build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
114 # 1.0.3.4.
115 replace["dayofwk"] = -1
--> 116 return default.replace(**replace), resolution
117
118
cftime/_cftime.pyx in cftime._cftime.datetime.replace()
ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.
- Differentiation:
In [19]: da.differentiate('time')
Out[19]:
<xarray.DataArray 'foo' (time: 24)>
array([3.734e-07, 3.944e-07, 3.944e-07, ..., 3.797e-07, 3.797e-07, 3.858e-07])
Coordinates:
* time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00
- Serialization:
In [20]: da.to_netcdf('example-no-leap.nc')
In [21]: xr.open_dataset('example-no-leap.nc')
Out[21]:
<xarray.Dataset>
Dimensions: (time: 24)
Coordinates:
* time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00
Data variables:
foo (time) int64 ...
- And resampling along the time dimension for data indexed by a
CFTimeIndex
:
In [22]: da.resample(time='81T', closed='right', label='right', base=3).mean()
Out[22]:
<xarray.DataArray 'foo' (time: 12428)>
array([ 0., nan, nan, ..., nan, nan, 23.])
Coordinates:
* time (time) object 0001-01-01 00:03:00 ... 0002-12-01 00:30:00
Note
For some use-cases it may still be useful to convert from
a CFTimeIndex
to a pandas.DatetimeIndex
,
despite the difference in calendar types. The recommended way of doing this
is to use the built-in to_datetimeindex()
method:
In [23]: modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-23-6b729ab1ce9d> in <module>() ----> 1 modern_times = xr.cftime_range('2000', periods=24, freq='MS', calendar='noleap') /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in cftime_range(start, end, periods, freq, normalize, name, closed, calendar) 961 962 if start is not None: --> 963 start = to_cftime_datetime(start, calendar) 964 start = _maybe_normalize_date(start, normalize) 965 if end is not None: /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftime_offsets.py in to_cftime_datetime(date_str_or_date, calendar) 677 "a calendar type must be provided" 678 ) --> 679 date, _ = _parse_iso8601_with_reso(get_date_type(calendar), date_str_or_date) 680 return date 681 elif isinstance(date_str_or_date, cftime.datetime): /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr) 114 # 1.0.3.4. 115 replace["dayofwk"] = -1 --> 116 return default.replace(**replace), resolution 117 118 cftime/_cftime.pyx in cftime._cftime.datetime.replace() ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported. In [24]: da = xr.DataArray(range(24), [('time', modern_times)]) --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-24-531789cc9665> in <module>() ----> 1 da = xr.DataArray(range(24), [('time', modern_times)]) NameError: name 'modern_times' is not defined In [25]: da Out[25]: <xarray.DataArray 'foo' (time: 24)> array([ 0, 1, 2, ..., 21, 22, 23]) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-12-01 00:00:00 In [26]: datetimeindex = da.indexes['time'].to_datetimeindex() --------------------------------------------------------------------------- OutOfBoundsDatetime Traceback (most recent call last) /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/times.py in cftime_to_nptime(times) 329 dt = pd.Timestamp( --> 330 t.year, t.month, t.day, t.hour, t.minute, t.second, t.microsecond 331 ) pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__() pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__() pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject() pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject() pandas/_libs/tslibs/np_datetime.pyx in pandas._libs.tslibs.np_datetime.check_dts_bounds() OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00 During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) <ipython-input-26-e98f64c62d2e> in <module>() ----> 1 datetimeindex = da.indexes['time'].to_datetimeindex() /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/cftimeindex.py in to_datetimeindex(self, unsafe) 477 DatetimeIndex(['2000-01-01', '2000-01-02'], dtype='datetime64[ns]', freq=None) 478 """ --> 479 nptimes = cftime_to_nptime(self) 480 calendar = infer_calendar_name(self) 481 if calendar not in _STANDARD_CALENDARS and not unsafe: /build/python-xarray-qR0v64/python-xarray-0.14.0/xarray/coding/times.py in cftime_to_nptime(times) 333 raise ValueError( 334 "Cannot convert date {} to a date in the " --> 335 "standard calendar. Reason: {}.".format(t, e) 336 ) 337 new[i] = np.datetime64(dt) ValueError: Cannot convert date 0001-01-01 00:00:00 to a date in the standard calendar. Reason: Out of bounds nanosecond timestamp: 1-01-01 00:00:00. In [27]: da['time'] = datetimeindex --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-27-8d60be7f4b2c> in <module>() ----> 1 da['time'] = datetimeindex NameError: name 'datetimeindex' is not defined
However in this case one should use caution to only perform operations which do not depend on differences between dates (e.g. differentiation, interpolation, or upsampling with resample), as these could introduce subtle and silent errors due to the difference in calendar types between the dates encoded in your data and the dates stored in memory.