functions reference API¶

dimarray functions are listed below by topic, along with examples. DimArray Methods are provided in a separate page DimArray API.

Join
Align
Interpolate
Stats
Read netCDF data
dimarray options

Join ¶

dimarray.stack(arrays, axis=None, keys=None, align=False, **kwargs)[source]¶

stack arrays along a new dimension (raise error if already existing)

Parameters:	arrays : sequence or dict of arrays axis : str, optional new dimension along which to stack the array keys : array-like, optional stack axis values, useful if array is a sequence, or a non-ordered dictionary align : bool, optional if True, align axes prior to stacking (Default to False) **kwargs : optional key-word arguments passed to align, if align is True
Returns:	DimArray : joint array

See also

concatenate: join arrays along an existing dimension
swapaxes: to modify the position of the newly inserted axis

Examples

>>> from dimarray import DimArray
>>> a = DimArray([1,2,3])
>>> b = DimArray([11,22,33])
>>> stack([a, b], axis='stackdim', keys=['a','b'])
dimarray: 6 non-null elements (0 null)
0 / stackdim (2): 'a' to 'b'
1 / x0 (3): 0 to 2
array([[ 1,  2,  3],
       [11, 22, 33]])

dimarray.concatenate(arrays, axis=0, _no_check=False, align=False, **kwargs)[source]¶

concatenate several DimArrays

Parameters:	arrays : list of DimArrays arrays to concatenate axis : int or str axis along which to concatenate (must exist) align : bool, optional align secondary axes before joining on the primary axis axis. Default to False. **kwargs : optional key-word arguments passed to align, if align is True
Returns:	concatenated DimArray

See also

stack: join arrays along a new dimension
align: align arrays

Examples

1-D

>>> from dimarray import DimArray
>>> a = DimArray([1,2,3], axes=[['a','b','c']])
>>> b = DimArray([4,5,6], axes=[['d','e','f']])
>>> concatenate((a, b))
dimarray: 6 non-null elements (0 null)
0 / x0 (6): 'a' to 'f'
array([1, 2, 3, 4, 5, 6])

2-D

>>> a = DimArray([[1,2,3],[11,22,33]])
>>> b = DimArray([[4,5,6],[44,55,66]])
>>> concatenate((a, b), axis=0)
dimarray: 12 non-null elements (0 null)
0 / x0 (4): 0 to 1
1 / x1 (3): 0 to 2
array([[ 1,  2,  3],
       [11, 22, 33],
       [ 4,  5,  6],
       [44, 55, 66]])
>>> concatenate((a, b), axis='x1')
dimarray: 12 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (6): 0 to 2
array([[ 1,  2,  3,  4,  5,  6],
       [11, 22, 33, 44, 55, 66]])

dimarray.stack_ds(datasets, axis, keys=None, align=False, **kwargs)[source]¶

stack dataset along a new dimension

Parameters:	datasets: sequence or dict of datasets axis: str, new dimension along which to stack the dataset keys, optional: stack axis values, useful if dataset is a sequence, or a non-ordered dictionary *align, optional: if True, align axes (via reindexing) prior* to stacking kwargs : optional key-word arguments passed to align, if align is True
Returns:	stacked dataset

See also

concatenate_ds, stack, sort_axis

Examples

>>> a = DimArray([1,2,3], dims=('dima',))
>>> b = DimArray([11,22], dims=('dimb',))
>>> ds = Dataset(a=a,b=b) # dataset of 2 variables from an experiment
>>> ds2 = Dataset(a=a*2,b=b*2) # dataset of 2 variables from a second experiment
>>> stack_ds([ds, ds2], axis='stackdim', keys=['exp1','exp2'])
Dataset of 2 variables
0 / stackdim (2): 'exp1' to 'exp2'
1 / dima (3): 0 to 2
2 / dimb (2): 0 to 1
a: ('stackdim', 'dima')
b: ('stackdim', 'dimb')

dimarray.concatenate_ds(datasets, axis=0, align=False, **kwargs)[source]¶

concatenate two datasets along an existing dimension

Parameters:	datasets: sequence of datasets axis: axis along which to concatenate align, optional: if True, align secondary axes (via reindexing) prior to concatenating **kwargs : optional key-word arguments passed to align, if align is True
Returns:	joint Dataset along axis NOTE: will raise an error if variables are there which do not contain the required dimension

See also

stack_ds, concatenate, sort_axis

Examples

>>> a = da.zeros(axes=[list('abc')], dims=('x0',))  # 1-D DimArray
>>> b = da.zeros(axes=[list('abc'), [1,2]], dims=('x0','x1')) # 2-D DimArray
>>> ds = Dataset(a=a,b=b) # dataset of 2 variables from an experiment
>>> a2 = da.ones(axes=[list('def')], dims=('x0',)) 
>>> b2 = da.ones(axes=[list('def'), [1,2]], dims=('x0','x1')) # 2-D DimArray
>>> ds2 = Dataset(a=a2,b=b2) # dataset of 2 variables from a second experiment
>>> concatenate_ds([ds, ds2])
Dataset of 2 variables
0 / x0 (6): 'a' to 'f'
1 / x1 (2): 1 to 2
a: ('x0',)
b: ('x0', 'x1')

Align ¶

dimarray.align_axes(*args, **kwargs)¶: Deprecated. Now renamed to align

dimarray.align_dims(*arrays)[source]¶

Align dimensions of a list of arrays so that they are ready for broadcast.

Method: inserting singleton axes at the right place and transpose where needed. Note : not part of public API, but used in other dimarray modules

Examples

>>> import dimarray as da
>>> import numpy as np
>>> x = da.DimArray(np.arange(2), dims=('x0',))
>>> y = da.DimArray(np.arange(3), dims=('x1',))
>>> align_dims(x, y)
[dimarray: 2 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (1): None to None
array([[0],
       [1]]), dimarray: 3 non-null elements (0 null)
0 / x0 (1): None to None
1 / x1 (3): 0 to 2
array([[0, 1, 2]])]

dimarray.broadcast_arrays(*arrays)[source]¶

Analogous to numpy.broadcast_arrays

but with looser requirements on input shape and returns copy instead of views

Parameters:	arrays : variable list of DimArrays
Returns:	list of DimArrays

Examples

Just as numpy’s broadcast_arrays

>>> import dimarray as da
>>> x = da.DimArray([[1,2,3]])
>>> y = da.DimArray([[1],[2],[3]])
>>> da.broadcast_arrays(x, y)
[dimarray: 9 non-null elements (0 null)
0 / x0 (3): 0 to 2
1 / x1 (3): 0 to 2
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]]), dimarray: 9 non-null elements (0 null)
0 / x0 (3): 0 to 2
1 / x1 (3): 0 to 2
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])]

Interpolate ¶

dimarray.interp2d(dim_array, newaxes, dims=(-2, -1), **kwargs)[source]¶

Two-dimensional interpolation

Parameters:

dim_array : DimArray instance
newaxes : sequence of two array-like, or dict.: axes on which to interpolate
dims : sequence of two axis names or integer rank, optional: Indicate dimensions which match newaxes. By default (-2, -1) (last two dimensions).
**kwargs : passed to scipy.interpolate.RegularGridInterpolator: method : ‘nearest’ or ‘linear’ (default) bounds_error : True by default fill_value : np.nan by default, but set to None to extrapolate outside bounds.

Returns:

dim_array_int : DimArray instance: interpolated array

Examples

>>> from dimarray import DimArray, interp2d
>>> x = np.array([0, 1, 2])
>>> y = np.array([0, 10])
>>> a = DimArray([[0,0,1],[1,0.,0.]], [('y',y),('x',x)])
>>> a
dimarray: 6 non-null elements (0 null)
0 / y (2): 0 to 10
1 / x (3): 0 to 2
array([[0., 0., 1.],
       [1., 0., 0.]])
>>> newx = [0.5, 1.5]
>>> newy = np.linspace(0,10,5)
>>> ai = interp2d(a, [newy, newx])
>>> ai
dimarray: 10 non-null elements (0 null)
0 / y (5): 0.0 to 10.0
1 / x (2): 0.5 to 1.5
array([[0.   , 0.5  ],
       [0.125, 0.375],
       [0.25 , 0.25 ],
       [0.375, 0.125],
       [0.5  , 0.   ]])

Use dims keyword argument if new axes order does not match array dimensions >>> (ai == interp2d(a, [newx, newy], dims=(‘x’,’y’))).all() True

Out-of-bounds filled with NaN: >>> newx = [-1, 1] >>> newy = [-5, 0, 10] >>> interp2d(a, [newy, newx], bounds_error=False) dimarray: 2 non-null elements (4 null) 0 / y (3): -5 to 10 1 / x (2): -1 to 1 array([[nan, nan],

[nan, 0.], [nan, 0.]])

Nearest neighbor interpolation and out-of-bounds extrapolation >>> interp2d(a, [newy, newx], method=’nearest’, bounds_error=False, fill_value=None) dimarray: 6 non-null elements (0 null) 0 / y (3): -5 to 10 1 / x (2): -1 to 1 array([[0., 0.],

[0., 0.], [1., 0.]])

Stats ¶

dimarray.percentile(a, pct, axis=0, newaxis=None, out=None, overwrite_input=False)[source]¶

calculate percentile along an axis

Parameters:	pct: float, percentile or sequence of percentiles (0< <100) axis, optional, default 0: axis along which to compute percentiles newaxis, optional: name of the new percentile axis, if more than one pct. By default, append “_percentile” to the axis name on which the transformation is applied. out, overwrite_input: passed to numpy’s percentile method (see documentation)
Returns:	pctiles: DimArray or scalar whose required axis has been reduced or replaced by percentiles

Examples

>>> from dimarray import DimArray
>>> np.random.seed(0) # for reproductibility of results
>>> a = DimArray(np.random.randn(1000), dims=['sample'])
>>> percentile(a, 50)
-0.058028034799627745

>>> percentile(a, [50, 95])
dimarray: 2 non-null elements (0 null)
0 / sample_percentile (2): 50 to 95
array([-0.05802803,  1.66012041])

Read netCDF data ¶

dimarray.read_nc(f, names=None, *args, **kwargs)[source]¶

Wrapper around DatasetOnDisk.read

Read one or several variables from one or several netCDF file

Parameters:

f : str or netCDF handle: netCDF file to read from or regular expression
names : None or list or str, optional: variable name(s) to read default is None
indices : int or list or slice (single-dimensional indices): or a tuple of those (multi-dimensional) or dict of { axis name : axis indices }

Indices refer to Dataset axes. Any item that does not possess one of the dimensions will not be indexed along that dimension. For example, scalar items will be left unchanged whatever indices are provided.
indexing : {‘label’, ‘position’}, optional: Indexing mode. - “label”: indexing on axis labels (default) - “position”: use numpy-like position index Default value can be changed in dimarray.rcParams[‘indexing.by’]
tol : float, optional: tolerance when looking for numerical values, e.g. to use nearest neighbor search, default None.
keepdims : bool, optional: keep singleton dimensions (default False)
axis : str, optional: When reading multiple files, axis along which to join the dimarrays or datasets. It the axis already exist, the resulting arrays will be concatenated, otherwise they will be stacked along a new array (in the sense of the numpy functions concatenate and stack)
keys : sequence, optional: When reading multiple files, keys for the join axis. If the axis already exists in the dataset, the concatenated dataset/dimarray will be re-indexed along the provided key, otherwise the keys will be used to create a new axis for stacking. In the latter case, keys’ length needs to exactly match the number of input files, and if not provided, file names will be taken instead. Note you may manually rename the axes later, or use the set_axis method.
align : bool, optional: When reading multiple files, passed to stack (new axis) or concatenate (existing axis) to reindex all arrays onto common axes. (in concatenate mode, the concatenation axis is not re-indexed of course, only the secondary axes) Default to False.
**kwargs : optional key-word arguments passed to align, if align is True: When reading multiple files, passed to stack (new axis) or This includes: sort (False by default) and join (‘outer’ by default)

Returns:

obj : DimArray or Dataset: depending on whether a (single) variable name is passed as argument (names) or not

See also

DatasetOnDisk.read, stack, concatenate, stack_ds, concatenate_ds, align, DimArray.write_nc, Dataset.write_nc

Examples

>>> import os
>>> from dimarray import read_nc, get_datadir

Single netCDF file

>>> ncfile = os.path.join(get_datadir(), 'cmip5.CSIRO-Mk3-6-0.nc')

>>> data = read_nc(ncfile)  # load full file
>>> data
Dataset of 2 variables
0 / time (451): 1850 to 2300
1 / scenario (5): u'historical' to u'rcp85'
tsl: (u'time', u'scenario')
temp: (u'time', u'scenario')
>>> data = read_nc(ncfile,'temp') # only one variable
>>> data = read_nc(ncfile,'temp', indices={"time":slice(2000,2100), "scenario":"rcp45"})  # load only a chunck of the data
>>> data = read_nc(ncfile,'temp', indices={"time":1950.3}, tol=0.5)  #  approximate matching, adjust tolerance
>>> data = read_nc(ncfile,'temp', indices={"time":-1}, indexing='position')  #  integer position indexing

Multiple files Read variable ‘temp’ across multiple files (representing various climate models) In this case the variable is a time series, whose length may vary across experiments (thus align=True is passed to reindex axes before stacking)

>>> direc = get_datadir()
>>> temp = da.read_nc(direc+'/cmip5.*.nc', 'temp', align=True, axis='model')

A new ‘model’ axis is created labeled with file names. It is then possible to rename it more appropriately, e.g. keeping only the part directly relevant to identify the experiment:

>>> getmodel = lambda x: os.path.basename(x).split('.')[1] # extract model name from path
>>> temp.set_axis(getmodel, axis='model') # would return a copy if inplace is not specified
>>> temp
dimarray: 9114 non-null elements (6671 null)
0 / model (7): 'CSIRO-Mk3-6-0' to 'MPI-ESM-MR'
1 / time (451): 1850 to 2300
2 / scenario (5): u'historical' to u'rcp85'
array(...)

This works on datasets as well:

>>> ds = da.read_nc(direc+'/cmip5.*.nc', align=True, axis='model')
>>> ds.set_axis(getmodel, axis='model')
>>> ds
Dataset of 2 variables
0 / model (7): 'CSIRO-Mk3-6-0' to 'MPI-ESM-MR'
1 / time (451): 1850 to 2300
2 / scenario (5): u'historical' to u'rcp85'
tsl: ('model', u'time', u'scenario')
temp: ('model', u'time', u'scenario')

dimarray.summary_nc(fname, name=None, metadata=False)[source]¶: Print summary information about the content of a netCDF file Deprecated, see dimarray.open_nc

dimarray.get_datadir()[source]¶: Return directory name for the datasets

dimarray.get_ncfile(fname='cmip5.CSIRO-Mk3-6-0.nc')[source]¶: Return one netCDF file

dimarray options ¶

dimarray.print_options()[source]¶

dimarray.get_option(name)[source]¶

dimarray.set_option(name, value)[source]¶: set global options

functions reference API¶

Join¶

Align¶

Interpolate¶

Stats¶

Read netCDF data¶

dimarray options¶

Join ¶

Align ¶

Interpolate ¶

Stats ¶

Read netCDF data ¶

dimarray options ¶