functions reference API

dimarray functions are listed below by topic, along with examples. DimArray Methods are provided in a separate page DimArray API.

Join

dimarray.stack(arrays, axis=None, keys=None, align=False, **kwargs)[source]

stack arrays along a new dimension (raise error if already existing)

Parameters:

arrays : sequence or dict of arrays

axis : str, optional

new dimension along which to stack the array

keys : array-like, optional

stack axis values, useful if array is a sequence, or a non-ordered dictionary

align : bool, optional

if True, align axes prior to stacking (Default to False)

**kwargs : optional key-word arguments passed to align, if align is True

Returns:

DimArray : joint array

See also

concatenate
join arrays along an existing dimension
swapaxes
to modify the position of the newly inserted axis

Examples

>>> from dimarray import DimArray
>>> a = DimArray([1,2,3])
>>> b = DimArray([11,22,33])
>>> stack([a, b], axis='stackdim', keys=['a','b'])
dimarray: 6 non-null elements (0 null)
0 / stackdim (2): 'a' to 'b'
1 / x0 (3): 0 to 2
array([[ 1,  2,  3],
       [11, 22, 33]])

dimarray.concatenate(arrays, axis=0, _no_check=False, align=False, **kwargs)[source]

concatenate several DimArrays

Parameters:

arrays : list of DimArrays

arrays to concatenate

axis : int or str

axis along which to concatenate (must exist)

align : bool, optional

align secondary axes before joining on the primary axis axis. Default to False.

**kwargs : optional key-word arguments passed to align, if align is True

Returns:

concatenated DimArray

See also

stack
join arrays along a new dimension
align
align arrays

Examples

1-D

>>> from dimarray import DimArray
>>> a = DimArray([1,2,3], axes=[['a','b','c']])
>>> b = DimArray([4,5,6], axes=[['d','e','f']])
>>> concatenate((a, b))
dimarray: 6 non-null elements (0 null)
0 / x0 (6): 'a' to 'f'
array([1, 2, 3, 4, 5, 6])

2-D

>>> a = DimArray([[1,2,3],[11,22,33]])
>>> b = DimArray([[4,5,6],[44,55,66]])
>>> concatenate((a, b), axis=0)
dimarray: 12 non-null elements (0 null)
0 / x0 (4): 0 to 1
1 / x1 (3): 0 to 2
array([[ 1,  2,  3],
       [11, 22, 33],
       [ 4,  5,  6],
       [44, 55, 66]])
>>> concatenate((a, b), axis='x1')
dimarray: 12 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (6): 0 to 2
array([[ 1,  2,  3,  4,  5,  6],
       [11, 22, 33, 44, 55, 66]])

dimarray.stack_ds(datasets, axis, keys=None, align=False, **kwargs)[source]

stack dataset along a new dimension

Parameters:

datasets: sequence or dict of datasets

axis: str, new dimension along which to stack the dataset

keys, optional: stack axis values, useful if dataset is a sequence, or a non-ordered dictionary

align, optional: if True, align axes (via reindexing) *prior* to stacking

**kwargs : optional key-word arguments passed to align, if align is True

Returns:

stacked dataset

See also

concatenate_ds, stack, sort_axis

Examples

>>> a = DimArray([1,2,3], dims=('dima',))
>>> b = DimArray([11,22], dims=('dimb',))
>>> ds = Dataset({'a':a,'b':b}) # dataset of 2 variables from an experiment
>>> ds2 = Dataset({'a':a*2,'b':b*2}) # dataset of 2 variables from a second experiment
>>> stack_ds([ds, ds2], axis='stackdim', keys=['exp1','exp2'])
Dataset of 2 variables
0 / stackdim (2): 'exp1' to 'exp2'
1 / dima (3): 0 to 2
2 / dimb (2): 0 to 1
a: ('stackdim', 'dima')
b: ('stackdim', 'dimb')

dimarray.concatenate_ds(datasets, axis=0, align=False, **kwargs)[source]

concatenate two datasets along an existing dimension

Parameters:

datasets: sequence of datasets

axis: axis along which to concatenate

align, optional: if True, align secondary axes (via reindexing) prior to concatenating

**kwargs : optional key-word arguments passed to align, if align is True

Returns:

joint Dataset along axis

NOTE: will raise an error if variables are there which do not contain the required dimension

See also

stack_ds, concatenate, sort_axis

Examples

>>> a = da.zeros(axes=[list('abc')], dims=('x0',))  # 1-D DimArray
>>> b = da.zeros(axes=[list('abc'), [1,2]], dims=('x0','x1')) # 2-D DimArray
>>> ds = Dataset({'a':a,'b':b}) # dataset of 2 variables from an experiment
>>> a2 = da.ones(axes=[list('def')], dims=('x0',)) 
>>> b2 = da.ones(axes=[list('def'), [1,2]], dims=('x0','x1')) # 2-D DimArray
>>> ds2 = Dataset({'a':a2,'b':b2}) # dataset of 2 variables from a second experiment
>>> concatenate_ds([ds, ds2])
Dataset of 2 variables
0 / x0 (6): 'a' to 'f'
1 / x1 (2): 1 to 2
a: ('x0',)
b: ('x0', 'x1')

Align

dimarray.align_axes(*args, **kwargs)

Deprecated. Now renamed to align


dimarray.align_dims(*arrays)[source]

Align dimensions of a list of arrays so that they are ready for broadcast.

Method: inserting singleton axes at the right place and transpose where needed. Note : not part of public API, but used in other dimarray modules

Examples

>>> import dimarray as da
>>> import numpy as np
>>> x = da.DimArray(np.arange(2), dims=('x0',))
>>> y = da.DimArray(np.arange(3), dims=('x1',))
>>> align_dims(x, y)
[dimarray: 2 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (1): None to None
array([[0],
       [1]]), dimarray: 3 non-null elements (0 null)
0 / x0 (1): None to None
1 / x1 (3): 0 to 2
array([[0, 1, 2]])]

dimarray.broadcast_arrays(*arrays)[source]

Analogous to numpy.broadcast_arrays

but with looser requirements on input shape and returns copy instead of views

Parameters:arrays : variable list of DimArrays
Returns:list of DimArrays

Examples

Just as numpy’s broadcast_arrays

>>> import dimarray as da
>>> x = da.DimArray([[1,2,3]])
>>> y = da.DimArray([[1],[2],[3]])
>>> da.broadcast_arrays(x, y)
[dimarray: 9 non-null elements (0 null)
0 / x0 (3): 0 to 2
1 / x1 (3): 0 to 2
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]]), dimarray: 9 non-null elements (0 null)
0 / x0 (3): 0 to 2
1 / x1 (3): 0 to 2
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])]

Interpolate

dimarray.interp2d(dim_array, newaxes, dims=(-2, -1), **kwargs)[source]

Two-dimensional interpolation

Parameters:

dim_array : DimArray instance

newaxes : sequence of two array-like, or dict.

axes on which to interpolate

dims : sequence of two axis names or integer rank, optional

Indicate dimensions which match newaxes. By default (-2, -1) (last two dimensions).

**kwargs : passed to scipy.interpolate.RegularGridInterpolator

method : ‘nearest’ or ‘linear’ (default) bounds_error : True by default fill_value : np.nan by default, but set to None to extrapolate outside bounds.

Returns:

dim_array_int : DimArray instance

interpolated array

Examples

>>> from dimarray import DimArray, interp2d
>>> x = np.array([0, 1, 2])
>>> y = np.array([0, 10])
>>> a = DimArray([[0,0,1],[1,0.,0.]], [('y',y),('x',x)])
>>> a
dimarray: 6 non-null elements (0 null)
0 / y (2): 0 to 10
1 / x (3): 0 to 2
array([[ 0.,  0.,  1.],
       [ 1.,  0.,  0.]])
>>> newx = [0.5, 1.5]
>>> newy = np.linspace(0,10,5)
>>> ai = interp2d(a, [newy, newx])
>>> ai
dimarray: 10 non-null elements (0 null)
0 / y (5): 0.0 to 10.0
1 / x (2): 0.5 to 1.5
array([[ 0.   ,  0.5  ],
       [ 0.125,  0.375],
       [ 0.25 ,  0.25 ],
       [ 0.375,  0.125],
       [ 0.5  ,  0.   ]])

Use dims keyword argument if new axes order does not match array dimensions >>> (ai == interp2d(a, [newx, newy], dims=(‘x’,’y’))).all() True

Out-of-bounds filled with NaN: >>> newx = [-1, 1] >>> newy = [-5, 0, 10] >>> interp2d(a, [newy, newx], bounds_error=False) dimarray: 2 non-null elements (4 null) 0 / y (3): -5 to 10 1 / x (2): -1 to 1 array([[ nan, nan],

[ nan, 0.], [ nan, 0.]])

Nearest neighbor interpolation and out-of-bounds extrapolation >>> interp2d(a, [newy, newx], method=’nearest’, bounds_error=False, fill_value=None) dimarray: 6 non-null elements (0 null) 0 / y (3): -5 to 10 1 / x (2): -1 to 1 array([[ 0., 0.],

[ 0., 0.], [ 1., 0.]])

Stats

dimarray.percentile(a, pct, axis=0, newaxis=None, out=None, overwrite_input=False)[source]

calculate percentile along an axis

Parameters:

pct: float, percentile or sequence of percentiles (0< <100)

axis, optional, default 0: axis along which to compute percentiles

newaxis, optional: name of the new percentile axis, if more than one pct.

By default, append “_percentile” to the axis name on which the transformation is applied.

out, overwrite_input: passed to numpy’s percentile method (see documentation)

Returns:

pctiles: DimArray or scalar whose required axis has been reduced or replaced by percentiles

Examples

>>> from dimarray import DimArray
>>> np.random.seed(0) # for reproductibility of results
>>> a = DimArray(np.random.randn(1000), dims=['sample'])
>>> percentile(a, 50)
-0.058028034799627745
>>> percentile(a, [50, 95])
dimarray: 2 non-null elements (0 null)
0 / sample_percentile (2): 50 to 95
array([-0.05802803,  1.66012041])

Read netCDF data

dimarray.read_nc(f, names=None, *args, **kwargs)[source]

Wrapper around DatasetOnDisk.read

Read one or several variables from one or several netCDF file

Parameters:

f : str or netCDF handle

netCDF file to read from or regular expression

names : None or list or str, optional

variable name(s) to read default is None

indices : int or list or slice (single-dimensional indices)

or a tuple of those (multi-dimensional) or dict of { axis name : axis indices }

Indices refer to Dataset axes. Any item that does not possess one of the dimensions will not be indexed along that dimension. For example, scalar items will be left unchanged whatever indices are provided.

indexing : {‘label’, ‘position’}, optional

Indexing mode. - “label”: indexing on axis labels (default) - “position”: use numpy-like position index Default value can be changed in dimarray.rcParams[‘indexing.by’]

tol : float, optional

tolerance when looking for numerical values, e.g. to use nearest neighbor search, default None.

keepdims : bool, optional

keep singleton dimensions (default False)

axis : str, optional

When reading multiple files, axis along which to join the dimarrays or datasets. It the axis already exist, the resulting arrays will be concatenated, otherwise they will be stacked along a new array (in the sense of the numpy functions concatenate and stack)

keys : sequence, optional

When reading multiple files, keys for the join axis. If the axis already exists in the dataset, the concatenated dataset/dimarray will be re-indexed along the provided key, otherwise the keys will be used to create a new axis for stacking. In the latter case, keys’ length needs to exactly match the number of input files, and if not provided, file names will be taken instead. Note you may manually rename the axes later, or use the set_axis method.

align : bool, optional

When reading multiple files, passed to stack (new axis) or concatenate (existing axis) to reindex all arrays onto common axes. (in concatenate mode, the concatenation axis is not re-indexed of course, only the secondary axes) Default to False.

**kwargs : optional key-word arguments passed to align, if align is True

When reading multiple files, passed to stack (new axis) or This includes: sort (False by default) and join (‘outer’ by default)

Returns:

obj : DimArray or Dataset

depending on whether a (single) variable name is passed as argument (names) or not

See also

DatasetOnDisk.read, stack, concatenate, stack_ds, concatenate_ds, align, DimArray.write_nc, Dataset.write_nc

Examples

>>> import os
>>> from dimarray import read_nc, get_datadir

Single netCDF file

>>> ncfile = os.path.join(get_datadir(), 'cmip5.CSIRO-Mk3-6-0.nc')
>>> data = read_nc(ncfile)  # load full file
>>> data
Dataset of 2 variables
0 / time (451): 1850 to 2300
1 / scenario (5): u'historical' to u'rcp85'
tsl: (u'time', u'scenario')
temp: (u'time', u'scenario')
>>> data = read_nc(ncfile,'temp') # only one variable
>>> data = read_nc(ncfile,'temp', indices={"time":slice(2000,2100), "scenario":"rcp45"})  # load only a chunck of the data
>>> data = read_nc(ncfile,'temp', indices={"time":1950.3}, tol=0.5)  #  approximate matching, adjust tolerance
>>> data = read_nc(ncfile,'temp', indices={"time":-1}, indexing='position')  #  integer position indexing

Multiple files Read variable ‘temp’ across multiple files (representing various climate models) In this case the variable is a time series, whose length may vary across experiments (thus align=True is passed to reindex axes before stacking)

>>> direc = get_datadir()
>>> temp = da.read_nc(direc+'/cmip5.*.nc', 'temp', align=True, axis='model')

A new ‘model’ axis is created labeled with file names. It is then possible to rename it more appropriately, e.g. keeping only the part directly relevant to identify the experiment:

>>> getmodel = lambda x: os.path.basename(x).split('.')[1] # extract model name from path
>>> temp.set_axis(getmodel, axis='model') # would return a copy if inplace is not specified
>>> temp
dimarray: 9114 non-null elements (6671 null)
0 / model (7): 'CSIRO-Mk3-6-0' to 'MPI-ESM-MR'
1 / time (451): 1850 to 2300
2 / scenario (5): u'historical' to u'rcp85'
array(...)

This works on datasets as well:

>>> ds = da.read_nc(direc+'/cmip5.*.nc', align=True, axis='model')
>>> ds.set_axis(getmodel, axis='model')
>>> ds
Dataset of 2 variables
0 / model (7): 'CSIRO-Mk3-6-0' to 'MPI-ESM-MR'
1 / time (451): 1850 to 2300
2 / scenario (5): u'historical' to u'rcp85'
tsl: ('model', u'time', u'scenario')
temp: ('model', u'time', u'scenario')

dimarray.summary_nc(fname, name=None, metadata=False)[source]

Print summary information about the content of a netCDF file Deprecated, see dimarray.open_nc


dimarray.get_datadir()[source]

Return directory name for the datasets


dimarray.get_ncfile(fname='cmip5.CSIRO-Mk3-6-0.nc')[source]

Return one netCDF file

dimarray options

dimarray.print_options()[source]

dimarray.get_option(name)[source]

dimarray.set_option(name, value)[source]

set global options