DimArray API

DimArray methods are list below by topic, along with examples. Functions are provided in a separate page functions reference API.

Create a DimArray

DimArray.__init__(values=None, axes=None, dims=None, labels=None, copy=False, dtype=None, _indexing=None, _indexing_broadcast=None, **kwargs)[source]

Initialize a DimArray instance

Parameters:

values : numpy-like array, or DimArray instance, or dict

If values is not provided, will initialize an empty array with dimensions inferred from axes (in that case axes= must be provided).

axes : list or tuple, optional

axis values as ndarrays, whose order matches axis names (the dimensions) provided via dims= parameter. Each axis can also be provided as a tuple (str, array-like) which contains both axis name and axis values, in which case dims= becomes superfluous. axes= can also be provided with a list of Axis objects If axes= is omitted, a standard axis np.arange(shape[i]) is created for each axis i.

dims : list or tuple, optional

dimensions (or axis names) This parameter can be omitted if dimensions are already provided by other means, such as passing a list of tuple to axes=. If axes are passed as keyword arguments (via **kwargs), dims= is used to determine the order of dimensions. If dims is not provided by any of the means mentioned above, default dimension names are given x0, x1, ...`xn`, where n is the number of dimensions.

dtype : numpy data type, optional

passed to np.array()

copy : bool, optional

passed to np.array()

**kwargs : keyword arguments

metadata

Notes

metadata passed this way cannot have name already taken by other
parameters such as “values”, “axes”, “dims”, “dtype” or “copy”.

Examples

Basic:

>>> DimArray([[1,2,3],[4,5,6]]) # automatic labelling
dimarray: 6 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (3): 0 to 2
array([[1, 2, 3],
       [4, 5, 6]])
>>> DimArray([[1,2,3],[4,5,6]], dims=['items','time'])  # axis names only
dimarray: 6 non-null elements (0 null)
0 / items (2): 0 to 1
1 / time (3): 0 to 2
array([[1, 2, 3],
       [4, 5, 6]])
>>> DimArray([[1,2,3],[4,5,6]], axes=[list("ab"), np.arange(1950,1953)]) # axis values only
dimarray: 6 non-null elements (0 null)
0 / x0 (2): 'a' to 'b'
1 / x1 (3): 1950 to 1952
array([[1, 2, 3],
       [4, 5, 6]])

More general case:

>>> a = DimArray([[1,2,3],[4,5,6]], axes=[list("ab"), np.arange(1950,1953)], dims=['items','time']) 
>>> b = DimArray([[1,2,3],[4,5,6]], axes=[('items',list("ab")), ('time',np.arange(1950,1953))])
>>> c = DimArray([[1,2,3],[4,5,6]], {'items':list("ab"), 'time':np.arange(1950,1953)}) # here dims can be omitted because shape = (2, 3)
>>> np.all(a == b) and np.all(a == c)
True
>>> a
dimarray: 6 non-null elements (0 null)
0 / items (2): 'a' to 'b'
1 / time (3): 1950 to 1952
array([[1, 2, 3],
       [4, 5, 6]])

Empty data

>>> a = DimArray(axes=[('items',list("ab")), ('time',np.arange(1950,1953))])

Metadata

>>> a = DimArray([[1,2,3],[4,5,6]], name='test', units='none') 

Modify shape

DimArray.transpose(*dims)

Permute dimensions

Analogous to numpy, but also allows axis names

Parameters:

*dims : int or str

variable list of dimensions

Returns:

transposed_array : DimArray

See also

reshape, flatten, unflatten, newaxis

Examples

>>> import dimarray as da
>>> a = da.DimArray(np.zeros((2,3)), ['x0','x1'])
>>> a          
dimarray: 6 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (3): 0 to 2
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])
>>> a.T       
dimarray: 6 non-null elements (0 null)
0 / x1 (3): 0 to 2
1 / x0 (2): 0 to 1
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])
>>> (a.T == a.transpose(1,0)).all() and (a.T == a.transpose('x1','x0')).all()
True

DimArray.swapaxes(axis1, axis2)

Swap two axes

analogous to numpy’s swapaxes, but can provide axes by name

Parameters:

axis1, axis2 : int or str

axes to swap (transpose)

Returns:

transposed_array : DimArray

Examples

>>> from dimarray import DimArray
>>> a = DimArray(np.arange(2*3*4).reshape(2,3,4))
>>> a.dims
('x0', 'x1', 'x2')
>>> b = a.swapaxes('x2',0) # put 'x2' at the first position
>>> b.dims
('x2', 'x1', 'x0')
>>> b.shape
(4, 3, 2)

DimArray.reshape(*newdims, **kwargs)

Add/remove/flatten dimensions to conform array to new dimensions

Parameters:

newdims : tuple or list or variable list of dimension names {str}

Any dimension now present in the array is added as singleton dimension Any dimension name containing a comma is interpreting as a flattening command. All dimensions to flatten have to exist already.

transpose : bool

if True, transpose dimensions to match new order (default True) otherwise, raise and Error if tranpose is needed (closer to original numpy’s behaviour)

Returns:

reshaped_array : DimArray

with reshaped_array.dims == tuple(newdims)

See also

flatten, unflatten, transpose, newaxis

Examples

>>> from dimarray import DimArray
>>> a = DimArray([7,8])
>>> a
dimarray: 2 non-null elements (0 null)
0 / x0 (2): 0 to 1
array([7, 8])
>>> a.reshape(('x0','new'))
dimarray: 2 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / new (1): None to None
array([[7],
       [8]])
>>> b = DimArray(np.arange(2*2*2).reshape(2,2,2))
>>> b
dimarray: 8 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (2): 0 to 1
2 / x2 (2): 0 to 1
array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])
>>> c = b.reshape('x0','x1,x2')
>>> c
dimarray: 8 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1,x2 (4): (0, 0) to (1, 1)
array([[0, 1, 2, 3],
       [4, 5, 6, 7]])
>>> c.reshape('x0,x1','x2')
dimarray: 8 non-null elements (0 null)
0 / x0,x1 (4): (0, 0) to (1, 1)
1 / x2 (2): 0 to 1
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7]])

DimArray.flatten(*dims, **kwargs)

Flatten all or a subset of dimensions

Parameters:

dims : list or tuple of axis names, optional

by default, all dimensions

reverse : bool, optional

if True, reverse behaviour: dims are interpreted as the dimensions to keep, and all the other dimensions are flattened default is False

insert : int, optional

position where to insert the flattened axis (by default, any flattened dimension is inserted at the position of the first axis involved in flattening)

Returns:

flattened_array : DimArray

appropriately reshaped, with collapsed dimensions as first axis (tuples)

This is useful to do a regional mean with missing values

See also

reshape, transpose

Notes

A tuple of axis names can be passed via the “axis” parameter of the transformation to trigger flattening prior to reducing an axis.

Examples

Flatten all dimensions

>>> from dimarray import DimArray
>>> a = DimArray([[1,2,3],[4,5,6]])
>>> a
dimarray: 6 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (3): 0 to 2
array([[1, 2, 3],
       [4, 5, 6]])
>>> b = a.flatten()
>>> b
dimarray: 6 non-null elements (0 null)
0 / x0,x1 (6): (0, 0) to (1, 2)
array([1, 2, 3, 4, 5, 6])
>>> b.labels
(array([(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)], dtype=object),)

Flatten a subset of dimensions only

>>> from dimarray import DimArray
>>> np.random.seed(0)
>>> values = np.arange(2*3*4).reshape(2,3,4)
>>> v = DimArray(values, axes=[('time', [1950,1955]), ('lat', np.linspace(-90,90,3)), ('lon', np.linspace(-180,180,4))])
>>> v
dimarray: 24 non-null elements (0 null)
0 / time (2): 1950 to 1955
1 / lat (3): -90.0 to 90.0
2 / lon (4): -180.0 to 180.0
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
>>> w = v.flatten(('lat','lon'), insert=1)
>>> w 
dimarray: 24 non-null elements (0 null)
0 / time (2): 1950 to 1955
1 / lat,lon (12): (-90.0, -180.0) to (90.0, 180.0)
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])
>>> np.all( w.unflatten() == v )
True

But be careful, the order matter !

>>> v.flatten(('lon','lat'), insert=1)
dimarray: 24 non-null elements (0 null)
0 / time (2): 1950 to 1955
1 / lon,lat (12): (-180.0, -90.0) to (180.0, 90.0)
array([[ 0,  4,  8,  1,  5,  9,  2,  6, 10,  3,  7, 11],
       [12, 16, 20, 13, 17, 21, 14, 18, 22, 15, 19, 23]])

Useful to average over a group of dimensions:

>>> v.flatten(('lon','lat'), insert=0).mean(axis=0)
dimarray: 2 non-null elements (0 null)
0 / time (2): 1950 to 1955
array([  5.5,  17.5])

is equivalent to:

>>> v.mean(axis=('lon','lat')) 
dimarray: 2 non-null elements (0 null)
0 / time (2): 1950 to 1955
array([  5.5,  17.5])

DimArray.unflatten(axis=None)

undo flatten (inflate array)

Parameters:

axis : int or str or None, optional

axis to unflatten default to None to unflatten all

Returns:

DimArray


DimArray.squeeze(axis=None)

Squeeze singleton axes

Analogous to numpy, but also allows axis name

Parameters:

axis : int or str or None

axis to squeeze default is None, to remove all singleton axes

Returns:

squeezed_array : DimArray

Examples

>>> import dimarray as da
>>> a = da.DimArray([[[1,2,3]]])
>>> a
dimarray: 3 non-null elements (0 null)
0 / x0 (1): 0 to 0
1 / x1 (1): 0 to 0
2 / x2 (3): 0 to 2
array([[[1, 2, 3]]])
>>> a.squeeze()
dimarray: 3 non-null elements (0 null)
0 / x2 (3): 0 to 2
array([1, 2, 3])
>>> a.squeeze(axis='x1')
dimarray: 3 non-null elements (0 null)
0 / x0 (1): 0 to 0
1 / x2 (3): 0 to 2
array([[1, 2, 3]])

DimArray.repeat(values, axis=None)

expand the array along an existing axis

Parameters:

values : int or ndarray or Axis instance

int: size of new axis ndarray: values of new axis

axis : int or str

refer to the dimension along which to repeat

**kwaxes : key-word arguments

alternatively, axes may be passed as keyword arguments

Returns:

DimArray

See also

newaxis

Examples

>>> import dimarray as da
>>> a = da.DimArray(np.arange(3), labels = [[1950., 1951., 1952.]], dims=('time',))
>>> a2d = a.newaxis('lon', pos=1) # lon is now singleton dimension
>>> a2d.repeat(2, axis="lon")  
dimarray: 6 non-null elements (0 null)
0 / time (3): 1950.0 to 1952.0
1 / lon (2): 0 to 1
array([[0, 0],
       [1, 1],
       [2, 2]])
>>> a2d.repeat([30., 50.], axis="lon")  
dimarray: 6 non-null elements (0 null)
0 / time (3): 1950.0 to 1952.0
1 / lon (2): 30.0 to 50.0
array([[0, 0],
       [1, 1],
       [2, 2]])

DimArray.broadcast(other)

repeat array to match target dimensions

Parameters:other : DimArray or Axes objects or ordered Dictionary of axis values
Returns:DimArray

Examples

Create some dummy data: # ...create some dummy data:

>>> import dimarray as da
>>> lon = np.linspace(10, 30, 2)
>>> lat = np.linspace(10, 50, 3)
>>> time = np.arange(1950,1955)
>>> ts = da.DimArray(np.arange(5), axes=[time], dims=['time'])
>>> cube = da.DimArray(np.zeros((3,2,5)), axes=[('lat',lat), ('lon',lon), ('time',time)])  # lat x lon x time
>>> cube.axes  
0 / lat (3): 10.0 to 50.0
1 / lon (2): 10.0 to 30.0
2 / time (5): 1950 to 1954

# ...broadcast timeseries to 3D data

>>> ts3D = ts.broadcast(cube) #  lat x lon x time
>>> ts3D
dimarray: 30 non-null elements (0 null)
0 / lat (3): 10.0 to 50.0
1 / lon (2): 10.0 to 30.0
2 / time (5): 1950 to 1954
array([[[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]],

       [[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]],

       [[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]]])

Reduce, accumulate

DimArray.max(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s max

max(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.max or numpy.ma.max for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.min(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s min

min(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.min or numpy.ma.min for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.ptp(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s ptp

ptp(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.ptp or numpy.ma.ptp for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.median(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s median

median(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.median or numpy.ma.median for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.all(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s all

all(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.all or numpy.ma.all for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.any(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s any

any(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.any or numpy.ma.any for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.prod(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s prod

prod(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.prod or numpy.ma.prod for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.sum(axis=None, skipna=False, args=(), **kwargs)

Analogous to numpy’s sum

sum(..., axis=None, skipna=False, ...)

Accepts the same parameters as the equivalent numpy function, with modified behaviour of the axis parameter and an additional skipna parameter to handle NaNs (by default considered missing values)

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)

”...” stands for any other parameters required by the function, and depends

on the particular function being called

Returns:

DimArray, or numpy array or scalar (e.g. in some cases if axis is None)

See help on numpy.sum or numpy.ma.sum for other parameters

and more information.

See also

apply_along_axis
is called by this method
to_MaskedArray
is used if skipna is True

DimArray.mean(axis=None, skipna=False, weights=None)

mean over an axis or sequence of axes, possibly weighted

This transformation can be weighted if a non-None weights parameter is provided, or if one of the axes has a non-None weights attribute. Otherwise, a standard, unweighted transformation is performed.

Parameters:

axis : int or str or tuple, optional

axis or sequence of axes to apply the transform on

skipna : bool, optional

ignore missing values (nans) prior to transformation Default is False.

weights : array-like or callable or dict, optional

if provided, is used instead if individual axes’ weights attributes

A weights array can be built from individual axes’ weights, either as a parameter to this function, or as a permanent axis attribute defined for the relevant axes. Weights can be of the form:

  • 1-D array-like : for 1-D arrays or if the transformation

is to be applied to one axis only (via axis parameter)

  • callable : like above, will be applied on the axis specified

by the axis parameter, if provided

If passed as a parameter to the weighted transformation, weights can also be provided as a dictionary. The keys must be axis names or integer ranks, and values one of the accepted types.

Returns:

DimArray instance or scalar, consistently with ndarray behaviour

See also

DimArray.var, DimArray.std, DimArray._get_weights

Notes

The weights actually used in the transformation can be checked via the DimArray._get_weights() method (experimental)

Examples

>>> from dimarray import DimArray
>>> np.random.seed(0) # to make results reproducible
>>> v = DimArray(np.random.rand(3,2), axes=[[-80, 0, 80], [-180, 180]], dims=['lat','lon'])

Classical, unweighted mean:

>>> v.mean() 
0.58019972362897432

Weighted mean

>>> w = np.cos(np.radians(v.lat))
>>> v.mean(weights={'lat':w})  # only lat axis is weighted
0.57628879031663871

Make the change permanent

>>> v.axes['lat'].weights = w
>>> v.mean()
0.57628879031663871

Check the weights being used (experimental)

>>> v._get_weights()
dimarray: 6 non-null elements (0 null)
0 / lat (3): -80 to 80
1 / lon (2): -180 to 180
array([[ 0.17364818,  0.17364818],
       [ 1.        ,  1.        ],
       [ 0.17364818,  0.17364818]])

DimArray.std(*args, **kwargs)

standard deviation over an axis or sequence of axes, possibly weighted

Parameters:

axis : int or str or tuple, optional

axis or sequence of axes to apply the transform on

skipna : bool, optional

ignore missing values (nans) prior to transformation Default is False.

weights : array-like or callable or dict, optional

if provided, is used instead if individual axes’ weights attributes

A weights array can be built from individual axes’ weights, either as a parameter to this function, or as a permanent axis attribute defined for the relevant axes. Weights can be of the form:

  • 1-D array-like : for 1-D arrays or if the transformation

is to be applied to one axis only (via axis parameter)

  • callable : like above, will be applied on the axis specified

by the axis parameter, if provided

If passed as a parameter to the weighted transformation, weights can also be provided as a dictionary. The keys must be axis names or integer ranks, and values one of the accepted types.

ddof : int, optional

“Delta Degrees of Freedom”: the divisor used in the calculation is

N - ddof, where N represents the number of elements. By default ddof is zero.

Note ddof is ignored when weights are used Default is 0.

Returns:

DimArray instance or scalar, consistently with ndarray behaviour

See also

DimArray.mean, DimArray.var, DimArray._get_weights

Notes

The weights actually used in the transformation can be checked via the DimArray._get_weights() method (experimental)


DimArray.var(axis=None, skipna=False, weights=None, ddof=0)

variance over an axis or sequence of axes, possibly weighted

Parameters:

axis : int or str or tuple, optional

axis or sequence of axes to apply the transform on

skipna : bool, optional

ignore missing values (nans) prior to transformation Default is False.

weights : array-like or callable or dict, optional

if provided, is used instead if individual axes’ weights attributes

A weights array can be built from individual axes’ weights, either as a parameter to this function, or as a permanent axis attribute defined for the relevant axes. Weights can be of the form:

  • 1-D array-like : for 1-D arrays or if the transformation

is to be applied to one axis only (via axis parameter)

  • callable : like above, will be applied on the axis specified

by the axis parameter, if provided

If passed as a parameter to the weighted transformation, weights can also be provided as a dictionary. The keys must be axis names or integer ranks, and values one of the accepted types.

ddof : int, optional

“Delta Degrees of Freedom”: the divisor used in the calculation is

N - ddof, where N represents the number of elements. By default ddof is zero.

Note ddof is ignored when weights are used Default is 0.

Returns:

DimArray instance or scalar, consistently with ndarray behaviour

See also

DimArray.mean, DimArray.std, DimArray._get_weights

Notes

The weights actually used in the transformation can be checked via the DimArray._get_weights() method (experimental)


DimArray.argmax(axis=None, skipna=False)

similar to numpy’s argmax, but return axis values instead of integer position

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)


DimArray.argmin(axis=None, skipna=False)

similar to numpy’s argmin, but return axis values instead of integer position

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is None.

skipna : bool

If True, treat NaN as missing values (either using MaskedArray or,

when available, specific numpy function)


DimArray.cumsum(a, axis=-1, skipna=False)

DimArray.cumprod(a, axis=-1, skipna=False)

DimArray.diff(axis=-1, scheme='backward', keepaxis=False, n=1)

Analogous to numpy’s diff

Calculate the n-th order discrete difference along given axis.

The first order difference is given by out[n] = a[n+1] - a[n] along the given axis, higher order differences are calculated by using diff recursively.

Parameters:

axis : int or str or tuple

axis along which to apply the tranform. Can be given as axis position (int), as axis name (str), as a list or tuple of axes (positions or names) to collapse into one axis before applying transform. If axis is None, just apply the transform on the flattened array consistently with numpy (in this case will return a scalar). Default is -1.

scheme : str, optional

determines the values of the resulting axis - “forward” : diff[i] = x[i+1] - x[i] - “backward”: diff[i] = x[i] - x[i-1] - “centered”: diff[i] = x[i+1/2] - x[i-1/2] Default is “backward”

keepaxis : bool, optional

if True, keep the initial axis by padding with NaNs Only compatible with “forward” or “backward” differences Default is False

n : int, optional

The number of times values are differenced. Default is one

Returns:

diff : DimArray

The n order differences. The shape of the output is the same as a except along axis where the dimension is smaller by n.

Examples

Create some example data

>>> import dimarray as da
>>> v = da.DimArray([1,2,3,4], ('time', np.arange(1950,1954)), dtype=float)
>>> s = v.cumsum()
>>> s 
dimarray: 4 non-null elements (0 null)
0 / time (4): 1950 to 1953
array([  1.,   3.,   6.,  10.])

diff reduces axis size by one, by default

>>> s.diff()
dimarray: 3 non-null elements (0 null)
0 / time (3): 1951 to 1953
array([ 2.,  3.,  4.])

The keepaxis= parameter fills array with nan where necessary to keep the axis unchanged. Default is backward differencing: diff[i] = v[i] - v[i-1].

>>> s.diff(keepaxis=True)
dimarray: 3 non-null elements (1 null)
0 / time (4): 1950 to 1953
array([ nan,   2.,   3.,   4.])

But other schemes are available to control how the new axis is defined: backward (default), forward and even centered

>>> s.diff(keepaxis=True, scheme="forward") # diff[i] = v[i+1] - v[i]
dimarray: 3 non-null elements (1 null)
0 / time (4): 1950 to 1953
array([  2.,   3.,   4.,  nan])

The keepaxis=True option is invalid with the centered scheme, since every axis value is modified by definition:

>>> s.diff(axis='time', scheme='centered')
dimarray: 3 non-null elements (0 null)
0 / time (3): 1950.5 to 1952.5
array([ 2.,  3.,  4.])

Indexing

DimArray.__getitem__(indices=None, axis=0, indexing=None, tol=None, broadcast=None, keepdims=False, broadcast_arrays=None)

DimArray.ix()

DimArray.box()

property to allow indexing without array broadcasting (matlab-like)


DimArray.take()

Retrieve values from a DimArray

Parameters:

indices : int or list or slice (single-dimensional indices)

or a tuple of those (multi-dimensional) or dict of { axis name : axis values }

axis : None or int or str, optional

if specified and indices is a slice, scalar or an array, assumes indexing is along this axis.

indexing : {‘label’, ‘position’}, optional

Indexing mode. - “label”: indexing on axis labels (default) - “position”: use numpy-like position index Default value can be changed in dimarray.rcParams[‘indexing.by’]

tol : None or float or tuple or dict, optional

tolerance when looking for numerical values, e.g. to use nearest neighbor search, default None.

keepdims : bool, optional

keep singleton dimensions (default False)

broadcast : bool, optional

if True, use numpy-like fancy indexing and broadcast any indexing array to a common shape, useful for example to sample points along a path. Default to False.

Returns:

indexed_array : DimArray instance or scalar

See also

DimArray.put, DimArrayOnDisk.read, DimArray.take_axis

Examples

>>> from dimarray import DimArray
>>> v = DimArray([[1,2,3],[4,5,6]], axes=[["a","b"], [10.,20.,30.]], dims=['d0','d1'], dtype=float) 
>>> v
dimarray: 6 non-null elements (0 null)
0 / d0 (2): 'a' to 'b'
1 / d1 (3): 10.0 to 30.0
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

Indexing via axis values (default)

>>> a = v[:,10]   # python slicing method
>>> a
dimarray: 2 non-null elements (0 null)
0 / d0 (2): 'a' to 'b'
array([ 1.,  4.])
>>> b = v.take(10, axis=1)  # take, by axis position
>>> c = v.take(10, axis='d1')  # take, by axis name
>>> d = v.take({'d1':10})  # take, by dict {axis name : axis values}
>>> (a==b).all() and (a==c).all() and (a==d).all()
True

Indexing via integer index (indexing=”position” or ix property)

>>> np.all(v.ix[:,0] == v[:,10])
True
>>> np.all(v.take(0, axis="d1", indexing="position") == v.take(10, axis="d1"))
True

Multi-dimensional indexing

>>> v["a", 10]  # also work with string axis
1.0
>>> v.take(('a',10))  # multi-dimensional, tuple
1.0
>>> v.take({'d0':'a', 'd1':10})  # dict-like arguments
1.0

Take a list of indices

>>> a = v[:,[10,20]] # also work with a list of index
>>> a
dimarray: 4 non-null elements (0 null)
0 / d0 (2): 'a' to 'b'
1 / d1 (2): 10.0 to 20.0
array([[ 1.,  2.],
       [ 4.,  5.]])
>>> b = v.take([10,20], axis='d1')
>>> np.all(a == b)
True

Take a slice:

>>> c = v[:,10:20] # axis values: slice includes last element
>>> c
dimarray: 4 non-null elements (0 null)
0 / d0 (2): 'a' to 'b'
1 / d1 (2): 10.0 to 20.0
array([[ 1.,  2.],
       [ 4.,  5.]])
>>> d = v.take(slice(10,20), axis='d1') # `take` accepts `slice` objects
>>> np.all(c == d)
True
>>> v.ix[:,0:1] # integer position: does *not* include last element
dimarray: 2 non-null elements (0 null)
0 / d0 (2): 'a' to 'b'
1 / d1 (1): 10.0 to 10.0
array([[ 1.],
       [ 4.]])

Keep dimensions

>>> a = v[["a"]]
>>> b = v.take("a",keepdims=True)
>>> np.all(a == b)
True

tolerance parameter to achieve “nearest neighbour” search

>>> v.take(12, axis="d1", tol=5)
dimarray: 2 non-null elements (0 null)
0 / d0 (2): 'a' to 'b'
array([ 1.,  4.])

# Matlab like multi-indexing

>>> v = DimArray(np.arange(2*3*4).reshape(2,3,4))
>>> v[[0,1],:,[0,0,0]].shape
(2, 3, 3)
>>> v[[0,1],:,[0,0]].shape # here broadcast = False
(2, 3, 2)
>>> v.take(([0,1],slice(None),[0,0]), broadcast=True).shape # that is traditional numpy, with broadcasting on same shape
(2, 3)
>>> v.values[[0,1],:,[0,0]].shape # a proof of it
(2, 3)
>>> a = DimArray(np.arange(2*3).reshape(2,3))
>>> a[a > 3] # FULL ARRAY: return a numpy array in n-d case (at least for now)
dimarray: 2 non-null elements (0 null)
0 / x0,x1 (2): (1, 1) to (1, 2)
array([4, 5])
>>> a[a.x0 > 0] # SINGLE AXIS: only first axis
dimarray: 3 non-null elements (0 null)
0 / x0 (1): 1 to 1
1 / x1 (3): 0 to 2
array([[3, 4, 5]])
>>> a[:, a.x1 > 0] # only second axis 
dimarray: 4 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (2): 1 to 2
array([[1, 2],
       [4, 5]])
>>> a[a.x0 > 0, a.x1 > 0]
dimarray: 2 non-null elements (0 null)
0 / x0 (1): 1 to 1
1 / x1 (2): 1 to 2
array([[4, 5]])

Sample points along a path, a la numpy, with broadcast=True

>>> a.take(([0,0,1],[1,2,2]), broadcast=True)
dimarray: 3 non-null elements (0 null)
0 / x0,x1 (3): (0, 1) to (1, 2)
array([1, 2, 5])

Ellipsis (only one supported)

>>> a = DimArray(np.arange(2*3*4*5).reshape(2,3,4,5))
>>> a[0,...,0].shape
(3, 4)
>>> a[...,0,0].shape
(2, 3)

DimArray.put()

Modify values of a DimArray

Parameters:

indices : int or list or slice (single-dimensional indices)

or a tuple of those (multi-dimensional) or dict of { axis name : axis values }

axis : None or int or str, optional

if specified and indices is a slice, scalar or an array, assumes indexing is along this axis.

indexing : {‘label’, ‘position’}, optional

Indexing mode. - “label”: indexing on axis labels (default) - “position”: use numpy-like position index Default value can be changed in dimarray.rcParams[‘indexing.by’]

tol : None or float or tuple or dict, optional

tolerance when looking for numerical values, e.g. to use nearest neighbor search, default None.

broadcast : bool, optional

if True, use numpy-like fancy indexing and broadcast any indexing array to a common shape, useful for example to sample points along a path. Default to False.

Returns:

None (inplace=True) or DimArray instance or scalar (inplace=False)

See also

DimArray.take, DimArrayOnDisk.write

Re-indexing

DimArray.reset_axis(values=None, axis=0, **kwargs)[source]

DimArray.reindex_axis(values, axis=0, fill_value=nan, raise_error=False, method=None)

reindex an array along an axis

Parameters:

values : array-like or Axis

new axis values

axis : int or str, optional

axis number or name

fill_value: bool, optional

Fill data to use for missing axis value, if raise_error is False.

raise_error : bool, optional

if True, raise error when an axis value is not present otherwise just replace with fill_value. Defaulf is False

method : {None, ‘left’, ‘right’}

method to fill the gaps (default None) If ‘left’ or ‘right’, just pass along to numpy.searchsorted.

Returns:

dimarray: DimArray instance

Examples

Basic reindexing: fill missing values with NaN

>>> import dimarray as da
>>> a = da.DimArray([1,2,3],axes=[('x0', [1,2,3])])
>>> b = da.DimArray([3,4],axes=[('x0',[1,3])])
>>> b.reindex_axis([1,2,3])
dimarray: 2 non-null elements (1 null)
0 / x0 (3): 1 to 3
array([  3.,  nan,   4.])

Or replace with anything else, like -9999

>>> b.reindex_axis([1,2,3], fill_value=-9999)
dimarray: 3 non-null elements (0 null)
0 / x0 (3): 1 to 3
array([    3, -9999,     4])

DimArray.reindex_like(other, **kwargs)

reindex_like : re-index like another dimarray / axes instance

Applies reindex_axis on each axis to match another DimArray

Parameters:

other : DimArray or Axes instance

**kwargs :

Returns:

DimArray

Notes

only reindex axes which are present in other

Examples

>>> import dimarray as da
>>> b = da.DimArray([3,4],('x0',[1,3]))
>>> c = da.DimArray([[1,2,3], [1,2,3]],[('x1',["a","b"]),('x0',[1, 2, 3])])
>>> b.reindex_like(c)
dimarray: 2 non-null elements (1 null)
0 / x0 (3): 1 to 3
array([  3.,  nan,   4.])

DimArray.sort_axis(a, axis=0, key=None, kind='quicksort')

sort an axis

Parameters:

a : DimArray (this argument is pre-assigned when using as bound method)

axis : int or str, optional

axis by position (int) or name (str) (default: 0)

key : callable or dict-like, optional

function that is called on each axis label and whose return value is used for sorting instead of axis label. Any other object with __getitem__ attribute may also be used as key, such as a dictionary. If None (the default), axis label is used for sorting.

kind : str, optional

sort algorigthm (see numpy.sort for more info)

Returns:

sorted : new DimArray with sorted axis

Examples

Basic

>>> from dimarray import DimArray
>>> a = DimArray([10,20,30], labels=[2, 0, 1])
>>> a
dimarray: 3 non-null elements (0 null)
0 / x0 (3): 2 to 1
array([10, 20, 30])
>>> a.sort_axis()
dimarray: 3 non-null elements (0 null)
0 / x0 (3): 0 to 2
array([20, 30, 10])
>>> a.sort_axis(key=lambda x: -x)
dimarray: 3 non-null elements (0 null)
0 / x0 (3): 2 to 0
array([10, 30, 20])

Multi-dimensional

>>> a = DimArray([[10,20,30],[40,50,60]], labels=[[0, 1], ['a','c','b']])
>>> a.sort_axis(axis=1)
dimarray: 6 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (3): 'a' to 'c'
array([[10, 30, 20],
       [40, 60, 50]])

Missing values

DimArray.dropna(axis=0, minvalid=None, na=nan)

drop nans along an axis

Parameters:

axis : axis position or name or list of names

minvalid : int, optional

min number of valid point in each slice along axis values by default all the points

Returns:

DimArray

Examples

1-Dimension

>>> from dimarray import DimArray
>>> a = DimArray([1.,2,3],('time',[1950, 1955, 1960]))
>>> a.ix[1] = np.nan
>>> a
dimarray: 2 non-null elements (1 null)
0 / time (3): 1950 to 1960
array([  1.,  nan,   3.])
>>> a.dropna()
dimarray: 2 non-null elements (0 null)
0 / time (2): 1950 to 1960
array([ 1.,  3.])

Multi-dimensional

>>> a = DimArray([[ np.nan, 2., 3.],[ np.nan, 5., np.nan]])
>>> a
dimarray: 3 non-null elements (3 null)
0 / x0 (2): 0 to 1
1 / x1 (3): 0 to 2
array([[ nan,   2.,   3.],
       [ nan,   5.,  nan]])
>>> a.dropna(axis=1)
dimarray: 2 non-null elements (0 null)
0 / x0 (2): 0 to 1
1 / x1 (1): 1 to 1
array([[ 2.],
       [ 5.]])
>>> a.dropna(axis=1, minvalid=1)  # minimum number of valid values, equivalent to `how="all"` in pandas
dimarray: 3 non-null elements (1 null)
0 / x0 (2): 0 to 1
1 / x1 (2): 1 to 2
array([[  2.,   3.],
       [  5.,  nan]])
DimArray.fillna(value, inplace=False, na=nan)

Fill NaN with a replacement value

Examples

>>> from dimarray import DimArray
>>> a = DimArray([1,2,np.nan])
>>> a.fillna(-99)
dimarray: 3 non-null elements (0 null)
0 / x0 (3): 0 to 2
array([  1.,   2., -99.])
DimArray.setna(value, na=nan, inplace=False)

set a value as missing

Parameters:

value : the values to set to na

na : the replacement value (default np.nan)

Examples

>>> from dimarray import DimArray
>>> a = DimArray([1,2,-99])
>>> a.setna(-99)
dimarray: 2 non-null elements (1 null)
0 / x0 (3): 0 to 2
array([  1.,   2.,  nan])
>>> a.setna([-99, 2]) # sequence
dimarray: 1 non-null elements (2 null)
0 / x0 (3): 0 to 2
array([  1.,  nan,  nan])
>>> a.setna(a > 1) # boolean
dimarray: 2 non-null elements (1 null)
0 / x0 (3): 0 to 2
array([  1.,  nan, -99.])
>>> a = DimArray([[1,2,-99]])  # multi-dim
>>> a.setna([-99, a>1])  # boolean
dimarray: 1 non-null elements (2 null)
0 / x0 (1): 0 to 0
1 / x1 (3): 0 to 2
array([[  1.,  nan,  nan]])

To / From other objects

classmethod DimArray.from_pandas(data, dims=None)[source]

Initialize a DimArray from pandas

Parameters:

data : pandas object (Series, DataFrame, Panel, Panel4D)

dims, optional : dimension (axis) names, otherwise look at ax.name for ax in data.axes

Returns:

a : DimArray instance

Examples

>>> import pandas as pd
>>> s = pd.Series([3,5,6], index=['a','b','c'])
>>> s.index.name = 'dim0'
>>> DimArray.from_pandas(s)
dimarray: 3 non-null elements (0 null)
0 / dim0 (3): 'a' to 'c'
array([3, 5, 6])

Also work with Multi-Index

>>> panel = pd.Panel(np.arange(2*3*4).reshape(2,3,4))
>>> b = panel.to_frame() # pandas' method to convert Panel to DataFrame via MultiIndex
>>> DimArray.from_pandas(b)    
dimarray: 24 non-null elements (0 null)
0 / major,minor (12): (0, 0) to (2, 3)
1 / x1 (2): 0 to 1
...  

DimArray.to_pandas()[source]

return the equivalent pandas object


DimArray.to_larry()[source]

return the equivalent pandas object


DimArray.to_dataset(axis=0)[source]

split a DimArray into a Dataset object (collection of DimArrays)

I/O

DimArray.write_nc(f, name=None, mode='w', clobber=None, format=None, *args, **kwargs)[source]

Write to netCDF

Parameters:

f : file name

name : variable name, optional

must be provided if no attribute “name” is defined

mode, clobber, format : see netCDF4.Dataset

**kwargs : passed to netCDF4.Dataset.createVAriable (compression)

See also

DatasetOnDisk


Plotting

DimArray.plot(*args, **kwargs)

Plot 1-D or 2-D data.

Wraps matplotlib’s plot()

Parameters:

*args, **kwargs : passed to matplotlib.pyplot.plot

legend : True (default) or False

Display legend for 2-D data.

ax : matplotlib.Axis, optional

Provide axis on which to show the plot.

Returns:

lines : list of matplotlib’s Lines2D instances

Examples

>>> from dimarray import DimArray
>>> data = DimArray(np.random.rand(4,3), axes=[np.arange(4), ['a','b','c']], dims=['distance', 'label'])
>>> data.axes[0].units = 'meters'
>>> h = data.plot(linewidth=2)
>>> h = data.T.plot(linestyle='-.')
>>> h = data.plot(linestyle='-.', legend=False)

DimArray.pcolor(*args, **kwargs)

Plot a quadrilateral mesh.

Wraps matplotlib pcolormesh(). See pcolormesh documentation in matplotlib for accepted keyword arguments.

Examples

>>> from dimarray import DimArray
>>> x = DimArray(np.zeros([100,40]))
>>> x.pcolor() 
>>> x.T.pcolor() # to flip horizontal/vertical axes  

DimArray.contourf(*args, **kwargs)

Plot filled 2-D contours.

Wraps matplotlib contourf(). See contourf documentation in matplotlib for accepted keyword arguments.

Examples

>>> from dimarray import DimArray
>>> x = DimArray(np.zeros([100,40])) 
>>> x[:50,:20] = 1.
>>> x.contourf() 
>>> x.T.contourf() # to flip horizontal/vertical axes  

DimArray.contour(*args, **kwargs)

Plot 2-D contours.

Wraps matplotlib contour(). See contour documentation in matplotlib for accepted keyword arguments.

Examples

>>> from dimarray import DimArray
>>> x = DimArray(np.zeros([100,40])) 
>>> x[:50,:20] = 1.
>>> x.contour() 
>>> x.T.contour() # to flip horizontal/vertical axes