Taking median (np.ma.median) of a list of masked arrays.

When trying to take a median of a list of masked arrays (using np.ma.median), 
the masks are ignored and the median of the unmasked arrays is returned.  An example of this is below:

```python
In [1]: import numpy as np
In [2]: data1 = np.array([[1,2,3,4],[5,6,7,8]])
In [3]: masked1 = np.ma.masked_where(data1 == 4, data1)
In [4]: data2 = np.array([[8,7,6,5],[4,3,2,1]])
In [5]: masked2 = np.ma.masked_where(data2==4,data2)
In [8]: list = [masked1,masked2]
In [9]: list
Out[9]: 
[masked_array(data =
  [[1 2 3 --]
  [5 6 7 8]],
              mask =
  [[False False False  True]
  [False False False False]],
        fill_value = 999999), masked_array(data =
  [[8 7 6 5]
  [-- 3 2 1]],
              mask =
  [[False False False False]
  [ True False False False]],
        fill_value = 999999)]
In [10]: np.ma.median(list,axis=0).data
Out[10]: 
array([[ 4.5,  4.5,  4.5,  4.5],
       [ 4.5,  4.5,  4.5,  4.5]])
```
but the output of `np.ma.median(list,axis=0).data` should actually be:
```
array([[ 4.5,  4.5,  4.5,  5],
       [ 5,  4.5,  4.5,  4.5]])
```
because the np.ma.median function should be ignoring the data values that are masked.

I have since found a workaround for this that works for me, which is to create the mask *after* making an array of arrays, e.g.:
```
In [16]: dataarray = np.array([data1,data2])
In [19]: maskedarray = np.ma.masked_where(dataarray==4,dataarray)
In [20]: maskedarray
Out[20]: 
masked_array(data =
 [[[1 2 3 --]
  [5 6 7 8]]

 [[8 7 6 5]
  [-- 3 2 1]]],
             mask =
 [[[False False False  True]
  [False False False False]]

 [[False False False False]
  [ True False False False]]],
       fill_value = 999999)
In [21]: np.ma.median(maskedarray,axis=0).data
Out[21]: 
array([[ 4.5,  4.5,  4.5,  5. ],
       [ 5. ,  4.5,  4.5,  4.5]])
```

But this was a bug in my code for years (ah!) and because it did not return an error, I never knew that it was not considering the masks when it was taking the medians of the data.

Is there a way to change the np.ma.median code so that if one tries to take the median of a list of masked arrays it *at least will return an error* so that future coders know that it is not working the way they assume it will work?  

Thanks all!

Versions:
numpy: 1.11.2/1.11.3
python: 2.7.12/2.7.13
on MacOSX

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Taking median (np.ma.median) of a list of masked arrays. #10757

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Taking median (np.ma.median) of a list of masked arrays. #10757

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions