What to do about `str` columns in `SolutionArray`

**Problem description**



The current implementation of `SolutionArray` allows for `extra` columns containing `str` (see discussion in #838), which may or may not be an intended use case: built-in attributes are almost exclusively numeric, with the notable exception of `state_of_matter`. Other implementations did not anticipate non-numeric data, e.g. CSV import and HDF export/import, so `str` support either needs to be deprecated/disabled or consistently implemented.

One interpretation is that `str` input was never intended, but also not explicitly checked for. I.e. it just happens to be supported by `numpy` (similar to sequences not being checked, see #895). In this case, the most appropriate resolution may be to catch and deprecate `str` columns, while warning that it is not fully supported for CSV/HDF.

**Steps to reproduce**


```
In [1]: import cantera as ct
    ...: gas = ct.Solution('h2o2.yaml')
    ...: arr = ct.SolutionArray(gas, 3, extra={'spam': 'eggs'})
    ...: 

In [2]: arr._extra
Out[2]: OrderedDict([('spam', ['eggs', 'eggs', 'eggs'])])

In [3]: arr.write_csv('test.csv')

In [3]: !cat test.csv
spam,T,density,Y_H2,Y_H,Y_O,Y_O2,Y_OH,Y_H2O,Y_HO2,Y_H2O2,Y_AR
eggs,300.0,0.08189392763801234,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
eggs,300.0,0.08189392763801234,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
eggs,300.0,0.08189392763801234,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0

In [4]: arr2 = ct.SolutionArray(gas, extra={'spam'})

In [4]: arr2.read_csv('test.csv')

In [5]: arr2._extra
Out[5]: {'spam': array([ nan,  nan,  nan])}

In [6]: arr.write_hdf('test.h5')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-d708cd336db5> in <module>()
----> 1 arr.write_hdf('test.h5')

/usr/local/lib/python3.6/dist-packages/cantera/composite.py in write_hdf(self, filename, cols, group, subgroup, attrs, mode, append, compression, compression_opts, *args, **kwargs)
   1117                 dgroup.attrs[key] = val
   1118             for header, col in data.items():
-> 1119                 dgroup.create_dataset(header, data=col, **hdf_kwargs)
   1120 
   1121         return group

/home/docker/.local/lib/python3.6/site-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds)
    134 
    135         with phil:
--> 136             dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)
    137             dset = dataset.Dataset(dsid)
    138             if name is not None:

/home/docker/.local/lib/python3.6/site-packages/h5py/_hl/dataset.py in make_new_dset(parent, shape, dtype, data, chunks, compression, shuffle, fletcher32, maxshape, compression_opts, fillvalue, scaleoffset, track_times, external, track_order, dcpl)
    116         else:
    117             dtype = numpy.dtype(dtype)
--> 118         tid = h5t.py_create(dtype, logical=1)
    119 
    120     # Legacy

h5py/h5t.pyx in h5py.h5t.py_create()

h5py/h5t.pyx in h5py.h5t.py_create()

h5py/h5t.pyx in h5py.h5t.py_create()

TypeError: No conversion path for dtype: dtype('<U4')

In [7]: arr.spam
Out[7]: 
array(['eggs', 'eggs', 'eggs'],
      dtype='<U4')
```

**Behavior**


The pre-existing `write_csv` supports `str`, the newly introduced `read_csv` fails to import it, and `h5py` has issues. 

I am filing this as a bug report as this presumably needs to be fixed prior to the release of 2.5.

**System information**

- Cantera version: 2.5.a4
- OS: Ubuntu 18.04
- Python 3.6

**Additional context**


#838, #895

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What to do about `str` columns in `SolutionArray` #896

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

What to do about str columns in SolutionArray #896

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

What to do about `str` columns in `SolutionArray` #896