API: read_csv should default to ignore_index=True when no index_col

xref https://github.com/pydata/pandas/issues/12618

```
In [13]: import dask

In [14]: dask.__version__
Out[14]: '0.8.0'

In [15]: pd.__version__
Out[15]: '0.18.0+7.g17a5982'
```

```
In [4]: import glob

In [5]: DataFrame(np.arange(10).reshape(-1,2)).to_csv('file1.csv')

In [6]: DataFrame(np.arange(10).reshape(-1,2)).to_csv('file2.csv')

In [7]: pd.concat([ pd.read_csv(f, index_col=0) for f in glob.glob('file*.csv') ], ignore_index=True)
Out[7]: 
   0  1
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9
5  0  1
6  2  3
7  4  5
8  6  7
9  8  9

In [11]: import dask.dataframe as dd

In [12]: dd.read_csv('file*.csv', header=0, usecols=[1,2]).compute()
Out[12]: 
   0  1
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9
```
- the concat should be `ignore_index=True`, otherwise you are implicitly setting the index

Side issue (maybe an example in the docs would help), about how to ignore an index.
- I'd really like to do:
  
  `dd.read_csv('file*.csv', header=0, index_col=0).compute()` 

(see the above linked issue). This requires me to `.set_index(...)` after on a dummy column though

I know I have an index, but I 

```
In [17]: dd.read_csv('file*.csv', header=0, names=['col',0,1]).compute()
Out[17]: 
   col  0  1
0    0  0  1
1    1  2  3
2    2  4  5
3    3  6  7
4    4  8  9
0    0  0  1
1    1  2  3
2    2  4  5
3    3  6  7
4    4  8  9
```

Then drop it or `.reset_index`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API: read_csv should default to ignore_index=True when no index_col #1047

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

API: read_csv should default to ignore_index=True when no index_col #1047

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions