Skip to content

[Feat] Update db.in.ogr manual to improve import CSV files #4674

@cmbarton

Description

@cmbarton

The problem

CSV is one of the most commonly and widely used formats for exchanging tabular data across platforms. However, it is difficult to import csv files into GRASS. Currently, db.in.ogr imports csv files but transforms all columns to text. In the current manual text, the suggested work around is to create an accompanying *.csvt file that specifies the data types for each column. This is very cumbersome, especially for files with many columns. But OGR will automatically recognize column data types if a file open option is set
-oo AUTODETECT_TYPE=YES
This makes importing csv files much easier. This can be manually entered into the gdal_oo argument in db.in.ogr but this is not mentioned in the current manual.

Proposed solution

While -oo AUTODETECT_TYPE=YES should be the default for importing a csv file using db.in.ogr (see #4593), until that change can be made, it would be very helpful to describe in the manual how this argument can be implemented as a workaround for now. The information in the current manual about using a *.csvt file should also be maintained for finer manual control. I propose the manual should be updated as follows:

Current:
Import CSV file

Limited type recognition can be done for Integer, Real, String, Date, Time and DateTime columns through a descriptive file with same name as the CSV file, but .csvt extension (see details here).

NOTE: create koeppen_gridcode.csvt first for automated type recognition

db.in.ogr input=koeppen_gridcode.csv output=koeppen_gridcode
db.select table=koeppen_gridcode

New:
Import CSV file

db.in.ogr will attempt to automatically read input data types if the gdal_doo flag is set to
AUTODETECT_TYPE=YES.

db.in.ogr input=koeppen_gridcode.csv output=koeppen_gridcode gdal_doo=AUTODETECT_TYPE=YES
db.select table=koeppen_gridcode

Users can also specify data types for CSV columns using a type definition file with same name as the CSV file, but *.csvt extension (see details here). Columns can be defined as Integer, Real, String, Date, Time and DateTime in this way.

NOTE: create koeppen_gridcode.csvt first for automated type recognition

db.in.ogr input=koeppen_gridcode.csv output=koeppen_gridcode
db.select table=koeppen_gridcode

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions