Skip to content

Conversation

@MLopez-Ibanez
Copy link
Contributor

@MLopez-Ibanez MLopez-Ibanez commented May 4, 2023

See #5634

man/fread.Rd Outdated
\title{ Fast and friendly file finagler }
\description{
Similar to \code{read.table} but faster and more convenient. All controls such as \code{sep}, \code{colClasses} and \code{nrows} are automatically detected.
Similar to \code{\link[utils:read.delim]{read.delim()}} but faster and more convenient. All controls such as \code{sep}, \code{colClasses} and \code{nrows} are automatically detected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I actually don't think this is right.

read.csv() and read.delim() are thin wrappers around read.table(). The former sets sep=",", the latter sep="\t".

However, fread() uses sep="auto" by default, meaning it's more generic than either read.csv() or read.delim(). Thus I think read.table() is the best fit (even though fread()'s auto-detection is not available in read.table()).

Partially, this is splitting hairs -- ?read.table, ?read.csv, and ?read.delim are all the same page.

Suggested change
Similar to \code{\link[utils:read.delim]{read.delim()}} but faster and more convenient. All controls such as \code{sep}, \code{colClasses} and \code{nrows} are automatically detected.
Similar to \code{\link[utils:read.table]{read.table()}} but faster and more convenient. All controls such as \code{sep}, \code{colClasses} and \code{nrows} are automatically detected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've just read the referenced discussion (FFR: use #<issue> in the PR body to make it easier on reviewers -- numbers in the PR title aren't auto-linked).

Maybe the compromise here is "Similar to [read.csv()] or [read.delim()]", that way we convey that it's for delimited files but not implying the full generality of read.table(), WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds OK. Thanks.

@MichaelChirico MichaelChirico changed the title [DOC] fread is similar to read.delim (#5634) fread is similar to read.delim (#5634) Jan 6, 2024
@MichaelChirico MichaelChirico changed the title fread is similar to read.delim (#5634) fread is similar to read.delim Jan 6, 2024
@MichaelChirico MichaelChirico changed the title fread is similar to read.delim Link to ?read.delim in ?fread to give a closer analogue of expected behavior Jan 6, 2024
@MichaelChirico MichaelChirico changed the base branch from master to 1-15-99 January 13, 2024 14:04
@MichaelChirico MichaelChirico merged commit e3f9766 into Rdatatable:1-15-99 Jan 13, 2024
@MichaelChirico
Copy link
Member

Thanks!

MichaelChirico added a commit that referenced this pull request Jan 14, 2024
…ehavior (#5635)

* fread is similar to read.delim (#5634)

* Use ?read.csv / ?read.delim

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
MichaelChirico added a commit that referenced this pull request Feb 17, 2024
…ehavior (#5635)

* fread is similar to read.delim (#5634)

* Use ?read.csv / ?read.delim

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
MichaelChirico added a commit that referenced this pull request Feb 18, 2024
…ehavior (#5635)

* fread is similar to read.delim (#5634)

* Use ?read.csv / ?read.delim

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
MichaelChirico added a commit that referenced this pull request Apr 20, 2024
* 1.15.0 on CRAN. Bump to 1.15.99

* Fix transform slowness (#5493)

* Fix 5492 by limiting the costly deparse to `nlines=1`

* Implementing PR feedbacks

* Added  inside

* Fix typo in name

* Idiomatic use of  inside

* Separating the deparse line limit to a different PR

---------

Co-authored-by: Michael Chirico <[email protected]>

* Improvements to the introductory vignette (#5836)

* Added my improvements to the intro vignette

* Removed two lines I added extra as a mistake earlier

* Requested changes

* Vignette typo patch (#5402)

* fix typos and grammatical mistakes

* fix typos and punctuation

* remove double spaces where it wasn't necessary

* fix typos and adhere to British English spelling

* fix typos

* fix typos

* add missing closing bracket

* fix typos

* review fixes

* Update vignettes/datatable-benchmarking.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Update vignettes/datatable-benchmarking.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Apply suggestions from code review benchmarking

Co-authored-by: Michael Chirico <[email protected]>

* remove unnecessary [ ] from datatable-keys-fast-subset.Rmd

* Update vignettes/datatable-programming.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Update vignettes/datatable-reshape.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* One last batch of fine-tuning

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>

* Improved handling of list columns with NULL entries (#4250)

* Updated documentation for rbindlist(fill=TRUE)

* Print NULL entries of list as NULL

* Added news item

* edit NEWS, use '[NULL]' not 'NULL'

* fix test

* split NEWS item

* add example

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* clarify that list input->unnamed list output (#5383)

* clarify that list input->unnamed list output

* Add example where make.names is used

* mention role of make.names

* fix subsetting issue in split.data.table (#5368)

* fix subsetting issue in split.data.table

* add a test

* drop=FALSE on inner [

* switch to 3.2.0 R dep (#5905)

* Allow early exit from check for eval/evalq in cedta (#5660)

* Allow early exit from check for eval/evalq in cedta

Done in the browser+untested, please take a second look :)

* Use %chin%

* nocov new code

* frollmax1: frollmax, frollmax adaptive, left adaptive support (#5889)

* frollmax exact, buggy fast, no fast adaptive

* frollmax fast fixing bugs

* frollmax man to fix CRAN check

* frollmax fast adaptive non NA, dev

* froll docs, adaptive left

* no frollmax fast adaptive

* frollmax adaptive exact NAs handling

* PR summary in news

* copy-edit changes from reviews

Co-authored-by: Benjamin Schwendinger <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* comment requested by Michael

* update NEWS file

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>

* add comment requested by Michael

* add comment about int iterator for loop over k-1 obs

* extra comments

* Revert "extra comments"

This reverts commit 03af0e3.

* add comments to frollmax and frollsum

* typo fix

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* Friendlier error in assignment with trailing comma (#5467)

* friendlier error in assignment with trailing comma

e.g. `DT[, `:=`(a = 1, b = 2,)`.

WIP. Need to add tests and such, but editing from browser before I forget.

* Another pass

* include unnamed indices on RHS too

* tests

* NEWS

* test numbering

* explicit example in NEWS

* Link to ?read.delim in ?fread to give a closer analogue of expected behavior (#5635)

* fread is similar to read.delim (#5634)

* Use ?read.csv / ?read.delim

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>

* Run GHA jobs on 1-15-99 dev branch (#5909)

* Make declarations static for covr (#5910)

* class= argument for condition calls

* Unify logic with helper

* Add tests

* Use call.=FALSE where possible

* correct caught class

* strip call=/call.= handling

* botched merge

---------

Co-authored-by: Ofek <[email protected]>
Co-authored-by: Ani <[email protected]>
Co-authored-by: David Budzynski <[email protected]>
Co-authored-by: Scott Ritchie <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Jan Gorecki <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Manuel López-Ibáñez <[email protected]>
MichaelChirico added a commit that referenced this pull request Apr 23, 2024
* 1.15.0 on CRAN. Bump to 1.15.99

* Fix transform slowness (#5493)

* Fix 5492 by limiting the costly deparse to `nlines=1`

* Implementing PR feedbacks

* Added  inside

* Fix typo in name

* Idiomatic use of  inside

* Separating the deparse line limit to a different PR

---------

Co-authored-by: Michael Chirico <[email protected]>

* Improvements to the introductory vignette (#5836)

* Added my improvements to the intro vignette

* Removed two lines I added extra as a mistake earlier

* Requested changes

* Vignette typo patch (#5402)

* fix typos and grammatical mistakes

* fix typos and punctuation

* remove double spaces where it wasn't necessary

* fix typos and adhere to British English spelling

* fix typos

* fix typos

* add missing closing bracket

* fix typos

* review fixes

* Update vignettes/datatable-benchmarking.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Update vignettes/datatable-benchmarking.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Apply suggestions from code review benchmarking

Co-authored-by: Michael Chirico <[email protected]>

* remove unnecessary [ ] from datatable-keys-fast-subset.Rmd

* Update vignettes/datatable-programming.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Update vignettes/datatable-reshape.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* One last batch of fine-tuning

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>

* Improved handling of list columns with NULL entries (#4250)

* Updated documentation for rbindlist(fill=TRUE)

* Print NULL entries of list as NULL

* Added news item

* edit NEWS, use '[NULL]' not 'NULL'

* fix test

* split NEWS item

* add example

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* clarify that list input->unnamed list output (#5383)

* clarify that list input->unnamed list output

* Add example where make.names is used

* mention role of make.names

* fix subsetting issue in split.data.table (#5368)

* fix subsetting issue in split.data.table

* add a test

* drop=FALSE on inner [

* switch to 3.2.0 R dep (#5905)

* Allow early exit from check for eval/evalq in cedta (#5660)

* Allow early exit from check for eval/evalq in cedta

Done in the browser+untested, please take a second look :)

* Use %chin%

* nocov new code

* frollmax1: frollmax, frollmax adaptive, left adaptive support (#5889)

* frollmax exact, buggy fast, no fast adaptive

* frollmax fast fixing bugs

* frollmax man to fix CRAN check

* frollmax fast adaptive non NA, dev

* froll docs, adaptive left

* no frollmax fast adaptive

* frollmax adaptive exact NAs handling

* PR summary in news

* copy-edit changes from reviews

Co-authored-by: Benjamin Schwendinger <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* comment requested by Michael

* update NEWS file

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>

* add comment requested by Michael

* add comment about int iterator for loop over k-1 obs

* extra comments

* Revert "extra comments"

This reverts commit 03af0e3.

* add comments to frollmax and frollsum

* typo fix

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* Friendlier error in assignment with trailing comma (#5467)

* friendlier error in assignment with trailing comma

e.g. `DT[, `:=`(a = 1, b = 2,)`.

WIP. Need to add tests and such, but editing from browser before I forget.

* Another pass

* include unnamed indices on RHS too

* tests

* NEWS

* test numbering

* explicit example in NEWS

* Link to ?read.delim in ?fread to give a closer analogue of expected behavior (#5635)

* fread is similar to read.delim (#5634)

* Use ?read.csv / ?read.delim

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>

* Run GHA jobs on 1-15-99 dev branch (#5909)

* overhauled linter

* revert code changes

* Initial commit of {lintr} approach

* first pass at personalization

* first custom linter

* delint vignettes

* delint tests

* delint R sources

* rm empty

* re-merge

* Move config to .ci directory

* Use endsWithAny

* Make declarations static for covr (#5910)

* restore lint on branch

* extension needed after all?

* set option in R

* debug printing

* Exact file name in option

* really hacky approach

* skip more linters

* One more round of deactivation

* FIx whitespace issues (again??)

* botched merge

* obsolete branch ref

* restore simple CI script thanks to upstream fix

* more delint

* just disable unused_import_linter() everywhere for now

* rm whitespace from atime tests

* comment about comment

---------

Co-authored-by: Ofek <[email protected]>
Co-authored-by: Ani <[email protected]>
Co-authored-by: David Budzynski <[email protected]>
Co-authored-by: Scott Ritchie <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Jan Gorecki <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Manuel López-Ibáñez <[email protected]>
MichaelChirico added a commit that referenced this pull request Jul 14, 2025
* add macro version

* write explicit parallel version

* copy attributes

* add tests

* add to NAMESPACE

* add to tests

* copy names

* add man page

* update man

* fix typos

* update tests

* add coverage

* add benchmark example

* coverage

* NEWS

* trim NEWS

* update NEWS

* add bit64

* update naming in NEWS

* 1.15.0 on CRAN. Bump to 1.15.99

* Fix transform slowness (#5493)

* Fix 5492 by limiting the costly deparse to `nlines=1`

* Implementing PR feedbacks

* Added  inside

* Fix typo in name

* Idiomatic use of  inside

* Separating the deparse line limit to a different PR

---------

Co-authored-by: Michael Chirico <[email protected]>

* Improvements to the introductory vignette (#5836)

* Added my improvements to the intro vignette

* Removed two lines I added extra as a mistake earlier

* Requested changes

* Vignette typo patch (#5402)

* fix typos and grammatical mistakes

* fix typos and punctuation

* remove double spaces where it wasn't necessary

* fix typos and adhere to British English spelling

* fix typos

* fix typos

* add missing closing bracket

* fix typos

* review fixes

* Update vignettes/datatable-benchmarking.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Update vignettes/datatable-benchmarking.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Apply suggestions from code review benchmarking

Co-authored-by: Michael Chirico <[email protected]>

* remove unnecessary [ ] from datatable-keys-fast-subset.Rmd

* Update vignettes/datatable-programming.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* Update vignettes/datatable-reshape.Rmd

Co-authored-by: Michael Chirico <[email protected]>

* One last batch of fine-tuning

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>

* Improved handling of list columns with NULL entries (#4250)

* Updated documentation for rbindlist(fill=TRUE)

* Print NULL entries of list as NULL

* Added news item

* edit NEWS, use '[NULL]' not 'NULL'

* fix test

* split NEWS item

* add example

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* clarify that list input->unnamed list output (#5383)

* clarify that list input->unnamed list output

* Add example where make.names is used

* mention role of make.names

* fix subsetting issue in split.data.table (#5368)

* fix subsetting issue in split.data.table

* add a test

* drop=FALSE on inner [

* switch to 3.2.0 R dep (#5905)

* Allow early exit from check for eval/evalq in cedta (#5660)

* Allow early exit from check for eval/evalq in cedta

Done in the browser+untested, please take a second look :)

* Use %chin%

* nocov new code

* frollmax1: frollmax, frollmax adaptive, left adaptive support (#5889)

* frollmax exact, buggy fast, no fast adaptive

* frollmax fast fixing bugs

* frollmax man to fix CRAN check

* frollmax fast adaptive non NA, dev

* froll docs, adaptive left

* no frollmax fast adaptive

* frollmax adaptive exact NAs handling

* PR summary in news

* copy-edit changes from reviews

Co-authored-by: Benjamin Schwendinger <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* comment requested by Michael

* update NEWS file

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>

* Apply suggestions from code review

Co-authored-by: Michael Chirico <[email protected]>

* add comment requested by Michael

* add comment about int iterator for loop over k-1 obs

* extra comments

* Revert "extra comments"

This reverts commit 03af0e3.

* add comments to frollmax and frollsum

* typo fix

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>

* Friendlier error in assignment with trailing comma (#5467)

* friendlier error in assignment with trailing comma

e.g. `DT[, `:=`(a = 1, b = 2,)`.

WIP. Need to add tests and such, but editing from browser before I forget.

* Another pass

* include unnamed indices on RHS too

* tests

* NEWS

* test numbering

* explicit example in NEWS

* Link to ?read.delim in ?fread to give a closer analogue of expected behavior (#5635)

* fread is similar to read.delim (#5634)

* Use ?read.csv / ?read.delim

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>

* Run GHA jobs on 1-15-99 dev branch (#5909)

* prohibit matrix

* readd deleted line

* Make declarations static for covr (#5910)

* reorder code

* return invisible if inplace

* cut to 1 line

* use isTRUE for copy=NA

* speedup strings and lists

* add Hughs comments

* add coverage

* dedup INTSXP LGLSXP

* make tests lighter

* rm altrep include

* change testnum

* remove altrep

* remove duplicated tests

* mostly fix botched merge

* migrate NEWS item

* revert bad search+replace

* update NEWS wording

* add small body

* add additional test cases

* rerun benchmarks single threaded

* update doc

* remove unnecessary assignment

* change to frev/setrev

* add symbol for setrev

* update docs

* update NEWS

* add details about attributes

* drop attributes except names, class and levels

* update docs

* vestigial copy= reference

* use parity tests

* change tests to capture behavior after side-effects

* change man

* allow matrix frev

* rotate idiom

* use frev and setrev internally

* item numbering

* block future rev() usage

* setrev->setfrev

* dont use setfrev on caller-owned object

* update wording that was contrasting frev+setfrev; tweak examples

* mention that levels are retained

* add names test

* remove temp variables

* fix test ordering

* move temporary variables

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Ofek <[email protected]>
Co-authored-by: Ani <[email protected]>
Co-authored-by: David Budzynski <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Scott Ritchie <[email protected]>
Co-authored-by: Jan Gorecki <[email protected]>
Co-authored-by: Manuel López-Ibáñez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants