Skip to content

Make Data.String.CodePoints the default #95

@hdgarrood

Description

@hdgarrood

@michaelficarra originally suggested this and I agree; I think Data.String.CodePoints should really be the default. Unless you're certain you won't be working with anything outside the Basic Multilingual Plane, and you've identified string manipulations as a performance bottleneck, you should really be using the functions in Data.String.CodePoints.

For the functions whose type signatures are the same across both modules, like length :: String -> Int, this has the potential to be quite problematic, so I think we need to be quite careful about it. I'd suggest the following:

  • In the next breaking release:
    • we create a module Data.String.CodeUnits, with the exact same exports as the current Data.String,
    • we add a notice at the very top of Data.String, detailing that the functions within currently operate on code units, not code points; that this will change in the next breaking release; and that you should very probably be using Data.String.CodePoints instead (unless you are sure you want to operate on code units, in which case you can use Data.String.CodeUnits)
  • In the breaking release after that one:
    • change Data.String so that it re-exports everything from Data.String.CodePoints
    • remove the notices
    • consider deprecating the Data.String.CodePoints module, for removal in a subsequent breaking release?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions