Skip to content

CSV import to SQL Server lacks Unicode support #2250

@nummacway

Description

@nummacway

Description

I assume Heidi uses varchar literals to insert data into MSSQL. You cannot do this.
You cannot do that because varchar literals are lossy when they contain data outside of the current codepage. For example, running

SELECT 'HeidiSQL♥'

will always cause data loss prior to SQL Server 2019 and from 2019 requires the default non-Unicode collation to be UTF-8 rather than e.g. Latin1. However, using UTF-8 as your non-national collation requires explicit client support that is usually not available, so software will still use national types.
If a target column isn't national, SQL Server would do the implicit conversion anyway (with exactly the same result), so there is no downside of simply using national varchar literals all the time when communicating with SQL Server. It is also the correct syntax according to the SQL standard. MySQL/MariaDB don't seem to mind and will run this, although they don't implement national varchars:

SELECT N'HeidiSQL♥'

HeidiSQL version

12.11.0.7093

Database server version

SQL Server 2019

Reproduction recipe

  1. In case you use SQL Server 2019 or newer, make sure not to use UTF-8 as your collation (this is usually the case)
  2. Create table with at least one nvarchar-type column
  3. Import a CSV file where a character is not in the current locale, e.g. a file just containing HeidiSQL♥ will cause the bug
    Data is lost. Letters lose accents if available. Everything else just becomes '?'.

Error/Backtrace

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugDefective behaviour in HeidiSQLnettype-mssqlSQL Server specific issue

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions