Skip to content

[Bug] [Seatunnel-formats] 读取csv的时候遇到字段里有换行的会被解析成新的一行 导致数据错乱 #6748

@WuJiY

Description

@WuJiY

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

当解析文本格式为excel或csv的时候 遇到某一列值有换行符(ALT+ENTER)的时候会被解析成新的一行数据,这样会导致数据错乱

SeaTunnel Version

v2.3.5

SeaTunnel Config

env {
  parallelism = 1
  job.mode = "BATCH"
  checkpoint.interval = 5000
}

source {
   HdfsFile {
      skip_header_row_number = 1
      fs.defaultFS="hdfs://127.0.0.1:8020"
      path="/originFile/test1/2024/0424"
      file_format_type = "csv"
      encoding="gbk"
      result_table_name="csv_sales"
      schema {
        columns = [
               {
                   "name"= "a"
                   "type"= string
                   "nullable"= true
               }]
}}}

Running Command

./bin/seatunnel.sh -c ./config/conf.template

Error Exception

no error

Zeta or Flink or Spark Version

zeta

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions