[Experimental] Integrate Comet native reader with remote HDFS

### What is the problem the feature request solves?

Currently Apache DataFusion Comet reads the data from underlying sources using builtin Comet reader which lacks support for nested types processing. 

There is an experimental feature 
```
  val COMET_NATIVE_SCAN_IMPL: ConfigEntry[String] = conf("spark.comet.scan.impl")
    .doc(
      s"The implementation of Comet Native Scan to use. Available modes are '$SCAN_NATIVE_COMET'," +
        s"'$SCAN_NATIVE_DATAFUSION', and '$SCAN_NATIVE_ICEBERG_COMPAT'. " +
        s"'$SCAN_NATIVE_COMET' is for the original Comet native scan which uses a jvm based " +
        "parquet file reader and native column decoding. Supports simple types only " +
        s"'$SCAN_NATIVE_DATAFUSION' is a fully native implementation of scan based on DataFusion" +
        s"'$SCAN_NATIVE_ICEBERG_COMPAT' is a native implementation that exposes apis to read " +
        "parquet columns natively.")
    .internal()
    .stringConf
    .transform(_.toLowerCase(Locale.ROOT))
    .checkValues(Set(SCAN_NATIVE_COMET, SCAN_NATIVE_DATAFUSION, SCAN_NATIVE_ICEBERG_COMPAT))
    .createWithDefault(sys.env
      .getOrElse("COMET_PARQUET_SCAN_IMPL", SCAN_NATIVE_COMET)
      .toLowerCase(Locale.ROOT))
```

to scan the data using DataFusion native reader which supports Arrow nested types, however the reader has to be able to read data from remote HDFS filesystem. 

There are some object store implementations available to work with HDFS which are 
- https://github.com/datafusion-contrib/datafusion-objectstore-hdfs(native object store on top of libhdfs and JVM. More mem usage but has richer client setting support, like retry, network options, etc
- https://github.com/datafusion-contrib/hdfs-native-object-store having less client options but no JVM dependency


Subtasks
- [x] #1337
- [x] #1360
- [x] #1367
- [x] Documentation
- [x] #1368
- [x] #1515 


### Describe the potential solution

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Experimental] Integrate Comet native reader with remote HDFS #1336

What is the problem the feature request solves?

Describe the potential solution

Additional context

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Experimental] Integrate Comet native reader with remote HDFS #1336

Description

What is the problem the feature request solves?

Describe the potential solution

Additional context

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions