Parquet Support Tasks & TODO

### Is your feature request related to a problem?

Tracking issue for Parquet-related features, optimizations

  #### Decode Performance                                                                                                                                                                                                                
  - [x] Optimize Parquet-to-native decoding for fixed-size columns (int, long, double, symbol etc. https://github.com/questdb/questdb/pull/6632)                                                                                                                                            
  - [x] Cache decoded Parquet data in ASOF JOIN to avoid redundant decoding                                                                                                                                                             
  - [x] Late materialization for filtered aggregate queries (#6675)
  - [x] Profile and optimize Array column decoding performance and see https://github.com/questdb/questdb/issues/6065      
  - [x] Using alternative varchar/string aux format internally to avoid unnecessary copy                                                                                                                                                   
                                                                                                                                                                                                                                        
  #### Write Support                                                                                                                                                                                                                     
  - [x] Decimal types (decimal8/16/32/64/128/256) write support                                                                                                                                                                   
  - [x] Fix Parquet writer to produce spec-compliant files (#6692) 
  - [ ] See also #4738 for the comprehensive Parquet implementation roadmap including:                                                                                                                                                        
  - DDL operations (detach/attach, add/change/drop column)                                                                                                                                                                                   
  - UPDATE/deduplication support (https://github.com/questdb/questdb/issues/6335)
  - Index
  - [ ] https://github.com/questdb/questdb/issues/6427
  - [ ] Catch MetaData from S3 parquet file to local disk                                                                                                                                                                    
                                                                                                                                                                                                                                        
  #### `read_parquet()` Function                                                                                                                                                                                                         
  - [ ] Support timestamp-based row group filtering for QuestDB-written files (`TimestampFinder`)   （https://github.com/questdb/questdb/issues/6081)                                                                                                                                     
  - [ ] Decode dictionary-encoded columns as Symbol(Dynamic) instead of Varchar                                                                                                                                                                  
                                                                                                                                                                                                                                        
  #### Statistics-based Optimization                                                                                                                                                                                                     
  - [x] Use Parquet min/max statistics to skip row groups / pages during filtered reads                                                                                                                                                         
  - [ ] Use Parquet statistics for aggregate pushdown (COUNT, MIN, MAX) 
  - [x] Use Parquet bloom filter to skip row groups / pages during filtered reads                                                                                                                                                                     
                                                                                                                                                                                                                                        
  #### Testing & Benchmarks                                                                                                                                                                                                              
  - [ ] Add more Parquet sqllogictest cases (ported from Duck)                                                                                                                                                                                                                                                                                                                                        
  - [ ] Run ClickBench on Parquet and compare with Duck
  - [x] Integration testing (Pandas, Polars, DuckDB, Spark) (https://github.com/questdb/questdb/pull/6708)

  #### Easy of Use                                                                                                                                                                                                    
  - [ ] https://github.com/questdb/questdb/issues/6494                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                        
### Full Name:

Victor

### Affiliation:

QuestDB

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parquet Support Tasks & TODO #6694

Is your feature request related to a problem?

Decode Performance

Write Support

`read_parquet()` Function

Statistics-based Optimization

Testing & Benchmarks

Easy of Use

Full Name:

Affiliation:

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parquet Support Tasks & TODO #6694

Description

Is your feature request related to a problem?

Decode Performance

Write Support

read_parquet() Function

Statistics-based Optimization

Testing & Benchmarks

Easy of Use

Full Name:

Affiliation:

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`read_parquet()` Function