Skip to content

Running Queries

Sal edited this page Sep 1, 2025 · 38 revisions

The utility allows users to run some simple SQL-like queries on the Parquet data.

Query Syntax

The syntax for queries can be found by clicking on the Filter Query (?) label:

The syntax is very similar to SQL except for how Dates are handled. Example query formats can be found below:

Type of Data Example(s)
NULL Check WHERE field_name IS NULL
WHERE field_name IS NOT NULL
Datetime WHERE field_name >= #2000-12-31#
WHERE field_name = #2000-01-13 01:00:00#
Numeric WHERE field_name <= 123.4
WHERE field_name <> 10
String

Wildcards:
% = any sequence of characters
_ = any single character
WHERE field_name LIKE '%value%'
WHERE field_name NOT LIKE '%value%'
WHERE field_name = 'equals value'
WHERE field_name <> 'not equals'
IN Check WHERE field_name IN ('value1', 'value2')
WHERE field_name NOT IN (1, 2)
Using Multiple Conditions WHERE (field_1 = 0 AND field_2 <> 'value') OR field_3 IS NULL
Arithmetic (+, -, *, /) WHERE field_1 * (field_2 / field_3) <= 100

Notes:

  • List, Map, and Struct fields are automatically cast to String type as JSON for querying.
  • The following date formats are accepted: yyyy/MM/dd and the North American MM/dd/yyyy

Escaping field names

Field names containing spaces or punctuation must be escaped with square brackets:

WHERE [field with spaces and punctuation!] <> 'not equals'

Running the query

The query can be entered in the Query Box located at the top of the UI:

To execute you may either hit Enter or click the Execute button. The grid below will be updated with your results. This can be verified by looking at the bottom-left side of the status bar which will show how many records have been filtered by the query:

Before Query After Query

The Clear button will only remove the filter from the results in the grid below and will not clear the query text that you have entered. You may hit the Esc key while editing the query to quickly clear any existing query filters.

Before Clear After Clear

It should be noted that queries will not run against the entire Parquet file but rather only the records that have been loaded into memory. For more information please see Query Scope.

Query Scope

Queries that are run apply to records that have been loaded into the application (First 1000 records by default). To run your queries against more records you must increase the Record Count so that more data from the Apache Parquet file is loaded into the application.

Loading more data into the application requires more system memory so this might become troublesome for really large files. See Tips For Large Files for some hints on how to deal with that.

Clone this wiki locally