-
Notifications
You must be signed in to change notification settings - Fork 121
Running Queries
The utility allows users to run some simple SQL-like queries on the Parquet data.
The syntax for queries can be found by clicking on the Filter Query (?) label:

The syntax is very similar to SQL except for how Dates are handled. Example query formats can be found below:
| Type of Data | Example(s) |
|---|---|
| NULL Check | WHERE field_name IS NULL WHERE field_name IS NOT NULL |
| Datetime | WHERE field_name >= #2000-12-31# WHERE field_name = #2000-01-13 01:00:00# |
| Numeric | WHERE field_name <= 123.4 WHERE field_name <> 10 |
| String Wildcards: % = any sequence of characters _ = any single character |
WHERE field_name LIKE '%value%' WHERE field_name NOT LIKE '%value%' WHERE field_name = 'equals value' WHERE field_name <> 'not equals' |
| IN Check | WHERE field_name IN ('value1', 'value2') WHERE field_name NOT IN (1, 2) |
| Using Multiple Conditions | WHERE (field_1 = 0 AND field_2 <> 'value') OR field_3 IS NULL |
| Arithmetic (+, -, *, /) | WHERE field_1 * (field_2 / field_3) <= 100 |
Notes:
- List, Map, and Struct fields are automatically cast to String type as JSON for querying.
- The following date formats are accepted:
yyyy/MM/ddand the North AmericanMM/dd/yyyy
Field names containing spaces or punctuation must be escaped with square brackets:
WHERE [field with spaces and punctuation!] <> 'not equals'
The query can be entered in the Query Box located at the top of the UI:

To execute you may either hit Enter or click the Execute button. The grid below will be updated with your results. This can be verified by looking at the bottom-left side of the status bar which will show how many records have been filtered by the query:
| Before Query | After Query |
|---|---|
![]() |
![]() |
The Clear button will only remove the filter from the results in the grid below and will not clear the query text that you have entered. You may hit the Esc key while editing the query to quickly clear any existing query filters.
| Before Clear | After Clear |
|---|---|
![]() |
![]() |
It should be noted that queries will not run against the entire Parquet file but rather only the records that have been loaded into memory. For more information please see Query Scope.
Queries that are run apply to records that have been loaded into the application (First 1000 records by default). To run your queries against more records you must increase the Record Count so that more data from the Apache Parquet file is loaded into the application.
Loading more data into the application requires more system memory so this might become troublesome for really large files. See Tips For Large Files for some hints on how to deal with that.

