Skip to content

added some builtin types that can appear in Spark #643

Merged
andialbrecht merged 1 commit intoandialbrecht:masterfrom
mrmasterplan:spark-types
Aug 8, 2022
Merged

added some builtin types that can appear in Spark #643
andialbrecht merged 1 commit intoandialbrecht:masterfrom
mrmasterplan:spark-types

Conversation

@mrmasterplan
Copy link
Copy Markdown
Contributor

@mrmasterplan mrmasterplan commented Sep 22, 2021

Spark table schema has some compond builtin types that need keywords so that they can be tokenized correctly,
You can see them, here: https://spark.apache.org/docs/latest/sql-ref-datatypes.html
I also found a reference to Hive SQL data types here http://hortonworks.com/wp-content/uploads/2016/05/Hortonworks.CheatSheet.SQLtoHive.pdf

Here is an example of the kind of statement that I need to parse.

CREATE TABLE IF NOT EXISTS my_db1.tbl1(
a int,
b int,
c string,
cplx struct<
    someId:string,
    QrCode:string,
    details:struct<id:string>,
    blabla : array< int >
    >,
d timestamp,
m map<int,string>
)
USING DELTA
COMMENT "Dummy Database 1 table 1"
LOCATION "/tmp/foo/bar/my_db1/tbl1/"

without the changes in this PR, tokenizing will fail.

@mrmasterplan mrmasterplan marked this pull request as ready for review September 22, 2021 13:53
@andialbrecht andialbrecht merged commit 4073b56 into andialbrecht:master Aug 8, 2022
@andialbrecht
Copy link
Copy Markdown
Owner

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants