Add a function htmlOrXmlCoarseParse to extract content from html or xml format string.#19600
Add a function htmlOrXmlCoarseParse to extract content from html or xml format string.#19600abyss7 merged 20 commits intoClickHouse:masterfrom
htmlOrXmlCoarseParse to extract content from html or xml format string.#19600Conversation
|
@zlx19950903 because |
If i exclude |
|
|
||
| DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override | ||
| { | ||
| if(!isString(arguments[0])) |
There was a problem hiding this comment.
Style error, need a space between if and (.
…o guojiantao/htmlCoarseParse Conflicts: tests/queries/0_stateless/arcadia_skip_list.txt
It's expected to not work. There is an array of skipped tests inside |
htmlOrCoarseParse to extract content from html or xml format string.htmlOrXmlCoarseParse to extract content from html or xml format string.
* master: (160 commits) Make Poco HTTP Server zero-copy again (ClickHouse#19516) Fixed documentation ccache 4.2+ does not requires any quirks for SOURCE_DATE_EPOCH Add a function `htmlOrXmlCoarseParse` to extract content from html or xml format string. (ClickHouse#19600) Reinterpret function added Decimal, DateTim64 support Add test Update InterpreterSelectQuery.cpp Improved serialization for data types combined of Arrays and Tuples. Improved matching enum data types to protobuf enum type. Fixed serialization of the Map data type. Omitted values are now set by default. Log stdout and stderr when failed to start docker in integration tests. Added comment Don't backport base commit of branch in the same branch (ClickHouse#20628) Fix fasttest retry for failed tests Dictionary create source with functions crash fix Added error reinterpretation tests Update run.sh Updated documentation fix subquery with limit Rename untyped function reinterpretAs into reinterpret ignore data store files Support vhost ...
* master: (153 commits) Add gdb to fasttest image Make Poco HTTP Server zero-copy again (ClickHouse#19516) Use fixed version for aerospike Fixed documentation ccache 4.2+ does not requires any quirks for SOURCE_DATE_EPOCH Add a function `htmlOrXmlCoarseParse` to extract content from html or xml format string. (ClickHouse#19600) Reinterpret function added Decimal, DateTim64 support test/stress: fix permissions for clickhouse directories test/stress: improve backtrace catching on server failures test/stress: use clickhouse builtin start/stop to run server from the same user Add test Update InterpreterSelectQuery.cpp Improved serialization for data types combined of Arrays and Tuples. Improved matching enum data types to protobuf enum type. Fixed serialization of the Map data type. Omitted values are now set by default. Log stdout and stderr when failed to start docker in integration tests. Added comment Don't backport base commit of branch in the same branch (ClickHouse#20628) Fix fasttest retry for failed tests Dictionary create source with functions crash fix Added error reinterpretation tests Update run.sh ...
|
I believe this function can run faster. This is on 80 vCPU machine: For |


I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
htmlOrxmlCoarseParse;<script></script>parse;<style></style>parse;<![CDATA[]]>parse;<content>format parse;Usage: