Skip to content

Commit 39914d8

Browse files
committed
Docs updated, __msan_unpoison usage fixed
1 parent 5a024bd commit 39914d8

File tree

9 files changed

+248
-28
lines changed

9 files changed

+248
-28
lines changed

docs/en/sql-reference/functions/string-functions.md

Lines changed: 154 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1136,54 +1136,198 @@ SELECT tryBase58Decode('3dc8KtHrwM') as res, tryBase58Decode('invalid') as res_i
11361136

11371137
## base64Encode
11381138

1139-
Encodes a String or FixedString as base64.
1139+
Encodes a String or FixedString as base64, according to [RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648#section-4).
11401140

11411141
Alias: `TO_BASE64`.
11421142

1143+
**Syntax**
1144+
1145+
```sql
1146+
base64Encode(plaintext)
1147+
```
1148+
1149+
**Arguments**
1150+
1151+
- `plaintext`[String](../data-types/string.md) column or constant.
1152+
1153+
**Returned value**
1154+
1155+
- A string containing the encoded value of the argument.
1156+
1157+
**Example**
1158+
1159+
``` sql
1160+
SELECT base64Encode('clickhouse');
1161+
```
1162+
1163+
Result:
1164+
1165+
```result
1166+
┌─base64Encode('clickhouse')─┐
1167+
│ Y2xpY2tob3VzZQ== │
1168+
└────────────────────────────┘
1169+
```
1170+
11431171
## base64UrlEncode
11441172

1145-
Encodes an URL (String or FixedString) as base64 according to [RFC 4648](https://tools.ietf.org/html/rfc4648).
1173+
Encodes an URL (String or FixedString) as base64 with URL-specific modifications, according to [RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648#section-5).
1174+
1175+
**Syntax**
1176+
1177+
```sql
1178+
base64UrlEncode(url)
1179+
```
1180+
1181+
**Arguments**
1182+
1183+
- `url`[String](../data-types/string.md) column or constant.
1184+
1185+
**Returned value**
1186+
1187+
- A string containing the encoded value of the argument.
1188+
1189+
**Example**
1190+
1191+
``` sql
1192+
SELECT base64UrlEncode('https://clickhouse.com');
1193+
```
1194+
1195+
Result:
1196+
1197+
```result
1198+
┌─base64UrlEncode('https://clickhouse.com')─┐
1199+
│ aHR0cDovL2NsaWNraG91c2UuY29t │
1200+
└───────────────────────────────────────────┘
1201+
```
11461202

11471203
## base64Decode
11481204

1149-
Decodes a base64-encoded String or FixedString. Throws an exception in case of error.
1205+
Accepts a String and decodes it from base64, according to [RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648#section-4). Throws an exception in case of an error.
11501206

11511207
Alias: `FROM_BASE64`.
11521208

1209+
**Syntax**
1210+
1211+
```sql
1212+
base64Decode(encoded)
1213+
```
1214+
1215+
**Arguments**
1216+
1217+
- `encoded`[String](../data-types/string.md) column or constant. If the string is not a valid Base64-encoded value, an exception is thrown.
1218+
1219+
**Returned value**
1220+
1221+
- A string containing the decoded value of the argument.
1222+
1223+
**Example**
1224+
1225+
``` sql
1226+
SELECT base64Decode('Y2xpY2tob3VzZQ==');
1227+
```
1228+
1229+
Result:
1230+
1231+
```result
1232+
┌─base64Decode('Y2xpY2tob3VzZQ==')─┐
1233+
│ clickhouse │
1234+
└──────────────────────────────────┘
1235+
```
1236+
11531237
## base64UrlDecode
11541238

1155-
Decodes a base64-encoded URL (String or FixedString) according to [RFC 4648](https://tools.ietf.org/html/rfc4648). Throws an exception in case of error.
1239+
Accepts a base64-encoded URL and decodes it from base64 with URL-specific modifications, according to [RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648#section-5). Throws an exception in case of an error.
1240+
1241+
**Syntax**
1242+
1243+
```sql
1244+
base64UrlDecode(encodedUrl)
1245+
```
1246+
1247+
**Arguments**
1248+
1249+
- `encodedUrl`[String](../data-types/string.md) column or constant. If the string is not a valid Base64-encoded value with URL-specific modifications, an exception is thrown.
1250+
1251+
**Returned value**
1252+
1253+
- A string containing the decoded value of the argument.
1254+
1255+
**Example**
1256+
1257+
``` sql
1258+
SELECT base64UrlDecode('aHR0cDovL2NsaWNraG91c2UuY29t');
1259+
```
1260+
1261+
Result:
1262+
1263+
```result
1264+
┌─base64UrlDecode('aHR0cDovL2NsaWNraG91c2UuY29t')─┐
1265+
│ https://clickhouse.com │
1266+
└─────────────────────────────────────────────────┘
1267+
```
11561268

11571269
## tryBase64Decode
11581270

11591271
Like `base64Decode` but returns an empty string in case of error.
11601272

1273+
**Syntax**
1274+
1275+
```sql
1276+
tryBase64Decode(encoded)
1277+
```
1278+
1279+
**Arguments**
1280+
1281+
- `encoded`: [String](../data-types/string.md) column or constant. If the string is not a valid Base64-encoded value, returns an empty string.
1282+
1283+
**Returned value**
1284+
1285+
- A string containing the decoded value of the argument.
1286+
1287+
**Examples**
1288+
1289+
Query:
1290+
1291+
```sql
1292+
SELECT tryBase64Decode('RW5jb2RlZA==') as res, tryBase64Decode('invalid') as res_invalid;
1293+
```
1294+
1295+
```response
1296+
┌─res────────┬─res_invalid─┐
1297+
│ clickhouse │ │
1298+
└────────────┴─────────────┘
1299+
```
1300+
11611301
## tryBase64UrlDecode
11621302

11631303
Like `base64UrlDecode` but returns an empty string in case of error.
11641304

11651305
**Syntax**
11661306

11671307
```sql
1168-
tryBase64Decode(encoded)
1308+
tryBase64UrlDecode(encodedUrl)
11691309
```
11701310

11711311
**Parameters**
11721312

1173-
- `encoded`: [String](../data-types/string.md) column or constant. If the string is not a valid Base58-encoded value, returns an empty string in case of error.
1313+
- `encodedUrl`: [String](../data-types/string.md) column or constant. If the string is not a valid Base64-encoded value with URL-specific modifications, returns an empty string.
1314+
1315+
**Returned value**
1316+
1317+
- A string containing the decoded value of the argument.
11741318

11751319
**Examples**
11761320

11771321
Query:
11781322

11791323
```sql
1180-
SELECT tryBase64Decode('RW5jb2RlZA==') as res, tryBase64Decode('invalid') as res_invalid;
1324+
SELECT tryBase64UrlDecode('aHR0cDovL2NsaWNraG91c2UuY29t') as res, tryBase64Decode('aHR0cHM6Ly9jbGlja') as res_invalid;
11811325
```
11821326

11831327
```response
1184-
┌─res─────┬─res_invalid─┐
1185-
Encoded │ │
1186-
└─────────┴─────────────┘
1328+
┌─res────────────────────┬─res_invalid─┐
1329+
https://clickhouse.com │ │
1330+
└────────────────────────┴─────────────┘
11871331
```
11881332

11891333
## endsWith {#endswith}

src/Functions/FunctionBase64Conversion.h

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ inline std::string preprocessBase64Url(std::string_view src)
3333
std::string padded_src;
3434
padded_src.reserve(src.size() + 3);
3535

36-
// Do symbol substitution as described in https://datatracker.ietf.org/doc/html/rfc4648#page-7
36+
// Do symbol substitution as described in https://datatracker.ietf.org/doc/html/rfc4648#section-5
3737
for (auto s : src)
3838
{
3939
switch (s)
@@ -57,7 +57,7 @@ inline std::string preprocessBase64Url(std::string_view src)
5757
case 0:
5858
break; // no padding needed
5959
case 1:
60-
padded_src.append("==="); // this case is impossible to occur, however, we'll insert padding anyway
60+
padded_src.append("==="); // this case is impossible to occur with valid base64-URL encoded input, however, we'll insert padding anyway
6161
break;
6262
case 2:
6363
padded_src.append("=="); // two bytes padding
@@ -72,7 +72,7 @@ inline std::string preprocessBase64Url(std::string_view src)
7272

7373
inline size_t postprocessBase64Url(UInt8 * dst, size_t out_len)
7474
{
75-
// Do symbol substitution as described in https://datatracker.ietf.org/doc/html/rfc4648#page-7
75+
// Do symbol substitution as described in https://datatracker.ietf.org/doc/html/rfc4648#section-5
7676
for (size_t i = 0; i < out_len; ++i)
7777
{
7878
switch (dst[i])
@@ -107,6 +107,10 @@ struct Base64Encode
107107
size_t outlen = 0;
108108
base64_encode(src.data(), src.size(), reinterpret_cast<char *>(dst), &outlen, 0);
109109

110+
/// Base64 library is using AVX-512 with some shuffle operations.
111+
/// Memory sanitizer doesn't understand if there was uninitialized memory in SIMD register but it was not used in the result of shuffle.
112+
__msan_unpoison(dst, outlen);
113+
110114
if constexpr (variant == Base64Variant::Url)
111115
outlen = postprocessBase64Url(dst, outlen);
112116

@@ -242,10 +246,6 @@ class FunctionBase64Conversion : public IFunction
242246
const size_t src_length = src_offsets[row] - src_offset_prev - 1;
243247
const size_t outlen = Func::perform({src, src_length}, dst_pos);
244248

245-
/// Base64 library is using AVX-512 with some shuffle operations.
246-
/// Memory sanitizer don't understand if there was uninitialized memory in SIMD register but it was not used in the result of shuffle.
247-
__msan_unpoison(dst_pos, outlen);
248-
249249
src += src_length + 1;
250250
dst_pos += outlen;
251251
*dst_pos = '\0';
@@ -280,10 +280,6 @@ class FunctionBase64Conversion : public IFunction
280280
{
281281
const auto outlen = Func::perform({src, src_n}, dst_pos);
282282

283-
/// Base64 library is using AVX-512 with some shuffle operations.
284-
/// Memory sanitizer don't understand if there was uninitialized memory in SIMD register but it was not used in the result of shuffle.
285-
__msan_unpoison(dst_pos, outlen);
286-
287283
src += src_n;
288284
dst_pos += outlen;
289285
*dst_pos = '\0';

src/Functions/base64Decode.cpp

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,14 @@ namespace DB
77
{
88
REGISTER_FUNCTION(Base64Decode)
99
{
10-
factory.registerFunction<FunctionBase64Conversion<Base64Decode<Base64Variant::Normal>>>();
11-
factory.registerFunction<FunctionBase64Conversion<Base64Decode<Base64Variant::Url>>>();
10+
FunctionDocumentation::Description description = R"(Accepts a String and decodes it from base64, according to RFC 4648 (https://datatracker.ietf.org/doc/html/rfc4648#section-4). Throws an exception in case of an error. Alias: FROM_BASE64.)";
11+
FunctionDocumentation::Syntax syntax = "base64Decode(encoded)";
12+
FunctionDocumentation::Arguments arguments = {{"encoded", "String column or constant. If the string is not a valid Base64-encoded value, an exception is thrown."}};
13+
FunctionDocumentation::ReturnedValue returned_value = "A string containing the decoded value of the argument.";
14+
FunctionDocumentation::Examples examples = {{"Example", "SELECT base64Decode('Y2xpY2tob3VzZQ==')", "clickhouse"}};
15+
FunctionDocumentation::Categories categories = {"String encoding"};
16+
17+
factory.registerFunction<FunctionBase64Conversion<Base64Decode<Base64Variant::Normal>>>({description, syntax, arguments, returned_value, examples, categories});
1218

1319
/// MySQL compatibility alias.
1420
factory.registerAlias("FROM_BASE64", "base64Decode", FunctionFactory::CaseInsensitive);

src/Functions/base64Encode.cpp

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,14 @@ namespace DB
77
{
88
REGISTER_FUNCTION(Base64Encode)
99
{
10-
factory.registerFunction<FunctionBase64Conversion<Base64Encode<Base64Variant::Normal>>>();
11-
factory.registerFunction<FunctionBase64Conversion<Base64Encode<Base64Variant::Url>>>();
10+
FunctionDocumentation::Description description = R"(Encodes a String as base64, according to RFC 4648 (https://datatracker.ietf.org/doc/html/rfc4648#section-4). Alias: TO_BASE64.)";
11+
FunctionDocumentation::Syntax syntax = "base64Encode(plaintext)";
12+
FunctionDocumentation::Arguments arguments = {{"plaintext", "String column or constant."}};
13+
FunctionDocumentation::ReturnedValue returned_value = "A string containing the encoded value of the argument.";
14+
FunctionDocumentation::Examples examples = {{"Example", "SELECT base64Encode('clickhouse')", "Y2xpY2tob3VzZQ=="}};
15+
FunctionDocumentation::Categories categories = {"String encoding"};
16+
17+
factory.registerFunction<FunctionBase64Conversion<Base64Encode<Base64Variant::Normal>>>({description, syntax, arguments, returned_value, examples, categories});
1218

1319
/// MySQL compatibility alias.
1420
factory.registerAlias("TO_BASE64", "base64Encode", FunctionFactory::CaseInsensitive);

src/Functions/base64UrlDecode.cpp

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#include <Functions/FunctionBase64Conversion.h>
2+
3+
#if USE_BASE64
4+
#include <Functions/FunctionFactory.h>
5+
6+
namespace DB
7+
{
8+
REGISTER_FUNCTION(Base64UrlDecode)
9+
{
10+
FunctionDocumentation::Description description = R"(Accepts a base64-encoded URL and decodes it from base64 with URL-specific modifications, according to RFC 4648 (https://datatracker.ietf.org/doc/html/rfc4648#section-5).)";
11+
FunctionDocumentation::Syntax syntax = "base64UrlDecode(encodedUrl)";
12+
FunctionDocumentation::Arguments arguments = {{"encodedUrl", "String column or constant. If the string is not a valid Base64-encoded value, an exception is thrown."}};
13+
FunctionDocumentation::ReturnedValue returned_value = "A string containing the decoded value of the argument.";
14+
FunctionDocumentation::Examples examples = {{"Example", "SELECT base64UrlDecode('aHR0cDovL2NsaWNraG91c2UuY29t')", "https://clickhouse.com"}};
15+
FunctionDocumentation::Categories categories = {"String encoding"};
16+
17+
factory.registerFunction<FunctionBase64Conversion<Base64Decode<Base64Variant::Url>>>({description, syntax, arguments, returned_value, examples, categories});
18+
}
19+
}
20+
21+
#endif

src/Functions/base64UrlEncode.cpp

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#include <Functions/FunctionBase64Conversion.h>
2+
3+
#if USE_BASE64
4+
#include <Functions/FunctionFactory.h>
5+
6+
namespace DB
7+
{
8+
REGISTER_FUNCTION(Base64UrlEncode)
9+
{
10+
FunctionDocumentation::Description description = R"(Encodes an URL (String or FixedString) as base64 with URL-specific modifications, according to RFC 4648 (https://datatracker.ietf.org/doc/html/rfc4648#section-5).)";
11+
FunctionDocumentation::Syntax syntax = "base64UrlEncode(url)";
12+
FunctionDocumentation::Arguments arguments = {{"url", "String column or constant."}};
13+
FunctionDocumentation::ReturnedValue returned_value = "A string containing the encoded value of the argument.";
14+
FunctionDocumentation::Examples examples = {{"Example", "SELECT base64UrlEncode('https://clickhouse.com')", "aHR0cHM6Ly9jbGlja2hvdXNlLmNvbQ"}};
15+
FunctionDocumentation::Categories categories = {"String encoding"};
16+
17+
factory.registerFunction<FunctionBase64Conversion<Base64Encode<Base64Variant::Url>>>({description, syntax, arguments, returned_value, examples, categories});
18+
}
19+
}
20+
21+
#endif

src/Functions/tryBase64Decode.cpp

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,14 @@ namespace DB
77
{
88
REGISTER_FUNCTION(TryBase64Decode)
99
{
10-
factory.registerFunction<FunctionBase64Conversion<TryBase64Decode<Base64Variant::Normal>>>();
11-
factory.registerFunction<FunctionBase64Conversion<TryBase64Decode<Base64Variant::Url>>>();
10+
FunctionDocumentation::Description description = R"(Decodes a String or FixedString from base64, like base64Decode but returns an empty string in case of an error.)";
11+
FunctionDocumentation::Syntax syntax = "tryBase64Decode(encoded)";
12+
FunctionDocumentation::Arguments arguments = {{"encoded", "String column or constant. If the string is not a valid Base64-encoded value, returns an empty string."}};
13+
FunctionDocumentation::ReturnedValue returned_value = "A string containing the decoded value of the argument.";
14+
FunctionDocumentation::Examples examples = {{"valid", "SELECT tryBase64Decode('Y2xpY2tob3VzZQ==')", "clickhouse"}, {"invalid", "SELECT tryBase64Decode('invalid')", ""}};
15+
FunctionDocumentation::Categories categories = {"String encoding"};
16+
17+
factory.registerFunction<FunctionBase64Conversion<TryBase64Decode<Base64Variant::Normal>>>({description, syntax, arguments, returned_value, examples, categories});
1218
}
1319
}
1420

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
#include <Functions/FunctionBase64Conversion.h>
2+
3+
#if USE_BASE64
4+
#include <Functions/FunctionFactory.h>
5+
6+
namespace DB
7+
{
8+
REGISTER_FUNCTION(TryBase64UrlDecode)
9+
{
10+
FunctionDocumentation::Description description = R"(Decodes an URL from base64, like base64UrlDecode but returns an empty string in case of an error.)";
11+
FunctionDocumentation::Syntax syntax = "tryBase64UrlDecode(encodedUrl)";
12+
FunctionDocumentation::Arguments arguments = {{"encodedUrl", "String column or constant. If the string is not a valid Base64-encoded value with URL-specific modifications, returns an empty string."}};
13+
FunctionDocumentation::ReturnedValue returned_value = "A string containing the decoded value of the argument.";
14+
FunctionDocumentation::Examples examples = {{"valid", "SELECT tryBase64UrlDecode('aHR0cHM6Ly9jbGlja2hvdXNlLmNvbQ')", "https://clickhouse.com"}, {"invalid", "SELECT tryBase64UrlDecode('aHR0cHM6Ly9jbGlja')", ""}};
15+
FunctionDocumentation::Categories categories = {"String encoding"};
16+
17+
factory.registerFunction<FunctionBase64Conversion<TryBase64Decode<Base64Variant::Url>>>({description, syntax, arguments, returned_value, examples, categories});
18+
}
19+
}
20+
21+
#endif

tests/queries/0_stateless/02415_all_new_functions_must_be_documented.sql

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
SELECT name FROM system.functions WHERE NOT is_aggregate AND origin = 'System' AND alias_to = '' AND length(description) < 10
44
AND name NOT IN (
55
'aes_decrypt_mysql', 'aes_encrypt_mysql', 'decrypt', 'encrypt',
6-
'base64Decode', 'base64Encode', 'tryBase64Decode', 'base64UrlDecode', 'base64UrlEncode', 'tryBase64UrlDecode',
76
'convertCharset',
87
'detectLanguage', 'detectLanguageMixed',
98
'geoToH3',

0 commit comments

Comments
 (0)