-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[Feature][Redis] Add redis key into the result record #9574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@Hisoka-X Since the previous Redis connector tests were already close to the 2-hour timeout, |
No problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new read_key_enabled option to the Redis source connector that includes the Redis key alongside the value in output records. It also introduces a single_field_name option to specify the field name for single Redis values when the key is enabled.
- Introduces
read_key_enabledconfiguration to include Redis keys in output records - Adds
single_field_nameoption for mapping single values when using keys - Refactors Redis source reader architecture with new KeyedRecordReader and UnKeyedRecordReader classes
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| RedisSourceOptions.java | Adds new configuration options for key reading functionality |
| RedisParameters.java | Updates parameter handling for new key-enabled features |
| RedisSourceReader.java | Refactors to use strategy pattern with keyed/unkeyed readers |
| KeyedRecordReader.java | New implementation for reading Redis data with keys included |
| UnKeyedRecordReader.java | New implementation preserving original behavior without keys |
| RedisRecordReader.java | Abstract base class for the reader strategy pattern |
| Test configuration files | E2E test configurations for various Redis data types with key reading |
| RedisTestCaseTemplateIT.java | New test methods validating key reading functionality |
| Redis.md | Documentation updates for new configuration options |
| backend.yml | Increases CI timeout to accommodate additional tests |
| } else | ||
| redisRecordReader = | ||
| new UnKeyedRecordReader(redisParameters, deserializationSchema, redisClient); |
Copilot
AI
Jul 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Missing braces around single-statement else clause. For consistency and maintainability, add braces around the else block.
| } else | |
| redisRecordReader = | |
| new UnKeyedRecordReader(redisParameters, deserializationSchema, redisClient); | |
| } else { | |
| redisRecordReader = | |
| new UnKeyedRecordReader(redisParameters, deserializationSchema, redisClient); | |
| } |
| try { | ||
| node = JsonUtils.parseObject(node.textValue()); | ||
| } catch (Exception e) { | ||
| // do nothing |
Copilot
AI
Jul 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The catch block with '// do nothing' comment silently ignores parsing errors. Consider logging the exception at debug level to aid troubleshooting while maintaining the fallback behavior.
| // do nothing | |
| log.debug("Failed to parse JSON object from text value: {}", node.textValue(), e); |
|
cc @liunaijie |
| RedisBaseOptions.KEY) | ||
| RedisBaseOptions.KEY, | ||
| RedisSourceOptions.READ_KEY_ENABLED, | ||
| RedisSourceOptions.SINGLE_FIELD_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest mark RedisSourceOptions.SINGLE_FIELD_NAME is required when data_type = key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at the code and noticed that the behavior of the KEY and STRING types seems completely identical.
Is there a specific reason why the KEY type is separated out?
Was KEY perhaps introduced to semantically distinguish it from other types that can hold multiple values?
Also, just to confirm: When key_read_enabled=true and data_type=key, should single_field_name be mandatory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my bad, when data_type = key. SINGLE_FIELD_NAME should not be required.
| objectNode = JsonUtils.createObjectNode(); | ||
| setValueInNode(objectNode, node); | ||
| } | ||
| objectNode.put("key", key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest this field use parameter, user can change this name. the default value is key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| try { | ||
| node = JsonUtils.parseObject(text); | ||
| } catch (Exception e) { | ||
| // do nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some log. or we can't find this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| "Redis source requires a deserialization schema to parse the JSON record with key: " | ||
| + key); | ||
| } else { | ||
| deserializationSchema.deserialize(json.getBytes(), output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now redis source only support JSON format. This logical is look good to me. But we need think about how to deal with other format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback!
Currently, Redis source only supports JSON format as you mentioned.
To prepare for supporting other formats in the future, I have refactored the parsing logic by defining an interface and implementing format-specific parsers.
This design allows easier extension to support additional formats later on.
Please let me know if you have any further suggestions!
…upport of multiple formats
| RedisBaseOptions.KEY, | ||
| RedisSourceOptions.READ_KEY_ENABLED, | ||
| RedisSourceOptions.SINGLE_FIELD_NAME, | ||
| RedisSourceOptions.KEY_FIELD_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest change to:
.conditional(
RedisSourceOptions.READ_KEY_ENABLED,
true,
RedisSourceOptions.SINGLE_FIELD_NAME,
RedisSourceOptions.KEY_FIELD_NAME
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liunaijie
Thank you for the suggestion!
However, I’d like to clarify that single_field_name is only applicable when the value in Redis is a single value .
In other cases, setting single_field_name has no effect.
If we want to make a required option, I believe enforcing key_field_name when read_key_enabled=true would be more appropriate.
What kind of constraint do you think would be appropriate to add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.conditional(
RedisSourceOptions.READ_KEY_ENABLED,
true,
RedisSourceOptions.SINGLE_FIELD_NAME,
RedisSourceOptions.KEY_FIELD_NAME
)
this means when read_key_enabled = true, then the single_field_name and key_field_name is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to clarify that I think it would be sufficient to enforce only key_field_name as required when read_key_enabled is true.
I believe making single_field_name mandatory might be unnecessary.
Please let me know if you see any reason otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the optionRule method can used in:
- runtime validation
only requiressingle_field_namewhenread_key_enabled = true, askey_field_nameis optional due to its default value.
In this case, we don't need setkey_field_name - configuration creation
e.g., in WebUI needs stricter rules:
when enabling read_key_enabled:
show key_field_name input with pre-filled default (key)
show single_field_name as required empty field (must be explicitly set)
block config submission until single_field_name is provided
In this case, we need setkey_field_nameandsingle_field_nameare required whenread_key_enabled=true
Given the second scenario, it's advisable to explicitly set both parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that makes sense. I understand that stricter validation is necessary, especially at the WebUI level, to ensure configuration consistency. Thank you for the clarification!
|
Overall LGTM. |
|
Thanks again for your help and review! 🙏 |
Hisoka-X
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dybyte !
Fixes #8838
Purpose of this pull request
This PR adds a new
read_key_enabledoption to the Redis source connector, which when enabled, includes the Redis key in each output record along with the value.Currently, this feature requires the schema to have a
keyfield named exactly as such and expects JSON-formatted data.Future improvements will focus on supporting flexible key field naming.
Additionally, to better support non-JSON (single-field) Redis values when
read_key_enabledis enabled, this PR introduces a newsingle_field_nameoption.This option allows users to specify the field name under which the single Redis value will be stored in the output record, making it possible to use key-value pairs even when the Redis value is not a JSON object.
Question
As I understand, without an explicit schema, Seatunnel maps fields by order, making it hard to insert the Redis key reliably.
Currently,
read_key_enabledrequires a schema with akeyfield and JSON format.Is there a recommended way to support this feature without schema?
Thanks for your advice!
Does this PR introduce any user-facing change?
Yes, this PR introduces two new configuration options to the Redis source connector:
read_key_enabled- When set to true, the Redis key will be included in the output record.single_field_name– Whenread_key_enabledis true and the Redis value is a single non-JSON value, this option allows specifying the field name to use for the value.How was this patch tested?
e2e tests were conducted for multiple Redis data types including string, list, set, and zset.
Additionally, tests were performed on complex data structures with custom keys, verifying the feature works correctly with various schema definitions and supports custom sink keys.
Check list
New License Guide