Skip to content

[BEAM-3446] Fixes RedisIO non-prefix read operations #4656

Closed
vvarma wants to merge 1 commit intoapache:masterfrom
vvarma:i3446
Closed

[BEAM-3446] Fixes RedisIO non-prefix read operations #4656
vvarma wants to merge 1 commit intoapache:masterfrom
vvarma:i3446

Conversation

@vvarma
Copy link
Copy Markdown
Contributor

@vvarma vvarma commented Feb 11, 2018

  • BaseReadFn to abstract general jedis operations.
  • Moved key fetch given prefix to ReadKeywsWithPattern DoFn.
  • ReadFn is pure fetch from redis given key.

URL: https://issues.apache.org/jira/browse/BEAM-3446

BaseReadFn to abstract general jedis operations. Separated key fetch using prefix and get by key into serparate DoFn.
@stale
Copy link
Copy Markdown

stale Bot commented Jun 7, 2018

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@stale stale Bot added the wontfix label Jun 7, 2018
@vvarma
Copy link
Copy Markdown
Contributor Author

vvarma commented Jun 8, 2018

the fix is in place @jbonofre please help review

@stale stale Bot removed the wontfix label Jun 8, 2018
@kennknowles
Copy link
Copy Markdown
Member

We have turned on autoformatting of the codebase, which causes small conflicts across the board. You can probably safely rebase and just keep your changes. Like this:

$ git rebase
... see some conflicts
$ git diff
... confirmed that the conflicts are just autoformatting
... so we can just keep our changes are do our own autoformat
$ git checkout --theirs --
$ git add -u
$ git rebase --continue
$ ./gradlew spotlessJavaApply

Please ping me if you run into any difficulty.

Copy link
Copy Markdown
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for taking so long @vvarma. Can you please rebase this one and take a look at my comments. (@jbonofre sorry for jumping in but was taking a look at broken things because of the recent spotless enable and thought about helping with this one).


return input
.apply(Create.of(keyPattern()))
.apply(ParDo.of(new ReadKeywsWithPattern(connectionConfiguration())))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ReadKeywsWithPattern/ReadKeysWithPattern


private final RedisConnectionConfiguration connectionConfiguration;
private abstract static class BaseReadFn<T> extends DoFn<String, T> {
protected final RedisConnectionConfiguration connectionConfiguration;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be package private

protected final RedisConnectionConfiguration connectionConfiguration;

private transient Jedis jedis;
protected transient Jedis jedis;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be package private

protected transient Jedis jedis;

public ReadFn(RedisConnectionConfiguration connectionConfiguration) {
public BaseReadFn(RedisConnectionConfiguration connectionConfiguration) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove public, in general it is a common Beam practice to restrict access as much as possible. You can use IntelliJ's analyze code to do this.

ScanResult<String> scanResult = jedis.scan(cursor, scanParams);
List<String> keys = scanResult.getResult();

Pipeline pipeline = jedis.pipelined();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: What is the reason to remove pipelining in general, seems like if the approach of this PR is more composable, it would perform worse, won't it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of keys that need to be looked up in a given window or batch can vary. Ideally, we should have a configurable batch size and use MGET https://redis.io/commands/mget if wanted to optimize further.
Pipelining an entire window or batch can cause memory spikes in Redis depending on the number of keys being looked up, for the time being, to simplify things I removed pipeline.

@vvarma
Copy link
Copy Markdown
Contributor Author

vvarma commented Jun 30, 2018

Closing this pr, opened a new one with fixes and rebased. #5841

@vvarma vvarma closed this Jun 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants