-
Notifications
You must be signed in to change notification settings - Fork 433
Implement Table::ReadRows() #167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
coryan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sending this early. I think your iterator class does not conform to the requirements for a C++ iterator, and would need some changes to meet them.
Please take a look and let me know if my comments do not make sense.
bigtable/client/table.cc
Outdated
| } | ||
|
|
||
| RowReader Table::ReadRows(RowSet row_set, int rows_limit, Filter filter) { | ||
| return RowReader(client_, table_name(), row_set, rows_limit, filter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to std::move() the "expensive" parameteres (row_set, filter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bigtable/client/table.cc
Outdated
| } | ||
| } | ||
|
|
||
| RowReader Table::ReadRows(RowSet row_set, int rows_limit, Filter filter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You return a RowReader but the class is defined in bigtable/client/internal and presumably in the internal namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| @@ -0,0 +1,158 @@ | |||
| // Copyright 2017 Google Inc. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think RowReader is part of the interface exposed to the user. They may need to define variables of this type:
bigtable::RowReader reader = table.ReadRows(...);it cannot be hidden in the internal directory or namespace. We want to be able to say "do not use anything in internal/ or bigtable::internal directly".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the timely review! I still owe the tests, but replying to the comments:
done - moved
| namespace bigtable { | ||
| inline namespace BIGTABLE_CLIENT_NS { | ||
| /** | ||
| * Object returned by Table::ReadRows(), enumerates rows in the response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Iterate over the results of ReadRows() using the STL idioms"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
|
|
||
| public: | ||
| /// Signifies that there is no limit on the number of rows to read. | ||
| static int const NO_ROWS_LIMIT = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constexpr maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| RowReaderIterator(RowReader* owner, bool is_end) | ||
| : owner_(owner), is_end_(is_end) {} | ||
|
|
||
| RowReaderIterator& operator++(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are required to implement the postfix ++ operator too, though it might be a bad idea to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I followed your advice and refactored to hold the row in the iterator.
I kept to the absl::optional to be able to use emplace.
Updated the * and -> operators as well by basically snatching the ones from absl::optional.
| } | ||
|
|
||
| private: | ||
| RowReader* const owner_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the constant pointer to RowReader disable copying for the iterator class? And I thought iterators were required to be CopyConstructible?
Please consider adding a test where you verify the compile time assertions about iterators:
// RowReader::iterator must satisfy the requirements of an InputIterator.
static void check_iterator_requirements() {
using base_iterator = std::iterator<std::input_iterator_tag, CellReader>;
static_assert(std::is_base_of<base_iterator, iterator>::value,
"iterator must be an InputIterator");
static_assert(std::is_copy_constructible<iterator>::value,
"iterator must be CopyConstructible");
static_assert(std::is_move_constructible<iterator>::value,
"iterator must be MoveConstructible");
static_assert(std::is_copy_assignable<iterator>::value,
"iterator must be CopyAssignable");
static_assert(std::is_move_assignable<iterator>::value,
"iterator must be MoveAssignable");
static_assert(std::is_destructible<iterator>::value,
"iterator must be Destructible");
static_assert(std::is_swappable<iterator>::value,
"iterator must be Swappable");
// ugly hacks to check that if o is of type iterator then ++o and *o
// are valid expressions with the usual return types ...
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are static asserts, I've put them at the top of row_reader.cc in an anonymous namespace. They did fail until I removed the const field.
| } | ||
|
|
||
| private: | ||
| RowReader* const owner_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you hold the already parse and read Row object in the iterator you can implement postfix, make it copy constructible, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks
| }; | ||
|
|
||
| std::shared_ptr<DataClient> client_; | ||
| absl::string_view table_name_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ties the lifecycle of the RowReader to the lifecycle of the Table used to create it. Consider making a copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| /// Number of rows read so far, used to set row_limit in retries. | ||
| int rows_count_; | ||
|
|
||
| /// Holds the last read row, non-end() iterators all point to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, you can have the end() iterators have owner_ == nullptr, and move the Row to the actual iterators. That would make them behave like other iterators do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
coryan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the changes so quickly. As you know, the tests need a lot of work, but the classes are pretty close.
bigtable/client/row_reader.cc
Outdated
| namespace { | ||
| using base_iterator = std::iterator<std::input_iterator_tag, Row>; | ||
|
|
||
| static_assert(std::is_base_of<base_iterator, RowReader::iterator>::value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing these checks. Can you review the requirements for an InputIterator in this doc, or any documentation you like really, and make sure we are not missing anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are exercising the equality and non-equality in tests through assertions. I've also added an explicit use of the postincrement in ReadOneRowIteratorPostincrement.
bigtable/client/row_reader.cc
Outdated
| RowReader::RowReader(std::shared_ptr<DataClient> client, | ||
| absl::string_view table_name, RowSet row_set, | ||
| int rows_limit, Filter filter, | ||
| std::unique_ptr<RPCRetryPolicy> retry_policy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do not copy or use retry_policy or backoff_policy. That is because this is WIP and you want to make the interface complete?
If we are going to merge this then we either need a (a) TODO saying why these are unused, or (b) remove them and put the TODO() from wherever this is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see below that you are using backoff_policy_, you probably want to transfer those policies then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, fixed
bigtable/client/row_reader.cc
Outdated
| do { | ||
| if (NextChunk()) { | ||
| parser_->HandleChunk( | ||
| std::move(*(response_.mutable_chunks(processed_chunks_)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard to figure out what is going on here. Seems like you move the data into response_ and then immediately move it out. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only move out one chunk at a time. I chose to simplify the parser (compared to what it used to do) and have it take single chunks. Therefore RowReader needs to handle multi-chunk responses, which is what happens here and in NextChunk(). If you agree with the approach, I'll add some comments explaining it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, processed_chunks_ is the count of processed chunks not the vector or list of processed chunks. Do you mind naming that member variable processed_chunks_count_?
With that I think no additional comments would be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bigtable/client/row_reader.cc
Outdated
| if (NextChunk()) { | ||
| parser_->HandleChunk( | ||
| std::move(*(response_.mutable_chunks(processed_chunks_)))); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer to minimize if/else depth:
if (blah) {
do thing;
continue;
}
do other things...over:
if (blah) {
do thing;
} else {
do other things;
}
// end of loop.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I've rearranged the whole thing a bit
bigtable/client/row_reader.cc
Outdated
| std::move(*(response_.mutable_chunks(processed_chunks_)))); | ||
| } else { | ||
| grpc::Status status = stream_->Finish(); | ||
| if (status.ok()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same preference for shallower if/else here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| * Reads data from a set of rows. | ||
| * | ||
| * @param row_set the rows to read from. | ||
| * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blanks between the @param sections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bigtable/client/table.h
Outdated
| class Table { | ||
| public: | ||
| /// Signifies that `Table::ReadRows()` should not limit the number of rows. | ||
| static int const NO_ROWS_LIMIT = RowReader::NO_ROWS_LIMIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use two overloads instead of a magic value... Not sure what is best, I just dislike magic values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zero gets special treatment in the protocol, but I agree we can protect the users from accidentally passing it. I've added an overload and enforced that rows_limit, if given, is strictly greater than 0.
bigtable/client/row_reader.cc
Outdated
| static_assert(std::is_destructible<RowReader::iterator>::value, | ||
| "RowReader::iterator must be Destructible"); | ||
|
|
||
| // TODO(C++17): std::is_swappable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO was meant as a "when C++17 is available we can check that iterator works with std::swap() easily". You can probably remove it, assuming the iterator does work with std::swap?
bigtable/client/row_reader.cc
Outdated
|
|
||
| // TODO(C++17): std::is_swappable | ||
|
|
||
| // TODO: check that if o is of type iterator then ++o and *o |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know the standard tricks to check such things? For example, we could do:
static_assert(
std::is_convertible<decltype(*std::declval<RowReader::iterator>()), RowReader::value_type>::value,
"*it when it is of RowReader::iterator type must be convertible to RowReader::value_type>);You can do something similar for the prefix ++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! done, I think
bigtable/client/row_reader.h
Outdated
|
|
||
| private: | ||
| RowReader* owner_; | ||
| absl::optional<Row> row_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been thinking for std::istream_iterator for inspiration on this. Your implementation seems like a modern version of the libstdc++ one:
I think that is pretty cool.
# Conflicts: # bigtable/CMakeLists.txt # bigtable/client/row_set.h # bigtable/client/testing/table_test_fixture.h
|
PTAL |
coryan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the namespace things before merging. Otherwise LGTM.
| #ifndef GOOGLE_CLOUD_CPP_BIGTABLE_CLIENT_INTERNAL_ROWREADERITERATOR_H_ | ||
| #define GOOGLE_CLOUD_CPP_BIGTABLE_CLIENT_INTERNAL_ROWREADERITERATOR_H_ | ||
|
|
||
| #include "bigtable/client/version.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bigtable/client/row_reader.cc
Outdated
| #include <thread> | ||
|
|
||
| namespace bigtable { | ||
| inline namespace BIGTABLE_CLIENT_NS {} // namespace BIGTABLE_CLIENT_NS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are opening and closing the namespace in the same line, are you sure that is what you wanted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, that was not supposed to close there. fixed
bigtable/client/row_reader.cc
Outdated
| inline namespace BIGTABLE_CLIENT_NS {} // namespace BIGTABLE_CLIENT_NS | ||
|
|
||
| // RowReader::iterator must satisfy the requirements of an InputIterator. | ||
| namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This defines an anonymous namespace inside the bigtable namespace, are you sure that is what you wanted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bigtable/client/row_reader.h
Outdated
| #ifndef GOOGLE_CLOUD_CPP_BIGTABLE_CLIENT_ROW_READER_H_ | ||
| #define GOOGLE_CLOUD_CPP_BIGTABLE_CLIENT_ROW_READER_H_ | ||
|
|
||
| #include "bigtable/client/version.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
coryan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last minute question, we can resolve in a future PR if needed.
bigtable/client/row_reader.cc
Outdated
|
|
||
| if (not last_read_row_key_.empty()) { | ||
| // There is a previous read row, so this is a restarted call | ||
| row_set_ = row_set_.Intersect(RowRange::Open(last_read_row_key_, "")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if this returns an empty row set? That means "everything" but should mean "we are done".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an excellent question, and I've just committed a test that fails in such a scenario. Will follow up with the fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of keeping other users from doing the same mistake, I've fixed this by handling the special case in RowSet. PTAL
coryan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes to the comments, otherwise LGTM.
bigtable/client/row_set.cc
Outdated
| namespace btproto = ::google::bigtable::v2; | ||
|
|
||
| RowSet RowSet::Intersect(bigtable::RowRange const& range) const { | ||
| RowSet result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move result until after the empty check.
bigtable/client/row_set.cc
Outdated
| if (row_set_.row_keys().empty() and row_set_.row_ranges().empty()) { | ||
| return RowSet(range); | ||
| } | ||
| // Normal case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider // Normal case: find the intersection with row keys and row ranges in the RowSet.
bigtable/client/row_set.cc
Outdated
| *result.row_set_.add_row_ranges() = std::get<1>(i).as_proto_move(); | ||
| } | ||
| } | ||
| // Another special case: intersection is empty, return empty range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider: // Another special case: a RowSet() with no entries means "all rows", but we want "no rows"
Implement the Table::ReadRows() call, fixes #32
The call returns a RowReader wrapper around the gRPC stream, that can also restart the call with updated arguments.