Problem: Validation code not optimized by kansi · Pull Request #2490 · bigchaindb/bigchaindb

kansi · 2018-08-29T12:48:39Z

Solution: Use memoization for functions with static validation

…memoize

Solution: Memoize operations which generate same results

vrde

Tests are missing 👮‍♂️

vrde · 2018-08-29T12:51:58Z

bigchaindb/backend/localmongodb/schema.py


    # to query the transactions for a transaction id, this field is unique
    conn.conn[dbname]['transactions'].create_index('id',
+                                                   unique=True,


Good point.

Note for other PRs: feel free to make a PR for this specific issue, so we don't mix concerns 👏

vrde · 2018-08-29T12:52:26Z

bigchaindb/backend/localmongodb/schema.py

 def create_blocks_secondary_index(conn, dbname):
    conn.conn[dbname]['blocks']\
-        .create_index([('height', DESCENDING)], name='height')
+        .create_index([('height', DESCENDING)], name='height', unique=True)


Same as the previous comment 🙂

vrde · 2018-08-29T13:03:08Z

bigchaindb/common/memoize.py

+
+class HDict(dict):
+    def __hash__(self):
+        return  int.from_bytes(codecs.decode(self['id'], 'hex'), 'big')


I had a similar problem recently. While your code converts the hex string representing the transaction.id to a number, a simpler approach is to just use int(self['id'], 16) (I was actually quite surprised of its simplicity when I found it out).

In [1]: int('437752a2c5c3cf2ab8ff6254ca8c0fb417a0951ab651c42481474dd9347971a7', 16) == int.from_bytes(codecs.decode('437752a2c5c3cf2ab8ff6254ca8c0fb417a0951ab651c42481474dd9347971a7', 'hex'), 'big') Out[1]: True

I was curious about your approach so I compared performance of the two approaches:

In [2]: %timeit int('437752a2c5c3cf2ab8ff6254ca8c0fb417a0951ab651c42481474dd9347971a7', 16) 382 ns ± 3.24 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) In [3]: %timeit int.from_bytes(codecs.decode('437752a2c5c3cf2ab8ff6254ca8c0fb417a0951ab651c42481474dd9347971a7', 'hex'), 'big') 1.72 µs ± 8.16 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Switched to hash(), following are the stats

In [1]: %timeit hash('437752a2c5c3cf2ab8ff6254ca8c0fb417a0951ab651c42481474dd9347971a7') 70.4 ns ± 0.625 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

bigchaindb/common/memoize.py

bigchaindb/common/transaction.py

bigchaindb/lib.py

vrde · 2018-08-29T13:18:34Z

bigchaindb/common/memoize.py

+    @functools.wraps(func)
+    def memoized_func(*args, **kwargs):
+        print(args)
+        new_args = (args[0], HDict(args[1]), args[2])


That's difficult to understand, can you please add some comments around this code (and the equivalent for memoize_to_dict.

Solution: use `hash()` function instead of `int.from_bytes()`

Solution: remove if condition

Solution: Clear from_dict cache

Solution: Enable memoization decorator

codecov-io · 2018-09-03T07:54:00Z

Codecov Report

Merging #2490 into master will increase coverage by 0.14%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2490      +/-   ##
==========================================
+ Coverage   91.73%   91.87%   +0.14%     
==========================================
  Files          41       42       +1     
  Lines        2467     2511      +44     
==========================================
+ Hits         2263     2307      +44     
  Misses        204      204

Solution: enable memoization and fix failing tests

Solution: Add tests for `to_dict` and `from_dict` memoization

Solution: Add necessary test

Solution: clear cache in bdb fixture

Solution: Fix flake8 issue

vrde · 2018-09-04T13:26:16Z

I run prof.py to check the improvement, this transaction validation speed on master:

 /tmp ➜ python prof.py
Create 1000 transactions
Start serial validation
  Total time: 4.285186
  Time per transaction: 0.004285

Speed with this patch:

 /tmp ➜ python prof.py
Create 1000 transactions
Start serial validation
  Total time: 3.363722
  Time per transaction: 0.003364

We now are 27% faster in transaction validation!
🏇 🏇 🏇

kansi added 5 commits August 17, 2018 07:47

Memoize data

c3d0bca

Merge remote-tracking branch 'ssh_upstream/master' into improve-perf

18964c0

fixed memorize bug

3820d43

Merge branch 'improve-perf' of github.com:kansi/bigchaindb into feat/…

2a14878

…memoize

Problem: Transaction validation is not optimal

6f2930e

Solution: Memoize operations which generate same results

vrde suggested changes Aug 29, 2018

View reviewed changes

kansi added the ⚠️ WIP ⚠️ label Aug 29, 2018

kansi added 5 commits August 30, 2018 10:25

Merge remote-tracking branch 'ssh_upstream/master' into feat/memoize

f7e07e1

Problem: Appropriate hash function not used.

2bb98ab

Solution: use `hash()` function instead of `int.from_bytes()`

Problem: is_commited method is not simplied

55c3fb5

Solution: remove if condition

Problem: CI build fails when using memoized from_dict

c2ea258

Solution: Clear from_dict cache

Problem: memoization for from_dict not enabled

9280ac6

Solution: Enable memoization decorator

kansi added 2 commits September 4, 2018 10:52

Problem: memoization for to_dict disabled

9689072

Solution: enable memoization and fix failing tests

Problem: No test for memoization

0b9e483

Solution: Add tests for `to_dict` and `from_dict` memoization

kansi requested review from ldmberman and vrde September 4, 2018 10:04

kansi added 2 commits September 4, 2018 12:06

Problem: No test for '_input_valid' memoization

7fa4e17

Solution: Add necessary test

Merge remote-tracking branch 'ssh_upstream/master' into feat/memoize

1333d56

kansi added ⚠️ WIP ⚠️ and removed ⚠️ WIP ⚠️ labels Sep 4, 2018

kansi added 2 commits September 4, 2018 12:22

Problem: _input_valid cache not clear in between tests

85eb0b1

Solution: clear cache in bdb fixture

Problem: CI Build error because of flake8 issue

84535ec

Solution: Fix flake8 issue

kansi removed the ⚠️ WIP ⚠️ label Sep 4, 2018

vrde approved these changes Sep 4, 2018

View reviewed changes

vrde merged commit cb22557 into bigchaindb:master Sep 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem: Validation code not optimized#2490

Problem: Validation code not optimized#2490
vrde merged 16 commits intobigchaindb:masterfrom
kansi:feat/memoize

kansi commented Aug 29, 2018

Uh oh!

vrde left a comment

Uh oh!

vrde Aug 29, 2018

Uh oh!

vrde Aug 29, 2018

Uh oh!

vrde Aug 29, 2018

Uh oh!

kansi Sep 4, 2018

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vrde Aug 29, 2018

Uh oh!

codecov-io commented Sep 3, 2018 •

edited

Loading

Uh oh!

vrde commented Sep 4, 2018 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kansi commented Aug 29, 2018

Uh oh!

vrde left a comment

Choose a reason for hiding this comment

Uh oh!

vrde Aug 29, 2018

Choose a reason for hiding this comment

Uh oh!

vrde Aug 29, 2018

Choose a reason for hiding this comment

Uh oh!

vrde Aug 29, 2018

Choose a reason for hiding this comment

Uh oh!

kansi Sep 4, 2018

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vrde Aug 29, 2018

Choose a reason for hiding this comment

Uh oh!

codecov-io commented Sep 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vrde commented Sep 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-io commented Sep 3, 2018 •

edited

Loading

vrde commented Sep 4, 2018 •

edited

Loading