Skip to content

Conversation

@xinyu-intel
Copy link
Member

When gluon model hybridize with static_shape=True, static_alloc=True, cached_op with static mode will be used. For this situation, we should try to cache operator state for better performance. This PR is to enable this feature along with MXNet #14785 to speed up gluon inference speed, especially for small batch sizes.

@xinyu-intel xinyu-intel requested a review from zhreshold April 24, 2019 07:47
@mli
Copy link
Member

mli commented Apr 24, 2019

@zhreshold
Copy link
Member

I'll wait for apache/mxnet#14785

@zhreshold
Copy link
Member

merged together with apache/mxnet#14785

@pengzhao-intel
Copy link

@xinyu-intel do we need to update the performance in the tutorial?

@xinyu-intel
Copy link
Member Author

@pengzhao-intel Throughput has a little bit improvement. Plan to update them along with some other models and waiting for 2nd gen Xeon online.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants