-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Updated Microsoft.WindowsAzure.Storage and Microsoft.WindowsAzure.ConfigurationManager to the latest Nuget packages and code converted accordingly #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you @veikkoeeva The general direction of keeping all changes inside AzureTableDataManager and keeping all the other code unmodified is right. That was the intent exactly. |
|
@gabikliot Hmm... I'm not sure how to change only a small number of files as when I upgrage the storage library the solution doesn't compile anymore without touching quite many files. I believe one nice outcome is that there are some pieces of code it nothing else. Even that took some time to think about, at least for me. :) With some luck, I get to the hoods, so to speak, next weekend. Though I don't mind if someone wants to just copy the piece I changed over to another branch with perhaps more moderate changes. I started by upgrading the storage library only in As written in #79, this is more in the class of a good stab and a serious try, but not carried out with utmost rigor (e.g. I didn't capture data from running system, write tests and refactor after that) and hence more action is needed. There are problems and similarly changes for optimizations as alluded already, but I need to write about them tomorrow, in about 20 hours. Of, the most interesting case is <edit: |
|
I mostly meant not doing the renaming of the word "table" to "tableReference" in the comments in a lot of files, which makes reviewing very hard. If you undo all these, the number of other changed files is manageable. |
Yeah, I think I got that. Mm, I need to put the kids to sleep now, but I'll see what I manage to do this evening. As you point out, that renaming doesn't make sense. |
|
No rush, this can wait. Thank you for your help! |
|
@gabikliot ... And some updates later. Yes, after every commit I thought I had it fixed and after pushing I thought that okay, let's not squash it anymore. Shouldn't work at midgnight, ugh. Oh, there are some comments and TODOs to grab attention. They ought to be cleaned up and possibly turned into issues. Hmm, and a few spurious empty space changes, luckily only few. Should check my VS settings... Anyhow, I'm off to bed now. Do reading people. :) |
|
Thank you. This is much more compact now. |
|
@gabikliot and others, I updated the initial comment. I'll try to be more of a laborer here as I don't currently have that much time to think this. So, any ideas are welcome and I'll try to provide assistance wherever I can. If this feels like a good path forward, I can fix issues as they emerge and naturally welcome others too to do the same. |
|
@gabikliot, @jthelin-microsoft I was wondering if I should add a storage tests to the project. Say, something like I was thinking a bit ahead today that perhaps in the future there's a need for SQL Engine storage too. It's not that bad at logging or queuing, for instance, and can become cheaper too in Azure. Doing it with SSDT is handy. I set up one such system at work, it build the database, deploys, creates dacpacs, runs integration tests (through two IISs to db, which get reset appropriately) etc. and all that relying on a basic VS installation with LocalDb. |
|
I am porting a bunch of Azure Table storage tests that we have internally and will submit a PR soon. So you better wait until you see what is already there. If you find a need to add more above it, you are welcome. We already support SQL storage for all system tables: https://orleans.codeplex.com/wikipage?title=What%27s%20new%20in%20Orleans%3f&referringTitle=Orleans%20Documentation and we have a bunch of tests for that as well. |
|
PR #114 contain the Azure tests that @gabikliot mentioned, and those tests are now merged into master. |
|
Thanks. I updated to the latest storage and configuration packages, fixed the new code and pushed. I did git pull --rebase upstream master, but somehow this looks a bit weird in GitHub now, but if this ends up to master, it can be fixed and if only some pieces get applied in somehow, it doesn't matter. O_o In any event, the following tests will fail now (I didn't investigate further):
|
|
Rebased on the latest master. I'm in a bit of bind at leas this weekend as to what comes to investigating the tests and fixing them and fixing the ported code, but here are results of the three failed tests: Test Name: AzureMembership_ReadAll_1 Result Message: Assert.AreNotEqual failed. Expected any value except:<W/"datetime'2015-02-14T15%3A08%3A32.103Z'">. Actual:<W/"datetime'2015-02-14T15%3A08%3A32.103Z'">. New VersionEtag differetnfrom last Test Name: AzureMembership_ReadRow_1 Result Message: Assert.AreNotEqual failed. Expected any value except:<W/"datetime'2015-02-14T15%3A08%3A31.88Z'">. Actual:<W/"datetime'2015-02-14T15%3A08%3A31.88Z'">. New VersionEtag differetnfrom last Test Name: SiloAddress_ToFrom_RowKey |
|
Rebased on the latest master. All tests pass now. @gabikliot Could someone take a look if there's something that needs still to be done? I haven't myself installed the SDK generated from this code and it may take a while, as everything lately. :) Of course, the issues regarding code refactoring to better utilize the new SDK still needs to be carried out. |
|
Thank you @veikkoeeva! We will take a look. So the code passed all the unit tests, as is, with no changes? Nice! Did you run against the storage emulator or real Azure storage? What "issues regarding code refactoring to better utilize the new SDK " are you talking about? Inside AzureTableDataManager and AzureQueueDataManager we can do what ever we want. I thought that is what you did now. |
|
@gabikliot The tests are unmodified, I ran them on Azure Emulator. I have a perhaps an unjustified feeling the tests won't cover all the cases. As for refactoring, these new libraries should bring a more performant storage layer in general, but there are things one could do, taking cues from Windows Azure Storage Client Library 2.0 Tables Deep Dive and especially from Jeffrey Richter's (ed. of Wintellect fame) Guide to Working with Azure Storage Tables:
Then possible bugs and erros:
|
…nager 2.0.3 Nuget packages updated with associated code changes. Fixes #79.
|
Thank you @veikkoeeva. We will take a look at those. |
|
@gabikliot Just noticed that UpsertTableEntryAsync doesn't do exponential backoff of retries as the in the previous versions. I think it needs to be fixed. Can you spot anything else? |
|
@veikkoeeva Can you please update the pull request to the latest master version? I just pulled it now and had a bunch of conflicts. Thank you. |
|
@veikkoeeva I have a question. How would we like to handle the process of accepting this PR? I will most certainly have a bunch of comments, will need to change/fix a bunch of small things. One way is for me to pull this PR, work on it, modify all I need, test it, and then submit a new PR. But that would mean the commits would be named with my name. But you did most of the work, so it’s fair it comes under your name. |
|
So I merged al the conflicts and have it building now. |
|
Hi @veikkoeeva. I am reviewing the code now. I spotted one key place: UpdateTableEntryConditionallyAsync does not do anything with the provided eTag. It just does Replace or InsertOrReplace. The key point about this function was that the update should be conditional, based on the eTag. It is also not clear how to do that in the new APIs. In the old API we did: |
|
@gabikliot You are right, difficult these "refactor during the night" sessions. :) Hmm, it should be assigned to As you might noticed, I refactored out already some code that was used internally to just retrieve the entity One of the troubles I had was that I have only used the newer library and in general think this in terms of the "HTTP semantics" and I wasn't sure how the old library functions and what exactly the current/old code should be doing. By the way, one additional, low-hanging fruit type of optimization could be PayloadFormat = TablePayloadFormat.JsonNoMetadata. I'm on GMT +2 and I suppose you've gone to asleep now. I'm not sure if you can update my fork of Orleans directly or should I flip some switches to give rights, but I don't mind you or someone else doing additional fixing. If there's something I can do, I'll be checking this a few times during the day, but real work starts probably, hmm, in about 14 hours. |
|
👍 @veikkoeeva, Very nice work! Is there anything your is PR is missing, or is it good to go? |
|
@ReubenBond , are you asking about Azure storage move to 4.2 PR? I am currently integrating this PR. There are a couple of small fixes that needs to be done (around eTags), so the PR as is now won't work correctly. |
|
I recommend waiting a day or 2. I am on it right now. |
|
Thank you, @gabikliot! I'll work around it for now. |
|
@ReubenBond Yep, @gabikliot is on the rest of stuff currently. This started as a hack that got serious, so I left @gabikliot some work to fix. :) @gabikliot Don't forget my comment on exponential backoff. If there's anything I can do, push the newest code somewhere and give a notice and I'll get on it as soon as possible. |
|
Thank you both :) your efforts are very much appreciated! -----Original Message----- @ReubenBond Yep, @gabikliot is on the rest of stuff currently. This started as a hack that got serious, so I left @gabikliot some work to fix. @gabikliot Don't forget my comment on exponential backoff. |
|
@veikkoeeva Very nice work! Actually I also worked on this but I had no time finish yet. Would be more readable if you use the CloudTable Async methods instead of this: I think we should consider to catch a StorageException on each try-catch block not just the generic Exception. @gabikliot Are you planning to change the interfaces around the ETag? I'm just wondering because the Tuple<T, string> and the separated tableVersionEtag don't need any more. I think ReadSingleTableEntryAsync and DeleteTableEntryAsync should use directly the storage client and not the batch version because of performance reason (haven't tried) Instead of We should use this below because of performance reason That's it for now. I will go through on it again but I hope this was helpful. Thanks again @gabikliot and @veikkoeeva |
|
I am working on this PR right now.
My goal is to get a correct, fully functional implementation first and accept the PR at this point. One thing: Anyway, this is not really important at this point. First I want to get a fully correct, functionally equivalent implementation. Only after that look at improvements. |
I think @pherbel meant that the entities themselves have an Otherwise I think the exception handling was a dark spot. I could see that the namespaces got updated, but it would be difficult and laboursome to verify without extensive tests if they are all as they were before or if not, how they should work. That's where @gabikliot steps in, luck with us. 👍 As a side note, I thought about updating the test project so that it starts the emulator automatically. It would feel like a nice feature especially on a CI machine. Perhaps also something that downloads and installs the correct version if needed. |
Makes sense About the eTag @veikkoeeva exactly wrote down how I meant
I see Another improvements would be a custom retry policy implementation. |
|
Hi @veikkoeeva @ReubenBond @pherbel So far it passes all our tests, but we are still running a bit more tests on it, to make sure. |
|
Thank you, @gabikliot! Trying the bits, I'm getting a "Azure storage connection string cannot be blank" which I wasn't getting before. It looks like it's pulling the connection string from Azure configuration instead of the silo config file. I'm not running in Azure, but I'm using an Azure system store. |
|
Reuben, nothing changed around that. What are you running? Azure web sample? I think it is supposed to pull it from azure config in azure web sample and in any real azure deployment. I think azure host expects it there. Anyway, I think this is unrelated. If you set the connection string correctly, does it work for you ok in all your scenarios? |
|
I think I know what's up: the NuGet packages were copying default config which was overwriting my own config. I've remedied that and am testing again. |
|
It's about six o'clock in the morning here, I'll try try in about fifteen hours if you haven't merge this by then. I'm excited! |
|
It looks like my current problems are caused by differing versions of the Azure storage libs. I'm seeing an exception on silo init: I also see this in the log: Is there some way that we can also support newer versions of the libs? |
|
@ReubenBond Hmm, shouldn't there be the 4.3.0 version of the libraries in the Orleans solution? As a tangential, I wonder if the next Orleans release should include a |
|
Maybe we can update all the way to 4.3.0? |
|
@ReubenBond We will think about it. We have other big customers who are still on 4.2, so we need to find a solution that will satisfy everyone. |
|
The general thought was to require latest version -1, so that people aren't forced to upgrade to latest right away, but can do so easily if they are ready. Do you see any problem with this approach? |
|
I wonder why binding redirection doesn't seem to be working - it should be happy with the newer dlls. @gabikliot I've got 4.3 because I was using Azure storage for other things |
|
@sergeybykov I don't see a problem here as I can update the packages afterwards (the .nuspec seems to define >= 4.2.0.0, which is good). Tangentially this touches upon how to maintain versions. Should the new versions be branched, released to Nuget and then maintained in their respective branches, and when a new version follows (perhaps having only updated libraries), branched again, released to Nuget and maintained in their respective branches. That way one can pick and choose the versions. I don't see this as an issue as long bigger updates doesn't take too long. Then again, perhaps less work is have the releases tagged and maintain a master branch that has all the fixes, updates etc. When can we expect this to turn inti Nuget? To clarify, my Orleans usage is quite limited currently and I strive to update as soon as there are updates and fix problems as they emerge. |
|
It would look reasonable to close this PR (at some point) and merge the branch from @gabikliot. What do you guys think? |
|
@veikkoeeva Do you think we need a branch or just a release tag for such updates? We are in the process of testing @gabikliot's branch while he is off skiing. If there are no issues, I hope to merge it today. Then we'll need to codesign the binaries, rebuild nugets and publish them. I agree - we should close this PR, as your commits are included in @gabikliot's branch. |
|
@sergeybykov I suppose it versioning comes to how the Orleans team would like to support libraries and how to communicate it. For instance, if there is a major, backwards incompatible update to the public API (or dependent components), would the team still want to support previous versions with bugfixes or backporting features. If so, it would look like feasible to branch the release from a master and tag it. Some induestries or enterprise policies may dictate certain procedures when there are major updates. One central factor in what to use in such projects is support for fix releases as it may be "politically impossible" to update a component when there's a major change (many times someone just looks at the version number). This remains me that the current update is for 4.2.0 whereas the Anyhow, I'll close this PR and we'll wait the update from @gabikliot to land master and Nuget. The code is at https://github.com/gabikliot/orleans/tree/AzureStorageUpdate-CLEAN, no PR as of yet. Big thanks to all for your patience! |
|
@veikkoeeva I agree with you - if there is a major or breaking change in dependencies, it makes sense to branch. For non-breaking minor changes, hopefully, NuGet update will just work. |
|
Fixed via #163. |
|
Followed up with #168, which is now merged. |
This resolves #79 when the converted code works as intended.
With this I sought to see approximately how much effort it would take to stab the storage system and what issues there would be. There aren’t many storage related tests and I thought that creating them would take too much effort for what I aimed at. I am not sure if this would be the best approach forward. One could reason testing (and fixing) these new binaries could be done observing by experience if they behave well enough. That is, there are other means besides a test harness to be assured of functionality, e.g. logs, alerts, experience of running a sufficiently busy system. If nothing else, this refactoring attempt presents some way to transform the code to quite close the needed functionality.
The resulting code is not good nor is it even correct in all parts. This uses the new binaries, but there needs to be cleanup especially in AzureTableDataManager storage methods. Even after that, it feels it’s somewhat a throw-away phase. Albeit this might be the case, I feel it would be an important step when assessing the conformance of this new version with the previous one. Especially this version would allow using newer libraries with the Orleans Nuget packages. In the meantime there could be incremental performance updates and even a new storage system implementation. There’s a blog post Windows Azure Storage Client Library 2.0 Tables Deep Dive which explains the newer constructs quite well.
Some spots that look potential bugs and cases for improvement
doesn’t check the list length. The batches are limited to 100 elements by Azure runtime. Even better would be take an arbitrary length list and chunk it to appropriately sized batches.
does have a check, but it could also do chunking.
look like do not need complicated custom error handling anymore.
ClouldTableClientis passed as a parameter to a lot of function. Perhaps better would be to passCloudTable, as usually the immediate call isvar tableReference = tableClient.GetTableReference(TableName);to obtain a reference and then use it.In ReminderTableEntry the ETag is shadowed.This refactoring isn't ready. There are todos, which should be turned into issues, commented code that ought to be removed and, as mentioned, issues with how to update, insert, upsert, merge entities. This could be minor issues if someone points out the desired behaviour in terms of the new library (or straight HTTP verbs).How to proceed
A bold aim
I think the bolder aim could be to provide a stream to storage, or two streams. One stream could save straight to storage, the other could batch messages and save (adaptively on load?) based either on some time interval or elements in the queue. This stream perhaps could have a configurable failover so that when, say, Azure Table Storage (ATS) goes offline, a secondary system would be attempted (a SQL Server, or another ATS storage (via connectionstring)). This system could be perhaps provide library components to a common pubsub functionality. This kind of batching doesn’t look like too complicated from an Rx perspective. As an example: http://blog.petegoo.com/2013/01/20/reactive-extensions-in-message-processing/ (though delay sensitive stream would be more complicated).
These streams could, naturally, provide a place to intercept the messages.