Use command line checksumAlgorithm in generator driver#81934
Use command line checksumAlgorithm in generator driver#81934RikkiGibson merged 8 commits intodotnet:mainfrom
Conversation
| hintName, | ||
| ChecksumAlgorithm == SourceHashAlgorithm.None || ChecksumAlgorithm == sourceText.ChecksumAlgorithm | ||
| ? sourceText | ||
| : SourceText.From(sourceText.ToString(), encoding: sourceText.Encoding, checksumAlgorithm: ChecksumAlgorithm)); |
There was a problem hiding this comment.
I'm unsure what to do in this case. The generator has a method it can call, passing a SourceText it already created. We want to override that and essentially ignore the ChecksumAlgorithm they specified, when we have our own that we want to use.
@tmat do you think this is a case we need to handle? What if we made it so our checksumAlgorithm is considered, when the AddSource(string, string) overload is used?
I don't think this method of making a derived text is fine. Maybe I can make a new SourceText subtype, which delegates to the underlying text, except it uses a different ChecksumAlgorithm. I'll see if that works out. #Resolved
There was a problem hiding this comment.
It's not too difficult to write a class SourceTextWithAlgorithm, wrapping the original SourceText, but, it's not obvious to me which virtual methods to override and delegate to the underlying text. We wouldn't want to inadvertently create a derived text which drops the new checksum algorithm, but, we also wouldn't want to create some perf issue when the text is used.
There was a problem hiding this comment.
I don't see the point with a user providing a checksum algorithm. I think we just support SourceText to have a simple text abstraction that isn't a contiguous sequence (like System.String). We should just wrap/copy, passing along the original data. But overriding with our checksum algorithm imo.
There was a problem hiding this comment.
I've now made it so that when the ChecksumAlgorithm of the text they gave us, differs from what we want to use, then we just wrap theirs.
| /// <param name="hintName">An identifier that can be used to reference this source text, must be unique within this generator</param> | ||
| /// <param name="source">The source code to add to the compilation</param> | ||
| public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8)); | ||
| public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8, checksumAlgorithm: _checksumAlgorithm == SourceHashAlgorithm.None ? SourceHashAlgorithm.Sha1 : _checksumAlgorithm)); |
There was a problem hiding this comment.
Note that in the None case, this is just using the default value for this parameter, which we were using implicitly before.
There was a problem hiding this comment.
Is there a reason for using Sha1 here other than it's what we did before? If so I would suggest that we add SourceHashAlgorithm.Latest and use it here. Generally the rule for deprecated crypto is that we can't default to it unless it was required due to compatibility, file format specification requirements, etc ...
Consider that using SHA1 here as a default means you will run into problems on say Redhat. Or anywhere where openssl has a locked down implementation that throws on SHA1 usage.
There was a problem hiding this comment.
That seems reasonable. Were you thinking of SourceHashAlgorithms.Default? Or is there a reason to introduce a distinct ‘Latest’ value in addition to that?
It feels like perhaps the SourceText.From method which had SHA1 as a default value, should also change to SourceHashAlgorithms.Default, but, that change may have more of a ripple effect.
There was a problem hiding this comment.
Changed to use SourceHashAlgorithms.Default when the host doesn't provide an algorithm. Note that the command line compiler will always provide an algorithm, because it has a default value for CommandLineArguments.ChecksumAlgorithm, which is the same as SourceHashAlgorithms.Default.
src/Compilers/Core/Portable/SourceGeneration/GeneratorContexts.cs
Outdated
Show resolved
Hide resolved
jcouv
left a comment
There was a problem hiding this comment.
LGTM Thanks (commit 2) with a suggestion to consider
|
It looks like we have no tests on VB side. #Closed |
|
Done with review pass (commit 4) #Closed |
| /// <param name="hintName">An identifier that can be used to reference this source text, must be unique within this generator</param> | ||
| /// <param name="source">The source code to add to the compilation</param> | ||
| public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8)); | ||
| public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8, checksumAlgorithm: _checksumAlgorithm == SourceHashAlgorithm.None ? SourceHashAlgorithm.Sha1 : _checksumAlgorithm)); |
There was a problem hiding this comment.
Is there a reason for using Sha1 here other than it's what we did before? If so I would suggest that we add SourceHashAlgorithm.Latest and use it here. Generally the rule for deprecated crypto is that we can't default to it unless it was required due to compatibility, file format specification requirements, etc ...
Consider that using SHA1 here as a default means you will run into problems on say Redhat. Or anywhere where openssl has a locked down implementation that throws on SHA1 usage.
| /// <summary> | ||
| /// When specified, overrides the <see cref="SourceText.ChecksumAlgorithm"/> given in <see cref="SourceProductionContext.AddSource(string, SourceText)"/>. | ||
| /// </summary> | ||
| internal SourceHashAlgorithm ChecksumAlgorithm { get; init; } |
There was a problem hiding this comment.
Is the use case for this testing or is there a product reason to override this value from the compiler option?
There was a problem hiding this comment.
There's a product reason for this. The reason is not to override the value from the compiler option. Rather, it is to override the ChecksumAlgorithm from the SourceText that the generator created, and use the ChecksumAlgorithm given from the compiler option instead.
|
Would it make sense to simply throw on an input that doesn't use proper algorithm? #Closed |
I was going to suggest that but it seems that value is coming from an Or maybe we just add an |
I am not sure about this. The problem is that if we ship the change, and the user starts getting that exception, they would have to either change the Generator authors will also need a clearly documented way to get the checksum to use with |
Still working on addressing this, I am going to just port the tests over to |
|
/backport to release/dev18.3 |
|
Started backporting to |
Not quite following this. The compiler already errors when you pass |
|
Consider a command line build scenario like the following.
With the solution in this PR, the compiler will see that the text produced by the generator doesn't have the desired checksum algorithm. So it will call If we just threw when the source text coming from the generator used the wrong algorithm, then, generators which worked yesterday could start throwing today in people's builds due to checksum algorithm mismatch. |
AlekseyTs
left a comment
There was a problem hiding this comment.
LGTM (commit 7), assuming that SourceTextWithAlgorithm.cs will be removed before merging
jcouv
left a comment
There was a problem hiding this comment.
LGTM Thanks (commit 7) modulo unnecessary file removal
|
/backport to release/dev18.3 |
|
Started backporting to |
Note: in terms of validation, I'm specifically referring to |
|
|
||
| internal SourceText WithChecksumAlgorithm(SourceHashAlgorithm checksumAlgorithm) | ||
| { | ||
| if (checksumAlgorithm == SourceHashAlgorithm.None || checksumAlgorithm == ChecksumAlgorithm) |
There was a problem hiding this comment.
Can you help me understand this? If this has SHA256 and the caller says WithChecksumAlgorithm(SourceHashAlgorithm.None) the return is SHA256? The caller specified a value and it wasn't used. That doesn't seem correct.
There was a problem hiding this comment.
This part is more dubious. It's convenient for this method's semantics to be: use this checksum algorithm instead, if it actually is one. That is exercised in the case that the host didn't set GeneratorDriverOptions.ChecksumAlgorithm.
We can push the == None check out to the caller instead if that seems more obviously correct. Then calling WithChecksumAlgorithm(None) would throw an exception.
There was a problem hiding this comment.
Note also that this is an internal API. So we can adjust its behavior in a follow-up.
There was a problem hiding this comment.
Is there any other With style API where we ignore the passed in value?
There was a problem hiding this comment.
Probably not, and I think that renaming the API to something like WithChecksumAlgorithmIfAny, in a follow up, may potentially be an acceptable resolution.
@jaredpar If that property had an invalid value, then we would throw in the SourceText constructor later on. Doing an earlier validation on the property value would be reasonable, but, it feels like it could be done in a follow up, if that's OK, so that we can get the important part of this change into 10.0.200. |
Fixes #63039
18.3 backport: #81954
I think the
SourceProductionContext.ChecksumAlgorithm,GeneratorDriverOptions.ChecksumAlgorithm, etc. we might want to make public in the future. Though, it seems like it should only really matter for scenarios where we are going to emit a PDB.