Skip to content

Use command line checksumAlgorithm in generator driver#81934

Merged
RikkiGibson merged 8 commits intodotnet:mainfrom
RikkiGibson:Issue63039
Jan 12, 2026
Merged

Use command line checksumAlgorithm in generator driver#81934
RikkiGibson merged 8 commits intodotnet:mainfrom
RikkiGibson:Issue63039

Conversation

@RikkiGibson
Copy link
Member

@RikkiGibson RikkiGibson commented Jan 8, 2026

Fixes #63039
18.3 backport: #81954

I think the SourceProductionContext.ChecksumAlgorithm, GeneratorDriverOptions.ChecksumAlgorithm, etc. we might want to make public in the future. Though, it seems like it should only really matter for scenarios where we are going to emit a PDB.

@RikkiGibson RikkiGibson requested a review from a team as a code owner January 8, 2026 23:37
hintName,
ChecksumAlgorithm == SourceHashAlgorithm.None || ChecksumAlgorithm == sourceText.ChecksumAlgorithm
? sourceText
: SourceText.From(sourceText.ToString(), encoding: sourceText.Encoding, checksumAlgorithm: ChecksumAlgorithm));
Copy link
Member Author

@RikkiGibson RikkiGibson Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure what to do in this case. The generator has a method it can call, passing a SourceText it already created. We want to override that and essentially ignore the ChecksumAlgorithm they specified, when we have our own that we want to use.

@tmat do you think this is a case we need to handle? What if we made it so our checksumAlgorithm is considered, when the AddSource(string, string) overload is used?

I don't think this method of making a derived text is fine. Maybe I can make a new SourceText subtype, which delegates to the underlying text, except it uses a different ChecksumAlgorithm. I'll see if that works out. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not too difficult to write a class SourceTextWithAlgorithm, wrapping the original SourceText, but, it's not obvious to me which virtual methods to override and delegate to the underlying text. We wouldn't want to inadvertently create a derived text which drops the new checksum algorithm, but, we also wouldn't want to create some perf issue when the text is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point with a user providing a checksum algorithm. I think we just support SourceText to have a simple text abstraction that isn't a contiguous sequence (like System.String). We should just wrap/copy, passing along the original data. But overriding with our checksum algorithm imo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now made it so that when the ChecksumAlgorithm of the text they gave us, differs from what we want to use, then we just wrap theirs.

/// <param name="hintName">An identifier that can be used to reference this source text, must be unique within this generator</param>
/// <param name="source">The source code to add to the compilation</param>
public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8));
public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8, checksumAlgorithm: _checksumAlgorithm == SourceHashAlgorithm.None ? SourceHashAlgorithm.Sha1 : _checksumAlgorithm));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in the None case, this is just using the default value for this parameter, which we were using implicitly before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for using Sha1 here other than it's what we did before? If so I would suggest that we add SourceHashAlgorithm.Latest and use it here. Generally the rule for deprecated crypto is that we can't default to it unless it was required due to compatibility, file format specification requirements, etc ...

Consider that using SHA1 here as a default means you will run into problems on say Redhat. Or anywhere where openssl has a locked down implementation that throws on SHA1 usage.

Copy link
Member Author

@RikkiGibson RikkiGibson Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable. Were you thinking of SourceHashAlgorithms.Default? Or is there a reason to introduce a distinct ‘Latest’ value in addition to that?

It feels like perhaps the SourceText.From method which had SHA1 as a default value, should also change to SourceHashAlgorithms.Default, but, that change may have more of a ripple effect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use SourceHashAlgorithms.Default when the host doesn't provide an algorithm. Note that the command line compiler will always provide an algorithm, because it has a default value for CommandLineArguments.ChecksumAlgorithm, which is the same as SourceHashAlgorithms.Default.

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks (commit 2) with a suggestion to consider

@jcouv jcouv self-assigned this Jan 9, 2026
@RikkiGibson RikkiGibson requested a review from tmat January 9, 2026 00:27
Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks (commit 3)

@AlekseyTs
Copy link
Contributor

AlekseyTs commented Jan 9, 2026

It looks like we have no tests on VB side. #Closed

@AlekseyTs
Copy link
Contributor

AlekseyTs commented Jan 9, 2026

Done with review pass (commit 4) #Closed

/// <param name="hintName">An identifier that can be used to reference this source text, must be unique within this generator</param>
/// <param name="source">The source code to add to the compilation</param>
public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8));
public void AddSource(string hintName, string source) => AddSource(hintName, SourceText.From(source, Encoding.UTF8, checksumAlgorithm: _checksumAlgorithm == SourceHashAlgorithm.None ? SourceHashAlgorithm.Sha1 : _checksumAlgorithm));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for using Sha1 here other than it's what we did before? If so I would suggest that we add SourceHashAlgorithm.Latest and use it here. Generally the rule for deprecated crypto is that we can't default to it unless it was required due to compatibility, file format specification requirements, etc ...

Consider that using SHA1 here as a default means you will run into problems on say Redhat. Or anywhere where openssl has a locked down implementation that throws on SHA1 usage.

/// <summary>
/// When specified, overrides the <see cref="SourceText.ChecksumAlgorithm"/> given in <see cref="SourceProductionContext.AddSource(string, SourceText)"/>.
/// </summary>
internal SourceHashAlgorithm ChecksumAlgorithm { get; init; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the use case for this testing or is there a product reason to override this value from the compiler option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a product reason for this. The reason is not to override the value from the compiler option. Rather, it is to override the ChecksumAlgorithm from the SourceText that the generator created, and use the ChecksumAlgorithm given from the compiler option instead.

@AlekseyTs
Copy link
Contributor

AlekseyTs commented Jan 9, 2026

Would it make sense to simply throw on an input that doesn't use proper algorithm? #Closed

@jaredpar
Copy link
Member

jaredpar commented Jan 9, 2026

Would it make sense to simply throw on an input that doesn't use proper algorithm?

I was going to suggest that but it seems that value is coming from an init property hence no good place to put validation. If this is a test only property then I'd agree we sohuld throw / assert on bad algorithm vs. doing a fallback atthe point it's used.

Or maybe we just add an internal ctor and do validation it the constructor.

@RikkiGibson
Copy link
Member Author

Would it make sense to simply throw on an input that doesn't use proper algorithm?

I am not sure about this. The problem is that if we ship the change, and the user starts getting that exception, they would have to either change the /checksumAlgorithm they pass to compiler, or go to the authors of generators which are not passing along the right value, and ask them to make a change to start doing so. Both of those would be problematic for users, possibly blocking adoption of the new compiler.

Generator authors will also need a clearly documented way to get the checksum to use with AddSource(string, SourceText). Today they can probably discover the right algorithm by digging through the SourceTexts of the non-generated files, but, perhaps a simpler public API which just exposes the algorithm which should be used when they create SourceTexts, would be justified.

@RikkiGibson
Copy link
Member Author

It looks like we have no tests on VB side.

Still working on addressing this, I am going to just port the tests over to CommandLineTests.vb

@RikkiGibson RikkiGibson requested review from a team, jcouv and tmat January 10, 2026 01:58
@RikkiGibson
Copy link
Member Author

/backport to release/dev18.3

@github-actions
Copy link
Contributor

Started backporting to release/dev18.3 (link to workflow run)

@jaredpar
Copy link
Member

The problem is that if we ship the change, and the user starts getting that exception, they would have to either change the /checksumAlgorithm they pass to compiler, or go to the authors of generators which are not passing along the right value, and ask them to make a change to start doing so

Not quite following this. The compiler already errors when you pass /checksumAlgorithm:none. Are you talking about it coming from a diff place than the command line?

@RikkiGibson
Copy link
Member Author

RikkiGibson commented Jan 10, 2026

Consider a command line build scenario like the following.

  • We pass /checksumAlgorithm:SHA256 to the compiler.
  • We pass /analyzer:MySourceGenerator.dll to the compiler, which contains some third-party source generator. The generator calls SourceProductionContext.AddSource("myFile", SourceText.From("class C {}")), and the default value SourceHashAlgorithms.SHA1 is passed to SourceText.From.

With the solution in this PR, the compiler will see that the text produced by the generator doesn't have the desired checksum algorithm. So it will call sourceText.WithChecksumAlgorithm(SourceHashAlgorithms.SHA256) (using the value from command line arguments) and produce a derived text which uses the desired algorithm, and include that in the compilation instead.

If we just threw when the source text coming from the generator used the wrong algorithm, then, generators which worked yesterday could start throwing today in people's builds due to checksum algorithm mismatch.

Copy link
Contributor

@AlekseyTs AlekseyTs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (commit 7), assuming that SourceTextWithAlgorithm.cs will be removed before merging

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks (commit 7) modulo unnecessary file removal

@RikkiGibson RikkiGibson enabled auto-merge (squash) January 12, 2026 19:22
@RikkiGibson RikkiGibson disabled auto-merge January 12, 2026 19:50
@RikkiGibson
Copy link
Member Author

/backport to release/dev18.3

@github-actions
Copy link
Contributor

Started backporting to release/dev18.3 (link to workflow run)

@jaredpar
Copy link
Member

If we just threw when the source text coming from the generator used the wrong algorithm, then, generators which worked yesterday could start throwing today in people's builds due to checksum algorithm mismatch.

Note: in terms of validation, I'm specifically referring to GeneratorContext.ChecksumAlgorithm. Essentially what do we do when that has an invalid value.


internal SourceText WithChecksumAlgorithm(SourceHashAlgorithm checksumAlgorithm)
{
if (checksumAlgorithm == SourceHashAlgorithm.None || checksumAlgorithm == ChecksumAlgorithm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand this? If this has SHA256 and the caller says WithChecksumAlgorithm(SourceHashAlgorithm.None) the return is SHA256? The caller specified a value and it wasn't used. That doesn't seem correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is more dubious. It's convenient for this method's semantics to be: use this checksum algorithm instead, if it actually is one. That is exercised in the case that the host didn't set GeneratorDriverOptions.ChecksumAlgorithm.

We can push the == None check out to the caller instead if that seems more obviously correct. Then calling WithChecksumAlgorithm(None) would throw an exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that this is an internal API. So we can adjust its behavior in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other With style API where we ignore the passed in value?

Copy link
Member Author

@RikkiGibson RikkiGibson Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, and I think that renaming the API to something like WithChecksumAlgorithmIfAny, in a follow up, may potentially be an acceptable resolution.

@RikkiGibson
Copy link
Member Author

Note: in terms of validation, I'm specifically referring to GeneratorContext.ChecksumAlgorithm. Essentially what do we do when that has an invalid value.

@jaredpar If that property had an invalid value, then we would throw in the SourceText constructor later on.

Doing an earlier validation on the property value would be reasonable, but, it feels like it could be done in a follow up, if that's OK, so that we can get the important part of this change into 10.0.200.

@RikkiGibson RikkiGibson merged commit 11129e9 into dotnet:main Jan 12, 2026
25 checks passed
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Jan 12, 2026
@RikkiGibson RikkiGibson deleted the Issue63039 branch January 13, 2026 01:27
@davidwengier davidwengier modified the milestones: Next, 18.4 Jan 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Checksum algorithm set by GeneratorContexts.AddSource should be overridden by /checksumalgorithm switch

7 participants