-
Notifications
You must be signed in to change notification settings - Fork 386
Add new 'symbolicate' commands to dotnet-stack #2436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new 'symbolicate' commands to dotnet-stack #2436
Conversation
|
Sample test: |
|
Rather than output the symbolicated lines one at a time out of context like that, could we output the entire input text with the symbols inserted in context? e.g., like this: sh-3.2# cat stack.trace
System.NullReferenceException: Object reference not set to an instance of an object.
at LineNumberExtract.Program.ExceptionMethod2(String[] args) in LineNumberExtract.Tizen.dll: token 0x6000001+0x5
at LineNumberExtract.Program.ExceptionMethod1(IntPtr data) in LineNumberExtract.Tizen.dll: token 0x6000002+0x1
at LineNumberExtract.Program.OnCreate() in LineNumberExtract.Tizen.dll: token 0x6000004+0x15
at LineNumberExtract.App.OnStart(String[] args) in LineNumberExtract.dll: token 0x6000002+0x2
sh-3.2# dotnet stack symbolicate stack.trace
System.NullReferenceException: Object reference not set to an instance of an object.
at LineNumberExtract.Program.ExceptionMethod2(String[] args) in U:\LineNumberExtract\LineNumberExtract.Tizen\LineNumberExtract.Tizen.cs:line 14
at LineNumberExtract.Program.ExceptionMethod1(IntPtr data) in U:\LineNumberExtract\LineNumberExtract.Tizen\LineNumberExtract.Tizen.cs:line 19
at LineNumberExtract.Program.OnCreate() in U:\LineNumberExtract\LineNumberExtract.Tizen\LineNumberExtract.Tizen.cs:line 58
at LineNumberExtract.App.OnStart(String[] args) in U:\LineNumberExtract\LineNumberExtract\LineNumberExtract.cs:line 33Long inputs will result in long outputs, but that's what |
| string input = string.Empty; | ||
| if (inputPath.Length > 1) | ||
| { | ||
| for (int i = 0; i < inputPath.Length; i++) | ||
| { | ||
| Console.WriteLine($" {i + 1}. {inputPath[i]}"); | ||
| } | ||
| Console.Write($"Select one of several stacktrace files [1-{inputPath.Length}]: "); | ||
| input = inputPath[Int32.Parse(Console.ReadLine()) - 1].FullName; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user specifies multiple input files, should the tool symbolicate all of them rather than forcing user interaction to pick one?
It occurred to me that we'd need to do an in-place update of the existing file or drop and updated version of each file next to it. It also occurred to me that we don't have a specific file extension we can look for. Specifying multiple files with a wildcard would be problematic due to that.
@noahfalk / @shirhatti what are your thoughts on handling multiple files? Should we have users script around the tool to hit multiple files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to receive only one specified extension(xxxxx.stack or xxxxx.stacktrace).
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with specifying an extension is that there isn't an accepted, standard extension for this type of output. This means we either need to have our users manually change the name of each file they want touched or specifically list all the files they want touched. In either case we are requiring the user to manually list out their files.
I'm inclined to not allow wildcards and only accept a list of specific files since it is simpler for our tool to handle. If a user wants to wildcard, they can rely on the shell's ability to do so, e.g., $ find ./*_exception_log | xargs dotnet-stack symbolicate.
We will have different default behaviors depending on the number of input files as a result:
- 1 input file --> print modified file to stdout
- >1 input file --> print modified file to new file next to old one (e.g., input.txt -> intput.sym.txt)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user specifies multiple input files, should the tool symbolicate all of them rather than forcing user interaction to pick one?
Should we allow multiple stacktrace files to be input? Isn't it better to input only one file at a time?
Or how about guiding the format of the input file contents?
------- Input file format -------
at {typeName}.{methodName}({parameters}) in {moduleName}: token {methodToken}+{ilOffset}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@noahfalk / @shirhatti what are your thoughts on handling multiple files? Should we have users script around the tool to hit multiple files?
I would keep it simple (at least initially) and support a single file input and write the output to some path adjacent to the output. Then as optional improvements we could add arguments that change the output path or write the output to stdout. File output as the default appealed to me because:
a) It requires less shell scripting knowledge
b) Probably if the input started in a file the user wants the output in a file
c) If we later add support for multiple files the experience doesn't need to abruptly change behavior
Or how about guiding the format of the input file contents?
I assume we'll have to have some definition of the format the tool can parse. I'm not sure if I understood the question though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur with @noahfalk. The tool should take one file in, and write to a new file next to the old one, e.g.,
$ ls
input.txt
$ dotnet stack symbolicate input.txt
$ ls
input.txt input.symbolicated.txtor whatever addition to the file extension we want to use to differentiate.
If you're asking whether we should require the input file to have a specific format, I'm not sure that's something we can require of users. There isn't a standardized file format that we can expect. I think it is sufficient to look for the at ... in ... token ... pattern and fixup when we find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't a standardized file format that we can expect.
I think the format of the input file is already set in the function:
https://github.com/dotnet/runtime/blob/c93bb62e33934c3b8b6b1d293612d44360483bd8/src/libraries/System.Private.CoreLib/src/System/Diagnostics/StackTrace.cs#L206
So, the user using this tool thought that it was easy to input without changing the output stack.
Because it is already output as below.
" at {typeName}.{methodName}({parameters}) in {moduleName}: token {methodToken}+{ilOffset}"
I think it is sufficient to look for the
at ... in ... token ...pattern and fixup when we find it.
Also, I agree with you. But wouldn't most users use it without change? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be saying the same thing.
I think we should take in any file and treat it as a text file because there is no agreed upon file extension for stack traces like this. We should just scan the file for the pattern and attempt to fix it up in place, then save the fixed-up file next to the original.
| foreach (var pdbFile in pdbFiles) | ||
| { | ||
| foreach (var peFile in peFiles) | ||
| { | ||
| if (Path.GetFileNameWithoutExtension(peFile) == Path.GetFileNameWithoutExtension(pdbFile)) | ||
| { | ||
| string xmlPath = Path.Combine(tempDirectory, Path.GetFileName(Path.ChangeExtension(peFile, "xml"))); | ||
| GenXmlFromPdb(peFile, pdbFile, xmlPath); | ||
| xmlList.Add(xmlPath); | ||
| break; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect we would only have one PDB per assembly. If you sorted both arrays before iterating and then used a for(;;;) loop to not lose your spot in the array, you would do less work. As it is now, this could potentially iterate all the assembly files for every PDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect we would only have one PDB per assembly.
Right. But #2145 (comment)
In generally, CE products(mobile, wearable, TV, refrigerator, etc) do not have enough memory.
So, It is often deployed an application without a PDB file to reduce the deployment size or obfuscate.
I think PDB file per assembly may not exist. So, I generated XML file only when there is assembly and pdb of the same name. Do you have a better idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was only suggesting a way to match up the assembly list and the pdb list with less looping.
for example (in pseudo code):
function match_assembly_pdb:
pairList <- []
pdbList.sort()
asmList.sort()
j <- 0
for i in pdbList.length: # assuming pdbList.length < asmList.length
while j < asmList.length:
if pdbList[i] == asmList[j]:
pairList.add((pdbList[i], asmList[j]))
break
j++
Then the code only ever touches each element of each array once after the sort.
All that being said, I chatted with my team a bit and I think there is a simpler way of getting the symbols without converting to XML. Rather than use DiaSymReader, you can use the System.Reflection.Metadata APIs directly since those already know how to manipulate portable PDBs.
Here is an example from SOS on how you can use these APIs:
diagnostics/src/SOS/SOS.Hosting/SymbolServiceWrapper.cs
Lines 502 to 547 in 419db65
| private bool GetSourceLineByILOffset( | |
| OpenedReader openedReader, | |
| int methodToken, | |
| long ilOffset, | |
| out int lineNumber, | |
| out string fileName) | |
| { | |
| lineNumber = 0; | |
| fileName = null; | |
| MetadataReader reader = openedReader.Reader; | |
| try | |
| { | |
| Handle handle = MetadataTokens.Handle(methodToken); | |
| if (handle.Kind != HandleKind.MethodDefinition) | |
| return false; | |
| MethodDebugInformationHandle methodDebugHandle = ((MethodDefinitionHandle)handle).ToDebugInformationHandle(); | |
| if (methodDebugHandle.IsNil) | |
| return false; | |
| MethodDebugInformation methodDebugInfo = reader.GetMethodDebugInformation(methodDebugHandle); | |
| SequencePointCollection sequencePoints = methodDebugInfo.GetSequencePoints(); | |
| SequencePoint? nearestPoint = null; | |
| foreach (SequencePoint point in sequencePoints) | |
| { | |
| if (point.Offset > ilOffset) | |
| break; | |
| if (point.StartLine != 0 && !point.IsHidden) | |
| nearestPoint = point; | |
| } | |
| if (nearestPoint.HasValue) | |
| { | |
| lineNumber = nearestPoint.Value.StartLine; | |
| fileName = reader.GetString(reader.GetDocument(nearestPoint.Value.Document).Name); | |
| return true; | |
| } | |
| } | |
| catch (Exception ex) | |
| { | |
| Trace.TraceError($"GetSourceLineByILOffset: {ex.Message}"); | |
| } | |
| return false; | |
| } |
Here is an example from the runtime on how to use those APIs (this one is a little simpler and handles caching parsed PDBs):
https://github.com/dotnet/runtime/blob/b1452c8fcab789dd9f2921c1479425b291111c5d/src/libraries/System.Diagnostics.StackTrace/src/System/Diagnostics/StackTraceSymbols.cs#L13-L94
Since non-portable PDBs are opt-in only and only work on Windows, I'm inclined to not add the extra headache of using DiaSym which ultimately wraps the System.Reflection.Metadata for portable PDBs anyway. This will help with caching parsed symbols, reduce dependencies, and decrease complexity by not requiring the conversion to XML phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to emphasize the use of System.Reflection.Metadata here since that should simplify the logic and dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to emphasize the use of
System.Reflection.Metadatahere since that should simplify the logic and dependencies.
Ok. I'll check the System.Reflection.Metadata
| XmlDocument xmlDoc = new XmlDocument(); | ||
| xmlDoc.Load(xmlPath); | ||
| XmlElement xRoot = xmlDoc.DocumentElement; | ||
| XmlNodeList xnList = xRoot.ChildNodes; | ||
| int xnCount = xnList.Count; | ||
| if (xnCount > 0) | ||
| { | ||
| for (int i = xnCount - 1; i >= 0; i--) | ||
| { | ||
| XmlNode node = xnList[i]; | ||
| if (node.Name == "files") | ||
| { | ||
| ParseFile(node.ChildNodes, stInfo); | ||
| } | ||
| else if (node.Name == "methods") | ||
| { | ||
| ParseMethod(node.ChildNodes, stInfo); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we cache this load and the symbol lookup? This looks like it's going to load the file, parse the XML, and walk every method in the parsed XML for every frame of a stack trace. If a stack has many frames from a single assembly with a large PDB, this could be expensive.
| <PackageReference Include="Microsoft.DiaSymReader" Version="1.4.0-beta2-20217-02" /> | ||
| <PackageReference Include="Microsoft.DiaSymReader.Converter.Xml" Version="1.1.0-beta2-20216-01" /> | ||
| <PackageReference Include="Microsoft.DiaSymReader.PortablePdb" Version="1.6.0-beta2-21126-02"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hoyosjs are these packages managed under DARC or should we leave the manual versioning here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd centralize them. Also, we already use Dia in SOS / dotnet dump.
|
I'll check this #2436 (comment) in the next commit. |
| public static Option<string> OutputOption() => | ||
| new Option<string>(new[] { "-o", "--output" }, "Output directly to a file") | ||
| { | ||
| Argument = new Argument<string>(name: "output-path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, in the latest versions of System.CommandLine, this property is not accessible.
1a593f3 to
43f39b2
Compare
| fswo?.Close(); | ||
| fswi?.Close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put these in a finally block, or refactor so you can use them in a using block.
| StreamWriter fswi = null; | ||
| if (!inputPath.EndsWith(".symbolicated")) | ||
| { | ||
| fswi = new StreamWriter(new FileStream(inputPath + ".symbolicated", FileMode.Create, FileAccess.Write)); | ||
| } | ||
|
|
||
| string output = string.Empty; | ||
| StreamWriter fswo = null; | ||
| if (outputPath != null) | ||
| { | ||
| fswo = new StreamWriter(new FileStream(outputPath, FileMode.Create, FileAccess.Write)); | ||
| output = $"\nOutput: {outputPath}\n"; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it always writes the fixed up text to a .symbolicated file next to the input file. and then optionally also writes to an output file.
The tool should do one or the other, but not both since that would mean a user generates two files when they specify an output path.
| if (!line.Contains("at ") || !line.Contains(" in ") || !line.Contains("token")) | ||
| { | ||
| fswo?.WriteLine(line); | ||
| console.Out.WriteLine($"{line}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe in a previous comment chain, we had decided to only print to a file. We can have it print to the console, but we should put it behind a --stdout flag or something similar.
| } | ||
| } | ||
|
|
||
| private static string GetRegex(IConsole console, string line, List<string> xmlList, StreamWriter fsw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function seems to have a few side-effects unrelated to getting the regex. I'd rename the function to better reflect that. Something like TrySymbolicateLine or something similar, since it looks like it can fail.
In a similar vein, it looks like this function has the logic for writing to one file, while GetLineFromStack has the logic for writing to other outputs. Having the print logic spread out makes it difficult to reason about what's being written where and when. In this case, the function name doesn't imply that it does any writing.
Preferably, GetLineFromStack might look something like this:
outputWriter = outputPath.IsNullOrEmpty() ?
/* open file next to input */ :
/* open at output path */;
foreach (line in inputFile)
outputWriter.Write(TrySymbolicateLine(line));so all the writing happens in one place.
| public string EndLine; | ||
| } | ||
|
|
||
| private static void GetLineFromStack(IConsole console, List<string> xmlList, string inputPath, string outputPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the name of this function reflects what it is doing. Perhaps something like SymbolicateFile?
Get implies this returns something, but it returns void.
| private static readonly string tempDirectory = Path.Combine(Path.GetTempPath(), Path.GetRandomFileName()); | ||
| private static readonly Regex regex = new Regex(@" at (?<type>[\w+\.?]+)\.(?<method>\w+)\((?<params>.*)\) in (?<filename>[\w+\.?]+)(\.dll|\.ni\.dll): token (?<token>0x\d+)\+(?<offset>0x\d+)", RegexOptions.Compiled); | ||
| private static readonly Regex verifyRegex = new Regex(@"at (?<typeMethod>.*)\((?<params>.*?)\) in (?<filename>.*)token(?<token>.*)\+(?<offset>.*)", RegexOptions.Compiled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: static readonly variables typically either get s_<name> or PascalCased names, e.g., s_tempDirectoryPath or TempDirectoryPath.
| if (searchDir.Length != 0) | ||
| { | ||
| foreach (var path in searchDir) | ||
| { | ||
| search_paths.Add(path.FullName); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably skip this stringification if you use DirectoryInfo[] and FileInfo[] everywhere.
| } | ||
| string ret = GetRegex(console, line, xmlList, fswi); | ||
| fswo?.WriteLine(ret); | ||
| console.Out.WriteLine($"{ret}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting for a single string is unnecessary work. This can just be Console.Out.WriteLine(ret).
| str.Append("+"); | ||
| str.Append(match.Groups["offset"].Value.TrimStart().Split(" ")[0]); | ||
|
|
||
| if (regex.Match(str.ToString()).Success) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should come before doing all the StringBuilder work above since you throw it out if it is false.
| { | ||
| string line = fsri.ReadLine(); | ||
| // The stacktrace must have "at ... in ... token ..." | ||
| if (!line.Contains("at ") || !line.Contains(" in ") || !line.Contains("token")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does a full read of the line 3 times in a row. You could simply check against the compiled regex once that would either match an return all the necessary info or not match, all in one step.
|
Exception log: Case 1: Case 2: Case 3: Case 4: |
josalem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things are definitely shaping up in the right direction! Another round of comments. Thank you for the contributions!
| List<string> searchPaths = new List<string>(); | ||
| foreach (var path in searchDir) | ||
| { | ||
| searchPaths.Add(path.FullName); | ||
| } | ||
| searchPaths.Add(Directory.GetCurrentDirectory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier to pass the {File|Directory}Info[] around, so we don't have to stringify everything here.
If the order of entries in these arrays is used to locate files, the current directory should probably be at the front of the array.
| private static void Symbolicator(IConsole console, List<string> searchPaths, string inputPath, string outputPath, bool isStdout) | ||
| { | ||
| CreateSymbolicateFile(console, GetSearchPathList(searchPaths), inputPath, outputPath, isStdout); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can remove this function and just call CreateSymbolicateFile directly in Symbolicate.
| foreach (var peFile in Directory.GetFiles(assemDir, searchPattern, SearchOption.AllDirectories)) | ||
| { | ||
| files.Add(peFile); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| foreach (var peFile in Directory.GetFiles(assemDir, searchPattern, SearchOption.AllDirectories)) | |
| { | |
| files.Add(peFile); | |
| } | |
| files.AddRange(Directory.GetFiles(assemDir, searchPattern, SearchOption.AllDirectories)); | |
| private static MetadataReader GetMetadataReader(string filePath) | ||
| { | ||
| try | ||
| { | ||
| Func<string, Stream> streamProvider = sp => new FileStream(sp, FileMode.Open, FileAccess.Read); | ||
| using Stream stream = new FileStream(filePath, FileMode.Open, FileAccess.Read); | ||
| if (stream != null) | ||
| { | ||
| MetadataReaderProvider provider = null; | ||
| if (filePath.Contains(".dll")) | ||
| { | ||
| using PEReader peReader = new PEReader(stream); | ||
| if (!peReader.TryOpenAssociatedPortablePdb(filePath, streamProvider, out provider, out string pdbPath)) | ||
| { | ||
| return null; | ||
| } | ||
| } | ||
| /*else if (filePath.Contains(".pdb")) | ||
| { | ||
| provider = MetadataReaderProvider.FromPortablePdbStream(stream); | ||
| }*/ | ||
| return provider.GetMetadataReader(); | ||
| } | ||
| return null; | ||
| } | ||
| catch | ||
| { | ||
| return null; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should attempt to cache the MetadataReaders so we don't repeat work if there are multiple frames from the same assembly.
I recommend stashing them in a Dictionary<string,MetadataReader> and being sure to clean them up at the end of the process.
| foreach (var path in searchPathList) | ||
| { | ||
| if (path.Contains(stInfo.Assembly)) | ||
| { | ||
| return GetLineFromMetadata(GetMetadataReader(path), ret, stInfo); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you add a caching mechanism for the MetadataReaders, you can eliminate this loop by making the key of the cache the assembly name. Then this becomes a constant time lookup rather than a linear lookup. You would need some function like bool TryGetMetadataReader(string assemblyName, out MetadataReader reader) that uses what you have written in GetMetadataReader that also attempts to cache things.
| private static readonly Regex s_regex = new Regex(@" at (?<type>[\w+\.?]+)\.(?<method>\w+)\((?<params>.*)\) in (?<filename>[\w+\.?]+)(\.dll|\.ni\.dll): token (?<token>0x\d+)\+(?<offset>0x\d+)", RegexOptions.Compiled); | ||
| private static readonly Regex s_verifyRegex = new Regex(@"at (?<typeMethod>.*)\((?<params>.*?)\) in (?<filename>.*)token(?<token>.*)\+(?<offset>.*)", RegexOptions.Compiled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does s_verifyRegex accomplish that s_regex doesn't already?
| private static string GetVerifiedStackTrace(string line) | ||
| { | ||
| Match match = s_verifyRegex.Match(line); | ||
| StringBuilder str = new StringBuilder(); | ||
| str.Append(" at "); | ||
| str.Append(match.Groups["typeMethod"].Value.TrimEnd()); | ||
| str.Append("("); | ||
| str.Append(match.Groups["params"].Value); | ||
| str.Append(") in "); | ||
| str.Append(match.Groups["filename"].Value.Replace(":", "").Trim()); | ||
| str.Append(": token "); | ||
| str.Append(match.Groups["token"].Value.Trim()); | ||
| str.Append("+"); | ||
| str.Append(match.Groups["offset"].Value.TrimStart().Split(" ")[0]); | ||
|
|
||
| if (s_regex.Match(str.ToString()).Success) | ||
| { | ||
| return str.ToString(); | ||
| } | ||
| return line; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this attempting to force all lines that roughly match the pattern into a specific format?
I'm not sure that is necessary as a separate step. s_regex looks like it should handle this already.
| public string Offset; | ||
| } | ||
|
|
||
| private static string GetRegex(string line, List<string> searchPathList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private static string GetRegex(string line, List<string> searchPathList) | |
| private static string TrySymbolicateLine(string line) |
If you add a cache for the MetadataReaders you don't need to pass the list of paths around and can access it from some global state.
josalem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another round of feedback 😃. I think this is almost ready for check-in.
| { | ||
| try | ||
| { | ||
| SetMetadataReader(searchDir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pre-populating the MetadataReader cache. This can be pretty expensive if we have an application that has a large number of PDBs. We also may not be using all of the PDB/MetadataReaders. If possible, these should be lazily constructed when they're first needed.
| if (!s_regex.Match(line).Success) | ||
| { | ||
| fileStreamWriter?.WriteLine(ret); | ||
| if (isStdout) console.Out.WriteLine(ret); | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is duplicating work that happens in TrySymbolicateLine. ret from TrySymbolicateLine will already be the original input if it doesn't match the regex. This call to RegularExpression.Match can be removed.
| private static MetadataReader TryGetMetadataReader(string assemblyName) | ||
| { | ||
| if (s_metadataReaderDictionary.ContainsKey(assemblyName)) | ||
| { | ||
| return s_metadataReaderDictionary[assemblyName]; | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have logic like this:
if (dict.TryGetValue(assemblyName, out MetadataReader reader))
return reader;
else
dict[assemblyName] = /* instantiate new MetadataReader */;This will only instantiate a MetadataReader when we ask for it.
| { | ||
| Arity = ArgumentArity.ZeroOrOne | ||
| } | ||
| }.ExistingOnly() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with this method. Is this saying that the output file needs to exist in order for the option to be valid? That seems opposite of the intention of the option, i.e., create the file and fill it with the contents we generate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will generate an error if the file doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. It's my mistake. I'll delete it.
| { | ||
| string sourceFile = reader.GetString(reader.GetDocument(bestPointSoFar.Value.Document).Name); | ||
| int sourceLine = bestPointSoFar.Value.StartLine; | ||
| return $" at {stInfo.Type}.{stInfo.Method}({stInfo.Param}) in {sourceFile}:line {sourceLine}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically, the input line could have any number or type of whitespace (or other characters) at the beginning of the line. For example, the input text may be from a log that has timestamps on every line. We could potentially be throwing out data by rewriting the line like this.
If possible, we should try to rewrite the line in situ. You could reuse the regex you already have with the Regex.Replace(String, String, MatchEvaluator, RegexOptions) API. You would write a MatchEvaluator method that receives each match in the regex and replaces it with the appropriate part of the StackTraceInfo. You may also need to add a capture group for the : token part so you can replace it with : line.
josalem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're to a point where I'm ready to merge this feature 🎊 . I'm assuming you have been doing some testing along the way? I'd like some sample input/output from local testing a few scenarios before we merge.
- input file that is just the stack without symbols
- input file where the stack is embedded in a log-ish format, e.g., timestamps or other text around the stack lines
- output file where only some of the PDBs were available
It doesn't need to be part of this PR, but I would like some automated tests of this verb at some point. When this merges, I'll create an issue to track adding those.
|
|
Merging 🎊 Thanks @JongHeonChoi for this contribution! This will be an awesome feature 😄 |
Add new 'symbolicate' commands to dotnet-stack.
Related PR : #2145, dotnet/runtime#44013