Skip to content

Conversation

@dantraMSFT
Copy link
Contributor

@dantraMSFT dantraMSFT commented Oct 17, 2017

NOTE: Tests are pending...

This PR is divided into the following areas:
1: Add/Rename the PowerShell/Windows ETW manifest to the repo and change both the provider id (guid) and name. See #4939.
The manifest is at tools/resxgen/PowerShell-Core-Instrumentation.man

2: Generate a resx file containing the string resources needed for logging ETW events to syslog.
This is accomplished by tools\resxgen\resxgen.psm1 and resxgen.ps1. The tool generates two files

  • A resx file containing string resources for each message string from the manifest. This is generated into the System.Management.Automation\gen directory.
  • A C# class (EventResource) that provides the mapping between the integer event id and the associated string resource name. The file is generated into the System.Management.Automation\CoreCLR directory with a compile-time condition of UNIX
    NOTE: The EventResource.cs class generated by resgen is explicitly ignored in the csproj file; it is not used.

3: Native shims for the syslog apis to enable pInvoke.
The shim implementations are libpsl-native\src\nativesyslog.cpp. Native_SysLog, Native_OpenLog, and Native_CloseLog are the target pInvoke.
NOTE: See #5149.

4: SMA\utils\tracing\PSSysLogProvider.cs
Implements the abstract LogProvider class and is the syslog equivalent of PSEtwLogProvider. The class contains a number of logical methods for logging lifecycle, health, and normal events.

5: SMA\utils\tracing\SysLogProvider.cs
This is the Syslog equivalent of ETW's log provider class and implements a Log method versus ETW's EventWrite. It is also responsible for resolving event ids to resource names and peforming the Syslog call. There is a large comment block in the class XML doc that describes the types of log output it produces.

6: SMA\utils\tracing\PSEtwLog.cs
PowerShell's current implementation is tightly coupled to this class; with code calling it directly for all events. To simplify integration of syslog, I updated the class to create an instance of PSSysLogProvider on Linux and removed the Linux-specific stub file.

7: SMA\engine\PropertyAccessor.cs
This class provides a wrapper around PowerShellProperties.json and has been extended to have Unix-specific assessors for configuring logging. Note that the file is expected to be in the $PSHOME directory to ensure SxS.

Currently, there are four configuration properties:

LogIdentity - the string identifier for the source application. This defaults to 'powershell' and can be configured to enable distinguishing between side-by-side installations.

LogLevel - configures the tracing level (log level). Informational is the defauilt.

LogChannels - used to enable operational and analytic logging. Operational is the default.

LogKeywords - used to configure enabling tracing by keyword. All keywords other than UseAlwaysAnalytic are enabled by default.
NOTE: This will likely change. PowerShell sometimes confuses the analytic channel with this keyword and sends logging to the wrong channel. Once this is cleared up, UseAlwaysAnalytic and UseAlwaysOperational keywords will likely be removed.

Additional Notes:

1: The current implementation writes directly to syslog and writing to a separate log file is still pending.
2: The generated class and resx are not part of the build; instead, it is expected that Resxgen should be run when events are added to the manifest. To fully automate the process, resxgen will need to be updated to generate the other dependent enums such as PSChannel, PSEventId, PSTask, PSOpcode, etc. You will see parsing logic in resxgen.psm1 to prepare for that but it is not enabled at this point.
4: Documentation is pending that documents the format of the syslog output as well as associated configuration.
5: As mentioned at the start, tests are pending

@iSazonov iSazonov added the Blocked blocked on something external to this repo label Oct 18, 2017
@iSazonov
Copy link
Collaborator

Add "Blocked" to wait #5149

Copy link
Contributor

@PaulHigin PaulHigin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not through with my review, but here are my comments so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetSysLogIdentity() or GetSysLogId()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this same path creation is done repeatedly in other methods so it can be done just once in the constructor and assigned to a readonly field. Also psHomeConfigDirectory should be readonly since it is set just once in the constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire class uses this pattern and; while i was tempted to change it; I think it's beyond the scope of this PR. I will mark the two fields as read-only, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created issue #5179 to track this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment. You can also do:
identity.Equals(LogDefaultValueName, System.StringComparison.OrdinalIgnoreCase) which returns a Boolean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference - I prefer having the two strings adjacent when reading the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer consistency over personal preference for others.

Personally, I pause on Compare and not on Equals, so it is a distraction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's two so I'll change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to make this field be the base class, LogProvider, and remove these ifdefs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that would work but I think I'll defer that to #5180

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defered

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also defined in PropertyAccessor.cs. We should only define it once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - now only defined in PropertyAccessor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating a new StringBuilder instance for each method call, I think we could just have a class property _payload instance that is cleared each time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that will introduce race conditions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if the instance is thread local.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is more speculation than anything else. Consider that the string builder has to be cleared and realloc'd each time plus the TLS overhead, it's likely premature to optimize without measurement. Of course, if you have reasonable evidence that object allocation will be a significant part of the overhead; I'll move it to thread local.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Garbage collection is not free - but you pay for it at random times, so it's hard to quantify.

The CLR thinks StringBuilder reuse is worthwhile, see here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a most of these LogProvider methods have a lot of commonality with the Windows implementation (PSEtwLogProvider) and they can be implemented in the base class, with WriteEvent being specific to the derived class. I know each class is complied separately for different platforms but it would make maintaining the code easier having most of the source in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire ETW logging stack needs to be refactored and after discussing it with Steve, I decided to defer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Can you please create an Issue to track the work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created #5180 to track this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move moved the common string building functions down into the base LogProvider class to reduce duplication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: teh should be the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this field also be readonly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@daxian-dbw
Copy link
Member

Something is wrong. A lot merged changes are shown up in this PR. Can you please rebase?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment block for this constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to have a Marshal.FreeHGlobal() to free up the unmanaged allocation. But you can use the [MarshalAs(UnmanagedType.LPStr)] attribute on the NativeMethods.OpenLog() parameter, to get automatic marshaling.

Copy link
Contributor Author

@dantraMSFT dantraMSFT Oct 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this string needs to remain for the lifetime of the process. I added a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary. You can initialize _resourceManager field where it is defined or in the static constructor. Also the field can be readonly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the pattern used everywhere else. i believe the notion is to defer loading resources until first use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Native_OpenLog takes a const char* it shouldn't need special marshaling. But I assume you do this above because it keeps a reference to the ident parameter string? If so can you update the comment to make this clear and as to why FreeHGlobal is never called (presumably because the allocation needs to exist for the life of the process).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is copied directly from Windows PowerShell manifest. I see that it has "PSWorkflow" keywords. Since we don't support workflows they can be removed at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we'll need to scrub all entries that are no longer referenced.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do the scrub before this is merged - no sense in taking something that never should have been there in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do the scrub as part of the Windows manifest work; due next.

@dantraMSFT dantraMSFT removed the Blocked blocked on something external to this repo label Oct 20, 2017
@dantraMSFT
Copy link
Contributor Author

@PaulHigin I believe I've address all of your feedback.
@daxian-dbw I've fixed the commit list.

Copy link
Contributor

@PaulHigin PaulHigin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as this is generated, maybe generate a switch statement instead of a dictionary.

Switch dispatch can be very fast (jump table) or a quick binary search, and the other benefit is we won't allocate a bunch of objects that might never be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoted from "https://stackoverflow.com/questions/44905/c-sharp-switch-statement-limitations-why/48259#48259"

At the end of the day, a C# switch against an integer expression on a modern system is a sub-microsecond operation, and not normally worth worrying about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I'll regenerate the code to use an out parameter for the parameter count and avoid the object allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting differs significantly, please follow the style of the rest of the code base. Specifically, '(', '||', and ')' should not be on their own line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could go in System.Management.Automation.Utils.Separators - it helps reuse to have a common place arrays like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any inherit benefit in sharing this constant. It simply introduces unnecessary/unintended coupling. ON the other hand, if there's an existing pattern that this should follow; I'm happy to look at it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more of a meta benefit - folks create arrays like this all over the place, sometimes in a loop.

People copy patterns, so if you follow the pattern, others will too. If you don't, they might follow your pattern - and in the new use case, it might be a bad choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left it encapsulated for now; I consider it a private implementation detail until we have a use case for sharing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$[EventMessage] looks like a bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling Contains isn't really necessary - Replace won't create a new string if there's nothing to replace, and Replace has to search the string again, so it just does more work.

Copy link
Contributor

@lzybkr lzybkr Oct 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think this whole loop would be easier to read/understand if it was something like this (basically inline the replacements since they aren't used elsewhere.)

      $message = $message -replace '%t', "`t" `
                          -replace '%n', "`n" `
                          -replace '%r', "`r" `
                          -replace '%space', ' ' `
                          -replace '%.', '.' `
                          -replace '%%', "%" 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've moved to UTF8 with no BOM in this repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that mean in practice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably fine to just use ASCII when generating this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the code or resx file? ASCII would preclude localization of the resx file; wouldn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just a shortcut assuming that we don't have characters requiring UTF8 encoding, which we don't.

This would only preclude localization if we generated the resx file from a localized manifest. As we're not even localizing error messages, I think we're a long way off from localizing the event log, and when that happens, it might not depend on generating new resx files - so I wouldn't worry about that for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I'll switch to ascii.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

lzybkr
lzybkr previously requested changes Oct 21, 2017
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do the scrub before this is merged - no sense in taking something that never should have been there in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description does not describe the guid in the example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - storing the hash code isn't necessary, it's highly optimized - stored in the string so it's only computed once. It's probably slower to store it yourself (not verified though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you seeing this implementation? When I review GetHashCode in the coreclr code base, I don't see any caching of the hashcode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not exposed via managed code - it's a low level implementation detail because string needs to be highly optimized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it also optimize away the function call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading about String.GetHashCode, I'm going to leave the implementation as is. There's a good argument for generating a hash using an algorithm that is repeatable and less likely to collide. As it stands, it's not hard to get two hash codes for the same commit id.

I'll add a comment to that effect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the hash code for git commit id? Is it used as some sort of an identification for a logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash code is used in-lieu of logging the long commit id with every log message. Instead, I log an initial entry containing the full commit id and the associated hash. Subsequent entries contain on the hash. SysLogProvider has a comment block at the beginning that illustrates its usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you can ignore my comment about caching - it's not clear to me that caching is really happening.

If you feel like digging into details, you could start here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why keep another reference here? It's not changing in PSVersionInfo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I'll remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We avoid unnecessary lines in this code base, this should be 1 or 2 lines instead of 6.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting - ( etc. should not be on a line by itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could reuse a single instance of StringBuilder to avoid extra allocations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if the instance is thread local.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With so many appends, it feels like this StringBuilder should preallocate space to avoid unnecessary reallocations - just guess or take a typical result and just use that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll do an estimate and use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@dantraMSFT
Copy link
Contributor Author

NOTE: This PR is dependent on #5310 as well as republishing the nugget package for libpsl-native.

@daxian-dbw
Copy link
Member

#5310 has been merged. Need to update the libpsl nuget package to include the binary with the new changes.

@daxian-dbw
Copy link
Member

The new libpsl nuget package has been published to powershell-core feed.
I also restarted the Travis CI builds.

dantraMSFT and others added 21 commits November 3, 2017 15:45
…onditional compilation.

Update powershell provider guid references
Update to enable future support  for customizing the syslog ident and facility.
Add flags to PSChannel and PSKeyword for reliable parsing
…nfig value.

Experiment with Local0 facility.
* ResxGen - change code generator to use a switch statement in GetMessage instead of a dictionary of objects
* Use thread local shared StringBuilder in SysLogProvider and PSSysLogProvider.
* Moved common string building methods down into LogProvider from the derived PSEtwLogProvider and PSSyslogProvider classes.
@dantraMSFT
Copy link
Contributor Author

@daxian-dbw I've verified the package on Ubuntu 14.04, Rasberry PI, and Mac 10.12. My 16.04 system is currently hosed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants