Skip to content

Comments

Changes to log file per discussion thread #1969#2010

Closed
Rob-NY wants to merge 13 commits intojamulussoftware:masterfrom
Rob-NY:log_file_changes
Closed

Changes to log file per discussion thread #1969#2010
Rob-NY wants to merge 13 commits intojamulussoftware:masterfrom
Rob-NY:log_file_changes

Conversation

@Rob-NY
Copy link
Contributor

@Rob-NY Rob-NY commented Sep 10, 2021

FOR DISCUSSION. In contemplation of resolution to #1969, this PR does the following:

  1. Refactors serverlogging as a singleton to permit more intuitive access from within the codebase.
  2. Changed CSV format to Tab-separated since channel names are UTF-8.
  3. The current method that wrote the log file entry upon a new connection takes place at a point in time when the channel name is not yet known. As such, an additional log entry line was created.
  4. As a result of the above, log file lines were coded with entry types: IDLE, CONNECT, CHANNEL to reflect the three states that are being logged.

Context: Fixes an issue?

This fixes nothing except to perhaps clean up the logging class a bit so it can be used elsewhere in the code base and to address the issues raised in #1969.

The context of this PR is to demonstrate a response to the asks embedded in #1969.

Does this change need documentation? What needs to be documented and how?

Yes, some documentation changes will be required. I'll sign up to write them if this PR moves forward.

Status of this Pull Request

BOTH POC and Working as submitted.

What is missing until this pull request can be merged?

Merged with upstream. TESTED ONLY On LINUX - Mac/Windows testing is required; I do not have that facility.

Checklist

  • I've verified that this Pull Request follows the general code principles
  • I tested my code and it does what I want LINUX ONLY
  • My code follows the style guide
  • I waited some time after this Pull Request was opened and all GitHub checks completed without errors.
  • I've filled all the content above

@pljones
Copy link
Collaborator

pljones commented Sep 11, 2021

  1. Refactors serverlogging as a singleton to permit more intuitive access from within the codebase.

It seems not to really make the code more readable but more clumsy:

  • Old:
    Logging.AddNewConnection ( RecHostAddr.InetAddr, GetNumberOfConnectedClients() );
  • New:
    CServerLogging::getInstance().AddNewConnection ( RecHostAddr.InetAddr, GetNumberOfConnectedClients() );

So I'm not clear there's much benefit here to the codebase directly. A simple global static called Logging declared in global.h would allow the removal of the declarations from each class.

However... If code refactoring is needed, one area that should be addressed is that writing a file on the real-time thread should be avoided. So the log writer, however it's provided, should be moved off this thread.

  1. Changed CSV format to Tab-separated since channel names are UTF-8.

I'm not a fan of tab-separated file formats - they tend to make files less easy to read (two adjacent tabs aren't immediately discernable). Coupled with that, any change to an existing - established - file format is likely to break things that rely on that file format. Given that there's no way to find out what would break, this would need a strong justification.

Alternatively, a command line method of selecting the new format could be provided (e.g. adding an optional format selector as a prefix to the log file name, with the existing format the default).

Again, however, if the format is going to be changed, wouldn't it be better to go for something more parseable (and readable) than TSV, like JSON? Coupled with moving logging off the real-time thread, of course.

  1. The current method that wrote the log file entry upon a new connection takes place at a point in time when the channel name is not yet known. As such, an additional log entry line was created.

This is intentional. The channel name can be sent multiple times during a connection and there is no need to polute the log with it. The IP address is adequate. I'd object to this being logged.

  1. As a result of the above, log file lines were coded with entry types: IDLE, CONNECT, CHANNEL to reflect the three states that are being logged.

This sounds like you've changed existing lines? I'd definitely oppose doing that. The same argument as moving from CSV to TSV applies: things you cannot know about will break. You need to prevent that.

@dtinth
Copy link
Contributor

dtinth commented Sep 11, 2021

For simple logging, then I think TSV is a fine format and can be easily ingested without any help from libraries. For example, parsing a TSV in Ruby:

File.readlines('log.tsv').map { |line| line.split("\t") }

On the other hand,

  • passing JSON requires require 'json'.
  • passing CSV requires handling quoted strings, etc.

It is worth noting that Unix cut utility uses the Tab character as a delimiter by default, and TSV can be easily opened in Excel, and parsed by fluentd.

One potential issue I see with using tabular format is that we can’t easily change the header columns.

This is mitigated by using structured logging, i.e. JSON logging as @pljones suggested. With structured logging, each log line is a JSON object. The log in JSON format can then be sent to log analytics service like Google Cloud Logging, Azure Log Analytics, Datadog, Loggly, or Elastic to create dashboards and monitoring alerts without any extra parsing step.

For example, here I have a dashboard that monitors the number of connected clients along with server CPU usage:

image

One downside of JSON logging is that it is hard to read by eyes without using special tools. For Jamulus I think it is not a complex application enough to benefit from structured logging, so I would be inclined towards TSV as well. As a log consumer I would be happy with TSV or JSON.

@Rob-NY
Copy link
Contributor Author

Rob-NY commented Sep 11, 2021

Well, I honestly didn’t expect this PR to make it even this far. All of your comments are correct. In fact, log entries are cut for every keystroke made to user PROFILES while connected to the server - both for channel name as well as skill, country, city, etc. I don’t see a simple, supportable way to add the channel name into the log without creating a much more significant state-based monitoring mechanism within the code-base. And that was beyond the scope of what I was willing to try for this particular ask.

@Rob-NY
Copy link
Contributor Author

Rob-NY commented Sep 11, 2021

As for the log format, I chose Tab Sep because: 1. I wanted to mostly remove the escaping issue which does not exist in the log today due to the limited data captured. 2) I was under the impression that most of the world used Tab instead of CSV due to the comma use as a decimal point. 3) I wanted it to still be human readable and scannable via a console which json falls a bit short for me. It is worth noting that the channel names are filtered for offending chars prior to being written to the log.

Re: singleton vs. static — that was a choice; so many folks are religious about using statics that I thought I was safer using a singleton. Guess I picked wrong.

Frankly, I don’t want to spend a lot more time on this because I think there are several objections surrounding the appropriateness of this entire request with respect to writing IP/NAME to disk. I’d like that to be resolved first before moving forward (which is why I noted ‘for discussion’ in the PR submission)

@ann0see
Copy link
Member

ann0see commented Jan 28, 2022

I don’t really like the idea of writing ip or names to disk. Since the UDP/TCP API approach also being discussed, what do we use from this PR?

@pljones
Copy link
Collaborator

pljones commented Jan 29, 2022

I require the ability to see who's been accessing my server. That means IP and port number if I need traceability. Any server owner deserves that.

@ann0see ann0see marked this pull request as draft February 2, 2022 20:37
@ann0see
Copy link
Member

ann0see commented Mar 4, 2022

@Rob-NY you should "rebase" this Pull request instead of merging upstream into your branch in order to keep the development history clean. Also Styling doesn't seem to be ok here.

@pljones @Rob-NY could you please explicitly state what should go into upstream from this PR?

@ann0see ann0see added the stale label Mar 4, 2022
@Rob-NY
Copy link
Contributor Author

Rob-NY commented Mar 4, 2022 via email

@ann0see
Copy link
Member

ann0see commented Mar 4, 2022

Ok. If you still have changes, maybe open another issue and then we could include them? I'm personally also a bit unsure what to do with this PR.

Closing it anyways.

@ann0see ann0see closed this Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants