Skip to content

mcary/wolverine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wolverine

An appetite for logs.

Summary

Wolverine is a library for processing event streams. Processing starts with a source of events, such as a log file. Filters wrap a source and themselves look like a source. The DSL supplies shortcut methods to easily build a pipeline of filters in this way. Filters are processed incrementally as events are required; one can quickly get the first few events from a very long stream.

Events may be stored in a database for easy indexing. See the ActiveRecordSource, which provides an optimized where filter.

For a list of convenience methods for building filter pipelines, see FilterDSL.

When processing a Rails log with multiple lines per requests and multiple web processes writing to the log concurrently, Wolverine can filter requests matching a particular criteria in a particular context. For example, it can tell you the number of exceptions thrown on POST requests, even though those bits of information are contained in different log messages.

Installing

$ gem install winsome_wolverine

(The gem name “wolverine” was already taken.)

Examples


  include Wolverine

  # Read events from a file, one per line
  requests = FileSource.new("log/production.log").

    # In this log, subsequent lines of a message may appear indented,
    # so coalesce them into one event.
    append_indented.

    # Parse text out into fields using regex captures.  Imagine a log
    # line that looks like this:
    #   Nov 19 19:08:27.931605 dhcp-172-25-211-77 45471: Hello World
    field(/\A(\w+ \d+ \d+:\d+:\d+\.\d{6}) (\w+) (\w+):/m,
          :timestamp, :host, :pid).

    # Group messages from a single request together.
    # Since the logs are from multiple web server processes, messages
    # might be interleaved with those from other processes.  Use the
    # :pid and :host fields to separate them.  Whenever a processes
    # logs "Processing," that's a new request.
    group(:following => [:pid, :host], :from => /: Processing/).

    # Pull some per-request fields out of the requests.
    fields(/Processing (\w+)#(\w+)/, :controller, :action).
    fields(/Session ID: (\w+)/, :session_id)

  # Filter by a field.  Note this returns a new filter immediately
  # but does not consume any input yet.
  accounts_requests = requests.where(:controller => "AccountsController")

  # Consume the entire stream and count the number of events
  accounts_requests.count

  # View the first five matches with "less"
  accounts_requests.head(5).less

  # Group requests into session traces
  sessions = requests.group(:following => :session_id)

  # Turn the timestamp into a real ruby object
  ts_requests = requests.parse_time(:timestamp)

  # Shuffle two streams into one, with events in order of timestamp
  ts_requests.interleave(other_host_requests)

Wolverine works well interactively:

$ irb -r wolverine

You may also wish to store some helper functions specific to your log format and/or task at hand to pre-load into your session:

$ irb -r my-app

Road Map

  • Sources to read files over SSH
  • Sources that can tail a file
  • Histograms and tables
  • Faster field extraction (regex matching is the bottleneck)
  • Processing multiple pipelines in one pass over the underlying data
  • Sliding window filters that emit when a 5-minute average exceeds a threshold

About

An apetite for logs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages