The goal is to write a parser in Java that parses web server access log file, loads
the log to MySQL and checks if a given IP makes more than a certain number of
requests for the given duration.
Java
----
(1) Create a java tool that can parse and load the given log file to MySQL. The
delimiter of the log file is pipe (|)
(2) The tool takes "startDate", "duration" and "threshold" as command line
arguments. "startDate" is of "[Link]:mm:ss" format, "duration" can take only
"hourly", "daily" as inputs and "threshold" can be an integer.
(3) This is how the tool works:
java -cp "[Link]" [Link] --startDate=2017-01-01.[Link]
--duration=hourly --threshold=100
The tool will find any IPs that made more than 100 requests starting from
2017-01-01.[Link] to 2017-01-01.[Link] (one hour) and print them to console AND
also load them to another MySQL table with comments on why it's blocked.
java -cp "[Link]" [Link] --startDate=2017-01-01.[Link]
--duration=daily --threshold=250
The tool will find any IPs that made more than 250 requests starting from
2017-01-01.[Link] to 2017-01-02.[Link] (24 hours) and print them to console AND
also load them to another MySQL table with comments on why it's blocked.
SQL
---
(1) Write MySQL query to find IPs that mode more than a certain number of requests
for a given time period.
Ex: Write SQL to find IPs that made more than 100 requests starting from 2017-
01-01.[Link] to 2017-01-01.[Link].
(2) Write MySQL query to find requests made by a given IP.
LOG Format
----------
Date, IP, Request, Status, User Agent (pipe delimited, open the example file in
text editor)
Date Format: "yyyy-MM-dd HH:mm:[Link]"
Also, please find attached a log file for your reference.
The log file assumes 200 as hourly limit and 500 as daily limit, meaning:
(1)
When you run your parser against this file with the following parameters
java -cp "[Link]" [Link] --startDate=2017-01-01.[Link]
--duration=hourly --threshold=200
The output will have [Link]. If you open the log file, [Link] has
200 or more requests between 2017-01-01.[Link] and 2017-01-01.[Link]
(2)
When you run your parser against this file with the following parameters
java -cp "[Link]" [Link] --startDate=2017-01-01.[Link]
--duration=daily --threshold=500
The output will have [Link]. If you open the log file, [Link]
has 500 or more requests between 2017-01-01.[Link] and 2017-01-01.[Link]
Deliverables
------------
(1) Java program that can be run from command line
java -cp "[Link]" [Link] --accesslog=/path/to/file --startDate=2017-
01-01.[Link] --duration=hourly --threshold=100
(2) Source Code for the Java program
(3) MySQL schema used for the log data
(4) SQL queries for SQL test