Rsync Statistics Visualizer

This repository contains a fully automated Elastic Stack (Elasticsearch, Kibana, Logstash) used to ingest normalized rsync data and visualize transfer statistics via a preconfigured Kibana dashboard. Built with docker-compose.

To find out how to produce the rsync data for visualization, reference the scripts separate documentation: scripts/README.md

The stack is designed for:

Reproducible local deployment
Easy access to elastic-stack tooling without worrying about platform
Clear separation of ingestion, storage, and visualization
Easy maintenance and setup

Intended Use

This stack is intended for:

Visualizing rsync output generated via rsync_runner and rsync_normalize shellscripts
Analysis and experimentation with the elastic-stack
- The focus is on rsync in this project, but there is a fully usable elastic-stack in containers that can also be used for log analysis of e.g. security related logs and can therefore be extended without any issue.
Local development

It is not hardened nor intended for direct public internet exposure in it's current form. In addition, it is a one-node cluster and not built with high-availability in mind.

Quickstart Guide
Repository Layout
Architecture Overview
- Core Services
- One-Time Initialization Services
Requirements
Initial Setup
Data Ingestion
- Input Files
- Elasticsearch Indices
Kibana Content
- Saved Objects
- Updating Dashboards or Visualizations
Configuration Files
Security
Maintenance
Troubleshooting
Elastic-Stack specific Design Decisions

Quickstart Guide

Prerequisites

docker-compose and rsync installed

Goal

For the elastic-stack to work, it expects normalized output of rsync runs in the data directory. Using the provided scripts that data can be generated and will then automatically be indexed and visualized by the elastic-stack. For the stack itself to come up correctly, some environment variables need to be available in your shell environment when running the compose up command. After everything is started, login to kibana, and take a look at the visualization of the content inside the data directory.

Guide

This guide assumes you need to also get the normalized rsync output first. For this we provide scripts documented in their own README. For now it is fine to simply follow the steps here in the quickstart guide and use our defaults.

If you already have normalized data, skip the "Getting the data" steps and move on over to "Starting the elastic-stack and using kibana". Don't forget to copy your files to the data directory so logstash can ingest them.

Getting the data

Set the needed configuration in rsync.conf file found in the scripts directory: RSYNC_SOURCE="./data/source" RSYNC_DEST="./data/destination" RSYNC_MODE="normalize"
Run rsync via our script and wait for it to complete: ./scripts/rsync_runner.sh
A file should now have been created in the data directory, matching the "*.jsonl" naming.

We recommend running this script via cron or systemd-timers, replacing the current method of running rsync directly to get data that can actually be visualized by our stack. For the purpose of this quickstart guide, this step can be skipped.

Should you experience any issues during this stage, consult the README inside the scripts directory to learn more about our scripts.

Starting the elastic-stack and using kibana

Follow the three steps outlined here: Initial Setup
After logging into kibana as user elastic and your defined password, head over to the dashboards tab and open the rsync dashboard.
Adjust the timeframe in the top right corner of the page. If you don't see data you expect to see, it is most likely the timeframe that is wrong. We recommend setting it to "last week".
The visualizations inside the dashboard should now visualize your rsync run generated by the scripts beforehand. Once a new file matching the "*.jsonl" naming appears in the data directory, it will instantly be indexed and available in the visualizations aswell.
Feel free to head to the "Discover" section of kibana, and take a look at the fields available to you. This is a fully working elastic-stack after all, so you can leverage the kibana query language or build your own visualizations as you see fit.

With your stack now working, keep in mind that all configurations made inside the webinterface of kibana are only preserved inside the docker-volumes, once these are wiped, so are the manual changes made. Therefore use the declarative approach and persist any kibana changes as saved objects inside the ndjson file that gets imported. Consult this README for more information.

Repository Layout

.
├── docker-compose.yml
├── README.md
├── diagrams/                           # README content
├── data/                               # logstash looks for normalized data here
├── rsync_runner-logs/                  # Non-normalized rsync run output lives here
├── scripts/                            # contains all scripts to get data
├── es-setup/                           # elastic-stack specific configs
│ ├── logs-rsync-index-template.json
│ ├── generate_certs.sh
│ └── kibana/
│   └── rsync-kibana-objects.ndjson
└── logstash/                           # logstash config
  └── pipeline/
    └── logstash.conf

Architecture Overview

Core Services

Elasticsearch
- Stores all rsync event data
- Security enabled
- Uses HTTP internally (no TLS) to reduce bootstrap complexity
Kibana
- Visualization and dashboard UI
- Exposed on https://localhost:5601
- Uses HTTPS with a self-signed certificate
- Connects to Elasticsearch using the kibana_system user
Logstash
- Reads rsync JSON files from disk
- Transforms and enriches fields
- Writes documents into daily Elasticsearch indices

One-Time Initialization Services

elastic-certs
- Generates a self-signed CA and certificates
- Used only for Kibana HTTPS
- Certificates are stored in a Docker volume
es-init
- Applies the Elasticsearch index template
- Performs cluster initialization tasks
kibana-init
- Imports Kibana saved objects (data view, visualizations, dashboard)
- Uses overwrite mode to ensure a known-good state

Requirements

Docker Engine
Docker Compose v2
Open ports on the host:
- 9200 (Elasticsearch)
- 5601 (Kibana HTTPS)

We tested and guarantee a working product with the following container images:

docker.elastic.co/elasticsearch/elasticsearch:8.15.3
docker.elastic.co/kibana/kibana:8.15.3
docker.elastic.co/logstash/logstash:8.15.3
curlimages/curl:8.10.1

Generally other minor versions of the elastic-images work aswell, however we cannot guarantee that the visualizations work on different major versions. All elastic-images need to have the same version to function together.

Initial Setup elastic-stack

1. Environment Variables

Create and source a .env file, ensure the variables are set as desired. The project does not look for this file anywhere, it simply assumes these variables are set. We recommend setting them in a env file for easy sourcing.

export ELASTIC_PASSWORD=change-me-elastic
export KIBANA_SYSTEM_PASSWORD=change-me-kibana-system
export KIBANA_ENCRYPTION_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef

Notes:

KIBANA_ENCRYPTION_KEY must be at least 32 characters
Keep the encryption key stable across restarts or Kibana will fail to start, unless you destroy the docker-volumes for a fresh initial setup.
The ELASTIC_PASSWORD will be used to login to kibana.

2. Start the Stack

In the project root where the docker-compose.yml file is located, run:

docker compose up -d

and wait for all services to spin up correctly.

Keep in mind that when the env is sourced by a non-root user, but docker is run via sudo as root, then root's env will not have the actual variables needed. Ensure the command is run by a user with all env variables correctly set in their environment.

When you run into above issue, before retrying, run:

docker compose down --volumes

this removes all containers again and destroys the volumes aswell, providing a clean slate.

3. Access Kibana

URL: https://localhost:5601
Browser warning about self-signed certificate is expected
Login credentials:
- Username: elastic
- Password: value of ELASTIC_PASSWORD

Data Ingestion

Input Files

Logstash reads rsync JSON files from:

./data/*.jsonl

Behavior:

Files are read once (mode => read)
Processing state is persisted via sincedb
Restarting Logstash does not reprocess existing files, because the sincedb is stored on a docker-volume
The data directory is bind-mounted inside the logstash container, meaning you will not lose the normalized rsync data when the docker-volumes are wiped.
- However this means you are responsible for ensuring the data directory of this repository is populated with the normalized rsync runs and has a backup incase you want to index those files again.

Elasticsearch Indices

Documents are written to daily indices:

rsync-YYYY.MM.dd

Kibana Content

Saved Objects

All Kibana content is stored as code in:

es-setup/kibana/rsync-kibana-objects.ndjson

This includes:

Data view (rsync-*)
Visualizations
Dashboard layout

The objects are imported automatically with overwrite enabled by the kibana-init container. Making it easy to re-apply the config.

Updating Dashboards or Visualizations

Modify dashboards or visualizations in the Kibana UI
Export saved objects (Stack Management → Saved Objects → Export)
Replace the NDJSON file
Re-run the kibana-init container: docker compose up -d kibana-init

Naturally the NDJSON file can also be edited directly if so desired, to apply the changes use above command.

Configuration Files

File	Description
`docker-compose.yml`	Full stack definition
`.env`	Local secrets and credentials (Optional as long as the ENV vars are set in your env when running the docker compose up command)
`logstash/pipeline/logstash.conf`	Logstash pipeline configuration
`es-setup/logs-rsync-index-template.json`	Elasticsearch index template
`es-setup/kibana/rsync-kibana-objects.ndjson`	Kibana saved objects e.g. visualizations
`es-setup/generate_certs.sh`	Self-signed certificate generation helper script

Security

Since this project mainly concerns itself with visualizing rsync data, and the stack is running locally anyway. We improved maintainability and easy automation at the cost of having https everywhere and proper secrets management for our env vars. This was deemed better than not having any auth or https whatsoever. So we provide the following:

Elasticsearch

Security enabled
HTTP only (no TLS internally)
Authentication required

Kibana

HTTPS enabled using self-signed certificates
Uses kibana_system internally to handle elasticsearch operations
Human users authenticate via basic-auth to Kibanas webinterface via user "elastic"

Self-Signed HTTPS

Encrypts browser traffic
Browser warnings are expected
Suitable for local or internal deployments

Basically since we don't have a public IP and no DNS, none of the "Let's Encrypt" challenges work for us. Therefore we opted to simply generate a self-signed cert for now.

Maintenance

View Logs

# for more possibilities take a look at the docker compose logs command help
docker compose logs elasticsearch
docker compose logs kibana
docker compose logs logstash

Restart Services

docker compose restart

Reapply Initialization Steps

Reinstall Elasticsearch index template:

docker compose up -d es-init

Reimport Kibana saved objects:

docker compose up -d kibana-init

Full Reset (This destroys the docker-volumes)

Removes all Elasticsearch data and Kibana state:

# shut down the containers and remove their volumes, then create all containers and their volumes again
docker compose down --volumes
docker compose up -d

Troubleshooting

You should not run into any issues if you have the required environment variables set when running any docker commands interacting with the stack. So always ensure first that they are set properly by running env in your shell.

Kibana Cannot Connect to Elasticsearch

Verify KIBANA_SYSTEM_PASSWORD in .env
Check initialization logs: docker compose logs es-init

Saved Object Import Returns 401

Ensure kibana-init authenticates using elastic in your request

Logstash Indexing Errors

Ensure Logstash output (found in logstash's config file) uses:
user => "elastic"
password => ${ELASTIC_PASSWORD}
If the logs show permission denied issues, ensure that the logstash container has sufficient permissions to read the mounted data directory

Elastic-Stack specific Design Decisions

Elasticsearch runs on HTTP to reduce bootstrap complexity
Kibana uses HTTPS to protect browser traffic
Built-in users are used instead of service account tokens to reduce bootstrap complexity
We consider the users shell environment safe for secrets for use with this tooling
- Obviously in a production grade deployment of the elastic-stack secrets management would be handled differently, however for the scope and complexity of this project this is a good tradeoff vs not having any authentication whatsoever
Kibana content is treated as declarative state and can be reapplied by restarting the kibana-init container
No manual post-install steps are required to see rsync visualizations
Logstash takes specifically formatted rsync output as input generated by our provided scripts
- This is to guarantee a automated and working solution without having to deal with any edge-cases

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
diagrams		diagrams
es-setup		es-setup
logstash/pipeline		logstash/pipeline
rsync_runner_logs		rsync_runner_logs
scripts		scripts
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Rsync Statistics Visualizer

Intended Use

Table of Contents

Quickstart Guide

Prerequisites

Goal

Guide

Getting the data

Starting the elastic-stack and using kibana

Repository Layout

Architecture Overview

Core Services

One-Time Initialization Services

Requirements

Initial Setup elastic-stack

1. Environment Variables

2. Start the Stack

3. Access Kibana

Data Ingestion

Input Files

Elasticsearch Indices

Kibana Content

Saved Objects

Updating Dashboards or Visualizations

Configuration Files

Security

Elasticsearch

Kibana

Self-Signed HTTPS

Maintenance

View Logs

Restart Services

Reapply Initialization Steps

Full Reset (This destroys the docker-volumes)

Troubleshooting

Kibana Cannot Connect to Elasticsearch

Saved Object Import Returns 401

Logstash Indexing Errors

Elastic-Stack specific Design Decisions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages