java | Jeroen van Wilgenburg's blog

Why HTTP/2 with TLS is not supported properly in Java 8 – And what you can do about it

10 January 2017 Jeroen van Wilgenburg 17 comments

Recently I got a bit frustrated that Undertow is still the only HTTP/2 server in Java that properly supports HTTP/2 with TLS. Today I found out I was being unreasonable. After watching a talk it appeared there’s a good reason for it: TLS in Java cancels a large part of the latency improvement of HTTP/2.

Categories: English, java, work Tags: haproxy, http2, ssl, wireshark

Getting started with Akka Stream Kafka – Using Kafka the reactive streams way

19 September 2016 Jeroen van Wilgenburg 7 comments

A few days ago my eyes fell on a new release of Akka Stream Kafka. Since I’m doing a lot with Kafka currently and I really wanted to get my hands dirty with Akka this sounded very good. Also a good opportunity to see if an upgrade to Kafka 0.10.0.1 (from 0.8.2.2) is worth while (since older versions of Kafka are not supported in Akka Stream Kafka 0.11).

Categories: English, java, work Tags: akka, kafka, scala

Finding the link between heart rate and running pace with Spark ML – Fitting a linear regression model

2 September 2016 Jeroen van Wilgenburg 1 comment

Besides crafting software I’m an avid runner and cyclist. Firstly for my health and secondly because of all the cool gadgets there are available. Recently I started with a Coursera course on Machine Learning and with that knowledge I combined the output of my running watch with Spark ML. In this article I discuss how to load gps and heart rate data to a linear regression model and ultimately get a formula with heart rate as input and running pace as output.

Categories: English, hardlopen, java, triathlon, work Tags: machine-learning, scala, spark

At-least-once delivery with Kafka and Spark – Improve the reliability of your job

28 June 2016 Jeroen van Wilgenburg Leave a comment

The default behaviour of a Spark-job is losing data when it explodes. This isn’t a bad choice per se, but at my current project we need higher reliability. In this article I’ll talk about at least once delivery with the Spark write ahead log. This article is focussed on Kafka, but can also be applied to other Receivers.

Categories: English, java, work Tags: kafka, scala, spark, yarn

Spring Boot with HTTP/2 – Start a server and make REST calls as a client

1 April 2016 Jeroen van Wilgenburg 5 comments

Since HTTP/2 is gaining momentum I thought it would be a nice experiment to see if it’s possible to convert some applications to HTTP/2. We have a bunch of Spring Boot micro services and those services communicate with each other via REST calls. All communication happens via JSON (Jackson 2). Running Spring Boot with HTTP/2 should be easy and hopefully Spring RestTemplate supports HTTP/2 for the inter service communication. Let’s see…

Categories: English, java, work Tags: http2, ssl

HTTP/2 Server Push with JBoss undertow – And how to monitor things

28 December 2015 Jeroen van Wilgenburg 3 comments

With HTTP/2 it’s possible to deliver data at a client before the client even asks for the data. This will significantly improve latency and perceived download speed. Last hack day at JPoint we spent some time with HTTP/2 server push. I will show you how to get it running on your machine and show some tools to monitor/prove all this goodness.

Categories: English, java, work Tags: classloading, http2

Spark Streaming Backpressure – finding the optimal rate is now done automatically

6 October 2015 Jeroen van Wilgenburg 10 comments

One of my complaints about Spark was that it wasn’t possible to set a dynamic maximum rate. This is a problem in many jobs since the maximum throughput isn’t always linear with the output rate. Another issue is with local testing. You have to set the rate to extremely low values and experiment a lot to make a Spark job usable on a local machine.
But all these problems are in the past with the introduction of backpressure (I believe it’s spelled as back pressure, but I’ll stick to the Spark notation).

Categories: English, java, work Tags: kafka, scala, spark

How to run a Spark cluster on Mesos on your Mac

10 May 2015 Jeroen van Wilgenburg 3 comments

On my current project we are running Spark on top of Yarn. Since Hadoop causes dependency problems and feels a bit ancient I was looking for an alternative. At JPoint we have a few a days a year to try things out, this was one of them. I paired with Eelco to get things up and running.

This article will show you how to run Spark on top of Mesos on your Mac (or Linux and probably a combination of these two).

Categories: English, java, work Tags: mac, mesos, spark

Understanding Spark parameters – A step by step guide to tune your Spark job

15 February 2015 Jeroen van Wilgenburg 1 comment

After using Spark for a few months we thought we had a pretty good grip on how to use it. The documentation of Spark appeared pretty decent and we had acceptable performance on most jobs. On one Job we kept hitting limits which were much lower than with that Jobs predecessor (Storm). When we did some research we found out we didn’t understand Spark as good as we thought.
My colleague Jethro pointed me to an article by Gerard Maas and I found another great article by Michael Noll. Combined with the Spark docs and some Googlin’ I wrote this article to help you tune your Spark Job. We improved our throughput by 600% (and then the elasticsearch cluster became the new bottle neck)

Categories: English, java, work Tags: elasticsearch, hadoop, kafka, scala, spark, yarn

How to fix the classpath in Spark – Rebuilding Spark to get rid of old jars

26 October 2014 Jeroen van Wilgenburg 3 comments

After fixing an earlier classpath problem a new and more difficult problem showed up. Again the evil-doer is and ancient version of Jackson.
Unfortunately all classes are loaded from one big container with Spark/Hadoop/Yarn, this causes a lot of problems. The spark.files.userClassPathFirst option is still ‘experimental’ which, in our case, meant it just didn’t work. But again, we found a solution. Our system engineers also wanted a reproducible solution so in the end it’s just a small recipe.

Categories: English, java, work Tags: classloading, hadoop, maven, spark, yarn

Newer Entries Older Entries

Jeroen van Wilgenburg's blog

Archive

Why HTTP/2 with TLS is not supported properly in Java 8 – And what you can do about it

Getting started with Akka Stream Kafka – Using Kafka the reactive streams way

Finding the link between heart rate and running pace with Spark ML – Fitting a linear regression model

At-least-once delivery with Kafka and Spark – Improve the reliability of your job

Spring Boot with HTTP/2 – Start a server and make REST calls as a client

HTTP/2 Server Push with JBoss undertow – And how to monitor things

Spark Streaming Backpressure – finding the optimal rate is now done automatically

How to run a Spark cluster on Mesos on your Mac

Understanding Spark parameters – A step by step guide to tune your Spark job

How to fix the classpath in Spark – Rebuilding Spark to get rid of old jars

Recent Posts

Archives

Categories

Tags