Showing posts with label actors. Show all posts
Showing posts with label actors. Show all posts

Saturday, December 26, 2015

Divided We Win: an event sourcing / CQRS prospective on write and read models separation. Queries.

Quite a while ago we have started to explore command query responsibility segregation (CQRS) architecture as an alternative way to develop distributed systems. Last time we have covered only commands and events but not queries. The goal of this blog post is to fill the gap and discuss the ways to handle queries following CQRS architecture.

We will start where we left off last time, with the sample application which was able to handle commands and persist events in the journal. In order to support read path, or queries, we are going to introduce data store. For the sake of keeping things simple, let it be in-memory H2 database. The data access layer is going to be handled by awesome Slick library.

To begin with, we have to come up with a simple data model for the User class, managed by UserAggregate persistent actor. In this regards, the Users class is a typical mapping of the relational table:

class Users(tag: Tag) extends Table[User](tag, "USERS") {
  def id = column[String]("ID", O.PrimaryKey)
  def email = column[String]("EMAIL", O.Length(512))
  def uniqueEmail = index("USERS_EMAIL_IDX", email, true)
  def * = (id, email) <> (User.tupled, User.unapply)
}
It is important to notice at this point that we enforce uniqueness constraint on User's email. We will come back to this subtle detail later, during the integration with UserAggregate persistent actor. Next thing we need is a service to manage data store access, namely persisting and querying Users. As we are in the Akka universe, obviously it is going to be an actor as well. Here it is:
case class CreateSchema()
case class FindUserByEmail(email: String)
case class UpdateUser(id: String, email: String)
case class FindAllUsers()

trait Persistence {
  val users = TableQuery[Users]  
  val db = Database.forConfig("db")
}

class PersistenceService extends Actor with ActorLogging with Persistence {
  import scala.concurrent.ExecutionContext.Implicits.global
   
  def receive = {
    case CreateSchema => db.run(DBIO.seq(users.schema.create))
      
    case UpdateUser(id, email) => {
      val query = for { user <- users if user.id === id } yield user.email
      db.run(users.insertOrUpdate(User(id, email)))
    }
    
    case FindUserByEmail(email) => {
      val replyTo = sender
      db.run(users.filter( _.email === email.toLowerCase).result.headOption) 
        .onComplete { replyTo ! _ }
    }
    
    case FindAllUsers => {
      val replyTo = sender
      db.run(users.result) onComplete { replyTo ! _ }
    }
  }
}
Please notice that PersistenceService is regular untyped Akka actor, not a persistent one. To keep things focused, we are going to support only four kind of messages:
  • CreateSchema to initialize database schema
  • UpdateUser to update user's email address
  • FindUserByEmail to query user by its email address
  • FindAllUsers to query all users in the data store

Good, the data store services are ready but nothing really fills them with data. Moving on to the next step, we will refactor UserAggregate, more precisely the way it handles UserEmailUpdate command. At its current implementation, user's email update happens unconditionally. But remember, we imposed uniqueness constraints on emails so we are going to change command logic to account for that: before actually performing the update we will run the query against read model (data store) to make sure no user which such email is already registered.

val receiveCommand: Receive = {
  case UserEmailUpdate(email) => 
    try {
      val future = (persistence ? FindUserByEmail(email)).mapTo[Try[Option[User]]]
      val result = Await.result(future, timeout.duration) match {
        case Failure(ex) => Error(id, ex.getMessage)
        case Success(Some(user)) if user.id != id => Error(id, s"Email '$email' already registered")
        case _ => persist(UserEmailUpdated(id, email)) { event =>
          updateState(event)
          persistence ! UpdateUser(id, email)
        }
        Acknowledged(id)
      }
        
      sender ! result
  } catch {
    case ex: Exception if NonFatal(ex) => sender ! Error(id, ex.getMessage) 
  }
}
Pretty simple, but not really idiomatic: the Await.result does not look as it belongs to this code. My first attempt was to use future / map / recover / pipeTo pipeline to keep the flow completely asynchronous. However, the side effect I have observed is that in this case the persist(UserEmailUpdated(id, email)) { event => ... } block is supposed to be executed in another thread most of the time (if result is not ready) but it didn't, very likely because of thread context switch. So the Await.result was here to the rescue.

Now every time user's email update happens, along with persisting the event we are going to record this fact in the data store as well. Nice, we are getting one step closer.

The last thing we have to consider is how to populate the data store from the event journal? Another experimental module from Akka Persistence portfolio is of great help here, Akka Persistence Query. Among other features, it provides the ability to query the event journal by persistence identifiers and this is what we are going to do in order to populate data store from the journal. Not a surprise, there would be UserJournal actor responsible for that.

case class InitSchema()

class UserJournal(persistence: ActorRef) extends Actor with ActorLogging {
  def receive = {
    case InitSchema => {
      val journal = PersistenceQuery(context.system)
        .readJournalFor[LeveldbReadJournal](LeveldbReadJournal.Identifier)
      val source = journal.currentPersistenceIds()
      
      implicit val materializer = ActorMaterializer()
      source
        .runForeach { persistenceId => 
          journal.currentEventsByPersistenceId(persistenceId, 0, Long.MaxValue)
            .runForeach { event => 
              event.event match {
                case UserEmailUpdated(id, email) => persistence ! UpdateUser(id, email)
              }
            }
        }
    }
  }
}
Basically, the code is simple enough but let us reiterate a bit on what it does. First thing, we ask the journal about all persistence identifiers it has using currentPersistenceIds method. Secondly, for every persistence identifier we query all the events from the journal. Because there is only one event in our case, UserEmailUpdated, we just directly transform it into data store services UpdateUser message.

Awesome, we are mostly done! The simplest thing at the end is to add another endpoint to UserRoute which returns the list of the existing users by querying the read model.

pathEnd {
  get {
    complete {
      (persistence ? FindAllUsers).mapTo[Try[Vector[User]]] map { 
        case Success(users) => 
          HttpResponse(status = OK, entity = users.toJson.compactPrint)
        case Failure(ex) => 
          HttpResponse(status = InternalServerError, entity = ex.getMessage)
      }
    }
  }
}
We are ready to give our revamped CQRS sample application a test drive! Once it is up and running, let ensure that our journal and data store are empty.
$ curl -X GET http://localhost:38080/api/v1/users
[]
Make sense, as we haven't created any users yet. Let us do that by updating two users with different email addresses and querying the read model again.
$ curl -X PUT http://localhost:38080/api/v1/users/123 -d [email protected]
Email updated: [email protected]

$ curl -X PUT http://localhost:38080/api/v1/users/124 -d [email protected]
Email updated: [email protected]

$ curl -X GET http://localhost:38080/api/v1/users
[
  {"id":"123","email":"[email protected]"},
  {"id":"124","email":"[email protected]"}
]
As expected, this time the result is different and we can see two users are being returned. The moment of truth, let us try to update the email of user with 124 with the one of user with 123.
$ curl -X PUT http://localhost:38080/api/v1/users/123 -d [email protected]
Email '[email protected]' already registered
Teriffic, this is what we wanted! The read (or query) model is very helpful and works just fine. Please notice when we restart the application, the read data store should be repopulated from the journal and return the previously created users:
$ curl -X GET http://localhost:38080/api/v1/users
[
  {"id":"123","email":"[email protected]"},
  {"id":"124","email":"[email protected]"}
]

With this post coming to the end, we are wrapping up the introduction into CQRS architecture. Although one may say many things are left out of scope, I hope the examples presented along our journey were useful to illustrate the idea and CQRS could be an interesting option to consider for your next projects.

As always, the complete source code is available on GitHub. Many thanks to Regis Leray and Esfandiar Amirrahimi, two brilliant developers, for helping out a lot with this series of blog posts.

Saturday, August 31, 2013

Lightweight real-time charts with Play Framework and Scala using server-side events

Continuing a great journey with awesome Play Framework and Scala language, I would like to share yet another interesting implementation of real-time charting: this time by using lightweight server-side events instead of full-duplex WebSockets technology described previously in this post. Indeed, if you don't need a bidirectional communication but only server push, server-side events look as a very natural fit. And if you are using Play Framework, it's really easy to do as well.

Let's try to cover the same use case so it will be fair to compare both implementations: we have couple of hosts and we would like to watch CPU usage on each one in real-time (on a chart). Let's start by creating a simple Play Framework application (choosing Scala as a primary language):

play new play-sse-example

Now, when the layout of our application is ready, our next step is to create some starting web page (using Play Framework's type safe template engine) and name it as views/dashboard.scala.html. Here is how it looks like:

@(title: String, hosts: List[Host])

<!DOCTYPE html>
<html>
  <head>
    <title>@title</title>
    <link rel="stylesheet" media="screen" href="@routes.Assets.at("stylesheets/main.css")">
    <link rel="shortcut icon" type="image/png" href="@routes.Assets.at("images/favicon.png")">
    <script src="@routes.Assets.at("javascripts/jquery-1.9.0.min.js")" type="text/javascript"></script>
    <script src="@routes.Assets.at("javascripts/highcharts.js")" type="text/javascript"></script>
  </head>
    
  <body>
    <div id="hosts">
      <ul class="hosts">
        @hosts.map { host =>
          <li>               
            <a href="#" onclick="javascript:show( '@host.id' )">@host.name</a>
          </li>
        }
      </ul>
    </div>    
    <div id="content">
    </div>
  </body>
</html>

<script type="text/javascript">
function show( hostid ) {
  $('#content').trigger('unload');
 
  $("#content").load( "/host/" + hostid,
    function( response, status, xhr ) {
      if (status == "error") {   
        $("#content").html( "Sorry but there was an error:" + xhr.status + " " + xhr.statusText);
      }
    }
  )
}
</script>

The template looks exactly the same as in WebSockets example, except one single line, the purpose of this one will be explained just a bit later.

$('#content').trigger('unload');

The result of this web page is a simple list of hosts. Whenever user clicks on a host link, the host-specific view will be fetched from the server (using AJAX) and displayed. Next template is the most interesting one, views/host.scala.html, and contains a lot of important details:

@(host: Host)( implicit request: RequestHeader )

<div id="content">
  <div id="chart"></div>
 
  <script type="text/javascript">
    var charts = []   
      charts[ '@host.id' ] = new Highcharts.Chart({                 
        chart: {
          renderTo: 'chart',
          defaultSeriesType: 'spline'            
        },           
        xAxis: {
          type: 'datetime'
        },   
        series: [{
          name: "CPU",
          data: []
        }
      ]
    }); 
  </script>     
</div>

<script type="text/javascript">
  if( !!window.EventSource ) {
    var event = new EventSource("@routes.Application.stats( host.id )");
 
    event.addEventListener('message', function( event ) { 
      var datapoint = jQuery.parseJSON( event.data );
      var chart = charts[ '@host.id' ];
       
      chart.series[ 0 ].addPoint({
        x: datapoint.cpu.timestamp,
        y: datapoint.cpu.load
      }, true, chart.series[ 0 ].data.length >= 50 );
    } );

    $('#content').bind('unload',function() {
      event.close();
    });                         
  }  
</script>

The core UI component is a simple chart, built using Highcharts library. The script block at the bottom tries to create an EventSource object which is an implementation of server-side events on browser side. If browser supports server-side events, the respective connection to server-side endpoint will be created and chart will be updated on every message received from the server ('message' listener). It's a good time to explain the purpose of this construct (and it's counterpart $('#content').trigger('unload') mentioned above):

$('#content').bind('unload',function() {
  event.close();
});

Whenever user clicks on different hosts, the previous event stream should be closed and new one should be created. Not doing so leads to more and more event streams to be created, flooding browser with more and more event listeners. To overcome this, we bind an unload method to a div element with id content and call it all the time when user clicks on a host. By doing that, we close event stream all the time before opening a new one. Enough UI, let's move on to back-end.

The routing table and mostly all the code stay the same, except only two small method changes, Statistics.attach and Application.stats. Let's take a look how server push of host's CPU statistics using server-side events is implemented on controller side (and mapped to /stats/:id URL):

def stats( id: String ) = Action { request =>
  Hosts.hosts.find( _.id == id ) match {
    case Some( host ) =>
      Async { 
        Statistics.attach( host ).map { enumerator =>      
          Ok.stream( enumerator &> EventSource() ).as( "text/event-stream")
        }
      }
    case None => NoContent  
  }
}

Very short piece of code which does a lot of things. After finding the respective host by its id, we "attaching" to it by receiving the Enumerator instance: the continuous flow of CPU statistics data. The Ok.stream( enumerator &> EventSource() ).as( "text/event-stream") will transform this continuous flow of statistics data to stream of events which client is able to consume using server-side events.

To finish with server-side changes, let's take a look how "attaching" to host's statistics flow looks like:

def attach( host: Host ): Future[ Enumerator[ JsValue ] ] = {
  ( actor( host.id ) ? Connect( host ) ).map {      
    case Connected( enumerator ) => enumerator
  }
}

It's as simple as returning the Enumerator, and because we are using Akka actors, it becomes a bit more tricky with Future and asynchronous invocations. And, that's it!

In action our simple application looks like this (using Mozilla Firefox), having only Host 1 and Host 2 as an example:

Very nice and simple, and yet again, thanks a lot to Play Framework guys and the community. Complete source code is available on GitHub.

Monday, May 27, 2013

Real-time charts with Play Framework and Scala: extreme productivity on JVM for web

Being a hardcore back-end developer, whenever I am thinking about building web application with some UI on JVM platform, I feel scared. And there are reasons for that: having experience with JSF, Liferay, Grails, ... I don't want to go this road anymore. But if a need comes, is there a choice, really? I found one which I think is awesome: Play Framework.

Built on top of JVM, Play Framework allows to create web applications using Java or Scala with literally no efforts. The valuable and distinguishing differences it provides: static compilation (even for page templates), easy to start with, and concise (more about it here).

To demonstrate how amazing Play Framework is, I would like to share my experience with developing simple web application. Let's assume we have couple of hosts and we would like to watch CPU usage on each one in real-time (on a chart). When one hears "real-time", it may mean different things but in context of our application it means: using WebSockets to push data from server to client. Though Play Framework supports pure Java API, I will use some Scala instead as it makes code very compact and clear.

Let's get started! After downloading Play Framework (the latest version on the moment of writing was 2.1.1), let's create our app by typing

play new play-websockets-example
and selecting Scala as a primary language. No wonders here: it's a pretty standard way nowadays, right?

Having our application ready, next step would be to create some starting web page. Play Framework uses own type safe template engine based on Scala, it has a couple of extremely simple rules and is very easy to get started with. Here is an example of views/dashboard.scala.html:

@(title: String, hosts: List[Host])

<!DOCTYPE html>
<html>
  <head>
    <title>@title</title>
    <link rel="stylesheet" media="screen" href="@routes.Assets.at("stylesheets/main.css")">
    <link rel="shortcut icon" type="image/png" href="@routes.Assets.at("images/favicon.png")">
    <script src="@routes.Assets.at("javascripts/jquery-1.9.0.min.js")" type="text/javascript">
    <script src="@routes.Assets.at("javascripts/highcharts.js")" type="text/javascript">
  </head>
    
  <body>
    <div id="hosts">
      <ul class="hosts">
        @hosts.map { host =>
        <li>               
          <a href="#" onclick="javascript:show( '@host.id' )"><b>@host.name</b></a>
        </li>
        }
      </ul>
    </div>
  
    <div id="content">
    </div>
  </body>
</html>

<script type="text/javascript">
function show( hostid ) {
  $("#content").load( "/host/" + hostid,
    function( response, status, xhr ) {
      if (status == "error") {
        $("#content").html( "Sorry but there was an error:" + xhr.status + " " + xhr.statusText);
      }
    }
  )
}
</script>

Aside from coupe of interesting constructs (which are very well described here), it looks pretty like regular HTML with a bit of JavaScript. The result of this web page is a simple list of hosts in the browser. Whenever user clicks on a particular host, another view will be fetched from the server (using old buddy AJAX) and displayed on right side from the host. Here is the second (and the last) template, views/host.scala.html:

@(host: Host)( implicit request: RequestHeader )

<div id="content">
  <div id="chart">
  <script type="text/javascript">
    var charts = []   
    charts[ '@host.id' ] = new Highcharts.Chart({                 
      chart: {
        renderTo: 'chart',
        defaultSeriesType: 'spline'            
      },           
      xAxis: {
        type: 'datetime'
      },   
      series: [{
        name: "CPU",
        data: []
      }]
    }); 
  </script>
</div>

<script type="text/javascript">
var socket = new WebSocket("@routes.Application.stats( host.id ).webSocketURL()")
socket.onmessage = function( event ) { 
  var datapoint = jQuery.parseJSON( event.data );
  var chart = charts[ '@host.id' ]
  
  chart.series[ 0 ].addPoint({
    x: datapoint.cpu.timestamp,
    y: datapoint.cpu.load
  }, true, chart.series[ 0 ].data.length >= 50 );
}
</script>

It's looks rather as a fragment, not a complete HTML page, which has only a chart and opens the WebSockets connection with a listener. With an enormous help of Highcharts and jQuery, JavaScript programming hasn't ever been so easy for back-end developers as it's now. At this moment, the UI part is completely done. Let's move on to back-end side.

Firstly, let's define the routing table which includes only three URLs and by default is located at conf/routes:

GET     /                           controllers.Application.index
GET     /host/:id                   controllers.Application.host( id: String )
GET     /stats/:id                  controllers.Application.stats( id: String )

Having views and routes defined, it's time to fill up the last and most interesting part, the controllers which glue all parts together (actually, only one controller, controllers/Application.scala). Here is a snippet which maps index action to the view templated by views/dashboard.scala.html, it's as easy as that:

def index = Action {
  Ok( views.html.dashboard( "Dashboard", Hosts.hosts() ) )
}

The interpretation of this action may sound like that: return successful response code and render template views/dashboard.scala.html with two parameters, title and hosts, as response body. The action to handle /host/:id looks much the same:

def host( id: String ) = Action { implicit request =>
  Hosts.hosts.find( _.id == id ) match {
    case Some( host ) => Ok( views.html.host( host ) )
    case None => NoContent
  }    
}

And here is a Hosts object defined in models/Hosts.scala. For simplicity, the list of hosts is hard-coded:

package models

case class Host( id: String, name: String )

object Hosts {  
  def hosts(): List[ Host ] = {
    return List( new Host( "h1", "Host 1" ), new Host( "h2", "Host 2" ) )
  } 
}

The boring part is over, let's move on to the last but not least implementation: server push of host's CPU statistics using WebSockets. As you can see, the /stats/:id URL is already mapped to controller action so let's take a look on its implementation:

def stats( id: String ) = WebSocket.async[JsValue] { request =>
  Hosts.hosts.find( _.id == id ) match {
    case Some( host ) => Statistics.attach( host )
    case None => {
      val enumerator = Enumerator
        .generateM[JsValue]( Promise.timeout( None, 1.second ) )
        .andThen( Enumerator.eof )
      Promise.pure( ( Iteratee.ignore[JsValue], enumerator ) )
    }
  }
}

Not too much code here but in case you are curious about WebSockets in Play Framework please follow this link. This couple of lines may look a bit weird at first but once you read the documentation and understand basic design principles behind Play Framework, it will look much more familiar and friendly. The Statistics object is the one who does the real job, let's take a look on the code:

package models

import scala.concurrent.Future
import scala.concurrent.duration.DurationInt

import akka.actor.ActorRef
import akka.actor.Props
import akka.pattern.ask
import akka.util.Timeout
import play.api.Play.current
import play.api.libs.concurrent.Akka
import play.api.libs.concurrent.Execution.Implicits.defaultContext
import play.api.libs.iteratee.Enumerator
import play.api.libs.iteratee.Iteratee
import play.api.libs.json.JsValue

case class Refresh()
case class Connect( host: Host )
case class Connected( enumerator: Enumerator[ JsValue ] )

object Statistics {
  implicit val timeout = Timeout( 5 second )
  var actors: Map[ String, ActorRef ] = Map()
  
  def actor( id: String ) = actors.synchronized {
    actors.find( _._1 == id ).map( _._2 ) match {
      case Some( actor ) => actor      
      case None => {
        val actor = Akka.system.actorOf( Props( new StatisticsActor(id) ), name = s"host-$id" )   
        Akka.system.scheduler.schedule( 0.seconds, 3.second, actor, Refresh )
        actors += ( id -> actor )
        actor
      }
    }
  }
 
  def attach( host: Host ): Future[ ( Iteratee[ JsValue, _ ], Enumerator[ JsValue ] ) ] = {
    ( actor( host.id ) ? Connect( host ) ).map {      
      case Connected( enumerator ) => ( Iteratee.ignore[JsValue], enumerator )
    }
  }
}

As always, thanks to Scala conciseness, not too much code but a lot of things are going on. As we may have hundreds of hosts, it would be reasonable to dedicate to each host own worker (not a thread) or, more precisely, own actor. For that, we will use another amazing library called Akka. The code snippet above just creates an actor for the host or uses existing one from the registry of the already created actors. Please note that the implementation is quite simplified and leaves off important details. The thoughts in right direction would be using supervisors and other advanced concepts instead of synchronized block. Also worth mentioning that we would like to make our actor a scheduled task: we ask actor system to send the actor a message Refresh every 3 seconds. That means that the charts will be updated with new values every three seconds as well.

So, when actor for a host is created, we send him a message Connect notifying that a new connection is being established. When response message Connected is received, we return from the method and at this point connection over WebSockets is about to be established. Please note that we intentionally ignore any input from the client by using Iteratee.ignore[JsValue].

And here is the StatisticsActor implementation:

package models

import java.util.Date

import scala.util.Random

import akka.actor.Actor
import play.api.libs.iteratee.Concurrent
import play.api.libs.json.JsNumber
import play.api.libs.json.JsObject
import play.api.libs.json.JsString
import play.api.libs.json.JsValue

class StatisticsActor( hostid: String ) extends Actor {
  val ( enumerator, channel ) = Concurrent.broadcast[JsValue]
  
  def receive = {
    case Connect( host ) => sender ! Connected( enumerator )       
    case Refresh => broadcast( new Date().getTime(), hostid )
  }
  
  def broadcast( timestamp: Long, id: String ) {
    val msg = JsObject( 
      Seq( 
        "id" -> JsString( id ),
        "cpu" -> JsObject( 
          Seq( 
            ( "timestamp" -> JsNumber( timestamp ) ), 
            ( "load" -> JsNumber( Random.nextInt( 100 ) ) ) 
          ) 
        )
      )
    )
     
    channel.push( msg )
  }
}

The CPU statistics is randomly generated and the actor just broadcasts it every 3 seconds as simple JSON object. On the client side, the JavaScript code parses this JSON and updates the chart. Here is how it looks like for two hosts, Host 1 and Host 2 in Mozilla Firefox:

To finish up, I am personally very excited with what I've done so far with Play Framework. It took just couple of hours to get started and another couple of hours to make things work as expected. The errors reporting and feedback cycle from running application are absolutely terrific, thanks a lot to Play Framework guys and the community around it. There are still a lot of things to learn for me but it worth doing it.

Please find the complete source code on GitHub.