-
Notifications
You must be signed in to change notification settings - Fork 0
/
atom.xml
1057 lines (680 loc) · 129 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title><![CDATA[Responsibly Sourced]]></title>
<link href="http://DrewEaster.github.io/atom.xml" rel="self"/>
<link href="http://DrewEaster.github.io/"/>
<updated>2016-06-12T17:41:10+01:00</updated>
<id>http://DrewEaster.github.io/</id>
<author>
<name><![CDATA[Andrew Easter]]></name>
</author>
<generator uri="http://octopress.org/">Octopress</generator>
<entry>
<title type="html"><![CDATA[Practical implementation traits of service choreography based microservices integration]]></title>
<link href="http://DrewEaster.github.io/blog/2016/06/09/practical-implementation-traits-of-service-choreography-based-microservices-integration/"/>
<updated>2016-06-09T17:54:00+01:00</updated>
<id>http://DrewEaster.github.io/blog/2016/06/09/practical-implementation-traits-of-service-choreography-based-microservices-integration</id>
<content type="html"><![CDATA[<p>This is the second post in a three part series looking at the topic of microservice integration. In the <a href="http://www.dreweaster.com/blog/2016/05/08/the-art-of-microservices-integration-using-service-choreography/">first instalment</a>, I focused mainly on the theory side of event-driven service choreography. In this second part, I’ll dig into the practical traits that we’ll require of our technical implementations that enable us to satisfy the theory discussed in the first post. In the final instalment of the series, I’ll look at different specific implementation techniques/technologies and how they map to the traits discussed in this post.</p>
<h2>Implementation traits</h2>
<p>I’d like to provide some coverage on what I believe to be the key traits we look for in service choreography based microservices integration implementation techniques/technologies. My goal is to set the scene to make it easier to test specific technologies against those traits (subject of the third and final post in this series).</p>
<p>I prefer to breakdown the traits into two categories: must-haves and nice-to-haves. The must-haves category contains traits that I believe are absolutely necessary in order to successfully apply the theory of service choreography. The nice-to-haves category contains traits that you can essentially live without, but can definitely buy you additional benefits. Like with most things, though, the decision to adopt nice-to-haves will be driven by context – we’re always making trade-offs, and you simply have to make judgement calls on a case by case basis.</p>
<p>Let’s move on to the first category of traits, the must-haves!</p>
<h3>Must-have traits</h3>
<h4>Decoupled in time</h4>
<p>This one is pretty straightforward. In the <a href="http://www.dreweaster.com/blog/2016/05/08/the-art-of-microservices-integration-using-service-choreography/">first instalment</a> of this series, I discussed the asynchronous nature of service choreography based integration. Whatever implementation direction we go in, it needs to support us decoupling services in <em>time</em>. This means, for example, service A does not require service B to be online at a specific point in time (now) – we just need to ensure we have some mechanism in place for events from service B to eventually reach service A at some point in the future.</p>
<h4>Guaranteed at-least-once-delivery</h4>
<p>An absolute pre-requisite for ensuring eventual consistency is that we guarantee events <em>eventually</em> reach their interested consumers. But, why don’t we aim for <em>exactly-once-delivery</em> instead? I’m not going to repeat what many others have said before me, so suffice to say it’s simply not possible to achieve it in a distributed system. Google is your friend if you want to explore why :–)</p>
<p>So, we’re happy to settle for <em>at-least-once-delivery</em> because sending duplicates of a specific event is better than sending no event at all (that’s what you might see with <em>at-most-once-delivery</em>). The ability to guarantee at-least-once-delivery also implies the need for <em>durability</em>.</p>
<p>The biggest gotcha I see when it comes to at-least-once-delivery is what I’ve generally seen referred to as the <em>dual-write problem</em>. Whether you’re using a traditional CRUD approach, or you’re using eventsourcing, you are going to end up pretty unhappy if you have a unit of code that both writes to a datastore and delivers events to, say, a message queue. Let’s examine two ways I’ve seen this done:</p>
<h5>Write to MQ after committing a database transaction</h5>
<pre><code>doInDatabaseTransaction { statement =>
statement.insert("INSERT into ....")
}
messageQueue.publish(new SomeEvent(...))
</code></pre>
<p>Okay, so we make changes to the book of record (the database), the database transaction gets committed, and only once that happens, do we publish an event to the message queue where our interested consumers will be listening. This would work perfectly well in a world where nothing bad ever happens. But there are all sorts of things that can go wrong here, including the most obvious:</p>
<ol>
<li>Our application crashes immediately after the database transaction commits</li>
<li>Our application is restarted immediately after the database transaction commits</li>
<li>The message queue infrastructure is down for a few minutes, meaning we can’t send the event right now</li>
</ol>
<p>In any of these cases, our book of record would be updated, but our downstream consumers would never receive the event. In one fell swoop, we’ve guaranteed that we’ll end up in an inconsistent state. You may say these circumstances are rare, and you’d be right, but Murphy’s Law – and our own experiences as software engineers – teaches us that if it can go wrong, it will go wrong. Guaranteed.</p>
<p>Let’s try another approach…</p>
<h5>Write to MQ within scope of a database transaction</h5>
<pre><code>doInDatabaseTransaction { statement =>
statement.insert("INSERT into ....")
messageQueue.publish(new SomeEvent(...))
}
</code></pre>
<p>Hang on, that transaction boundary is bound to the database only; it’s got nothing to do with the message queue technology. In this example, if our database transaction were to rollback for any reason, our event would still have been published to the message queue. Oh dear, the event we sent out to interested consumers is not consistent with our book of record. Our consumers will proceed to behave as if the event has taken place, but our local context (the source of the event) will have no knowledge of it ever having taken place. That’s bad. Very bad. Arguably even worse than the first example.</p>
<p>Let’s get something straight here and now – unless we start dabbling in distributed transaction managers (e.g. XA standard) and two-phase commits, we can’t atomically update a database and write to a message queue. Two-phase commit is a disease we want to quarantine ourselves from forever, so we need another way to escape from the dual-write problem. You’ll have to wait until the third part of this mini-series for a solution ;–)</p>
<h4>Guaranteed message ordering</h4>
<p>The need for a stream of events to be consumable in order really depends on the use cases of its consumers. Very broadly speaking, there are two categories of consumer use case:</p>
<ol>
<li><p>Consumers that consume events from another service where consumption results in state transitions in the local context (e.g. projecting state locally from an external bounded context). Such consumers are conceptually <em>stateful</em> in that they care about tracking state across a series of multiple related events over time. In such cases, it’s usually necessary to process the events in the order they were originally produced for the local state to remain consistent with the source. It’s important to emphasise that it is only related events for which the order is necessary (e.g. events emanating from a specific aggregate instance).</p></li>
<li><p>Consumers that are conceptually <em>stateless</em> in that they can treat every event they encounter as if it’s completely unrelated to any other event they’ve encountered in the past. Such consumers will typically trigger some kind of one off action, such as sending an email, sending a push notification, or triggering an external API call. An example of this might be where the reaction to an event requires charging a credit card via a third-party payment gateway.</p></li>
</ol>
<p>Given that service choreography will inherently lead to many instances of use case 1) in your services, it becomes inevitable that you make implementation choices that allow events to be consumed in the order they were produced. With this in mind, it makes sense to choose implementation techniques/technologies that provide this guarantee, even if some of your consumers don’t rely on ordering.</p>
<h4>Guaranteed at-least-once-processing</h4>
<p>Well, I guess what we really want is <em>exactly-once-processing</em>! However, I thought it would be helpful to write a separate subsection on idempotency (see below). I find it useful to separate the general action of processing from the outcome of the processing – even if we handle a message/event idempotently (e.g. through some method of deduplication), I still like to consider that the message/event has been processed, despite the absence of any side effects. I find it simpler to think of processing as meaning a consumer has handled a message/event and is now ready to handle the next one in the stream.</p>
<p>It’s really important to emphasise the word ‘eventual’ in eventual consistency. Whilst it seems obvious, I have seen people neglect the fact that eventual does mean that something will <em>definitely happen</em> in the end. Yes, we acknowledge that consistency may be delayed, but we still rely on consistency being achieved in the end. Where we’re going down the microservices path – and following the service choreography approach – we need, in many cases, cast iron guarantees that we’ll eventually process every event we’re interested in. For example, if we are projecting state locally (achieving autonomy and encapsulated persistence) based on events produced by another service (bounded context), and our local business logic relies on that state, we can have zero trust in the entire system if we can’t guarantee that we’ll successfully process every event we’re interested in.</p>
<p>A murky subtext here is how to deal with processing errors. Whatever the reason for an error during handling of an event, you are forced to consider the fact that, if you continue to process further events without processing the event raising the error, you could leave your system in a permanently inconsistent state. Where it’s absolutely necessary for a consumer to handle events in order, you really are forced to block all subsequent processing until you’ve found a way to successfully process the event that’s raising an error. There’s an obvious danger here that your target SLA on eventual consistency could be quickly blown out the water if, for example, the solution to the failed processing involved code changes. As discussed above, ordering is rarely a requirement across every event in a stream. With this in mind, the ability to achieve some form of parallelism in event handling may well be necessary to avoid complete gridlock in a specific consumer. I’ll discuss this in the nice-to-haves section.</p>
<p>Where the requirement to process events in order can be relaxed, dealing with processing errors can be a little more straightforward. An option might be to log the event raising an error (after exhausting retries), and move on to subsequent events in the stream. You could put in place some mechanism to replay from the error log once necessary work has been carried out to ensure the event can be successfully processed.</p>
<p>In some circumstances, it may even be ok to never process an event. For example, consider an email notification use case. Given that processing failure rates are likely to be pretty low in normal operation, you may deem it acceptable for the odd system email to never reach an intended customer.</p>
<h4>Idempotency</h4>
<p>Given the inability to achieve exactly-once-delivery, and instead falling back to at-least-once-delivery, we can’t just ignore the fact that consumers will, on occasion, encounter the same event more than once. Idempotency is a property of an event handler that allows the same event to be applied multiple times without any new side effects beyond the initial application. In some cases, it might be ok to live with repeated side effects, and in some cases it won’t be ok. For example, we might not mind if we send a duplicate email, but a customer won’t be too happy if we charge their credit card twice for the same order.</p>
<p>Some actions are naturally idempotent, in which case you don’t need to explicitly worry about duplicate application, but there are many cases where it’s going to matter, and so you need to introduce mechanisms to avoid duplicate application. I’m going to resist exploring patterns for idempotent event handling in this series of posts, as it warrants dedicated coverage of its own. Mechanisms for implementing idempotency are typically application level concerns, rather than, for example, being something you can rely on some middleware layer to handle for you. Whatever implementation mechanisms you choose to integrate services via asynchronous events, you’ll need to deal with ensuring idempotency in the way you handle the events.</p>
<p>On a side note, it’s worth mentioning that some third-party, external services you integrate with may give you some help in this area. For example, <a href="http://www.stripe.com">Stripe’s</a> API supports passing an ‘idempotency key’ with a request, and it guarantees that, in a 24 hour window, it won’t reprocess two API calls that share the same key.</p>
<h3>Nice-to-have traits</h3>
<h4>Consumer-side failure recovery</h4>
<p>I was very close to including this trait within the must-haves group, but decided to be lenient. Now that we understand autonomy to be a key attribute for reactive microservices, it follows, in my opinion, that consumers must be responsible for recovering from their own failures without burdening upstream sources of events. I’ve worked with message oriented systems where a producer of events is relied upon to re-dispatch messages in the event a downstream consumer has got itself in a mess. It strikes me that such an approach is not compliant with the autonomy objective – if a consumer is dependent on a producer going beyond its operational responsibilities to help it recover from failure, the autonomy of that consumer is called in to question.</p>
<p>This trait drives an alternative way of thinking from more traditional forms of middleware and/or integration patterns. In the third part of this series of posts, I’ll look at how distributed commit log technologies (such as Apache Kafka and Amazon Kinesis) have a considerable advantage over traditional MQ and pub/sub technologies in regard to this nice-to-have integration trait. It boils down to inversion of control, whereby the responsibility for tracking a consumer’s progress through a stream of events becomes the responsibility of the consumer rather than a central messaging broker.</p>
<h4>Decoupled in space</h4>
<p>In the must-haves section, I covered the trait of integration being decoupled in time. Going a stage further, you can aim for services to be decoupled in space as well. Anyone who has worked with a service-oriented architecture, especially where synchronous integration between services is the norm, will be familiar with the challenge of service addressability. Dealing with the overhead of managing configuration for many service endpoints can be quite a burden.</p>
<p>If we’re able to remove this overhead in some way, thus achieving significant location transparency, it can further simplify our service integration challenges. Using middleware technology is a great way of achieving this. Decoupling in space is also possible without middleware – contemporary service discovery/locator patterns do facilitate this to some extent – and I’ll weigh up the two approaches in the third and final post of this series.</p>
<h4>Parallelism</h4>
<p>In an ideal world, we’d want the ability to parallelise the processing capabilities of a specific consumer by starting multiple instances. A common pattern when using messaging middleware is to have a single queue with multiple consumers each being sent messages in a round-robin fashion, with no consumer receiving the same message. This approach works fine in scenarios where processing messages in order is not important. However, as discussed in the must-haves section, we’ll often encounter the need for a consumer to process a stream of events strictly in order, especially when applying service choreography based integration. As also discussed earlier, it’s rarely the case that a consumer cares to receive every event in order, more likely it’s important that events that are related in some way to each other are processed from a stream in the order they were generated (e.g. events emanating from a specific aggregate instance). With this in mind, it’s a nice-to-have to find a way to parallelise consumers, whilst still ensuring events related to each other are processed in order. By doing this we get these primary benefits:</p>
<ol>
<li>We can improve the performance of our system through horizontal scaling, reducing the latency of eventual consistency.</li>
<li>It’s easier to implement high availability of consumers rather than have single points of failure.</li>
<li>We can avoid a consumer use case being completely blocked when encountering a repeated error in processing a single event. If we’re able to parallelise in some way, we can at least have that consumer use case continue processing some events (as long as they aren’t related to the stubborn one) rather than stopping processing altogether.</li>
</ol>
<p>In the third part of this series of posts, I’ll look at the technology options available to us that enable both guaranteed in-order processing <em>and</em> parallel consumers.</p>
<h2>Wrapping up</h2>
<p>Phew, that’s the end of a long post! I’ve covered both the must-have traits and the nice-to-have traits of microservices integration implementations that are supportive of service choreography. In the third and final post of this series, I’ll at last get round to looking at specific technologies and techniques that enable us to satisfy these traits. Stay tuned!</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[The art of microservices integration using service choreography]]></title>
<link href="http://DrewEaster.github.io/blog/2016/05/08/the-art-of-microservices-integration-using-service-choreography/"/>
<updated>2016-05-08T22:10:00+01:00</updated>
<id>http://DrewEaster.github.io/blog/2016/05/08/the-art-of-microservices-integration-using-service-choreography</id>
<content type="html"><![CDATA[<p>This is the first post in a three part series looking at the topic of microservice integration. In this first instalment, I’ll be focusing mainly on the theory side of event-driven service choreography. In the second post, I’ll cover the implementation traits required to satisfy the theory discussed in this first post, and, in final post, I’ll be assessing the support available for those traits in well known implementation technologies. So, let’s get on with part one!</p>
<h2>Looking back</h2>
<p>One of the biggest shortcomings of traditional SOA is/was the tendency to break up a highly-coupled monolith into a series of smaller services with the same level of coupling that was previously internal to the monolith. The likely result being a distributed monolith with all the same problems you had before, but now with an additional operational burden – you end up in a worse position than if you’d just stuck with the monolith!</p>
<p>It’s this learning that I believe was the main catalyst for the development of the microservices architecture pattern (SOA 2.0?). In hindsight it seems pretty obvious; if you can’t run a service in isolation, with significant levels of autonomy, it’s pretty hard to justify why a piece of functionality is better in a separate service than simply internal to a monolith. It’s not like there aren’t some potentially good reasons why people would choose to do old style SOA, it’s just you find those good reasons are outweighed by the negatives most of the time.</p>
<h2>Looking forward</h2>
<p>So, enter the world of microservices architecture, and the promise of isolation, autonomy, single-responsibility, encapsulated persistence etc. But, what exactly allows one to achieve such isolation and autonomy?</p>
<p>In this post, I’m going to focus on how getting service integration right is a fundamental part of what enables services to be isolated, and act autonomously. I believe having a thorough understanding of the principles of good service integration can train your mind into grasping the importance of good service separation. As a little bonus, I’ll close this post by looking at some considerations for good service separation.</p>
<h2>Service orchestration vs service choreography</h2>
<p>In a classic distributed monolith scenario as described above, the prevalent integration technique is likely to involve service <em>orchestration</em>. This is where backend services typically have a high-level of synchronous coupling – i.e. a service is reliant on other services to be operational and working, within a single request/response cycle, in order for it to carry out its own responsibilities. Such real-time dependencies prevent a service from acting autonomously – any failures in its dependencies will inevitably cascade, preventing the service from fulfilling its own responsibility. Here’s a visualisation of service orchestration in action:</p>
<p><a href="http://DrewEaster.github.io/images/service_orchestration.png"><img class="center" src="http://DrewEaster.github.io/images/service_orchestration.png"></a></p>
<p>In this example, service A is dependent on both service B and service C when handling its own inbound HTTP calls. Service A fetches state X (from service B) state Y (from service C), aggregating them with some of its own state, Z, to finally yield an HTTP response to its callers. Failures in either B or C will prevent A from fully fulfilling its responsibilities – it may be able to degrade gracefully but the ability to do so really depends on context.</p>
<p>Contrast this scenario with service <em>choreography</em> instead. In a system designed to embrace choreography, services will typically avoid synchronous coupling – i.e. any integration between services does not apply during the usual request/response cycle. In such cases, a service can fulfill its responsibilities within the request/response cycle without the need to make further calls to other services (with the exception of persistence backends owned solely by the service).</p>
<p>The classic way to achieve this is via embracing an event-driven (message passing) approach to integration. That is, any state that a service requires from services external to it, is projected internally from event streams published by those external services. Such internal projections will be managed and updated completely asynchronously (outside of the request/response cycle), and will be eventually consistent. In the true spirit of microservices, the service entirely encapsulates all the persistent state it requires in order to fulfill its responsibilities, achieving true isolation and autonomy.</p>
<p>Let’s refactor the previous diagram to reflect choreography instead of orchestration:</p>
<p><a href="http://DrewEaster.github.io/images/service_choreography.png"><img class="center" src="http://DrewEaster.github.io/images/service_choreography.png"></a></p>
<p>In this updated approach, service A receives events from streams published by service B and service C. Service A processes events in an eventually consistent manner, and persists locally only what it needs of state X and Y, alongside its own state, Z. When handling an incoming request, service A does not need to communicate with service B or service C.</p>
<h2>Clarifying asynchronous integration in service choreography</h2>
<p>I feel there’s some confusion where people refer to asynchronous communication, especially in the field of microservices integration. It’s worth some time to clarify what’s meant in the context of service choreography.</p>
<p>What it’s not:</p>
<ul>
<li><p><strong>Non-blocking I/O</strong> – I’m an advocate of asynchronous, non-blocking I/O as part of building more efficient, resilient and scalable services that interact in some way with external I/O. However, in the context of service choreography, this is certainly not what we mean by asynchronous integration. Non-blocking I/O could still be used within the request/response cycle for orchestration use cases, and, whilst it has its advantages in one sense, certainly doesn’t, on its own, buy any architectural benefits of isolation and autonomy.</p></li>
<li><p><strong>Classic MQ Request/Reply</strong> – It’s possible using classic MQ technology (e.g. JMS, AMQP) to achieve asynchronous request/reply behaviour. You could pop a message on a queue, and wait for a response on some temporary reply queue. There’s certainly some added decoupling in that the caller needn’t know exactly who will reply, but, like with non-blocking I/O, if this is being done as part of a service handling an incoming request, then, despite the communication with the MQ itself being asynchronous in nature, the service is still not acting autonomously. If a consumer responsible for replying is down, and the call must then timeout, it’s ultimately no different to an HTTP endpoint being unavailable or failing.</p></li>
</ul>
<p>So, to clarify things then. When we’re talking about services being integrated asynchronously as part of service choreography, we’re referring to a service being <em>free from the need to rely on its dependencies during the request/response lifecycle</em>.</p>
<h2>End-to-end autonomy</h2>
<p>Where I’ve covered isolation and autonomy in this post, I’ve been referring to <em>runtime</em> autonomy. It’s worth noting that a strong motivation for isolating services is the autonomy they additionally afford throughout the entire engineering process. The event-driven integration nature of service choreography sits very naturally with the desire to assign clear ownership to specific teams, enabling them to develop, build, test and release independently. Whilst there are techniques to support these things alongside service orchestration, I find it much easier to reason about isolation and independence when service dependencies are largely confined to the background.</p>
<p>When considering service/API level tests, embracing eventually consistent, event-driven integration allows you to focus on data fixtures as opposed to service virtualisation. There’s something inherently more simple – in my mind – about placing a system into a desired state via a simulated event stream, rather than having to worry about mocking/stubbing/virtualising some external service endpoint(s).</p>
<h2>Added complexity?</h2>
<p>Service choreography, without doubt, introduces a level of technical complexity beyond orchestration. Rather than the apparent simplicity of making calls to dependencies in real-time, you need to both produce events yourself and consume events from others; it requires a fundamental shift in the way you build software. It exposes you to such challenges as guaranteeing at-least-once-delivery/processing of events, handling out of order events, ensuring idempotency in event consumers, factoring in eventual consistency in the UI etc.</p>
<p>Like with anything in software engineering, it’s all about tradeoffs. There’s no silver bullet, and you have to make a call based on your own unique circumstances. There will always be times where the simplicity of orchestration will trump its limitations on autonomy. Making such calls is why experience matters and why there will always be room for judgement in engineering.</p>
<p>For example, a team may decide that it’s ok for services acting as companions to some primary service – and so invisible outside the context of the service’s public contract – to use orchestration, reserving choreography for integrating with services owned by other teams. In this scenario, the team would still benefit from the greater independence they’ll have from other teams, whilst losing some runtime autonomy internally.</p>
<p>Additionally, there are times when integrating with third-party services (outside your organisation) will necessitate a degree of orchestration given limitations in the third-party API contracts.</p>
<p>As a general system wide architectural constraint, it’s wise to be rigid about enforcing service choreography between services owned by different teams. This has the added benefit of really helping to drive home the importance of good service boundaries – if your design relies on orchestration between services owned by different teams, there’s a good chance you’ve not found sensible business-oriented boundaries between your services. A word of caution, though: where services are clearly aligned with business boundaries, choreography should be the preferred approach, even if the services are owned by the same team.</p>
<h2>Touching on boundaries</h2>
<p>Whilst service choreography enables autonomy, the elephant in the room is that a dependency is still a dependency, whether it be event-driven or not. If a service is required to consume from a large number of event streams, it’s still adding overhead in terms of managing the dependencies over time.</p>
<p>One of the fundamental mistakes people make with microservices is to draw technical boundaries rather than boundaries of business capability/purpose. The use of the word “micro” is surely partly to blame, as it may appear to encourage highly granular service responsibilities. By breaking up a monolith into services of a highly technical nature, with little correlation to the business domain, you’re inevitably going to introduce more dependencies, and accordingly more overhead. Too many dependencies of any variety is almost certainly a design smell, a sign of high coupling and low cohesion, when it’s the exact opposite we’re looking for. If you’re stuck with lots of dependencies, it’s a sure fire way of spotting that you’ve drawn your boundaries wrong. If you’ve managed to identify a genuine business capability, you’ll be surprised at its fairly natural properties of isolation and independence.</p>
<p>Domain-Driven Design equips us with the toolset – and mindset – to identify business-oriented boundaries, and there’s certainly reason to see a close relationship between microservice boundaries and Bounded Contexts as described in DDD. Whilst I don’t consider there to be a direct 1:1 mapping in every circumstance (a good subject for another post), there are definitely some parallels to draw.</p>
<p>One way to go about reasoning about the need to introduce a dependency on an external event stream is to pedantically question the purpose of having that dependency in the first place. You may begin to find that you’re tending to introduce such dependencies for the purpose of presenting data in a UI rather than for fulfilling specific business logic within your service boundary. In such cases, it can be preferable to rely on data aggregation/composition in the UI layer (where it’s generally a little more acceptable/inevitable to rely on orchestration). When applying Domain-Driven Design to model business complexity, it’s advisable to be ruthless about business purpose within any Bounded Context, and that means avoiding projecting state from external services if your service/domain doesn’t <em>really</em> need it.</p>
<h2>Coming next</h2>
<p>That wraps up part one of this three part series. I’ll be following up soon with part two within which I’ll cover the implementation traits required to satisfy the theory discussed in this first post. Stay tuned!</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Twelve-factor config and Docker]]></title>
<link href="http://DrewEaster.github.io/blog/2015/02/08/twelve-factor-config-and-docker/"/>
<updated>2015-02-08T08:03:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2015/02/08/twelve-factor-config-and-docker</id>
<content type="html"><![CDATA[<h2>Config confusion</h2>
<p>I recently wrote about what I see to be confusion over the way the <a href="http://www.12factor.net">Twelve-factor app</a> guidelines are interpreted with regard to app config. You can read that post <a href="http://www.dreweaster.com/blog/2015/01/21/a-confusing-side-to-Twelve-factor-app-configuration/">here</a>.</p>
<p>To summarise my argument, I think people tend to focus solely on the explicit guidelines in the <a href="http://12factor.net/config">Config</a> section, and overlook the additional advice – specific to config – given in the <a href="http://12factor.net/build-release-run">Build, release, run</a> section. The former simply speaks about reading app config from the environment, the latter clearly states that a software ‘Release’ is a <em>uniquely versioned</em> combination of a ‘Build’ <em>and</em> ‘Config’.</p>
<p>Suffice to say, I’m quite surprised at the extent to which people manage to overlook the <a href="http://12factor.net/build-release-run">Build, release, run</a> section when discussing Twelve-factor config. It offers extremely specific advice with regard to managing <em>immutable</em> release packages, and I don’t believe it’s correct to claim you’re doing Twelve-factor style config if you’re not also following the <a href="http://12factor.net/build-release-run">Build, release, run</a> guidelines.</p>
<p>In this post, I want to address the implications of this view with regard to shipping applications in Docker containers. Once again, I see some conflict in what Twelve-factor has to say about config and perceived best practices for Docker.</p>
<h2>The Docker way</h2>
<p>I’ve digested a whole bunch of various opinions and best practices with regard to Docker, and a fairly consistent view is that containers should remain environment agnostic – i.e. the same container you generate at ‘Build’ time should be deployable to <em>any</em> environment.</p>
<p>I get this, and I’m in total agreement. There’s certainly agreement here with Twelve-factor, at least in terms of what constitues a ‘Build’. So, how would we supposedly do Twelve-factor config with this model? It appears to be quite simple as Docker lets us pass in environment variables when running a container, e.g.</p>
<p><code>docker run -e FOO=bar coolcompany/coolapp:1.0.0</code></p>
<p>This is saying, run <code>coolapp</code> with tag <code>1.0.0</code> passing in <code>bar</code> as the value for environment variable <code>FOO</code>. The Docker tag, in this fictitious example, is meant to represent the ‘Build’ version of the app, and would have been generated during the build phase in the delivery pipeline.</p>
<p>This approach is absolutely consistent with the Twelve-factor <a href="http://12factor.net/config">Config</a> section – our application (encapsulated in the container) will read its configuration from the environment variable(s) provided. And, of course, we haven’t tied the container image to a specific environment – this container looks very much like what Twelve-factor refers to as a ‘Build’.</p>
<p>Hold on, though. Whilst we’ve satisfied the <a href="http://12factor.net/config">Config</a> section, we’ve only partly satisfied the <a href="http://12factor.net/build-release-run">Build, release, run</a> section. In fact, I’d go as far as saying that this is violating the <a href="http://12factor.net/build-release-run">Build, release, run</a> guidelines.</p>
<p>Let’s take some quotes directly from the Twelve-factor guidelines:</p>
<blockquote><p>The release stage takes the build produced by the build stage and combines it with the deploy’s current config. The resulting release contains both the build and the config and is ready for immediate execution in the execution environment.</p></blockquote>
<p>and:</p>
<blockquote><p>Every release should always have a unique release ID, such as a timestamp of the release (such as 2011-04-06-20:32:17) or an incrementing number (such as v100). Releases are an append-only ledger and a release cannot be mutated once it is created. Any change must create a new release.</p></blockquote>
<p>In our example above, I think it’s fair to say that this advice has been circumnavigated. We’ve taken our ‘Build’ and jumped straight to ‘Run’, altogether ignoring what Twelve-factor refers to as a ‘Release’. We’ve <em>not</em> created a uniquely versioned, immutable release package and we’ve burdened the ‘Run’ phase with the additional responsibility of having to pass environment variables to the container. The ‘Run’ phase has become more complicated than it should be.</p>
<p>This approach has maintained a distinct separation between code and config, whereas Twelve-factor very explicitly specifies that a ‘Release’ <em>is</em> a combination of code and config. The Twelve-factor approach allows the ‘Run’ phase to be dumb – it just launches whatever package you give it, needing no knowledge of application specific configuration. And, it naturally follows that rollbacks are a simple case of running the previously versioned release, with no need to worry about what the configuration for that version should be.</p>
<h2>An alternative approach</h2>
<p>This is where this post is bound to get murky and upset a few people. I’m going to be heretical and suggest a model whereby we do create environment specific Docker containers. I can hear the cries of “How very dare he?!”</p>
<p>I propose the idea of taking our base ‘Build’ image and creating a uniquely versioned ‘Release’ image as a thin layer on top of it – oh the joys of image layering. This new image <em>does</em> embed the environment variables – specific to a chosen environment – within itself, rather than requiring they be passed to <code>docker run</code> on launch.</p>
<p>Let’s look at an example Dockerfile to achieve this:</p>
<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>FROM coolcompany/coolapp:1.0.0
</span><span class='line'>ENV FOO bar</span></code></pre></td></tr></table></div></figure>
<p>We can then use this Dockerfile to build the ‘Release’ image, giving it a unique version at the same time, e.g. <code>coolcompany/coolapp:1.0.0-staging-v11</code>.</p>
<p>I’ve made up a convention here <code>{build_version}-{environment_name}-{release_number}</code> for tagging releases. Including the environment name in the tag might be a nice way of ensuring it’s clear which environment the container is tied to.</p>
<p>So, our delivery pipeline continues to produce an environment agnostic container ‘Build’ image, but, just at the point of deployment to our chosen environment, we create a new environment specific image and use this as our ‘Release’. Then, the ‘Run’ phase need only be given the ‘Release’ image version in order to execute the application.</p>
<p>This model sees ‘Release’ packages created on demand – i.e. a ‘Release’ package (Docker image) is created <em>just in time</em> at the point of deployment to a specified environment. From where environment variables are actually sourced and added to the Dockerfile, is beyond the realms of this post.</p>
<h2>The right way?</h2>
<p>I’ve read enough so called best practices to expect this approach to anger some Docker/containerization purists. However, I genuinely see this as being a reasonable way to implement the Twelve-factor guidelines using Docker.</p>
<p>If not this approach, then what? For me, one reasonable way to challenge this model would be to challenge the whole Twelve-factor concept of <a href="http://12factor.net/build-release-run">Build, release, run</a>. If you disagree with the Twelve-factor concept of a ‘Release’, then by all means disagree with the content of this post!</p>
<p>Just like with my <a href="http://www.dreweaster.com/blog/2015/01/21/a-confusing-side-to-Twelve-factor-app-configuration/">related post</a> – and despite being sympathetic to the <a href="http://12factor.net/build-release-run">Build, release, run</a> advice – I’m not necessarily arguing right or wrong here. It’s just a case of pointing out what would constitute a pure implementation of the Twelve-factor guidelines on top of Docker.</p>
<p>Remember, the Twelve-factor guidelines were essentially invented by the Heroku gurus, and there are other PaaS technologies that also follow the same principles. It’s just a specific way of tackling release management, and, whilst it may not be the <em>right</em> way of using Docker, I don’t think it would be fair to say it’s <em>wrong</em> either.</p>
<p>What do you think?</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Customer experience makes the difference]]></title>
<link href="http://DrewEaster.github.io/blog/2015/02/03/customer-experience-makes-the-difference/"/>
<updated>2015-02-03T09:56:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2015/02/03/customer-experience-makes-the-difference</id>
<content type="html"><![CDATA[<p>In this post, I’m going to dip my toes into the world of customer experience (CX). This isn’t a subject I’ve written about before, but a recent sub-optimal hotel experience during a trip to Belgium has prompted this analysis.</p>
<p>Whilst this story does not relate specifically to technology, great customer experience is something that all businesses should pursue, whether technical or not in nature.</p>
<h2>Our story</h2>
<p>Back in early December, my wife and I booked a two night city break to Bruges (Belgium), to return for the third time to what will remain an unnamed hotel, a hotel that continued to rank amongst our top five hotels in the world. To beat those New Year blues, we’d been looking to go away sooner, but delayed until the beginning of February due to the hotel having no availability for the entire month of January. With no specific explanation – within the online booking engine – for the complete lack of availability in January, it was natural to assume they were just fully booked.</p>
<p>A few days prior to our trip, my wife noticed on their website an explanation for the January blackout – turns out they were shutting the hotel for refurbishment. But, no worries, as the work would be over ready for our arrival.</p>
<p>On arrival at the hotel, it was immediately obvious that the work had not been completed – there were decorators inside and out, hammers banging, paint pots piled up in the hallways. During check-in, the desk staff made no reference to the refurbishment work that was going on around us, and we were eventually shown to our room.</p>
<p>To a backing track of relentless hammering outside the window, my wife asked our room escort about the Wellness Centre – the pool, jacuzzi, sauna etc. being one of the considerable advantages of this particular hotel. It was at this point that a member of staff finally acknowledged what was going on around us and this included breaking the news that the Wellness Centre was out of use. It turned out the refurbishment work had overrun (no s**t, Sherlock) and was expected to carry on for a further week.</p>
<p>Following a fairly short discussion, my wife and I concluded that we should move to a different hotel. Why?</p>
<h3>The hotel failed at every possible opportunity to acknowledge the problem</h3>
<p>The hotel had at least three clear opportunities to proactively acknowledge the problem:</p>
<ul>
<li><strong><em>At the point of booking</em></strong> – they could have at least warned of the potential risk of overrun</li>
<li><strong><em>In the days leading up to our arrival</em></strong> – a simple email would have been courteous</li>
<li><strong><em>On arrival</em></strong> – they could have immediately apologised at check-in and offered solutions</li>
</ul>
<p>The most damning thing is that we ultimately had to prise an apology out of the staff. If I’d been managing the hotel, I’d have personally welcomed every arriving guest, explained the issue and discussed ways to minimise the disappointment for them. It’s a customer experience disaster that we were left to realise what was going on before the staff acknowledged it.</p>
<h3>The hotel had devised no proactive mitigation plan</h3>
<p>What’s pretty clear is that the hotel staff, as a team, had failed to prepare for the fallout. Let’s be honest, it’s not difficult to predict that refurbishment works will overrun – not only should the hotel manager have prepared a mitigation plan, but that plan should have been communicated clearly to <em>every</em> member of staff.</p>
<p>They should have been ready for every single pre-booked guest arrival and had devised an individual, tailored mitigation solution for each one. This plan could have been communicated to the guest(s), at best, prior to arrival, and, at worst, immediately on arrival at the reception desk.</p>
<p>A great manager would have got every member of staff into a room, rallied the troops and ensured each team member was fully prepared to execute the mitigation plan. Unfortunately, the staff at our hotel seemed as bemused as we did, and this was extremely disappointing.</p>
<h2>What can be learned?</h2>
<p>So, given the clumsy nature of their handling of the situation, we checked ourselves out the hotel and found ourselves a room elsewhere (the first time we’ve ever done that). The hotel that we’d previously held so dear to our hearts, had undone years of customer loyalty development in a single day. As it turns out, we shouldn’t have ignored the fact they’d tumbled down the TripAdvisor rankings in the past few years – the warning signs were there.</p>
<p>Still, there’s a good chance we’d have forgiven them if they’d demonstrated a desire to minimise our disappointment – we’d probably have stayed where we were. Instead, not only are we unlikely to ever go back, but we’re now unlikely to recommend anyone else to stay there either (and they’ve had guests in the past directly off the back of our recommendations).</p>
<p>The lessons that can be learned from this experience are hardly original:</p>
<h3>A good product is worth little without great customer service</h3>
<p>The tangible product – ignoring the fact that some of it wasn’t available at the time – was seemingly unchanged. This is still a fabulous looking boutique hotel, in a fabulous location. The rooms are beautiful and the breakfasts fantastic. But, that just isn’t enough without delightful customer service to go along with it.</p>
<p>Never ever ever think your tangible product will make up for lapses in customer service. There’s no difference between product and service – your product <em>includes</em> the service.</p>
<h3>Don’t rest on your laurels</h3>
<p>You can’t rely on your customers’ loyalty through past experience. Just because you’ve won their loyalty in the past, it doesn’t mean you can screw up in the present. Loyalty is a pretty fickle concept and, if you mess with your customers, they’ll look elsewhere for a better all round experience.</p>
<h3>Customer service is the most important differentiator</h3>
<p>As a market becomes more and more competitive, the opportunities to differentiate on your tangible product are reduced – it’s not always easy to create a truly original product. So, whilst your competitors continue to iterate blindly on their tangible product, spread your resources to accommodate improving all round customer experience. Service your customers in delightful ways, and they’ll keep on coming back to you for more. Do not underestimate your ability to differentiate yourself through great service.</p>
<h2>Make the difference</h2>
<p>As stated at the beginning of this post, customer experience is important for <em>every</em> business – little or large, technical or non-technical.</p>
<p>Great customer experience makes the difference, so make it your difference.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[A confusing side to twelve factor app configuration]]></title>
<link href="http://DrewEaster.github.io/blog/2015/01/21/a-confusing-side-to-twelve-factor-app-configuration/"/>
<updated>2015-01-21T18:39:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2015/01/21/a-confusing-side-to-twelve-factor-app-configuration</id>
<content type="html"><![CDATA[<p>I’m sure I’m not the only one who breathed a sign of relief when the clever brains behind Heroku published the <a href="http://12factor.net">Twelve Factor App</a> guidelines. Here was a reliable set of principles – born out of real life experience – that immediately hit home, providing sensible advice for overcoming the many pitfalls developers and operations teams have fought relentless battles with time and time again.</p>
<p>One of the principles that I was especially able to connect with is the advice regarding from where applications should read their <a href="http://12factor.net/config">configuration</a>. Having regularly encountered config file hell over the years, this simple, platform agnostic approach to supplying config to an application makes a whole lot of sense. There have, though, been some misunderstandings with regard to this advice, <a href="http://blog.doismellburning.co.uk/2014/10/06/twelve-factor-config-misunderstandings-and-advice/">and this blog post</a> by Kristian Glass does a good job of highlighting one such misunderstanding – Twelve Factor does not dictate from where the environment should be populated, only that an app should read from it.</p>
<p>So, with a major misunderstanding out the way, we are left with a whole bunch of options as to how we populate the environment. Outside of the extreme abstraction of a PaaS environment like Heroku, something like Consul – the distributed key/value store – is one such solution. And in this <a href="https://hashicorp.com/blog/twelve-factor-consul.html">blog post</a> by <a href="https://hashicorp.com">Hashicorp</a>, that very solution is covered quite nicely.</p>
<p>But, hang on, has something been overlooked?</p>
<p>I’d like to challenge Hashicorp’s claim that this approach is compatible with Twelve Factor principles. My personal opinion – on which I’m very happy to be challenged – is that there is a further critical piece of advice, regarding configuration, contained within the Twelve Factor guidelines – you just have to turn to a different page.</p>
<p>This additional piece of advice is contained with the section titled <a href="http://12factor.net/build-release-run">Build-Release-Run</a>. As far as I’m concerned, the advice here is pretty crystal clear – a ‘Release’ is a combination of a ‘Build’ <em>and</em> ‘Config’. This <em>immutable</em> artifact is uniquely versioned and is deployed and rolled back as an atomic unit. This is how Heroku does it, and the open source <a href="http://www.docker.com">Docker</a> based PaaS <a href="http://www.deis.io">Deis</a> has the <a href="http://docs.deis.io/en/latest/reference/terms/release/#release">same approach</a>. I’m leaving out other PaaS tech that also follows this model.</p>
<p>It would be hard to deny that this approach makes deployments very easy to reason about. By bundling together code alongside environment specific config, you simplify release management and tracking. For example, if you need to rollback, you don’t have to manage that as two separate actions – it’s an atomic action and will return you to the last known uniquely versioned, operationally sound combination of code and config.</p>
<p>So, whilst it’s true that Twelve Factor does not dictate from <em>where</em> the environment should be populated, it’s pretty clear that it does say <em>when</em> it should be. The source of environment configuration should be read <strong><em>when preparing a release</em></strong>, and that any change to config, just like with a change to code, <strong><em>results in a new release</em></strong>.</p>
<p>Therefore, any approach to managing configuration, in a similar way to Hashicorp’s advice, would not appear to be compatible with the Twelve Factor guidelines. They are promoting a model where the ‘Build’ and ‘Config’ are not atomically bundled as a ‘Release’ and, as such, this model is violating a fundamental Twelve Factor principle – if your code and config is managed in separate lifecycles, it ain’t Twelve Factor compatible.</p>
<p>One could quite easily challenge the ‘Build + Config = Release’ advice:</p>
<ol>
<li>It doesn’t appear to leave any room for runtime config changes</li>
<li>It’s not completely clear how it would work with dynamic service discovery</li>
</ol>
<p>Sometimes, however, the advantages of predictable, easy to reason about deployments outweigh the benefits of such niceties</p>
<p>I’m not debating here what’s the right or wrong way, I’m just pointing out that the Twelve Factor advice is very clear about the meaning of a ‘Release’, and, therefore, any method that circumvents this, cannot claim to be compatible with the guidelines.</p>
<p>Just saying.</p>
<h3>UPDATE 2015-01-21 20:10</h3>
<p>Spring are also claiming that the <a href="http://projects.spring.io/spring-cloud/spring-cloud.html">Spring Cloud</a> approach to configuration is Twelve Factor compatible in this <a href="http://spring.io/blog/2015/01/13/configuring-it-all-out-or-12-factor-app-style-configuration-with-spring?utm_content=bufferfa5a5&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer">blog post</a>. I think this is making the same mistake as Hashicorp. If configuration is able to change independently from a release – as is described in this post – then it’s not in the spirit of the Twelve Factor App.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Lessons from my first startup failure]]></title>
<link href="http://DrewEaster.github.io/blog/2014/10/02/lessons-from-my-first-startup-failure/"/>
<updated>2014-10-02T15:13:00+01:00</updated>
<id>http://DrewEaster.github.io/blog/2014/10/02/lessons-from-my-first-startup-failure</id>
<content type="html"><![CDATA[<p>It’s been a while since I last posted here, and good reason – I’ve been trying to run a startup! As this is a technically focused blog – and given my startup experiences were mainly commercially focused – I didn’t have all that much to say here. But, now I’ve experienced my first startup failure, I thought I’d write about it here (even though it’s not really technical content).</p>
<p>It has become somewhat customary for entrepreneurs to write about the good times and the bad. In either case – often even more relevant in the case of the bad times – it seems quite useful to share with others the lessons one learnt along the way. So, this is me sharing some of the lessons I learnt during my first startup failure experience. Let’s get on with it.</p>
<h2>Be wary of complex dependencies</h2>
<p>If your business model is reliant on others (partners) investing in changes to their own technology systems, your chances of success are considerably lowered.</p>
<p>All startups have an implicit dependency on customers buying their product/service. Any additional dependencies just hugely complicate things. Try to keep as much within your own control as possible.</p>
<h2>Take good feedback with a pinch of salt</h2>
<p>Most people will tell you how great your idea is – it’s human nature. The only way to be sure you’ve got a good product is by getting someone to actually pay you. Until someone actually hands over money for your product/service, you haven’t validated the problem you are solving (this is skewed somewhat towards B2B software).</p>
<h2>Identifying a problem is not enough</h2>
<p>Whilst clearly important, it’s not enough to just identify a problem. Yes, your business may well solve a real pain point in theory, but it still doesn’t mean you can actually execute on it successfully. Commercial complexities (e.g. a deeply tangled industry) can still make even a good idea impossible to actually execute on, especially for a startup.</p>
<h2>Be patient before diving in</h2>
<p>Even if you can afford to, it might not be a good idea to quit your job until you’ve found something you are really passionate about building/solving. Identify a problem you’ve had in your own life/job, don’t try to force ideas out of nothing – it’s contrived. You have to be patient and let the idea come to you – if that means working for others until you do, then so be it.</p>
<p>Make the most of working for larger companies where you can network with those in business areas you are not familiar with. The gaps will be there, you just need to put yourself in the position to see them.</p>
<h2>It’s much harder than you could possibly imagine</h2>
<p>It’s going to be infinitely more difficult to succeed than you think. Success is a combination of a good idea, impeccable execution, luck, who you know etc.</p>
<p>There is a myth doing the rounds that it’s easier than ever to start a startup. In reality it’s just a cultural mind shift, the actual chances of succeeding are just as low as they have ever been. The ‘anyone can build an app’ delusion is really unhelpful.</p>
<p>And, it will take five times longer to succeed than you probably think it will. Most successful startups have been operating a lot longer than you think they have – you’re facing many years of blood, sweat and tears (and you still might fail).</p>
<p>Also, be willing to acknowledge, therefore, that the idea of running your own business might look much better on paper than it does in reality.</p>
<h2>A tech startup is not just about tech</h2>
<p>Marketing and sales will most likely be 80% of the effort in a startup’s success story. Don’t be under any illusions – a good product won’t sell itself. A warning for you techies out there – don’t be fooled into thinking that you only need tech skills to get a business off the ground. It’s not true, seriously!</p>
<p>Accordingly, non-techie entrepreneurs shouldn’t feel disadvantaged that they don’t have tech skills. This is not to say that tech isn’t hugely important (I obviously think it is), but it’s not a one-way ticket to success.</p>
<h2>Recognise when the game is up</h2>
<p>Be prepared to recognise when it’s time to call it a day. Cut through the false positives in order to make an objective assessment of your business. There’s a fine line between sensible persistence and blind optimism. It’s not always advisable to keep the faith if every indicator of business healthiness is against you. Maybe it’s just not a good idea, or maybe the market is just not ready, for any number of commercial reasons, to accommodate your product.</p>
<h2>Don’t do it for money</h2>
<p>Don’t bother doing it if you are doing so in any way for money. Be totally honest with yourself from the outset. Unless your only reason for starting a business is to build a great product/service then don’t bother. For most people, it they are truly honest, the idea of becoming rich is the real motivator deep down.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[What a storming idea]]></title>
<link href="http://DrewEaster.github.io/blog/2013/12/07/what-a-storming-idea/"/>
<updated>2013-12-07T09:46:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2013/12/07/what-a-storming-idea</id>
<content type="html"><![CDATA[<p>Firstly, an important disclosure – having only just stumbled across the practice of <a href="http://ziobrando.blogspot.be/2013/11/introducing-event-storming.html">Event Storming</a>, I’ve not yet had the opportunity to experiment with it myself. However, sometimes a concept/practice makes such immediate sense in one’s mind, that one feels compelled to talk about it!</p>
<p>This post feels like quite a natural successor to my previous post on <a href="http://www.dreweaster.com/blog/2013/12/02/event-driven-architecture-ftw/">Event-driven Architecture</a>. In that post, I discussed some of the tangible benefits of EDA, and this follow up introduces the practice of <a href="http://ziobrando.blogspot.be/2013/11/introducing-event-storming.html">Event Storming</a>, a fledgling Domain Driven Design influenced practice that goes hand in hand with EDA to assist product teams in exploring the complexity of their business domain.</p>
<p>It’s not my intention to write a long article on Event Storming – I encourage you to read Alberto Brandolini’s <a href="http://ziobrando.blogspot.be/2013/11/introducing-event-storming.html">introduction</a> instead – but I want to share my biggest takeaways from this new learning.</p>
<h2>The Domain-Relational Impedance Mismatch</h2>
<p>In my experience of domain modelling, the biggest mistake I’ve seen made, time and time again, is to fail to involve the most important people ‘in the room’. The reason for this is that far too many projects take a <em>persistence oriented</em> approach to modelling, as opposed to a <em>business oriented</em> approach. Put simply, by tending to address model genesis using only technical team members, teams get sucked into allowing technical persistence concerns to shape their modelling approach. It’s not unusual for domain experts to be involved only at the user interface level, leaving software engineers and DBAs to make decisions that they are probably the least qualified to be making. The ultimate success of a software project resides on how well domain complexity has been tackled, and so it seems crazy for domain experts to be absent from the modelling process.</p>
<p>Let’s be absolutely clear – a domain model exists in spite of persistence technology, not because of it. A model is not a technical concept; it is a reflection of a business domain that exists in real-life, not inside a machine.</p>
<blockquote><p>“A domain model exists in spite of persistence technology, not because of it.”</p></blockquote>
<p>Whilst NoSQL technology is becoming more popular, it’s still fairly normal to see domain modelling tackled using entity relationship (ER) diagrams – something quite familiar to engineers and DBAs. That wily old fox, the relational model, is still recognised by many as the de facto way to practice domain modelling. However, Domain Driven Design (DDD) teaches us a much better way, and does not make room for persistence concerns in our conversations – models spawned using DDD practices typically appear very different to what they’d look like had ER modelling been applied instead.</p>
<p>You’re probably familiar with the <a href="http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch">object-relational impedance mismatch</a> concept, but I think our problems extend much further than that; I believe DDD teaches us of a domain-relational impedance mismatch. That is, the relational model is not a natural fit for addressing domain model complexity, and thus should not be trusted to do so.</p>
<h2>The light at the end of the tunnel</h2>
<p>So, we know we’re doing it wrong, but how do we then ensure we get both the right people in the room (domain experts), and a method to support effective communication of domain complexity? This is where I believe we should look to Event Storming to help us out.</p>
<p>From my initial learnings on Event Storming, I can wholeheartedly say that this technique appears to offer a very attractive way to ensure focus remains on the business rather than technical implementation. It forces the right people – domain experts – to be in the room, thus ensuring core business flows are identified, bounded contexts are defined and consistency boundaries are clarified.</p>
<p>Event Storming does infer an event-driven architecture (EDA), but I hope my <a href="http://www.dreweaster.com/blog/2013/12/02/event-driven-architecture-ftw/">previous post</a> serves to address why you should be doing that anyway. It finally gives us an accessible technique that allows domain experts and technical specialists to work together to tackle domain complexity effectively. It’s a really exciting prospect and I look forward to applying it both in existing and future projects.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[The Silicon Valley Mindset]]></title>
<link href="http://DrewEaster.github.io/blog/2013/12/04/the-silicon-valley-mindset/"/>
<updated>2013-12-04T22:01:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2013/12/04/the-silicon-valley-mindset</id>
<content type="html"><![CDATA[<p>As I’m currently on a temporary work assignment in San Jose, I recently wrote an article for <a href="http://thecroydoncitizen.com/">The Croydon Citizen</a> about my experiences in Silicon Valley. It’s great to have become part of the <a href="http://croydontechcity.com/">Croydon Tech City</a> movement, and so it was nice to contribute an article in support of the great work being done to promote Croydon as a new hub for tech startups in the UK.</p>
<p>Check out my article here:</p>
<p><a href="http://thecroydoncitizen.com/croydon-tech-city/silicon-valley-mindset/">http://thecroydoncitizen.com/croydon-tech-city/silicon-valley-mindset/</a></p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Event-driven Architecture FTW]]></title>
<link href="http://DrewEaster.github.io/blog/2013/12/02/event-driven-architecture-ftw/"/>
<updated>2013-12-02T21:41:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2013/12/02/event-driven-architecture-ftw</id>
<content type="html"><![CDATA[<h2>A quick primer on EDA</h2>
<p>First of all, let’s delegate to the all-knowing source, <a href="http://en.wikipedia.org/wiki/Event-driven_architecture">Wikipedia</a>, to give us a concise definition of Event-driven architecture (EDA):</p>
<blockquote><p>“Event-driven architecture (EDA) is a software architecture pattern promoting the production, detection, consumption of, and reaction to events.”</p></blockquote>
<p>It’s pretty unusual to encounter a software engineer who hasn’t dabbled with publish-subscribe semantics – or anything resembling the <a href="http://en.wikipedia.org/wiki/Observer_pattern">Observer Pattern</a> – at some point in their engineering adventures. It’s hard to ignore, when faced with a challenge to simplify some programming problem, the draw of decoupling a producer of something from one or many consumers of that something. Not only does it divide responsibility (making code easier to test for starters), it also clears the way for an asynchronous programming model in use cases where producers needn’t be blocked by the actions of an interested consumer.</p>
<p>If you can appreciate the power of publish-subscribe, you’re unlikely to have much trouble in figuring out how Event-driven architecture (EDA) could help you. It only requires a small leap of faith to realise that pattern you used to solve that micro problem, in some isolated component within your system, can become a macro pattern to underpin a system-wide architectural style.</p>
<p>So, let us return to that Wikipedia definition. I think we could reasonably interpret that definition to understand EDA as the art of designing a system around the principle of using events – concise descriptions of state changes or significant occurrences in the system – to drive behaviour both within and between the applications that make up an entire platform.</p>
<h3>Production</h3>
<p>Production of an event involves some application component/object generating a representation of something that has happened – maybe a state change to an entity, or maybe the occurrence of some activity (e.g. a user viewed page X). Rather than notifying specific consumers, the producer will simply pass the event to some infrastructural component that will deal with ensuring the event can be consumed by anything that’s interested in it.</p>
<h3>Detection</h3>
<p>It’s my belief that the Wikipedia definition is essentially referring to the mechanism that sits between a producer and consumer (the infrastructure piece) – the logic that ensures events are passed to interested consumers.</p>
<h3>Consumption</h3>
<p>Consumption is the act of an interested consumer receiving an event for which it is interested in. This is still most likely a part of the infrastructure piece (e.g. a message bus). There might be a whole bunch of stuff here about reliable delivery, failure recovery etc.</p>
<h3>Reaction</h3>
<p>Reaction is the fun part where a consumer actually performs some action in response to the event that it has consumed. For example, imagine you register a user on your website and you want to send them a welcome email. Rather than bundling the responsibility for sending email within your domain model, just create a consumer to listen into a UserRegisteredEvent and send an email from there. This nicely decouples the email delivery phase and also allows it to be done asynchronously – you probably don’t need, or want, email delivery to be a synchronous action. Also, imagine you have further requirements relating to post registration behaviour – your domain model would soon become unwieldy with all that extra responsibility. Not one to want to violate the Single Responsibility Principle (SRP), you sensibly use event-driven programming to separate all those actions into separate consumers, allowing each behaviour to be tested in isolation, and retain simplicity in your domain model.</p>
<h2>Events everywhere</h2>
<p>As previously alluded to, it’s far from unusual to see fragments of event-driven programming in many applications. However, it’s another step entirely to see event-driven programming adopted in such a way that it becomes an endemic architectural pattern – that is, where an entire platform uses events to underpin all its moving parts. To be successful with EDA, it needs to become a fundamental mindset that drives all design decisions, rather than just a pattern that used in some isolated parts of a wider system.</p>
<p>I want to evaluate a series of application/platform features that, whilst may sit outside of core business workflows, fit really nicely with EDA. This should help to realise why EDA can be a very fruitful path to follow.</p>
<h3>WebHooks (the Evented Web)</h3>
<p>WebHooks is a fairly high level concept that encompasses the use of HTTP as a form of publish-subscribe mechanism. Whilst there are more recent attempts to create more standards around this – <a href="http://progrium.com/blog/2012/11/19/from-webhooks-to-the-evented-web/">calling it the Evented Web</a> – the fundamental idea is to allow a consumer to register a callback URL with some remote resource (using HTTP) that will be invoked by the hosting service whenever some event occurs relating to that resource. A really well known example is post-commit hooks on Github – any external tool (e.g. a CI server, a bug tracker) can register interest in commits to a repository and react in whatever way that makes sense in their context.</p>
<p>I’m pretty convinced that the Evented Web paradigm has got a lot of growth potential and will become a de facto expectation of any well designed service API. What should be clear is how easy it would be to add WebHooks functionality to your own application if you are already applying EDA across the board.</p>
<h3>Big Data</h3>
<p>I do kind of detest using the term ‘Big Data’ as it’s uncomfortably ambiguous and vague. However, for the purposes of this article, I’m going to stick with it (strike me down). If, for now, we think of Big Data as a way of capturing shed loads of data to enable business intelligence, we should be able to see quickly that events occurring within an application might be a great source of all that lovely data. If you’ve adopted EDA, your Big Data pipeline may just be a consumer of all your application events. You might dump all those events into HDFS for future batch processing, and, given you are essentially subscribing to a real-time event feed, you might also drive your real-time analytics use cases as well.</p>
<h3>Monitoring</h3>
<p>Unless you are someone who really couldn’t give a damn, you’re going to want some monitoring tools in place to give a thorough insight into the health of your system in production. Common monitoring solutions may include, amongst other things, a bunch of smart dashboards full of sexy graphs, and some threshold alerting tools to help spot potential problems as soon as an incident starts. Either way, both these tools are driven by time series data that represents ‘stuff’ happening within an application. What better way to capture this ‘stuff’ than sucking events in from your event streams? Again, if you’ve already followed EDA, you’re going to get some pretty quick monitoring wins.</p>
<h3>Log file analysis</h3>
<p>There is possibly some crossover here with monitoring, and even Big Data, but I think it deserves its own special mention. If you imagine logs as comprehensive streams of events, assuming you’ve followed an EDA style, you can pretty much get log analysis for free. Just suck your logs into some analysis tools (e.g. <a href="http://logstash.net/">Logstash</a> and <a href="http://www.elasticsearch.org/overview/kibana/">Kibana</a>), and you’re pretty much good to go. Just remember that it’s perfectly reasonable to use events to represent errors too (which could contain any relevant stack trace).</p>
<h3>Test-driven development (TDD)</h3>
<p>Okay, so TDD is not an application feature, it’s part of the engineering process. However, if our architecture decisions can help to improve our quality process, then that can’t be a bad thing. Event-driven programming encourages a code level design approach that follows the <a href="http://pragprog.com/articles/tell-dont-ask">Tell, Don’t Ask</a> pattern. You tell an object to do something, which leads to an event, or events, being published. So, what’s this got to do with TDD? In my experience, it’s much easier to reason about your code, and define more coherent specifications, if your testing style mimics ‘given this input (command), I expect this event, or events, to be produced’. A test first approach is very compatible with this style, and makes you think in a much more behavioural (think BDD) way.</p>
<h2>For the win</h2>
<p>Right, we’ve covered a primer of EDA and seen how it can be used to drive both core business flows, cross cutting concerns, and even our quality process. I believe this knowledge makes a very compelling case for adoption of EDA – why would you bake in custom solutions for capturing Big Data, doing health check monitoring etc, when you can simply piggy back these features off of your core architecture? Hopefully, you wouldn’t! All sorts of wonderful acronyms pop into my head at this point – KISS, DRY, SRP etc. And don’t we all love acronyms?</p>
<p>But can we go even further?</p>
<h2>Going the whole hog with event sourcing</h2>
<p>This discussion leads so elegantly into the final part of this blog post – event sourcing. Event sourcing is an approach to persistence that means the state of an entity – more specifically, an Aggregate Root in DDD speak – is made up of all the events, representing state changes, it has emitted over time. So, rather than store current state in a database, you simply load the historical sequence of events (from an event store) and apply them in order to obtain current state. I will leave it up to the reader to pursue the full benefits of using event sourcing, but here are some of the headline wins:</p>
<ul>
<li>Supports a simple, and very scalable approach to persistence. An event store can be as simple as an append only log file.</li>
<li>Gives you a full history of every state change and that’s great for producing an audit log (something you might want anyway, even without event sourcing).</li>
<li>Can still utilise snapshots of current state for performance optimisation when replaying.</li>
<li>Very compatible with a test first, behavioural approach to testing.</li>
<li>Plays very nice with the <a href="http://martinfowler.com/bliki/CQRS.html">CQRS</a> architectural pattern, a very practical way to bake scalability into your applications by maintaining separate paths for reads and writes.</li>
</ul>
<p>If you’re going to go down the EDA route, why limit just your applications to an event-driven style? If it’s possible to maintain state via the events you’re already publishing, why maintain a separate database at all? Storing events for all time might seem like a storage burden but, seriously, are we still worrying about the cost of storage in 2013? Storing current state in a database is a ‘lossy’ approach – once you’ve overwritten existing state, you can never get it back. Martin Thompson summed all this up so concisely in a recent tweet:</p>
<blockquote class="twitter-tweet" data-conversation="none" lang="en"><p><a href="https://twitter.com/ashic">@ashic</a> <a href="https://twitter.com/hintjens">@hintjens</a> I like, "A database is a cache of the event log".</p>— Martin Thompson (@mjpt777) <a href="https://twitter.com/mjpt777/statuses/407534444803661824">December 2, 2013</a></blockquote>
<script async src="http://DrewEaster.github.io//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>There are way too many compelling reasons for wanting to keep a history of everything and it’s impossible to avoid being courted by that proposition.</p>
<p>I think this is a really fascinating area of exploration – sometimes traditional CRUD might be a better choice, but the more I work with event sourcing, and the more comfortable I feel with EDA in general, the harder it becomes to find good reasons against following this path.</p>
<h2>Conclusion</h2>
<p>So that wraps up a fairly lengthy discussion on EDA and how an event-driven mindset can promote a coherent strategy for the way you build software. One of the toughest things we face as software engineers is maintaining a consistency in style, especially in large teams. My ideal vision is for code to speak the architecture patterns on which it is crafted, such that no engineer could ever doubt what is expected of them when refactoring or adding new features. For me, EDA is an enabler of this vision, and will help to bridge the gap between doing the right thing (building product features your users love), and doing the thing right (consistent and elegant technical solutions).</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[The fallacy of the omniscient domain model]]></title>
<link href="http://DrewEaster.github.io/blog/2013/11/17/the-fallacy-of-the-omniscient-domain-model/"/>
<updated>2013-11-17T17:20:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2013/11/17/the-fallacy-of-the-omniscient-domain-model</id>
<content type="html"><![CDATA[<p>Complexity. As software engineers, it’s pretty hard to make it beyond lunch without someone mentioning it. But, what is it exactly? Most of us probably think we sussed this out a long time ago – hell, we’re probably preaching the KISS and DRY principles on a daily basis. However, complexity in software is a multi-faceted beast and, with that in mind, taking the time to reflect on your own view of complexity, you may reveal a bunch of defective preconceptions.</p>
<p>One of these preconceptions, one which I’ve failed to question effectively myself in the past, is that a ‘simple’ domain model is one that can encompass the entire domain of an enterprise. The other preconception is that ‘simple’ architecture means few moving parts.</p>
<p>Let’s address each one individually.</p>
<p>Imagine a completely DRY, one-size-fits-all domain model that manages to perfectly model the domain of an entire enterprise. This model is infinitely malleable and able to accommodate future changes without adversely affecting existing code dependant on it. Are you struggling to imagine this? I hope so, because I don’t think it’s possible in anything other than the most basic of business domains. Regardless, this a very common approach and, instead of being simple, it adds an alarming amount of overall complexity. Ultimately, the intertwining of vaguely related entities makes it impossible to make changes in one place without having to untangle deep dependencies elsewhere.</p>
<p>If, instead, you apply some of the core principles of Domain Driven Design (Domains, Subdomains, Bounded Contexts) to any enterprise, it’s natural to materialise multiple models that exist within the different subdomains and contexts that make up the wider domain. This approach, which actually reflects the structure of the real business, reduces overall complexity – dependencies are untangled by design, making changes more achievable in isolation.</p>
<p>One important thing to note here – you’re not necessarily violating the DRY principle if an entity appears in multiple contexts. Maybe you’ve just failed to successfully tweak the Ubiquitous Language for each context, such that you’re recognising that entity to mean different things in those different contexts.</p>
<p>So, now onto our second preconception – does simple architecture mean few moving parts? My previous argument may seem overly convenient given it implicitly provides support to my next case – it’s most likely true that the introduction of multiple bounded contexts and/or subdomains will lead to more moving parts. But does that actually equate to complexity?</p>
<p>We’ve all seen over-engineered software that throws in a message queue here, another message queue over there, neither appearing to offer any discernible value. I’m certainly not advocating that! But these over imaginative solutions shouldn’t be confused with DDD influenced design decisions to separate contexts and integrate them effectively where necessary.</p>
<p>Applying the KISS principle to architecture doesn’t necessarily mean a system with fewer moving parts. A simple system is one that is highly adaptable and reactive to business changes – thus, the number of moving parts alone can’t be considered a good measure of complexity.</p>
<p>I hope the arguments I’ve made in this article help to address some common misconceptions. I do believe the concept of a simple, single, all-knowing domain model is a fallacy. Feel confident to apply DDD principles, be proud of your separate cohesive models, and don’t fear the additional moving parts you might adopt in the process.</p>
<p>Complexity is not always what it seems.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Akka, DDD, CQRS, Event Sourcing and Me]]></title>
<link href="http://DrewEaster.github.io/blog/2013/10/27/Akka-DDD-CQRS-Event-Sourcing-And-Me/"/>
<updated>2013-10-27T20:47:00+00:00</updated>
<id>http://DrewEaster.github.io/blog/2013/10/27/Akka-DDD-CQRS-Event-Sourcing-And-Me</id>
<content type="html"><![CDATA[<h2>Introduction</h2>
<p>I’ve been involved in some really interesting discussions on the Akka User mailing list of late and thought it would be good to translate something I wrote, in one particular thread, into a blog post.</p>
<p>I recently became rather obsessed with Domain Driven Design (DDD), CQRS and Event Sourcing. Given an existing passion for Akka (actor based programming), it dawned on me that the actor model might be an extremely good fit for the aforementioned “Holy Trio” of paradigms/patterns. Whilst trying to build a prototype, I brought up the topic on the Akka User mailing list and became aware that the Akka Team were actively working on Akka Persistence, which is well ahead of my dirty little prototype! For one reason or another, within a post on the mailing list, I ended up brain dumping my understanding of how Akka Persistence and the “Holy Trio” fits together, so here follows a blog friendly version!</p>
<h2>Akka and the Holy Trio</h2>
<p>So, it seems like the actor model fits very well with DDD, and I’m not the first person to think this – <a href="https://vaughnvernon.co/">Vaughn Vernon</a> is well ahead of the game!</p>
<p>It feels quite natural to me to consider an actor analogous to an aggregate root (in DDD speak). Rather than just seeing an actor as a mediator sitting in front of a database, you see the actor conceptually as an entity. By incorporating event sourcing, the actor works on the basis of processing a command (representing some action) and outputting one or many events that represent the result of processing that command. At all times the actor encapsulates its current state, which can simply be seen as the result of a series of ordered events applied to it.</p>
<p>Initially, it’s actually quite helpful to ignore persistence requirements altogether and just rely on the state within the actor – in this way you don’t allow persistence concerns to influence the way you design your domain – I often find that domain models end up messy because, despite attempts to avoid doing so, persistence concerns end up, in one form or another, leaking into the business logic. Any approach to building a domain layer, that can genuinely remove the need to consider persistence technologies, is an attractive proposition in my mind!</p>
<p>By following an event sourcing approach, persistence becomes no more complex than just persisting the events – that the actor produces – in an event store (journal). Given that events represent an immutable history of state changes, the journal can be append only, and that’s pretty exciting from a performance perspective. This is something that can quite clearly be bolted on once you’ve already put together a working in-memory only version of your domain – the actors themselves can be completely agnostic to the existence of the persistence mechanism. This is where Akka Persistence comes in – it works on the following basis:</p>
<ol>
<li>Actor receives command</li>
<li>Actor processes command (applies business logic) and produces events that represent results of processing</li>
<li>Events are persisted to journal</li>
<li>Events are applied to actor so it can update it’s internal state</li>
</ol>
<p>Akka Persistence pretty much deals with everything other than the command handling step, which represents your custom business logic – Akka is not quite good enough (yet) to do that bit ;–)</p>
<p>Step 4 is quite important – it’s key that internal state is only applied once you’re sure events have been persisted to the journal. Also, by separating this step, it serves the additional purpose of allowing events in the journal to be replayed onto a “green” instance of the actor such to bring its internal state back up to current. Akka Persistence also has support for “snapshots” which allow you to take periodic snapshots of current state to be used as a performance optimisation when replaying events.</p>
<p>So, to sum up, the actor model just fits with the “Holy Trio”. It seems to deal with a majority of the pain points experienced when building a domain layer using more traditional CRUD techniques. It’s pretty exciting to me how natural this design feels without any need to consider using an ORM ;–)</p>
<h2>The Future</h2>
<p>And that’s the end of the blog friendly version of my brain dump! I truly believe the combination of the actor model with DDD, CQRS and Event Sourcing is going to become a very prevalent domain model design solution in the future.</p>
<p>I have to add that the Akka Team (including Patrik, Martin, Roland etc.) continue to be such an inspiration. Keep up the good work, guys!</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Access logs in Play! using RequestHeader tags]]></title>
<link href="http://DrewEaster.github.io/blog/2013/07/14/access-logs-in-play-using-requestheader-tags/"/>
<updated>2013-07-14T13:30:00+01:00</updated>
<id>http://DrewEaster.github.io/blog/2013/07/14/access-logs-in-play-using-requestheader-tags</id>
<content type="html"><![CDATA[<h2>Log files as event streams</h2>
<p>Having long been a fan of treating <a href="http://12factor.net/logs">log files as streams of events</a> (JSON formatted), I thought it would be useful to share a little bit of code to demonstrate how to achieve access style logging in a Play application. There’s certainly more than one solution to this problem (there is, for example, a nice little plugin here: <a href="https://github.com/briannesbitt/play-accesslog">https://github.com/briannesbitt/play-accesslog</a>), but I wanted to share a little snippet of code – I’ve used plenty of times in my own applications – utilising the RequestHeader tags feature in Play!.</p>
<p>Just after starting this post, I noticed another great <a href="http://matthiasnehlsen.com/blog/2013/07/09/transforming-logs-into-information/">article</a> from <a href="https://twitter.com/matthiasnehlsen">@matthiasnehlsen</a> about using <a href="http://kibana.org/">Kibana</a> to store log events generated in Play applications. Kibana is a great tool – backed by the awesome Elasticsearch – and I also highly recommend it. I won’t go into any detail in this article about sending logs to Kibana as Matthias has done a great job of doing that :–) Suffice to say, though, the information in my article here would certainly play nicely with Kibana – I simply want to make you aware of how you can extract some useful information from Play to enhance your log events no matter where you might wish to send them (log files included!).</p>
<h2>RequestHeader tags</h2>
<p>So…back to this feature – <code>RequestHeader</code> tags! It turns out that Play associates some very useful ‘tags’ along with the <code>RequestHeader</code> generated for each request. You can get hold of this <code>Map</code> of tags by calling <code>RequestHeader.tags</code>.</p>
<p>Given this learning, we can build a Play <code>EssentialFilter</code> – that will wrap every request – and log the JSON formatted request information to the Play <code>Logger</code>.</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">object</span> <span class="nc">AccessLogFilter</span> <span class="k">extends</span> <span class="nc">EssentialFilter</span> <span class="o">{</span>
</span><span class='line'>
</span><span class='line'> <span class="k">val</span> <span class="n">dateTimeFormat</span> <span class="k">=</span> <span class="nc">ISODateTimeFormat</span><span class="o">.</span><span class="n">ordinalDateTimeNoMillis</span><span class="o">()</span>
</span><span class='line'>
</span><span class='line'> <span class="k">def</span> <span class="n">apply</span><span class="o">(</span><span class="n">next</span><span class="k">:</span> <span class="kt">EssentialAction</span><span class="o">)</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">EssentialAction</span> <span class="o">{</span>
</span><span class='line'> <span class="k">def</span> <span class="n">apply</span><span class="o">(</span><span class="n">rh</span><span class="k">:</span> <span class="kt">RequestHeader</span><span class="o">)</span> <span class="k">=</span> <span class="o">{</span>
</span><span class='line'> <span class="k">val</span> <span class="n">startTime</span> <span class="k">=</span> <span class="nc">System</span><span class="o">.</span><span class="n">currentTimeMillis</span><span class="o">()</span>
</span><span class='line'>
</span><span class='line'> <span class="k">def</span> <span class="n">logTime</span><span class="o">(</span><span class="n">result</span><span class="k">:</span> <span class="kt">PlainResult</span><span class="o">)</span><span class="k">:</span> <span class="kt">Result</span> <span class="o">=</span> <span class="o">{</span>
</span><span class='line'> <span class="k">val</span> <span class="n">event</span> <span class="k">=</span> <span class="nc">Json</span><span class="o">.</span><span class="n">obj</span><span class="o">(</span>
</span><span class='line'> <span class="s">"uri"</span> <span class="o">-></span> <span class="n">rh</span><span class="o">.</span><span class="n">uri</span><span class="o">,</span>
</span><span class='line'> <span class="s">"timestamp"</span> <span class="o">-></span> <span class="n">dateTimeFormat</span><span class="o">.</span><span class="n">print</span><span class="o">(</span><span class="k">new</span> <span class="nc">DateTime</span><span class="o">),</span>
</span><span class='line'> <span class="s">"execution_time"</span> <span class="o">-></span> <span class="o">(</span><span class="nc">System</span><span class="o">.</span><span class="n">currentTimeMillis</span><span class="o">()</span> <span class="o">-</span> <span class="n">startTime</span><span class="o">),</span>
</span><span class='line'> <span class="s">"status"</span> <span class="o">-></span> <span class="n">result</span><span class="o">.</span><span class="n">header</span><span class="o">.</span><span class="n">status</span><span class="o">,</span>
</span><span class='line'> <span class="s">"tags"</span> <span class="o">-></span> <span class="nc">Json</span><span class="o">.</span><span class="n">toJson</span><span class="o">(</span><span class="n">rh</span><span class="o">.</span><span class="n">tags</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">entry</span> <span class="k">=></span> <span class="n">entry</span><span class="o">.</span><span class="n">_1</span><span class="o">.</span><span class="n">toLowerCase</span> <span class="o">-></span> <span class="n">entry</span><span class="o">.</span><span class="n">_2</span><span class="o">))</span>
</span><span class='line'> <span class="o">)</span>
</span><span class='line'> <span class="nc">Logger</span><span class="o">.</span><span class="n">info</span><span class="o">(</span><span class="nc">Json</span><span class="o">.</span><span class="n">stringify</span><span class="o">(</span><span class="n">event</span><span class="o">))</span>
</span><span class='line'> <span class="n">result</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'>
</span><span class='line'> <span class="n">next</span><span class="o">(</span><span class="n">rh</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span>
</span><span class='line'> <span class="k">case</span> <span class="n">plain</span><span class="k">:</span> <span class="kt">PlainResult</span> <span class="o">=></span> <span class="n">logTime</span><span class="o">(</span><span class="n">plain</span><span class="o">)</span>
</span><span class='line'> <span class="k">case</span> <span class="n">async</span><span class="k">:</span> <span class="kt">AsyncResult</span> <span class="o">=></span> <span class="n">async</span><span class="o">.</span><span class="n">transform</span><span class="o">(</span><span class="n">logTime</span><span class="o">)</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'><span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>On top of adding the tags (and some other available request info) to the JSON object, you’ll see the execution time is also captured alongside. To enable this filter, we simply create a custom <code>Global.scala</code>, in the default package:</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">object</span> <span class="nc">Global</span> <span class="k">extends</span> <span class="nc">WithFilters</span><span class="o">(</span><span class="nc">AccessLogFilter</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'><span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>Here is an example log line generated by this filter (prettified for readability):</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class='json'><span class='line'><span class="p">{</span>
</span><span class='line'> <span class="nt">"uri"</span><span class="p">:</span><span class="s2">"/search/*"</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"timestamp"</span><span class="p">:</span><span class="s2">"2013-194T18:14:27+01:00"</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"execution_time"</span><span class="p">:</span><span class="mi">20</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"status"</span><span class="p">:</span><span class="mi">200</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"tags"</span><span class="p">:{</span>
</span><span class='line'> <span class="nt">"route_verb"</span><span class="p">:</span><span class="s2">"GET"</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"route_action_method"</span><span class="p">:</span><span class="s2">"search"</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"route_pattern"</span><span class="p">:</span><span class="s2">"/search/$searchString"</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"route_controller"</span><span class="p">:</span><span class="s2">"controllers.Application"</span><span class="p">,</span>
</span><span class='line'> <span class="nt">"route_comments"</span><span class="p">:</span><span class="s2">"Root action"</span>
</span><span class='line'> <span class="p">}</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>One of the most useful things about capturing the tags is that you get easy access to ‘normalized’ actions/endpoints. So, it’s immediately easy (e.g. in a tool like Kibana) to pick all log events related to a specific action. The URI alone is not good enough because parameterized path variables, or query string parameters, lead to non-unique URI representations for the same logical action bindings.</p>
<p>As the filter wraps every request, you will see log lines generated for static assets too – something that is most likely too verbose for most use cases. It’s not too hard to apply filtering to the URI extension to ignore assets, but I left it out of the example code for the sake of cleanliness.</p>
<p>Keep on logging!</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Reactive, real-time log search with Play, Akka, AngularJS and Elasticsearch]]></title>
<link href="http://DrewEaster.github.io/blog/2013/07/08/reactive-real-time-log-search-with-play-akka-angularjs-and-elasticsearch/"/>
<updated>2013-07-08T12:25:00+01:00</updated>
<id>http://DrewEaster.github.io/blog/2013/07/08/reactive-real-time-log-search-with-play-akka-angularjs-and-elasticsearch</id>
<content type="html"><![CDATA[<h2>Introduction</h2>
<p>So, I’ve decided to contribute an Activator Template to TypeSafe (will submit soon, promise!). Having recently become more and more involved in Elasticsearch, I saw a great opportunity to put together a neat “reactive” application combining Play & Akka with the “bonsai cool” <a href="http://www.elasticsearch.org/guide/reference/api/percolate/">percolation</a> feature of Elasticsearch. Then, to put a cherry on top, use AngularJS on the client-side to create a dynamically updating UI.</p>
<p>What I came up with is slightly contrived – a very basic real-time log entry search tool – but I think it provides a really nice base for apps that want to integrate this bunch of technologies.</p>
<p><a href="http://DrewEaster.github.io/images/real-time-search.png"><img class="center" src="http://DrewEaster.github.io/images/real-time-search.png" width="600" height="264"></a></p>
<p>All the code for the application is available on Github <a href="https://github.com/drewzilla/realtime-search">here</a>. In this post, I’ve attempted to dissect the Activator Template tutorial I’ve written and regurgitate it in blog form.</p>
<h3>Play and Akka</h3>
<p>Play and Akka are used to implement the reactive server-side application. The application favours SSE (Server Sent Events) to push updates to the client. The template introduces a number of interesting topics, including Play Iteratees/Enumerators and Akka Actors.</p>
<h3>AngularJS</h3>
<p>AngularJS has been chosen on the client-side to demonstrate how simple it can be to build a dynamic, single page user experience with very little code.</p>
<h3>Elasticsearch</h3>
<p>The “bonsai cool” percolation feature of Elasticsearch achieves the real-time search aspect of the application. The application starts up an embedded Elasticsearch node, so no need to run your own external instance. Take a look at <a href="https://github.com/drewzilla/realtime-search/blob/master/app/utils/EmbeddedESServer.scala">EmbeddedESServer</a> for the embedded server code. There is a custom <a href="https://github.com/drewzilla/realtime-search/blob/master/app/Global.scala">Global</a> where the embedded server is started and shutdown as part of the application lifecycle.</p>
<h2>The Actors</h2>
<p>The application has three actors:</p>
<ul>
<li><a href="https://github.com/drewzilla/realtime-search/blob/master/app/actors/MainSearchActor.scala">MainSearchActor</a> supervises and coordinates data flows</li>
<li><a href="https://github.com/drewzilla/realtime-search/blob/master/app/actors/ElasticSearchActor.scala">ElasticsearchActor</a> interacts with Elasticsearch percolation API</li>
<li><a href="https://github.com/drewzilla/realtime-search/blob/master/app/actors/LogEntryProducerActor.scala">LogEntryProducerActor</a> generates ‘random’ log entry data</li>
</ul>
<h3>MainSearchActor</h3>
<p>This actor’s job is to coordinate the reactive parts of the application and supervise the other actors. It is the main dependency of the application’s single Play <a href="https://github.com/drewzilla/realtime-search/blob/master/app/controllers/Application.scala">controller</a>.</p>
<p><strong>Starting/stopping a search</strong></p>
<p>The actor responds to a <code>StartSearch</code> message by ‘replying’ with an <code>Enumerator</code> to the sender. The <code>Enumerator</code> wraps a unicast channel to which log entries are pushed that match the query string sent within the message. Let’s take a look at some code:</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">private</span> <span class="k">def</span> <span class="n">startSearching</span><span class="o">(</span><span class="n">startSearch</span><span class="k">:</span> <span class="kt">StartSearch</span><span class="o">)</span> <span class="k">=</span>
</span><span class='line'> <span class="nc">Concurrent</span><span class="o">.</span><span class="n">unicast</span><span class="o">[</span><span class="kt">JsValue</span><span class="o">](</span>
</span><span class='line'> <span class="n">onStart</span> <span class="k">=</span> <span class="o">(</span><span class="n">c</span><span class="o">)</span> <span class="k">=></span> <span class="o">{</span>
</span><span class='line'> <span class="n">channels</span> <span class="o">+=</span> <span class="o">(</span><span class="n">startSearch</span><span class="o">.</span><span class="n">id</span> <span class="o">-></span> <span class="n">c</span><span class="o">)</span>
</span><span class='line'> <span class="n">elasticSearchActor</span> <span class="o">!</span> <span class="n">startSearch</span>
</span><span class='line'> <span class="o">},</span>
</span><span class='line'> <span class="n">onComplete</span> <span class="k">=</span> <span class="o">{</span>
</span><span class='line'> <span class="n">self</span> <span class="o">!</span> <span class="nc">StopSearch</span><span class="o">(</span><span class="n">startSearch</span><span class="o">.</span><span class="n">id</span><span class="o">)</span>
</span><span class='line'> <span class="o">},</span>
</span><span class='line'> <span class="n">onError</span> <span class="k">=</span> <span class="o">(</span><span class="n">str</span><span class="o">,</span> <span class="n">in</span><span class="o">)</span> <span class="k">=></span> <span class="o">{</span>
</span><span class='line'> <span class="n">self</span> <span class="o">!</span> <span class="nc">StopSearch</span><span class="o">(</span><span class="n">startSearch</span><span class="o">.</span><span class="n">id</span><span class="o">)</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'> <span class="o">).</span><span class="n">onDoneEnumerating</span><span class="o">(</span>
</span><span class='line'> <span class="n">callback</span> <span class="k">=</span> <span class="o">{</span>
</span><span class='line'> <span class="n">self</span> <span class="o">!</span> <span class="nc">StopSearch</span><span class="o">(</span><span class="n">startSearch</span><span class="o">.</span><span class="n">id</span><span class="o">)</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'> <span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>
<p>The Play Iteratees library has the very handy <code>Concurrent</code> utilities. In this case, <code>Concurrent.unicast</code> is called to create an <code>Enumerator</code> that encloses a <code>Concurrent.Channel</code>. When the channel starts (<code>onStart</code>), it is stored in a map local to the actor (using UUID as key) and the <code>StartSearch</code> message is forwarded onto the <code>ElasticSearchActor</code> where the query will be percolated in Elasticsearch. It’s worth noting that this code is not production ready – it ought to be a transactional operation, i.e. we should only store the channel once we know Elasticsearch has successfully percolated the query. You will notice that a <code>StopSearch</code> message is sent to <code>self</code> such that the channel is removed from the local map, and the percolated query is deleted, when the channel is no longer useful (i.e. is closed by the client, or an error occurs).</p>
<p><strong>Broadcasting matching results</strong></p>
<p>The actor will receive a <code>SearchMatch</code> message when a log entry has matched a percolated query.</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">private</span> <span class="k">def</span> <span class="n">broadcastMatch</span><span class="o">(</span><span class="n">searchMatch</span><span class="k">:</span> <span class="kt">SearchMatch</span><span class="o">)</span> <span class="o">{</span>
</span><span class='line'> <span class="n">searchMatch</span><span class="o">.</span><span class="n">matchingChannelIds</span><span class="o">.</span><span class="n">foreach</span> <span class="o">{</span>
</span><span class='line'> <span class="n">channels</span><span class="o">.</span><span class="n">get</span><span class="o">(</span><span class="k">_</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span>
</span><span class='line'> <span class="k">_</span> <span class="n">push</span> <span class="n">searchMatch</span><span class="o">.</span><span class="n">logEntry</span><span class="o">.</span><span class="n">data</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'><span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>On receipt of the message, each matching id is iterated over and the corresponding channel is retrieved from the local map. The log entry is then pushed to the channel, and thus onto the client.</p>
<p><strong>Scheduling log entry creation</strong></p>
<p>The actor uses the Akka scheduler to send a <code>Tick</code> to the <code>LogEntryProducerActor</code> every second – in the real world, this would obviously be unnecessary, as genuine log entries would be fed into the application in some other way. The <code>Tick</code> is sent to <code>self</code> before being forwarded on to the <code>LogEntryProducerActor</code>.</p>
<h3>ElasticsearchActor</h3>
<p>This actor has responsibility for both registering queries in Elasticsearch and percolating log entry documents against those queries. Rather than utilise the Elasticsearch Java Client, the code, instead, crafts the Elasticsearch API calls manually, demonstrating the use of the asynchronous Play WS API to execute them. For simplicity, the calls are hard coded to talk to Elasticsearch on localhost:9200 (where the embedded server will be listening).</p>
<p>The code is fairly self explanatory within this actor. Do note that there is a lack of error handling on the API calls thus making this actor unsuitable for production use in its current form. It is recommended you read the Elasticsearch <a href="http://www.elasticsearch.org/guide/reference/api/percolate/">documentation on percolation</a> to learn more about this powerful feature.</p>
<p>There’s one little important gotcha this code has avoided – closing over the <code>sender</code> ref in an asynchronous callback block. The <code>sender</code> ref is part of the shared mutable state of the actor and so, if the actor were to reply to the sender in the percolate callback, a race condition would be encountered if another thread had modified the actor’s state before the percolation call to Elasticsearch had completed. This race condition has been avoided by ensuring to ‘freeze’ the <code>sender</code> ref, by sending it to a private function:</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">private</span> <span class="k">def</span> <span class="n">percolate</span><span class="o">(</span><span class="n">logJson</span><span class="k">:</span> <span class="kt">JsValue</span><span class="o">,</span> <span class="n">requestor</span><span class="k">:</span> <span class="kt">ActorRef</span><span class="o">)</span>
</span></code></pre></td></tr></table></div></figure>
<p>and close over the parameter instead.</p>
<h3>LogEntryProducerActor</h3>
<p>I won’t go into the detail of this actor. Suffice to say, its job is to generate a random, JSON formatted log event whenever it receives a <code>Tick</code> message. In reality, a genuine source of log events would replace this actor.</p>
<h2>The Play Controller</h2>
<p>As most of the server-side logic exists within the actors, the single Play controller is very simple. The most interesting aspect of the controller is the action that opens an event stream connected with the client:</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="k">def</span> <span class="n">search</span><span class="o">(</span><span class="n">searchString</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span> <span class="k">=</span> <span class="nc">Action</span> <span class="o">{</span>
</span><span class='line'> <span class="nc">Async</span> <span class="o">{</span>
</span><span class='line'> <span class="o">(</span><span class="n">searchActor</span> <span class="o">?</span> <span class="nc">StartSearch</span><span class="o">(</span><span class="n">searchString</span> <span class="k">=</span> <span class="n">searchString</span><span class="o">)).</span><span class="n">map</span> <span class="o">{</span>
</span><span class='line'> <span class="k">case</span> <span class="nc">SearchFeed</span><span class="o">(</span><span class="n">out</span><span class="o">)</span> <span class="k">=></span> <span class="nc">Ok</span><span class="o">.</span><span class="n">stream</span><span class="o">(</span><span class="n">out</span> <span class="o">&></span> <span class="nc">EventSource</span><span class="o">()).</span><span class="n">as</span><span class="o">(</span><span class="s">"text/event-stream"</span><span class="o">)</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'> <span class="o">}</span>
</span><span class='line'><span class="o">}</span>
</span></code></pre></td></tr></table></div></figure>
<p>The most important thing to note is the use of the Akka ‘ask’ pattern of message exchange (notice the use of ‘?’ instead of ‘!’). This differs from the more typical fire-and-forget approach in that we’re able to asynchronously pick up a reply from the recipient actor. In this scenario, a <code>StartSearch</code> message is sent to the <code>MainSearchActor</code> which replies with an <code>Enumerator</code> used to stream search results to the client. Given the use of the ‘ask’ pattern, we wrap the action logic in an <code>Async</code> block – so not to hold up other requests – rather than blocking until the <code>Future</code> yields a result.</p>
<h2>The User Interface</h2>
<p>The key parts of the application UI are:</p>
<ol>
<li>A Play <a href="https://github.com/drewzilla/realtime-search/blob/master/app/views/index.scala.html">template</a> with AngularJS specific markup</li>
<li>A single AngularJS <a href="https://github.com/drewzilla/realtime-search/blob/master/app/assets/javascripts/controllers.js">controller</a></li>
</ol>
<p>The application makes use of the <a href="http://www.webjars.org/">WebJars</a> project to simplify the introduction of its JS and CSS dependencies (e.g. AngularJS and Twitter Bootstrap).</p>
<h3>UI Template</h3>
<p>Firstly, the opening <code><div></code> is linked to the controller <code>SearchCtrl</code> that subsequently enables the automagical databinding power of AngularJS. A simple search form captures an Apache Lucene formatted query string. A search can be started by clicking on the ‘Search’ button which invokes the <code>startSearching()</code> function defined in the controller. Finally, you can see the use of AngularJS two-way databinding to render matching search results contained within the view model (only displays the latest 10 matches):</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="o"><</span><span class="n">tr</span> <span class="n">ng</span><span class="o">-</span><span class="n">repeat</span><span class="o">=</span><span class="s">"searchResult in searchResults | limitTo:10"</span><span class="o">></span>
</span></code></pre></td></tr></table></div></figure>
<h3>The AngularJS Controller</h3>
<p>The AngularJS controller is fairly straightforward. The key part is handling a new search:</p>
<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='scala'><span class='line'><span class="nc">$scope</span><span class="o">.</span><span class="n">startSearching</span> <span class="k">=</span> <span class="n">function</span> <span class="o">()</span> <span class="o">{</span>
</span><span class='line'> <span class="nc">$scope</span><span class="o">.</span><span class="n">stopSearching</span><span class="o">()</span>
</span><span class='line'> <span class="nc">$scope</span><span class="o">.</span><span class="n">searchResults</span> <span class="k">=</span> <span class="o">[];</span>