-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.xml
429 lines (370 loc) · 44.8 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Will Eatherton</title>
<link>https://willeatherton.com/</link>
<atom:link href="https://willeatherton.com/index.xml" rel="self" type="application/rss+xml" />
<description>Will Eatherton</description>
<generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>Copyright Will Eatherton</copyright><lastBuildDate>Tue, 05 Sep 2017 14:00:29 -0800</lastBuildDate>
<image>
<url>https://willeatherton.com/media/icon_hu3ac9caf249b0f5408f238ccac1a4254f_26012_512x512_fill_lanczos_center_2.png</url>
<title>Will Eatherton</title>
<link>https://willeatherton.com/</link>
</image>
<item>
<title>Opaque Crypto : What happens when the Enterprise Network goes Dark ?</title>
<link>https://willeatherton.com/post/crypto1v4/</link>
<pubDate>Tue, 05 Sep 2017 14:00:29 -0800</pubDate>
<guid>https://willeatherton.com/post/crypto1v4/</guid>
<description><p><em><strong>The Rise of Opaque encryption Resets the Network Security Industry</strong></em></p>
<p><em>{Note this writeup was done in summer of 2017 while I was at Skyport Systems as Founder and VP of Engineering}</em></p>
<p>During my time at Cisco and Juniper (1999-2013), our marketing departments unveiled many campaigns around the &ldquo;Intelligent Network.&rdquo; To deliver on this message, we kept building products and features that would use stateful views of deep packet data to help reinforce the marketing story about the value of the network. During the same time, a multi-billion-dollar industry grew out of providing visibility into and control of enterprise traffic, including next-generation firewalls, features in routers, IPS/IDS, UTM, web proxies, data loss prevention, anti-virus, and secure web gateways.</p>
<p>We knew that while uncommon at the time, that most of those features we were adding would become irrelevant if the day came that all the data flowing through our devices was encrypted, but that seemed a very far off possibility.</p>
<p><strong>Fast forward to 2017.</strong> Not only is encryption here with a vengeance, it will soon cause most network traffic to go dark from the point of view of IT departments. This year the threshold was crossed with [now over 50% of web traffic using HTTPS] (<a href="https://www.wired.com/2017/01/half-web-now-encrypted-makes-everyone-safer/)">https://www.wired.com/2017/01/half-web-now-encrypted-makes-everyone-safer/)</a>. We are also seeing the rise of new encryption protocols that do not accommodate man-in-the-middle (MITM) monitoring. As a result, most traffic will speed by, opaque to inspection from the standpoint of IT and security administrators.</p>
<p>These MITM boxes (arguably) own one side of each leg in internal data communications (with the definition that the enterprise IT department owns all client/server endpoints in its network) and for years MITM proxies terminated encrypted sessions and re-originated them, in which case everything is visible. However, moving forward with new protocols increasingly they will only be able to gather extremely superficial intelligence from the traffic headers flowing through them in clear text.</p>
<h2 id="mitm-today">MITM today</h2>
<p>Before launching into new developments with encryption, here are the main uses of traffic inspection today:</p>
<ul>
<li>Malware and botnet detection to contain the lateral spread of malware and reach-out from enterprise to command and control points</li>
<li>Authentication/validation of access to critical segments and systems</li>
<li>Detecting policy violations related to content and data access</li>
<li>Detecting data exfiltration (theft)</li>
<li>General visibility and analytics</li>
</ul>
<p>Over the past two decades, MITM boxes evolved as invisible inspectors (transparent bumps in the wire) in the enterprise network. This model was relatively easy to operationalize and vendors provided a range of product options.</p>
<p>Today, there are two main scenarios for deploying MITM boxes. The left side of the diagram below shows traditional inspection of traffic between enterprise servers within their perimiter to external points like clients, SaaS and IaaS services. The right side of the diagram shows the trend over the past 5-7 years towards deploying MITM devices inside the enterprise to provide visibility and segment the enterprise internally. For these internal segmentation cases the network security boxes can intercept traffic between servers, or across boundaries that connect servers to databases and storage.</p>
<figure id="figure-man-in-the-middle-locations-in-enterprises-today">
<div class="figure-img-wrap" >
<img alt="Man-in-The-Middle locations in Enterprises Today" srcset="
/post/crypto1v4/Crypto1_connectivity2_hud13f2bb1926a9cdc7426ea3f09f24d17_669937_f0f0caee738fd680a11bcec1d95adf0f.png 400w,
/post/crypto1v4/Crypto1_connectivity2_hud13f2bb1926a9cdc7426ea3f09f24d17_669937_757cf95801b92a97897480e1a1c7e029.png 760w,
/post/crypto1v4/Crypto1_connectivity2_hud13f2bb1926a9cdc7426ea3f09f24d17_669937_1200x1200_fit_lanczos_2.png 1200w"
src="https://willeatherton.com/post/crypto1v4/Crypto1_connectivity2_hud13f2bb1926a9cdc7426ea3f09f24d17_669937_f0f0caee738fd680a11bcec1d95adf0f.png"
width="760"
height="337"
loading="lazy" data-zoomable /></div><figcaption>
Man-in-The-Middle locations in Enterprises Today
</figcaption></figure>
<h2 id="what-is-opaque-crypto">What is Opaque Crypto</h2>
<p>Opaque cryptography covers a broad set of developments that has a heavy focus on HTTPS, but includes other protocol stacks also as shown below.</p>
<table>
<thead>
<tr>
<th>Connection Between</th>
<th style="text-align:left">New Developments</th>
</tr>
</thead>
<tbody>
<tr>
<td>Application&lt;-&gt;Database</td>
<td style="text-align:left">Mongo, Postgres, etc w TLS</td>
</tr>
<tr>
<td>Application&lt;-&gt;Storage</td>
<td style="text-align:left">SMBv3, NFS4.1 w Kerberized Crypto</td>
</tr>
<tr>
<td>Application&lt;-&gt;Application</td>
<td style="text-align:left">Increased ease of use of TLS Libraries</td>
</tr>
<tr>
<td>Client to Server</td>
<td style="text-align:left">HTTP/2, TLS1.3, QUIC</td>
</tr>
<tr>
<td>Iaas to On-Premises</td>
<td style="text-align:left">Encrypted VPNs with IPSEC</td>
</tr>
</tbody>
</table>
<p>New developments in network traffic encryption</p>
<p>Without doing a deep dive on each of these new developments, we can categorize the trends that make it hard to leverage the existing proxy technologies :</p>
<ul>
<li>A general trend towards use of <a href="http://www.informationsecuritybuzz.com/articles/the-pitfalls-of-perfect-forward-secrecy/" target="_blank" rel="noopener">Perfect Forward Secrecy (PFS) and acceptance of it as a valid goal for encryption standards</a>. PFS is a property of a key exchange that deliberately prevents after-the-fact decryption and defeats all passive MITM technologies.</li>
<li>Increased use of non-SSL/TLS encryption protocols with no accompanying MITM infrastructure. For example, Google&rsquo;s QUIC is an alternative to TLS that vendors are lagging behind (currently unable to support).</li>
<li>Multi-path to improve performance. Multi-path makes decryption difficult because a single MITM box only sees a portion of the needed state.</li>
<li>Compressed header state and fewer clear text headers. This makes it difficult to determine what traffic to decrypt.
<ul>
<li>TLS 1.3 does provide a clear text extension (SNI) to communicate a target server&rsquo;s host name. While it seems useful for decryption determination, it can be easily spoofed.</li>
</ul>
</li>
<li>Increased use of certificate and public key pinning, which are hard blocks to traditional transparent MITM.
<ul>
<li>Traditionally if you read network security vendor documents the response to this issue was that SaaS vendors simply will not risk their own business by pinning.</li>
<li>However, the submissiveness of SaaS vendors is changing with a growing number of services that cross consumer and enterprise boundaries, as well as growing number of SaaS vendors that value the trust of the end users over friction with IT admins.</li>
</ul>
</li>
<li>TCP connection re-use (enabled through HTTP/2 and TLS1.3) complicates state machines for MITM devices.</li>
<li>Technologies that are crossing enterprise perimeter that were never meant for transparently inserting network security in the middle, for example IPSEC based tunnels.</li>
</ul>
<p>Reading through the above list, what is interesting is that while it seems reasonable that many of these developments could be countered to some degree by network security vendors, taken as a sum total the level of disruption in such a short time is staggering.</p>
<h2 id="forces-resulting-in-opaque-crypto">Forces resulting in Opaque Crypto</h2>
<p>It is frequently called out that the rise of machine learning in past few years is due to the simultaneous occurrence of 3 factors : 1) significant increases in computation, 2) availability of massive data sets, and 3) Improvement in fundamental algorithms.</p>
<p>There is a parallel set of forces that are causing a rise of encryption in a form that removes visibility (i.e. Opaque Crypto). These forces which are detailed in following sub-sections are :</p>
<ul>
<li>
<ol>
<li>Consumer web companies now lead definition of Web protocols</li>
</ol>
</li>
<li>
<ol start="2">
<li>The cost of encryption goes to zero</li>
</ol>
</li>
<li>
<ol start="3">
<li>Cloud ripple effects</li>
</ol>
</li>
</ul>
<h3 id="force-1--consumer-web-companies-now-lead-definition-of-web-protocols">Force #1 : Consumer web companies now lead definition of Web protocols</h3>
<p>At one time, vendors like Cisco, working with service providers and enterprise customers, drove the key web and internet standards. The key force driving the technology choices at end of the day was what would sell more routers, switches, and networking security boxes.</p>
<p>While the fact that technology gets driven for economic ends has not changed, the special interests that are now getting prioritized is that of web companies that want to minimize costs, and maximize value to their consumer customers. In the pursuit of pro-consumer profits, web properties focus on characteristics like low latency, high throughput, good user experience, and trust.</p>
<p>On the topic of trust, consumer web companies don&rsquo;t respect the MITM model that financials so love, because in the consumer arena there is no legitimate reason to insert a MITM (only bad reasons like stealing credit card numbers). Additionally in the post Edward Snoweden era, the general negative public reaction to web companies supporting snooping of any kind has added additional pressure on these companies to resist incept models. The result is that web companies believe to earn trust it requires strong end-end security models, and in the pursuit of profit, that is what consumer web companies are delivering.</p>
<p>A good example of consumer web companies' hostility towards MITM architecture is a <a href="https://jhalderm.com/pub/papers/interception-ndss17.pdf" target="_blank" rel="noopener">2017 paper that includes authors from Google and Cloudflare</a> that essentially condemns HTTPS interception.</p>
<figure id="figure-paper-condemning-https-intercept">
<div class="figure-img-wrap" >
<img alt="Paper condemning HTTPS Intercept" srcset="
/post/crypto1v4/Crypto1_InterceptPaper_hua2088b3faa95e2c4d0bd5f092b0b7fb6_473234_9284bfa40a752588e5392cc5ee1bd6e5.png 400w,
/post/crypto1v4/Crypto1_InterceptPaper_hua2088b3faa95e2c4d0bd5f092b0b7fb6_473234_82d9434bc6254ffe62f55b8484174fce.png 760w,
/post/crypto1v4/Crypto1_InterceptPaper_hua2088b3faa95e2c4d0bd5f092b0b7fb6_473234_1200x1200_fit_lanczos_2.png 1200w"
src="https://willeatherton.com/post/crypto1v4/Crypto1_InterceptPaper_hua2088b3faa95e2c4d0bd5f092b0b7fb6_473234_9284bfa40a752588e5392cc5ee1bd6e5.png"
width="760"
height="259"
loading="lazy" data-zoomable /></div><figcaption>
Paper condemning HTTPS Intercept
</figcaption></figure>
<p>This paper is the result of a massive investment. The authors tested a vast combination of security products and software versions from top vendors, and offer this finding in summary:</p>
<blockquote>
<p>Most concerningly, 62% of traffic that traverses a network middlebox has reduced security and 58% of middlebox connections have severe vulnerabilities. We investigated popular antivirus and corporate proxies, finding that nearly all reduce connection security and that many introduce vulnerabilities</p>
</blockquote>
<p>After publication of this paper, <a href="https://www.us-cert.gov/ncas/alerts/TA17-075A" target="_blank" rel="noopener">a US agency issued a CERT (Alert TA17-075A)</a> citing it and issuing a similar warning about the risks of HTTPS interception.</p>
<p>In recent discussions with a household name financial about these trends, the customer noted that they were talking to Google about the disruption that standards like Google&rsquo;s QUIC/TLS1.3 could cause across their industry due to these standards' implicit push against interception. While the frustration is understandable given that financials are heavily invested in the current model for monitoring/protecting their perimeter, the new direction looks clear barring major government interference.</p>
<h3 id="force-2--the-cost-of-encryption-goes-to-zero">Force #2 : The cost of encryption goes to zero</h3>
<p>Though not strictly a cause, a drop in the cost of encryption was necessary to ease its adoption. Improvements to key encryption standards in hardware&ndash;and more importantly software&ndash;have made it easier to support very strong encryption algorithms, and at a level that performance is no longer an issue for either the client or server side of a connection.</p>
<p>Improvements like Intel Advanced Encryption Standard instructions speed up encryption, and there are ongoing efforts to improve all aspects of TLS performance, including <a href="https://netdevconf.org/1.2/papers/ktls.pdf" target="_blank" rel="noopener">making TLS a kernel service</a>.</p>
<h3 id="force-3--cloud-ripple-effects">Force #3 : Cloud ripple effects</h3>
<p>This force was harder to pin down, but outside of web traffic and the falling cost of encryption there was clearly at least one more force driving the rise of opaque encryption. Two example trends that are not explained by the two forces above are the recent moves towards <a href="https://whyistheinternetbroken.wordpress.com/tag/nfsv4-1/" target="_blank" rel="noopener">encrypted storage traffic with NFS4.1</a> using kerberized encryption as well as increased use of <a href="https://technet.microsoft.com/en-us/library/dn551363%28v=ws.11%29.aspx" target="_blank" rel="noopener">SMB 3.0 with encryption</a>.</p>
<p>So why is the above happening ? Lets start with the assumption that data and workloads without notable business value will continue quickly moving to public cloud. If you are an enterprise vendor of ISV software or systems, you are flailing around, looking to counter this trend with new features, and to push the notion that your customers should be keeping important stuff in a enterprise owned data centers. This leads to features like encrypted data at rest, as well as encrypted data traffic for storage, databases and applications.</p>
<p>Beyond these vendor driven features, cloud in the form of SaaS has produced thousands of enterprise SaaS services. Beyond client connections, these SaaS services most often involve agents or VMs sitting in the enterprise data center that reach out to their cloud service origin (most commonly on AWS) to make some sort of data connection and do data integrations. These SaaS services are following in the foot steps of their consumer web brethren and all will be increasingly opaque.</p>
<p>Finally going to the core trend of IaaS adoption in enterprise, services like AWS are resulting in many (poorly managed) enterprises sprinkling VPC VPN gateways all over their internal network. Each of these end points is rendering the outer perimeter of the enterprise meaningless as each is a bi-directional gateway between an entire world encapsulated in the AWS VPC to the heart of the data center. And again these are all encrypted traffic that is not being proxied by those HTTPS interception MITM devices.</p>
<h2 id="enterprise-vendors-responses-to-opaque-crypto">Enterprise vendors' responses to opaque crypto</h2>
<p>The network security industry has largely been quiet about the trends discussed here. Some companies like <a href="https://www.theregister.co.uk/2017/02/27/blue_coat_chokes_on_chrome_encryption_update/" target="_blank" rel="noopener">Bluecoat have been visible in their struggles to keep up with protocols like TLS1.3</a>. But overall there has been little substantive industry-wide discussion of the issues or possible solutions.</p>
<p>One exception is Cisco, which has highlighted <a href="https://www.ciscolive.com/online/connect/sessionDetail.ww?SESSION_ID=90863&amp;tclass=popup" target="_blank" rel="noopener">several issues with opaque crypto</a> and advocated a potential solution. In mid-2017 Cisco introduced <a href="https://www.cisco.com/c/dam/en/us/solutions/collateral/enterprise-networks/enterprise-network-security/nb-09-encrytd-traf-anlytcs-wp-cte-en.pdf" target="_blank" rel="noopener">Cisco Encrypted Traffic Analytics</a>, and they deserve credit for being one of the only established security vendors to call out this issue. They also deserve credit for <a href="https://arxiv.org/pdf/1607.01639.pdf" target="_blank" rel="noopener">publishing academic papers</a> related to their solutions and <a href="https://github.com/cisco/joy" target="_blank" rel="noopener">making available software for feature extraction from packet traces, plus an analysis tool on github</a> that seem related to their for-profit product.</p>
<p>So what does Cisco propose? As their graphic below shows, first they look at clear text TLS information (also known as TLS fingerprinting). This can be as simple as the order of presentation of available bulk encryption algorithms, but of course is usually more complex. Also, they offer nice real time data collection of the TLS packet streams across a variety of extracted features, including old favorites like packet size and inter-packet gap.</p>
<figure id="figure-cisco-encrypted-analytics">
<div class="figure-img-wrap" >
<img alt="Cisco Encrypted Analytics" srcset="
/post/crypto1v4/Crypto1_ciscoencrypted_hu72fa74da76a42610dbe996e32ed2fdaa_161471_90020991c3f9ded8fdb6527302bb5145.png 400w,
/post/crypto1v4/Crypto1_ciscoencrypted_hu72fa74da76a42610dbe996e32ed2fdaa_161471_1ba66bfbd197c9547f4d588efa079bfd.png 760w,
/post/crypto1v4/Crypto1_ciscoencrypted_hu72fa74da76a42610dbe996e32ed2fdaa_161471_1200x1200_fit_lanczos_2.png 1200w"
src="https://willeatherton.com/post/crypto1v4/Crypto1_ciscoencrypted_hu72fa74da76a42610dbe996e32ed2fdaa_161471_90020991c3f9ded8fdb6527302bb5145.png"
width="760"
height="244"
loading="lazy" data-zoomable /></div><figcaption>
Cisco Encrypted Analytics
</figcaption></figure>
<p>But while Cisco deserves credit, there seem to be fundamental issues with their offering.</p>
<ul>
<li>The proposal focuses on malware detection, which overlooks other goals of MITM proxies such as policy violations related to data access, detecting data exfiltration, and validating/authenticating access to management segments and systems.</li>
<li>As mentioned above, TLS 1.3 and other new protocols are becoming more opaque. Cisco&rsquo;s approach will likely become less relevant with the reduced clear text available with TLS1.3, the smaller set of supported cipher suites, all making TLS fingerprinting more difficult.</li>
<li>Cisco&rsquo;s solution seems to focus on TLS and omit technologies like IPSEC VPNs, Kerberized file transfer protocols, and QUIC.</li>
<li>In the past, solutions like malware detonation were effective <a href="https://www.sans.org/reading-room/whitepapers/forensics/detecting-malware-sandbox-evasion-techniques-36667" target="_blank" rel="noopener">until bad actors learned how to do end runs around the technology</a>. By extension, it seems likely that once hackers see the solution, they will learn to avoid TLS fingerprinting, matching packet sizes and gaps, and other &ldquo;tells&rdquo; to appear similar to known applications.</li>
<li>Cisco&rsquo;s solution today depends on data collection using Cisco technology collector points.</li>
</ul>
<p>Despite these issues, Cisco&rsquo;s proposal adds value. It would help to know what companies like Palo Alto Networks, Bluecoat, F5, and Checkpoint have to say on this topic.</p>
<p>As a final note, back in 2012 <a href="https://datatracker.ietf.org/meeting/84/materials/slides-84-tls-3" target="_blank" rel="noopener">an IETF presentation by Cisco Fellow David McGrew</a> pointed out many of the issues with HTTPS interception discussed here. Since that time, the new trends have only pushed it further off the cliff.</p>
<h1 id="the-shape-of-the-future">The shape of the future</h1>
<p>What do the next few years hold, and what are practical solutions for enterprise-wide visibility and policy enforcement?</p>
<ul>
<li>The transition to opaque crypto will start in earnest in 2018, but will take years to complete. Mini-industries will spring up to keep legacy protocols alive for the finance industry for another 10 years. But for most enterprises, simple to deploy central policy visibility and enforcement will go away and companies will adapt.</li>
<li>There might be a rise of non-transparent proxies for client endpoints in the enterprise. But non-transparent proxies are known to be relatively difficult to maintain, so it is unclear if this will gain widespread adoption. Enterprises may increasingly consider client endpoints to be lost causes, and establish a goal to heavily segment them from business applications. To deal with rogue clients, use of frequent rebuild/replace of client software could be more common.</li>
<li>For applications running on server endpoints in enterprise data centers, there will be an ongoing trend towards applying policy to the application instead of the network. Policy (for example, what entities this application can talk to or receive communication from), will increasingly be a part of a developer&rsquo;s definition of an application&ndash;similar to software dependencies, data models, and API definitions. <sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></li>
</ul>
<p>The trend towards opaque encryption will produce a stronger model for secure communication. But while the networking industry resets, vendors and enterprises will struggle to find the new normal.</p>
<section class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1" role="doc-endnote">
<p>Skyport Systems was acquired by Cisco Systems in early 2018. <a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</section>
</description>
</item>
<item>
<title>ARM CPU's for Data Center Servers</title>
<link>https://willeatherton.com/post/armanalysis/</link>
<pubDate>Wed, 05 Sep 2012 14:00:29 -0800</pubDate>
<guid>https://willeatherton.com/post/armanalysis/</guid>
<description><p><em>{Note this writeup was done privately at end of 2012 and so pretty outdated, but still I think it was a decent summary of the state of ARM at that time and so I publish it for fun. I am considering to do an updated Arm Analysis given the changes that have happened in past year}</em></p>
<p><em>Will Eatherton - willeatherton@gmail.com</em></p>
<details class="toc-inpage d-print-none " open>
<summary class="font-weight-bold">Table of Contents</summary>
<nav id="TableOfContents">
<ul>
<li><a href="#exec-summary">Exec Summary</a></li>
<li><a href="#challenges-for-arm-as-server-cpu-isa">Challenges for ARM as Server CPU ISA</a>
<ul>
<li><a href="#os-support-optimizations">OS Support Optimizations</a></li>
<li><a href="#compilation-tool-chains">Compilation Tool Chains</a></li>
<li><a href="#hypervisor-support">Hypervisor Support</a></li>
<li><a href="#java-virtual-machines">Java Virtual Machines</a></li>
</ul>
</li>
<li><a href="#cpu-level-analysis-of-arm-vs-x86">CPU level Analysis of ARM vs x86</a></li>
<li><a href="#analysis-of-arm-cpu-for-a-mr-compute-node">Analysis of Arm CPU for a M&amp;R Compute Node</a></li>
</ul>
</nav>
</details>
<h2 id="exec-summary">Exec Summary</h2>
<p>The buzz around Arm based CPU&rsquo;s for the Server Ecosystem has been growing notably in mid 2012. Gartner predicts that ARM servers will own 15 percent of the server CPU market within four years.</p>
<p>There are numerous planned ARM based CPU server chips for server market : ST Micro , <a href="http://www.eetimes.com/electronics-news/4230166/AMCC-demos-64-bit-ARM-server-chip" target="_blank" rel="noopener">Applied Micro</a> has early 64-bit prototype and seems to be leading support for ARM v8, [Marvell] (<a href="http://www.marvell.com/embedded-processors/armada-xp/">http://www.marvell.com/embedded-processors/armada-xp/</a>) was early with a 32-bit server ARM based server focused processor, <a href="http://arstechnica.com/information-technology/2012/10/amd-announces-arm-based-opteron-cpus-due-to-launch-in-2014/" target="_blank" rel="noopener">AMD</a> has announced 64-bit ARM CPUs by 2014 , <a href="http://www.eetimes.com/electronics-news/4398058/ARM--LSI-on-chip-link-connects-up-to-32-cores" target="_blank" rel="noopener">LSI</a> , and startup <a href="http://www.calxeda.com" target="_blank" rel="noopener">Calxeda</a>. Additionally there are indications within the industry that IBM/Samsung may be considering opportunities in server space using ARM (maybe more as SOC designs with partners).</p>
<ul>
<li>
<p>Beyond silicon plays, early stage startups like [Netspeed] (<a href="http://www.netspeedsystems.com">http://www.netspeedsystems.com</a>) are looking at enabling the growing number of players interested in high end multi-core ARM SOCs by providing a coherent interconnect</p>
</li>
<li>
<p>[ARM is also working on a coherent interconnectt] (<a href="http://www.arm.com/about/newsroom/arm-announces-new-high-performance-system-ip-to-address-demand-for-energy-efficient-many-core.php">http://www.arm.com/about/newsroom/arm-announces-new-high-performance-system-ip-to-address-demand-for-energy-efficient-many-core.php</a>) for massively multi-core designs with support for 16 cores going to 32 in the future</p>
</li>
</ul>
<p>From a systems standpoint <a href="http://content.dell.com/us/en/enterprise/d/campaigns/project-copper" target="_blank" rel="noopener">Dell</a> has announced an ARM based server starting with a 32-bit version, and [HP has indicated] (<a href="http://liliputing.com/2012/06/hp-project-moonshot-drops-arm-will-use-intel-atom-servers.html">http://liliputing.com/2012/06/hp-project-moonshot-drops-arm-will-use-intel-atom-servers.html</a>) that their proof of concept project (Moonshot) will support ATOM and ARM based CPUs.</p>
<ul>
<li>
<p>Beyond the system level end Mega Data center operators like [Facebook have also have shown interestt] (<a href="http://www.eetimes.com/electronics-news/4400339/Linaro-ARM-server-efforts-targets-Linux-code">http://www.eetimes.com/electronics-news/4400339/Linaro-ARM-server-efforts-targets-Linux-code</a>) in helping to support ARM insertion into data center</p>
</li>
<li>
<p>Note that for system vendors as well as DIY Hyperscale DC integrators like Facebook, support of ARM in DC is somewhat self serving for the basic reason of providing negotiating leverage with Intel on pricing</p>
</li>
</ul>
<p>Looking at specific application of a Map and Reduce focused compute node, the question is what an ARM cpu based server could look like in 2014/2015 and if it would add any notable differentiation benefits Vs Intel. The analysis below focuses on exploring implications of 64-bit ARM cores used in data center for M&amp;R applications.</p>
<p><strong>Conclusion</strong> : The analysis below concludes that ARM based servers for the target application do not seem worth the risks at this point. If future projected ARM solutions could show 3-5x increase in final performance after all factors (including software) in same power profile then it would be worth revisiting.</p>
<h2 id="challenges-for-arm-as-server-cpu-isa">Challenges for ARM as Server CPU ISA</h2>
<p>Before getting into a low level performance/hardware analysis of ARM as a server CPU, it is important to look at what the implications of replacing x86 in data center with ARM ISA will have on the ecosystem for Software/System architecture.</p>
<h3 id="os-support-optimizations">OS Support Optimizations</h3>
<p>There is clearly gaps today for strong generalized Linux support of the upcoming ARM silicon for DC. There is a recent initiative termed [Linaro to focus on flushing out full linux support for ARM] (<a href="http://www.eetimes.com/electronics-news/4400339/Linaro-ARM-server-efforts-targets-Linux-code">http://www.eetimes.com/electronics-news/4400339/Linaro-ARM-server-efforts-targets-Linux-code</a>) related to topics starting with boot sequences and going from there. Beyond base driver level support a major area of concern from performance oriented software developers is the extensive tooling around benchmarking and tuning of Linux on x86 that will have to mature with ARM.</p>
<h3 id="compilation-tool-chains">Compilation Tool Chains</h3>
<p>It has been common wisdom for a number of years that Intel&rsquo;s optimizing compiler (ICC) for C/C++ is one of the better compilers in the industry and represents again the level of multi-year optimization around x86. [Recent studies in 2012] (<a href="http://www.behardware.com/articles/847-1/the-impact-of-compilers-on-x86-x64-cpu-architectures.html">http://www.behardware.com/articles/847-1/the-impact-of-compilers-on-x86-x64-cpu-architectures.html</a>), there continues to be seen a notable benefit of ICC compilers of other alternatives like GCC and Microsoft. From the perspective of Hadoop (a Java application) the impact of not having the Intel C/C++ compiler from performance/memory standpoint may not be major, but is representative of the types of optimization issues that ARM will have entering wide spread data center use.</p>
<h3 id="hypervisor-support">Hypervisor Support</h3>
<p>With ARMv7/v8 there is some support for Hardware and I/O virtualization that will be supported and available in silicon by 2014. Paravirtualization is the description of the technique used to present software interfaces up to virtual machines and is used to compensate for these gaps (accomplished by intercepting certain instructions from guest OS and interpreting them differently for the actual hardware) .</p>
<ul>
<li>Note the ongoing maintenance of para-virtualization can be difficult with consistent performance and reliability</li>
</ul>
<p>[Vmware has indicated pretty bluntly] (<a href="http://www.wired.com/wiredenterprise/2012/08/vmware-on-arm/">http://www.wired.com/wiredenterprise/2012/08/vmware-on-arm/</a>) they are in no rush to support ARM, this is of course a major issue for adoption of ARM in the data center as the open source solutions will be only real option for virtualization for ARM servers in the foreseeable future.</p>
<p>For [KVM support of ARM] (<a href="http://blog.xen.org/index.php/2012/09/21/xensummit-sessions-new-pvh-virtualisation-mode-for-arm-cortex-a15arm-servers-and-x86/">http://blog.xen.org/index.php/2012/09/21/xensummit-sessions-new-pvh-virtualisation-mode-for-arm-cortex-a15arm-servers-and-x86/</a>) which seem to be making progress but still early phase and have many years of work ahead of them on topics related to performance tuning, benchmarking, and para-virtualization enhancements to better compensate for missing virtualization extensions and hardware in ARM v7/v8.</p>
<ul>
<li>For the x86 architecture (Intel and AMD) starting providing support for virtualization many years ago and the time frame for ARM to reach parity will be a long time</li>
</ul>
<p>From the perspective of a Hadoop cluster, the weak support of hypervisor support on ARM is not necessarily a show stopper as it is a common case to run the Hadoop stack a non virtualized OS running bare metal, but it is still representative of the types of optimization issues that ARM will have entering wide spread data center use.</p>
<h3 id="java-virtual-machines">Java Virtual Machines</h3>
<p>As with prior topics, the immature state of JVM optimization for ARM compared to x86 will again be an impediment for rapid ARM adoption. While in general the performance analysis data is sketchy in this area, [some example experiments] (<a href="http://fullshovel.wordpress.com/2012/07/11/java-vs-c-on-arm/">http://fullshovel.wordpress.com/2012/07/11/java-vs-c-on-arm/</a> ) in 2012 have shown that with OpenJDK performance between C and Java on x86 platforms are in ballpark of 1:1, but when the same benchmarks are run on ARM the ratios can be as high as 3.6x to 8.9x worse for ARM. This indicates that the level of tuning around ARM JVM support is still very immature.</p>
<p>Oracle [has recently announced] (<a href="https://blogs.oracle.com/henrik/entry/oracle_releases_jdk_for_linux">https://blogs.oracle.com/henrik/entry/oracle_releases_jdk_for_linux</a>) that they will support their JRE on ARM. This is very important is the Oracle JRE is commonly viewed as the clear industry grade/performance leader for JRE support compared to alternative commercial and open source options. There are some functional limitations in Oracles planned support of ARM, but they do not appear to have major impact on server applications. However, there is no benchmarking data yet available for Oracle&rsquo;s JRE and it is expected there will be a multi-year evolution required. Additionally the first port is focused on 32-bit and ARM v7, so support for the new set of 64-bit cores will not be until well into 2013.</p>
<p>The JVM support for ARM is key to Hadoop which is Java based.</p>
<h2 id="cpu-level-analysis-of-arm-vs-x86">CPU level Analysis of ARM vs x86</h2>
<p>Based on discussion with a processor Architect, interesting data points :</p>
<ul>
<li>
<p>Expects to see a 32-core, 64-bit core devices in prototype by 2H 2013</p>
</li>
<li>
<p>for integer/text manipulations (common in M&amp;R applications which is example application considered here) expects that first order it can be approximated that ARM v8 cores should be on par with x86 E5 cores at the machine code level (ignoring any virtualization or tool chain differences that were explored above) at the same clock</p>
</li>
<li>
<p>From power standpoint expects 32 core 64-bit ARM v8 core CPU to be similar power as 10 core x86 CPU in similar time frame (100W) for same system level functionality. This implies an upper bounds of 3x benefit per core for ARM</p>
<ul>
<li>Industry discussion of the back of the envelope statistics for a 5-10x delta between power of x86 server cores and ARM cores, generally are comparing Arm V6 32-bit cores which do not represent the power per core when the ARM ISA moves to 64-bit and starts adding more overheads like full floating point support, virtualization support (e.g. nested page tables), and coherent interconnect overheads.</li>
</ul>
</li>
<li>
<p>He does not see that coherent interconnect of 32-cores will be bottleneck based on his analysis of ARM&rsquo;s recent multi-interconnect (CN-504)</p>
</li>
</ul>
<p>Looking beyond a single CPU, There is not a concrete plan available yet about support for multi-CPU mesh configurations like Intel&rsquo;s QPI connection for larger shared memory complex</p>
<ul>
<li>
<p>The implication of not having this multi-CPU configuration is that each silicon instance is a standalone CPU without ability to leverage shared memory and require finer grain segregation of tasks across CPUs with separate distributed application instances across each CPU</p>
</li>
<li>
<p>There has been a [statement from Applied micro CEO] (<a href="http://www.theregister.co.uk/2012/08/30/applied_micro_x_gene_server_chip">http://www.theregister.co.uk/2012/08/30/applied_micro_x_gene_server_chip</a>) that in future up to 1024 cores across 64 CPU&rsquo;s is planned, though there is not much additional detail on this yet.</p>
</li>
</ul>
<figure id="figure-example-intel-e5-multi-cpu-configuration">
<div class="figure-img-wrap" >
<img alt="Example Intel E5 Multi-CPU Configuration" srcset="
/post/armanalysis/IntelCPU_hu768a213ea724e953042322abe7c2f65c_191064_0c86857be22354c0e66939f56deaecb4.png 400w,
/post/armanalysis/IntelCPU_hu768a213ea724e953042322abe7c2f65c_191064_33e0480dae2766fb036ab4610a46f1b8.png 760w,
/post/armanalysis/IntelCPU_hu768a213ea724e953042322abe7c2f65c_191064_1200x1200_fit_lanczos_2.png 1200w"
src="https://willeatherton.com/post/armanalysis/IntelCPU_hu768a213ea724e953042322abe7c2f65c_191064_0c86857be22354c0e66939f56deaecb4.png"
width="521"
height="314"
loading="lazy" data-zoomable /></div><figcaption>
Example Intel E5 Multi-CPU Configuration
</figcaption></figure>
<h2 id="analysis-of-arm-cpu-for-a-mr-compute-node">Analysis of Arm CPU for a M&amp;R Compute Node</h2>
<p>At a system level beyond the topic of the potential compute benefits (per Watt) of ARM vs x86, and the software complexities is question of how relevant this trade-off for a given application area.</p>
<p>First lets consider a very rough estimate for normalizing performance to BIPS (Billions of Instructions per Second) for integer operations of an ARM and x86 based CPU within 100W budget for silicon expected to be available by end of 2013. Note that the relative JVM performance estimates may be optimistic in favor of ARM.</p>
<p>Relative Comparison of performance for x86 Vs ARM 64-bit Server CPU</p>
<table class="table table-striped">
<thead>
<tr>
<th>Performance Factor</th>
<th>ARM</th>
<th>x86</th>
</tr>
</thead>
<tbody>
<tr>
<td>Number of Cores</td>
<td>32</td>
<td>10</td>
</tr>
<tr>
<td>Clock (Ghz)</td>
<td>3</td>
<td>2.4</td>
</tr>
<tr>
<td>ISA efficiency for Integer operations</td>
<td>.5</td>
<td>1</td>
</tr>
<tr>
<td>Linux OS Perf Relative to x86</td>
<td>.95</td>
<td>1</td>
</tr>
<tr>
<td>JVM Performance Implications</td>
<td>.7</td>
<td>1</td>
</tr>
<tr>
<td>Final relative BIPS</td>
<td>32</td>
<td>24</td>
</tr>
</tbody>
</table>
<p><strong>In summary while a crude estimate, this final ratio of relative performance with in 100W is close enough to 1:1 that it is not interesting.</strong></p>
<p>Going to system level, if we assume that over time with software and further silicon optimization ARM based CPUs improved to a solid 2x final performance (relative BIPS) per watt benefit of ARM over x86 after all overheads, how much would this impact system optimization for the Map and Reduce application in a rack server ?</p>
<ul>
<li>
<p>Taking into account memory, disk, IO and other overheads, the final system impact in system density for a 2:1 would have estimated &lt;20% benefit at system level. This does not seem to have enough impact to warrant the significant risk and effort.</p>
</li>
<li>
<p>As an example, consider that for recent hadoop cluster analysis, the ratio of x86 cores to spindles may be as high as 1:5, this implies that a blade with say 32 cores would matchup with 160 spinning disks. This is significant amount of space and power compared to the CPU complex, making the raw CPU performance less relevant.</p>
</li>
</ul>
<p>If it were possible to achieve a ratio of say 3-5x of ARM over x86 in final relative BIP performance per watt, and aiming for 100&rsquo;s of ARM cores on a blade (or in a 2RU rack server), then merging this optimization with a major overhaul of the system design to match the massively multi-core architecture in areas of disk/memory/IO could result in a significant different optimization point then x86 servers today.</p>
</description>
</item>
<item>
<title></title>
<link>https://willeatherton.com/admin/config.yml</link>
<pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
<guid>https://willeatherton.com/admin/config.yml</guid>
<description></description>
</item>
</channel>
</rss>