-
Notifications
You must be signed in to change notification settings - Fork 12
/
Copy pathepisode-6.json
1 lines (1 loc) · 36.8 KB
/
episode-6.json
1
{"name":"June 18, 2020 17:55","messages":[{"text":"alright everybody I think this is working we're going to sit here and make sure because you know it's been awhile","created_at":"2020-06-18T18:00:15.545Z"},{"text":"in the streams they do weird things","created_at":"2020-06-18T18:00:19.425Z"},{"text":"but I see myself over there to here we go","created_at":"2020-06-18T18:00:21.381Z"},{"text":"back out 20-25 today recovering service reporting with Amazon Athena if this is your first episode of your first time joining me here on the channel thank you if you're coming back thanks for joining it's been awhile I know things have been crazy today we're going to go over how to query the data that we've generated from our next Generation application and so we'll go back and we'll review the architecture pretty quickly","created_at":"2020-06-18T18:00:52.380Z"},{"text":"and then we'll go in and clear that they did with Amazon Athena","created_at":"2020-06-18T18:00:56.405Z"},{"text":"that's one of the first things we'll talk about it just what is Amazon Athena","created_at":"2020-06-18T18:00:59.437Z"},{"text":"how we can use it to query that data that we stored the other sequel characteristics it has like we can create views for applications and where you should create those views we're going to take a quick stab a visualization of our data with Amazon quicksight which is a business intelligence product that helps you visualize your data into meaningful dashboards charts and graphics and then we're going to cover the registry of open data on AWS so that you have some understanding of some of the available data that out there that you can either ingest into your application or ingest alongside your data with quicksight to enrich it and it will save some time for Q\u0026A at the end","created_at":"2020-06-18T18:01:43.412Z"},{"text":"get too deep into this I want to give you a cup","created_at":"2020-06-18T18:01:47.426Z"},{"text":"caveats right so one when I originally planned this episode it was back in May and they were occurring every week until I wrote a simulator to generate transactions put them on to the event Bridge event bus and process them and eventually write them out as data","created_at":"2020-06-18T18:02:09.488Z"},{"text":"I caught an error in it so I said well instead of doing them at a rate of 10 today I'll run them at a rate of 10 today","created_at":"2020-06-18T18:02:18.431Z"},{"text":"I'll run them at a rate of sorry 10 per second i x 10 said I'll run it at a hundred transactions per second to catch up","created_at":"2020-06-18T18:02:26.415Z"},{"text":"and it in the episode didn't air","created_at":"2020-06-18T18:02:30.471Z"},{"text":"are we going to air the next week so I was like well I'll just let it run and and then it didn't air and then y'all I left it running and that's like production levels of transactions per second so the good news is we have a lot of data to query the lesson here is definitely pay attention when you're simulating transactions because it's a lot of data tickets generated can you see the bill I'm pretty sure our boss is already seen the bill buddy","created_at":"2020-06-18T18:03:02.473Z"},{"text":"for close to a billion","created_at":"2020-06-18T18:03:05.379Z"},{"text":"Renters of the entire system everything for it which is like a month load of transactions into end came out to under $5,000 which is kind of not in your account this time","created_at":"2020-06-18T18:03:20.394Z"},{"text":"siddur that's you know all your high-availability the fact that all these transactions are being processed in milliseconds you know it's essentially like the workload of of an entire business $5,000 a month on a on a bill with no optimization and just like stabbing stuff in there intentionally sloppily to get it out the door","created_at":"2020-06-18T18:03:40.520Z"},{"text":"how many ec2 instances that is but considering there's no you know we didn't have to run olap datastore you have to run a data warehouse any other stuff we just junk stuff through or are step functions work clothes stored in S3 and here we are with a bunch of Dayton OH so I'll be washing the dishes for a while to pay for that one but you know learn from my mistakes","created_at":"2020-06-18T18:04:06.525Z"},{"text":"today we're going to be there's not a whole lot of code so I'll push that code for the simulation of transactions free to look at again be careful if you run it I won't push it up with a hundred transactions per second is I had it","created_at":"2020-06-18T18:04:21.526Z"},{"text":"for the most part we're going to be in the console today which maybe this will be your favorite up so maybe this will be your least favorite episode I don't know there's going to be lots of clicking and the thing with that is it's not really deterministic and I'm not very good with a console so I'll ask you to bear with me as we click around in here because even though I've gone through this several times at my comfort zone is being in the CLI and doing everything is code so","created_at":"2020-06-18T18:04:46.521Z"},{"text":"let's talk about Amazon Athena first","created_at":"2020-06-18T18:04:50.404Z"},{"text":"Amazon Athena is a service","created_at":"2020-06-18T18:04:53.457Z"},{"text":"how to get your data from multiple locations in our case Amazon S3 buckets for you stored it it allows you to run SQL queries against it it's the Presto SQL engine","created_at":"2020-06-18T18:05:05.431Z"},{"text":"send text wife is very similar to post gray syntax if you're familiar with that so it does have its own","created_at":"2020-06-18T18:05:14.497Z"},{"text":"characteristics and in fact if you look in the","created_at":"2020-06-18T18:05:16.426Z"},{"text":"here which will share in the chat is a link to Athena sequel to its own flavor there for you to go check out whatever you're writing Athena SQL queries that's always a good place to start","created_at":"2020-06-18T18:05:27.420Z"},{"text":"but it's","created_at":"2020-06-18T18:05:30.471Z"},{"text":"by the amount of data that you pull in and out for your queries right which introduces a couple Concepts","created_at":"2020-06-18T18:05:36.598Z"},{"text":"one you want to store your data compressed and you want to store your data in an efficient format so you can store your data in there at csvs but it's not efficient to read or extract or store so you're going to be paying more than you need will go back and look at the code that we wrote in the Kinesis data firehose episode we actually stored our data in parquet format Apache parquet which is a columnar format and the advantage of that is when you're doing scans your only reading the columns that you need to your only reading the data that you need and you're not paying for the rest of the data","created_at":"2020-06-18T18:06:14.467Z"},{"text":"excuse me y'all I'm glad I got the mute button this time so usually I miss it but so really if you can you want to be storing your data and parquet format especially as your data grows and as you have to read more more of your data you also want to consider partitioning your data we haven't done that for this episode it's a pretty complex topic but by partitioning your data you can query only inside those partitions typically by date so you can query only in a certain time range that matters to you and you limit the scope of the data that you're looking at which a increases your performance but be also decreases your cost so again you'll find information about that in which are linked here in the lynx come in","created_at":"2020-06-18T18:07:03.515Z"},{"text":"if you're looking for the repo for this series right now it's temporarily down while we resolved some things and migrated to an official repo so that's a good news is it's going to be up under AWS samples and we'll be living there living on as a permanent repo the bad news","created_at":"2020-06-18T18:07:22.382Z"},{"text":"so appreciate your patience on that","created_at":"2020-06-18T18:07:25.402Z"},{"text":"let's go back and let's look at our diagram","created_at":"2020-06-18T18:07:29.406Z"},{"text":"let's go to see where we are for a quick refresher how we got here","created_at":"2020-06-18T18:07:35.431Z"},{"text":"so if you remember the Amazon event Bridge event buses the backbone of our application and as we move across the bus from left to right through time we put events on to the bus and rules listen to those events filter those events and kick-off workflows AWS step functions which represent our business processes","created_at":"2020-06-18T18:07:57.396Z"},{"text":"Furlong running processes with standard work clothes how to use them for quick running processes with Express workflows how to do service Integrations to write data into dynamodb or two in our case into Kinesis data firehose so that it gets into S3 buckets that's this split that you see right here a single","created_at":"2020-06-18T18:08:21.446Z"},{"text":"can be captured by two rules","created_at":"2020-06-18T18:08:24.426Z"},{"text":"text was in our service one of which send it to her online transaction processing system functions workflow","created_at":"2020-06-18T18:08:31.600Z"},{"text":"and the other which sends it into a raw S3 bucket where it can be converted to park a and stored in our process bucket using AWS glue but so all of this stuff that we feel we still haven't provisioned any servers we still haven't provisioned anything that runs when we don't have business event everything start to finish this still serverless for us in this architecture so that's how we got here I'm not going to go through this code but just to jog your memory in the billing component of our app that's where we used in AWS glue table","created_at":"2020-06-18T18:09:11.465Z"},{"text":"a fire hose","created_at":"2020-06-18T18:09:13.567Z"},{"text":"to write that information into an S3 bucket","created_at":"2020-06-18T18:09:17.478Z"},{"text":"reviews the DC realizer","created_at":"2020-06-18T18:09:20.419Z"},{"text":"and we've written it out as parquet with some other configuration offer options okay so all of this was from the Kinesis data firehose episode which is the last one that we did on May 12th I believe so if you don't remember how we got to hear maybe is May 10th","created_at":"2020-06-18T18:09:36.426Z"},{"text":"I don't remember how we got to hear go back and take a look at that episode and see what we did with Kinesis data firehose to write these records into an S3 bucket","created_at":"2020-06-18T18:09:45.446Z"},{"text":"so before we get into Athena let's take a look at that S3 bucket","created_at":"2020-06-18T18:09:52.521Z"},{"text":"we know we have the raw data bucket","created_at":"2020-06-18T18:09:56.419Z"},{"text":"you can see","created_at":"2020-06-18T18:09:58.429Z"},{"text":"4","created_at":"2020-06-18T18:10:00.395Z"},{"text":"sometime here from May 20th","created_at":"2020-06-18T18:10:04.478Z"},{"text":"the June 6th","created_at":"2020-06-18T18:10:07.479Z"},{"text":"I wrote Sumrall records into this bucket right","created_at":"2020-06-18T18:10:12.391Z"},{"text":"these are just the Json objects that were sent out through the event and you see there's quite a few of them every hour","created_at":"2020-06-18T18:10:18.668Z"},{"text":"we go back we also had the process data bucket","created_at":"2020-06-18T18:10:24.662Z"},{"text":"and this is where we see the partitioning it's been done even though we're not using it","created_at":"2020-06-18T18:10:30.656Z"},{"text":"and in here all of these are parquet files and there's fewer of them because they're packed in the way that they're done they're packed in and compressed so we have all these items in RS3 bucket that we can go back and query against so that's where we're getting the state of from when we're in Athena it views that","created_at":"2020-06-18T18:10:55.389Z"},{"text":"base as the glue database that weed","created_at":"2020-06-18T18:10:58.475Z"},{"text":"so yes it's coming from S3 but Athena thinks that it is a glue database it's just that the glue database then knows that that table is located in S3 and that's relevant because you can put tables in different places so that you pull all of that data together and Athena query is it enough of Federated or unified fashion without needing to know where all of that data came from specifically just sees it as a glue database","created_at":"2020-06-18T18:11:27.569Z"},{"text":"so now when we go back over here this will be a fun when we got this transactions table that we created with the customer ID the the date times of the events Source accounts and destination accounts we can see where it's flowing and the total amount and we can begin to run SQL queries against it right","created_at":"2020-06-18T18:11:45.463Z"},{"text":"the most basic is just to figure out how many of these records we have over the entire life of me","created_at":"2020-06-18T18:11:52.470Z"},{"text":"firehosing Record Service","created_at":"2020-06-18T18:11:56.428Z"},{"text":"time 10 and 1/2 seconds didn't actually scan the data","created_at":"2020-06-18T18:12:00.428Z"},{"text":"what is 104000874 million records in here","created_at":"2020-06-18T18:12:06.423Z"},{"text":"and then at this point","created_at":"2020-06-18T18:12:09.424Z"},{"text":"all of presto SQL available to us","created_at":"2020-06-18T18:12:14.645Z"},{"text":"certain by amount we can look at them by date we can perform the analytics transactions on them like average and standard deviation","created_at":"2020-06-18T18:12:23.628Z"},{"text":"love you thank you thank you","created_at":"2020-06-18T18:12:25.552Z"},{"text":"you all to you only the best","created_at":"2020-06-18T18:12:28.452Z"},{"text":"so there's you get all that power from let's look at this link real quick","created_at":"2020-06-18T18:12:35.590Z"},{"text":"Athena SQL from the Presto SQL language","created_at":"2020-06-18T18:12:38.429Z"},{"text":"here inside those queries now","created_at":"2020-06-18T18:12:42.397Z"},{"text":"this isn't the only way that you can use Athena","created_at":"2020-06-18T18:12:47.391Z"},{"text":"you can do API calls to start executions of","created_at":"2020-06-18T18:12:51.408Z"},{"text":"you can save named queries better than access by other components of your application you can restrict access to those queries using IAM roles and permissions so it's it's not a query in the console service there's a lot that can be done here and then of course as we will see in a minute you can also use this as the backing for quick site for Amazon quicksight so that you can drive your your bi they're very thirsty cuz I just came off a stream with my man Eric Johnson here so I've already been talking a lot today","created_at":"2020-06-18T18:13:22.448Z"},{"text":"hello pod there little quick pot","created_at":"2020-06-18T18:13:27.470Z"},{"text":"September of the ones I have okay so here I have this for this view that I've created as an example","created_at":"2020-06-18T18:13:40.400Z"},{"text":"you can create views","created_at":"2020-06-18T18:13:43.444Z"},{"text":"Amazon Athena and you treat them just like views in any other relational database right there compiled they are efficient you can query against them by name as without knowing how they're necessarily constructed when you get into permissions you can also create views inside Amazon quicksight and we'll see that and it's kind of up to you to figure out where is more effective for you to put that view if it's something that across the business people will want to see like you're aggregating glue tables across S3 buckets that belong to different organizational units in your company then you may want to create that unified view here in Athena so the people across business units can query against an entire set of data if you're creating something for yourself","created_at":"2020-06-18T18:14:37.619Z"},{"text":"dashboard and you may want to do that inside Amazon quicksight","created_at":"2020-06-18T18:14:40.475Z"},{"text":"but you have the option to do both and we'll explore that in a minute to when we get all clicky","created_at":"2020-06-18T18:14:46.445Z"},{"text":"but again just as from that that links document their the Presto SQL this should be a pretty standard SQL for you write 50 years","created_at":"2020-06-18T18:14:55.456Z"},{"text":"a Creator replace view view name as select statement","created_at":"2020-06-18T18:14:59.444Z"},{"text":"and so in this case","created_at":"2020-06-18T18:15:03.496Z"},{"text":"the substring syntax from","created_at":"2020-06-18T18:15:06.567Z"},{"text":"so we're starting at the first character taking two characters and then what we've given ourselves here is this a preview here yeah","created_at":"2020-06-18T18:15:16.684Z"},{"text":"we can run this query","created_at":"2020-06-18T18:15:20.458Z"},{"text":"using","created_at":"2020-06-18T18:15:22.454Z"},{"text":"give you a command and it just grabs 10 of them","created_at":"2020-06-18T18:15:25.611Z"},{"text":"so we have","created_at":"2020-06-18T18:15:27.437Z"},{"text":"I'll be honest with you I don't know all of these country codes I apologize I feel like I should I'm imagining that's Macedonia Hungary Luxembourg but we can see sort of okay this is the shape of the data that I expected to get","created_at":"2020-06-18T18:15:40.401Z"},{"text":"out of this view right and I can see the source country the destination country in the transaction amount","created_at":"2020-06-18T18:15:47.410Z"},{"text":"and now here we get into like actual use cases right so if we have a whole list of financial transactions we can go and say show me all of the transactions from the Benelux region to Cypress","created_at":"2020-06-18T18:16:04.439Z"},{"text":"if you want to look for Trade Agreement or for renegotiation of transfer pricing or for anti-money laundering or for you know what he told Market segmentation pricing rightly differentiated pricing whatever your use cases these views can shape all of that data into what you need so that you can then drive your intelligence off of it and again don't forget that Athena can be queried via API so it's not just here in the console inside your applications even for like real-time decision-making for fraud detection for other sorts of you know so you have a minimum level of transaction that must be achieved lifetime before you start doing something and somebody applies for that you can do that query over the life of all of their transactions","created_at":"2020-06-18T18:16:55.396Z"},{"text":"be very fast query very effective query regardless of how many record","created_at":"2020-06-18T18:16:58.395Z"},{"text":"and you get that answer back","created_at":"2020-06-18T18:17:01.415Z"},{"text":"NA Step functions workflow say right","created_at":"2020-06-18T18:17:05.448Z"},{"text":"let's see what else we have in here create or replace view country for we've done that one already and that's the same as this","created_at":"2020-06-18T18:17:14.483Z"},{"text":"select an account so we can sit here and we can write sql query","created_at":"2020-06-18T18:17:18.509Z"},{"text":"but I think it's more interesting to take this into quicksight and see what we can actually visualize with it","created_at":"2020-06-18T18:17:25.556Z"},{"text":"let's go over and go to Quick sign","created_at":"2020-06-18T18:17:31.397Z"},{"text":"so if I deleted my account so I can show you how you get star","created_at":"2020-06-18T18:17:34.443Z"},{"text":"I have not","created_at":"2020-06-18T18:17:37.438Z"},{"text":"first go into quicksight it's something that you have to enable you have to select","created_at":"2020-06-18T18:17:41.410Z"},{"text":"that you want to be on and you have to create SSO","created_at":"2020-06-18T18:17:44.429Z"},{"text":"just be aware that there is a cost to the server","created_at":"2020-06-18T18:17:47.527Z"},{"text":"and again it's not the only way for you to visualize your data it's just a tool that's here for you and I want you to be able to see","created_at":"2020-06-18T18:17:57.571Z"},{"text":"search man it's some data where in North Virginia that's not where our data is","created_at":"2020-06-18T18:18:03.442Z"},{"text":"jump over here","created_at":"2020-06-18T18:18:05.403Z"},{"text":"I'll look out even in here so we can open this to see what I built and then we can look at another one","created_at":"2020-06-18T18:18:12.456Z"},{"text":"this one we've just built a little a heat map is one of the built-in visualisations that we have so you see we have heat Maps tables line charts donut charts Donuts all these different types of built-in visualisations as well as Insight which offers you","created_at":"2020-06-18T18:18:34.521Z"},{"text":"assisted insights into your data over time","created_at":"2020-06-18T18:18:38.429Z"},{"text":"our case what we have is","created_at":"2020-06-18T18:18:41.560Z"},{"text":"this is probably going to take a while because it's","created_at":"2020-06-18T18:18:47.434Z"},{"text":"in the state of for the first time but","created_at":"2020-06-18T18:18:50.394Z"},{"text":"let me see if I can make this larger for you","created_at":"2020-06-18T18:18:54.632Z"},{"text":"in the field well what we're going to have our","created_at":"2020-06-18T18:18:58.403Z"},{"text":"code by Rose and the destination","created_at":"2020-06-18T18:19:02.545Z"},{"text":"and then the destination country code by columns so we're looking at transactions from Benelux to Cyprus Great Britain and Ireland a typical tax","created_at":"2020-06-18T18:19:11.587Z"},{"text":"audit or investigation that you would see right and the values is going to be the total sum","created_at":"2020-06-18T18:19:18.414Z"},{"text":"look we have some applied filters here","created_at":"2020-06-18T18:19:23.456Z"},{"text":"give us","created_at":"2020-06-18T18:19:25.556Z"},{"text":"values right so here we have a drop-down custom filter list very easy to create for the source country we only want to see transactions coming from that view I'll be honest with y'all haven't run this report against 174 million transactions before so","created_at":"2020-06-18T18:19:41.421Z"},{"text":"who knows what's going to happen","created_at":"2020-06-18T18:19:44.405Z"},{"text":"select the countries that you want to see here","created_at":"2020-06-18T18:19:48.477Z"},{"text":"the same thing for the destination filter","created_at":"2020-06-18T18:19:52.422Z"},{"text":"I received that it's the from one country to another and then the titles are there as well","created_at":"2020-06-18T18:19:59.439Z"},{"text":"here in our visualization let me go check my data sources","created_at":"2020-06-18T18:20:06.501Z"},{"text":"to make sure that we're doing what we want to do manage data","created_at":"2020-06-18T18:20:12.566Z"},{"text":"the data sets transactions so it's from Athena","created_at":"2020-06-18T18:20:16.710Z"},{"text":"you can apply row level security","created_at":"2020-06-18T18:20:20.616Z"},{"text":"I don't think we want to duplicate that dataset","created_at":"2020-06-18T18:20:23.417Z"},{"text":"if we added it we see the transactions is coming in from","created_at":"2020-06-18T18:20:30.652Z"},{"text":"yeah","created_at":"2020-06-18T18:20:32.520Z"},{"text":"table name transactions","created_at":"2020-06-18T18:20:37.747Z"},{"text":"with these are calculated fields","created_at":"2020-06-18T18:20:43.498Z"},{"text":"added so you can add that as well so we're not actually using the view here","created_at":"2020-06-18T18:20:47.520Z"},{"text":"this is what I was talking about when you we could either add data","created_at":"2020-06-18T18:20:53.396Z"},{"text":"from the country info table","created_at":"2020-06-18T18:20:57.399Z"},{"text":"inquiry that country info table directly","created_at":"2020-06-18T18:21:00.464Z"},{"text":"or","created_at":"2020-06-18T18:21:03.577Z"},{"text":"move that we cannot calculated Fields ourselves","created_at":"2020-06-18T18:21:07.406Z"},{"text":"and so for these what we've done is we've applied that Presto that same Presto SQL of substring to the destination account","created_at":"2020-06-18T18:21:17.470Z"},{"text":"get a definition country code","created_at":"2020-06-18T18:21:19.450Z"},{"text":"there's a couple different ways that you can do this if your analytics users are your business intelligence users are comfortable SQL they can apply at themselves but you also have the option of giving it to them internally","created_at":"2020-06-18T18:21:31.483Z"},{"text":"so let's get back out of here managed data or data sets we can add a new dataset and there's permissions that you need to apply so we'll come back here again we'll come from Athena","created_at":"2020-06-18T18:21:46.466Z"},{"text":"country data","created_at":"2020-06-18T18:21:50.398Z"},{"text":"for my primary work group","created_at":"2020-06-18T18:21:54.397Z"},{"text":"make sure that connections Goods we got the permissions right","created_at":"2020-06-18T18:21:58.480Z"},{"text":"create data source","created_at":"2020-06-18T18:22:00.498Z"},{"text":"case that's that app 20-25 AWS glue database that we talked about right","created_at":"2020-06-18T18:22:07.406Z"},{"text":"take that country info view that we had we go back over here","created_at":"2020-06-18T18:22:11.439Z"},{"text":"we had create or replace view country info","created_at":"2020-06-18T18:22:16.418Z"},{"text":"right so this is what we're looking at this is less data so","created_at":"2020-06-18T18:22:21.419Z"},{"text":"it's fewer columns all of them irrelevant so as we go on this should","created_at":"2020-06-18T18:22:28.568Z"},{"text":"little faster hopefully for us to run against","created_at":"2020-06-18T18:22:32.460Z"},{"text":"soyuz country info we don't have any info to bring it into spice which will accelerate everything I'm not sure why I probably put something in there and then forgot to remove it so","created_at":"2020-06-18T18:22:45.401Z"},{"text":"for the rehearsal","created_at":"2020-06-18T18:22:49.561Z"},{"text":"so if we go to treemap that's fancy","created_at":"2020-06-18T18:22:54.515Z"},{"text":"if we give ourselves a heatmap","created_at":"2020-06-18T18:22:58.585Z"},{"text":"then again we can set the source country code to be the row","created_at":"2020-06-18T18:23:04.506Z"},{"text":"destination country code to be the column","created_at":"2020-06-18T18:23:07.605Z"},{"text":"and some of the transaction come out","created_at":"2020-06-18T18:23:12.459Z"},{"text":"Ron's we're going to get you know I think it Limitless it 50 country codes in each","created_at":"2020-06-18T18:23:17.473Z"},{"text":"so we want to go ahead and apply those filters he","created_at":"2020-06-18T18:23:19.463Z"},{"text":"as well","created_at":"2020-06-18T18:23:22.417Z"},{"text":"we don't want to include all filter list","created_at":"2020-06-18T18:23:26.407Z"},{"text":"this is","created_at":"2020-06-18T18:23:32.400Z"},{"text":"see there you go you got that nice","created_at":"2020-06-18T18:23:35.480Z"},{"text":"I think it's maybe it's 40 by 40 heat map","created_at":"2020-06-18T18:23:39.410Z"},{"text":"because we haven't applied to filters yet it will cut off and not show everything which is why you need to","created_at":"2020-06-18T18:23:48.419Z"},{"text":"filter it","created_at":"2020-06-18T18:23:50.486Z"},{"text":"the only ones you're interested in and we can see that even though we've got all these transactions were generated using","created_at":"2020-06-18T18:23:57.626Z"},{"text":"so they're all zero to a thousand","created_at":"2020-06-18T18:24:02.541Z"},{"text":"units monetary units","created_at":"2020-06-18T18:24:04.429Z"},{"text":"another randomly distributed you do see that you get randomly distributed doesn't mean evenly distribute","created_at":"2020-06-18T18:24:11.498Z"},{"text":"so we do have some heat map","created_at":"2020-06-18T18:24:14.432Z"},{"text":"she's here and we said we were going to check from Benelux Nations so we need Luxembourg","created_at":"2020-06-18T18:24:23.552Z"},{"text":"come on I can scroll faster here we go and the Netherlands","created_at":"2020-06-18T18:24:29.737Z"},{"text":"reply that one then we can close and now this will have only three rows and you know 50 columns or 40 columns whatever that case is","created_at":"2020-06-18T18:24:42.471Z"},{"text":"while that's going","created_at":"2020-06-18T18:24:45.470Z"},{"text":"add a note right so you see here we don't yet we don't have this access access but we can see for example most transfers from Luxembourg appear to have gone to any help me out of ZTE Estonia","created_at":"2020-06-18T18:25:00.626Z"},{"text":"sanity check me on that one","created_at":"2020-06-18T18:25:04.567Z"},{"text":"and of course again all the state is fake so","created_at":"2020-06-18T18:25:07.526Z"},{"text":"can I add a filter to our source country code","created_at":"2020-06-18T18:25:11.416Z"},{"text":"but that's not what we want we want to filter list so again we wait for it to come through","created_at":"2020-06-18T18:25:16.532Z"},{"text":"and if you've used products like bi-products before remember we're dealing with massive sets of data here like watch have 174 million rows of anything it's not like clearing against a thousand Rose where you get it back like especially the fact that this is all coming in from S3 so you know I've seen production queries to take 4 hours in some products","created_at":"2020-06-18T18:25:47.414Z"},{"text":"unrealistic from Benelux to Great Britain Ireland or Cyprus right Cypress Cy","created_at":"2020-06-18T18:25:57.537Z"},{"text":"chibi","created_at":"2020-06-18T18:25:59.422Z"},{"text":"and Ireland's eye","created_at":"2020-06-18T18:26:03.480Z"},{"text":"can we apply those and I will filter down","created_at":"2020-06-18T18:26:07.597Z"},{"text":"top three in destination country in bottom 50in Source country","created_at":"2020-06-18T18:26:11.587Z"},{"text":"as you see from this little highlight on here this is limited to the 50","created_at":"2020-06-18T18:26:16.409Z"},{"text":"the ones that you would use","created_at":"2020-06-18T18:26:20.494Z"},{"text":"beautiful little heat map for audit to say okay these are the countries that we're concerned with","created_at":"2020-06-18T18:26:28.575Z"},{"text":"by looking at this received the most transactions are from Ireland to Luxembourg","created_at":"2020-06-18T18:26:38.403Z"},{"text":"millionaire and then the least transactions","created_at":"2020-06-18T18:26:42.415Z"},{"text":"from Great Britain","created_at":"2020-06-18T18:26:45.432Z"},{"text":"to Luxembourg at 18 million and now now that you're looking at that you see why 18.5 million vs was is 19.1 million","created_at":"2020-06-18T18:26:57.464Z"},{"text":"not that big of a difference which is where you would of course","created_at":"2020-06-18T18:27:00.415Z"},{"text":"will you Skate Supply your standard deviations and how far outside of that are they are they truly outliers","created_at":"2020-06-18T18:27:06.485Z"},{"text":"I just use the song but I wanted you to have a simple example to visualize what we can do with 174 million rows of data and how we can begin to look for patterns or how we can display that to our users so again","created_at":"2020-06-18T18:27:25.407Z"},{"text":"big short session today I want to come back to","created_at":"2020-06-18T18:27:29.407Z"},{"text":"architecture diagram","created_at":"2020-06-18T18:27:31.551Z"},{"text":"point out that all of this is happening from that single event going onto the bus","created_at":"2020-06-18T18:27:37.703Z"},{"text":"and then the two rules are picking it up","created_at":"2020-06-18T18:27:40.458Z"},{"text":"one is taking it through and processing","created_at":"2020-06-18T18:27:43.446Z"},{"text":"Jackson for us but the other is taking it out and preparing it for analytics later and again the important thing to remember here is all of this has been done serverless Lee the only thing we've paid for is to download that data to run the query which we have to and nothing else right the storage is already covered by being an S3 the processing is covered at the query runtime so very powerful service a good way for us to take what's an abstract back end and then visualize it for a business customers and it just seeing this architecture that's not the first time we've focused on","created_at":"2020-06-18T18:28:24.408Z"},{"text":"beginning we went in and looked at our step functions workflows and one of the I'll go back to the console for that one of the advantages of Step functions that we've talks about is the ability to sit side-by-side with the analysts so we can look at our transaction process here to pick","created_at":"2020-06-18T18:28:45.418Z"},{"text":"Express work phone.","created_at":"2020-06-18T18:28:46.404Z"},{"text":"wow that's a simplified one","created_at":"2020-06-18T18:28:50.469Z"},{"text":"we can look at our expired subscription","created_at":"2020-06-18T18:28:54.437Z"},{"text":"with the analysts and say is this what you want and before we begin coding and then we taking that all the way through to the very end where we saved that data and then regenerated visualization for this is what our business is doing at this point in time this is where we need to focus our efforts is on transactions from Ireland to Luxembourg and give that very graphical representation so all of that exists inside this one repo all of that is done in a collaborative method with your with your business partner with your colleagues whomever you're working alongside and all of that is done serverless Lee so again just a very powerful into n method of building application know we're in here have you had to worry about redundancy or load balancing or patching or any of this stuff right all of the benefits of service application","created_at":"2020-06-18T18:29:56.435Z"},{"text":"from ingestion to","created_at":"2020-06-18T18:30:00.419Z"},{"text":"collation processing and visualization","created_at":"2020-06-18T18:30:03.596Z"},{"text":"I'm going to wrap it up here again like I said this is a very console driven episode so not a lot of code here but I wanted you to see how quick Athena and quick side can really be on massive datasets I'm sorry I said I was going to talk about one more thing here","created_at":"2020-06-18T18:30:22.440Z"},{"text":"that's it for architecture","created_at":"2020-06-18T18:30:24.521Z"},{"text":"the registry of open data on AWS","created_at":"2020-06-18T18:30:29.515Z"},{"text":"registry. Open data. AWS 157 as of the recording of this episode","created_at":"2020-06-18T18:30:35.503Z"},{"text":"better there for you too","created_at":"2020-06-18T18:30:38.416Z"},{"text":"against based on whatever your interest of your business cases and they're sorted if you want anomic they decides if you want","created_at":"2020-06-18T18:30:49.431Z"},{"text":"fraud what does it mean","created_at":"2020-06-18T18:30:51.444Z"},{"text":"spatial data sets so different data sets that you can use","created_at":"2020-06-18T18:30:56.527Z"},{"text":"Aquarius to enrich your applications","created_at":"2020-06-18T18:30:59.548Z"},{"text":"again you don't have to run a database server to get these data sets you just bring them into your own S3 bucket monster in your S3 bucket there there for you to query using Athena glue in quick side","created_at":"2020-06-18T18:31:13.435Z"},{"text":"so a very visual episode","created_at":"2020-06-18T18:31:15.407Z"},{"text":"today shorter one","created_at":"2020-06-18T18:31:19.488Z"},{"text":"but again when we go back and we look at what we've learned we talked about how all of that data that we were generating can be pulled in by Amazon Athena the fact that we want to store it as columnar data using parquet and compressed for performance and cost reasons the fact that you can create views and query directly within quicksight you can save those views you can save those queries as named queries that you can then execute from applications using IAM permissions","created_at":"2020-06-18T18:31:48.425Z"},{"text":"all of that can be pulled into quicksight","created_at":"2020-06-18T18:31:51.417Z"},{"text":"location for your business users you can also create calculated field that help do things similar to views but we showed how the view starts with less data","created_at":"2020-06-18T18:32:03.453Z"},{"text":"it's faster to execute inside quicksight than the calculated field if you don't need the rest of that data and then we went over the registry of open data on AWS that you can use to enrich your applications","created_at":"2020-06-18T18:32:19.485Z"},{"text":"5 is a series I like to thank you all for joining I may add some bits and pieces to this overtime as we continue to release features you know now that we have step-function support in AWS am I'll go back and modify the templates to use that so some of that clean up over time but I intend for this to be like a true Living Model that you can use to build your business applications So to that and I would always welcome your feedback","created_at":"2020-06-18T18:32:46.440Z"},{"text":"what you seen in here that worked for you what might not have worked for you or what was missing","created_at":"2020-06-18T18:32:52.532Z"},{"text":"pictures that you want to see from us that you want us to build my Twitter handle is there DM's are always open and then of course you know where to find me on Twitch because you're here right now again will get the this source code up for you as soon as we can until then and thank you all for joining the series and I wish you the best of luck as you built by everybody","created_at":"2020-06-18T18:33:16.413Z"}]}