Table of Contents
- Cosmos DB scenario-based labs - Retail hands-on lab step-by-step
- Abstract and learning objectives
- Overview
- Solution architecture (High-level)
- Requirements
- Before the hands-on lab
- Exercise 1: Configure Databricks and generate event data
- Exercise 2: Complete and deploy Web and Function App
- Exercise 3: Perform and deploy association rules calculation for offline algorithms
- Exercise 4: Complete and deploy Web App and Function App (Association Rules)
- Exercise 5: Perform and deploy collaborative filtering rules calculation
- Exercise 6: Reporting with Stream Analytics and Power BI
- Exercise 7: Email alerts using Logic Apps
- After the hands-on lab
In this hands-on-lab, you will complete various tasks to complete the implementation of an e-commerce site that utilizes an AI driven recommendation engine using several Microsoft Azure PaaS services.
At the end of this lab you will understand how to design recommendation systems that store data in Cosmos DB using Databricks. You will also see how to implement an e-commerce store front with Cosmos DB as its data store. Additionally, you will see how to leverage the Cosmos DB change feed to execute functions for reporting and monitoring activities with Stream Analytics, Power BI and Logic Apps.
Contoso Movies, Ltd. has expressed their desire to move to a more modern and cloud-based approach to their online e-commerce presence. The have decided to utilize Cosmos DB and Azure Databricks to implement their next generation recommendation system using various popular AI driven recommendation algorithms.
They would also like to have real-time reporting on user site actions such as viewing item details, adding items to carts and purchase events. Based on this data, they would like to know immediately if there are potential issues with the order processing pipeline.
Below is a diagram of the solution architecture you will build in this lab. Please study this carefully, so you understand the whole of the solution as you are working on the various components.
-
Data ingest, event processing, and storage:
The solution for the Retail scenario centers around Cosmos DB, which acts as the globally-available, highly scalable data storage for streaming event data for reporting and external integrations. User telemetry data flows in from the data generator where an Azure function processes the event data and inserts it into a container in Cosmos DB.
- Event processing with Azure Functions:
The Cosmos DB change feed triggers a single Azure function (although the functionality could be broken into many different functions). The single function provides three pieces of functionality.
-
Aggregate Calculations - This code updates the item aggregations for the
buy
events to keep track of the top items purchased. This will continually update and drive thetop
suggestions. You will see this when you execute the Data Generator tool. -
Forward events to EventHub - This code will forward the changefeed item to the event hub where Stream Analytics will then process the data.
-
Call a Logic App - This code will forward the changefeed item to the logic app's http endpoint that will generate an email
-
Stream processing, dashboards, and reports:
Stream Analytics queries the forwarded event data and aggregates to Power BI to display a real-time dashboard of user activity.
-
Advanced analytics and ML model training:
Azure Databricks is used to generate a set of offline calculations based on user events to create implicit ratings and associations used to drive new and current user recommendations.
-
eCommerce web app:
A Web App allows Contoso customers to browse and purchase movies. Azure Key Vault is used to securely store centralized application secrets, such as connection strings and access keys, and is used by the Function App, Web App, and Azure Databricks. Finally, Application Insights provides real-time monitoring, metrics, and logging information for the Function App and Web App.
-
Microsoft Azure subscription must be pay-as-you-go or MSDN.
- Trial subscriptions will not work.
-
Azure CLI - version 2.0.68 or later
Refer to the Before the hands-on lab setup guide manual before continuing to the lab exercises.
Duration: 30 minutes
Synopsis: We have pre-generated a set of events that include buy and details events. Based on this data, a Top Items recommendation will be made to users that are new to the site (aka a cold start recommendation). You will implement this top items code in the web application and function applications, then deploy the applications to test the functionality.
The algorithms for creating the offline calculations are written in Python and are executed via Azure Databricks.
-
Open the Azure portal (https://portal.azure.com), search for your assigned lab resource group. If you were not assigned a resource group, your generated resource group will be named after the following pattern: YOURINIT-s2-retail.
-
Select your resource group, and then select your Azure Databricks instance, it should be named s2_databricks....
-
Select Launch Workspace, if prompted, login as the account you used to create your environment.
-
In the side navigation, Select Clusters.
-
Select Create Cluster.
-
On the create cluster form, provide the following:
-
Cluster Name: small
-
Cluster Type: Standard
-
Databricks Runtime Version: Runtime: 5.5 (Scala 2.11, Spark 2.4.3) (Note: the runtime version may have LTS after the version. This is also a valid selection.)
-
Python Version: 3
-
Enable Autoscaling: Uncheck this option.
-
Auto Termination: Check the box and enter 120
-
Worker Type: Standard_DS3_v2
-
Driver Type: Same as worker
-
Workers: 1
-
-
Select Create Cluster.
-
Before continuing to the next step, verify that your new cluster is running. Wait for the state to change from Pending to Running.
-
Select the small cluster, then select Libraries.
-
Select Install New.
-
In the Install Library dialog, select Maven for the Library Source.
-
In the Coordinates field type:
com.microsoft.azure:azure-cosmosdb-spark_2.4.0_2.11:1.4.1
-
Select Install.
-
Wait until the library's status shows as Installed before continuing.
-
Within Azure Databricks, select Workspace on the menu, then Users, select your user, then select the down arrow on the top of your user workspace. Select Import.
-
Within the Import Notebooks dialog, select Import from: file, then drag-and-drop the file or browse to upload it (
{un-zipped repo folder}/Retail/Notebooks/02 Retail.dbc
) -
Select Import
-
After importing, select the new 02 Retail folder, then navigation to the Includes folder
-
Select the Shared-Configuration notebook
-
Update the configuration settings and set the following using the values from your lab setup script output:
- Endpoint = Cosmos DB endpoint url
- Masterkey = Cosmos DB master key
- Database = Database id of the cosmos db ('movies')
If you do not have your setup script output values available for reference, you may find the
Endpoint
andMasterkey
values by navigating to your Cosmos DB account in the Azure portal, then selecting Keys in the left-hand menu. Copy the URI value forEndpoint
, and Primary Key for theMasterkey
value. -
Attach your cluster to the notebook using the dropdown. You will need to do this for each notebook you open. In the drop down, select the small cluster.
-
Next, navigate back up to 02 Retail and select the 01 Event Generator notebook
This notebook will simulate the browsing and purchasing activity for six users with different personality based preferences and save the result to the
events
container in Cosmos DB.The movies have been pre-selected and sorted into the genres of comedy, drama and action. While the actual movie selection and activity taken is random, it is weighted to respect the user's preferences in each genre to hit a distribution that would mirror that user's taste.
For example, user 400001 has the preference of 20 for comedy, 30 for drama, 50 for action. This will result in the user logging more activity with action movies.
NOTE: Your results (aka the
events
generated) may be different from your fellow lab participants -
Attach your cluster to the notebook using the dropdown. In the drop down, select the small cluster.
-
Select Run All.
-
Return to the Azure portal.
-
In your resource group, navigate to your Cosmos DB instance, it should start with s2cosmosdb....
-
Select Data Explorer in the left-hand menu.
-
Select the events container, then select items.
-
Select one or more of the items and review them.
NOTE: These items are created from the Databricks solution and include a random set of generated events for each user personality type. You should see events generated for 'details', 'buy' and 'addToCart' as well as the item associated with the event (via the contentId field).
-
Browse to the {un-zipped repo folder}/Retail/Starter/Contoso Movies folder and open the Contoso.Apps.Movies.sln solution.
If Visual Studio prompts you to sign in when it first launches, use the account provided to you for this lab (if applicable), or an existing Microsoft account.
-
Within the Solution Explorer, expand the /Utilities/MovieDataImport project and open the Program.cs file. Take a few moments to browse code. You will see that it:
- Aggregates all the event data generated from the Databricks notebook
- Creates the user personalities
- Creates the movie categories/genres
- Creates the movies
-
Right-click the project, select Set as startup project.
-
Press F5 to run the project.
You may see several of the following lines output to the console window after saving the genres and before adding the movies:
Input string was not in a correct format.
. You can safely ignore these due to some movies the API retrieved are poorly formatted.
NOTE: You must wait for the Event Generator Databricks notebook to be completed first before running this step. This is to ensure that later steps in the lab to match.
Duration: 30 minutes
Synopsis: We have pre-generated a set of events that include buy events. Based on this information, a Top Items recommendation will be made to users that are new to the site. You will implement this code in the web application and function applications, then deploy the applications to test the functionality.
-
In the Contoso.Apps.Movies.Web project, open the /Controllers/HomeController.cs file.
-
In Visual Studio, select View, then select Task List. This will display the list of TODO items, helping you navigate to each one.
The Task List appears at the bottom of the window:
-
Find TODO #1 and complete it by adding the following line underneath:
vm.RecommendProductsTop = RecommendationHelper.GetViaFunction("top", 0, 0);
The
RecommendationHelper
class is responsible for making the call to the Azure Function to get the recommendations based on the information submitted that includes the algorithm, user and content being used to compute the recommendations. -
In the Contoso.Apps.FunctionApp project, open the RecommendationHelper.cs file.
-
In the TopRecommendation method, find the TODO #2 and complete it with the following:
var container = client.GetContainer(databaseId, "object"); var query = container.GetItemLinqQueryable<Item>(true) .Where(c => c.EntityType == "ItemAggregate") .OrderByDescending(c => c.BuyCount) .Take(take); items = query.ToList(); foreach(Item i in items) { itemIds.Add(i.ItemId.ToString()); } topItems = GetItemsByImdbIds(itemIds);
-
Review the code, notice the following:
- We are querying an "object" container for an entity type called
ItemAggregation
and sorting it by theBuyCount
. Essentially these are the top purchased items. - We are then querying the object container for all the top movie items to get their metadata for display on the web front-end.
This code is responsible for querying the Cosmos DB
object
container to find the item aggregation information, for example all thebuy
events for a movie.It is important that you use aggregations to do this as each operation in Cosmos DB consumes a certain amount of RUs. For queries, the RU charge is based on the number of documents returned, the complexity of the query, and the number of partitions queried. To continue to have efficient queries as the user count and activity increases, we create an aggregated view.
Over time, the
events
container is expected to get incredibly large as your user count and activity increases. With respect to RUs, you can imagine the costs for making this query can become costly. - We are querying an "object" container for an entity type called
-
In the Visual Studio menu, select Build, then Rebuild Solution. This ensure all NuGet packages are restored, the solutions are cleaned, then build all the projects within the solution.
You should see an output stating that build successfully compiled all 9 projects:
========== Rebuild All: 9 succeeded, 0 failed, 0 skipped ==========
.
In the ARM templates, you will notice that we have intentionally set the throughput in RU/s for each container, based on our anticipated event processing and reporting workloads. In Azure Cosmos DB, provisioned throughput is represented as request units/second (RUs). RUs measure the cost of both read and write operations against your Cosmos DB container. Because Cosmos DB is designed with transparent horizontal scaling (e.g., scale out) and multi-master replication, you can very quickly and easily increase or decrease the number of RUs to handle thousands to hundreds of millions of requests per second around the globe with a single API call.
Cosmos DB allows you to increment/decrement the RUs in small increments of 100 at the database level, or at the container level. It is recommended that you configure throughput at the container granularity for guaranteed performance for the container all the time, backed by SLAs. Other guarantees that Cosmos DB delivers are 99.999% read and write availability all around the world, with those reads and writes being served in less than 10 milliseconds at the 99th percentile.
When you set a number of RUs for a container, Cosmos DB ensures that those RUs are available in all regions associated with your Cosmos DB account. When you scale out the number of regions by adding a new one, Cosmos will automatically provision the same quantity of RUs in the newly added region. You cannot selectively assign different RUs to a specific region. These RUs are provisioned for a container (or database) for all associated regions.
-
Right-click the Contoso.Apps.FunctionApp function app project, then select Publish.
-
Select Start, then ensure that Azure Functions Premium Plan is selected.
-
Select Select Existing, be sure to uncheck Run from package file, then select Create Profile.
-
Select your Azure Subscription, resource group and Function App to deploy to. The name should start with s2func...*.
-
Select OK, then click the Publish button to start the process.
-
Right-click the Contoso.Apps.Movies.Web web app project, then select Publish.
-
Select Start, then ensure that App Service is selected.
-
Select Select Existing, then select Create Profile.
-
Select your Azure Subscription, resource group and Function App to deploy to. The name should start with s2web...*.
-
Select OK, then click the Publish button to start the process. The application will publish and the site should be displayed. If the site does not automatically launch in a browser, you can copy the Site URL on the publish dialog and open the site in a new browser window.
- In the browser window that opened from your web application deployment above, check to see that you received recommendations as a non-logged in user. You should see something similar to the following:
NOTE: These are simply suggestions based on the top purchased items from the pre-generated events.
Duration: 30 minutes
Synopsis: Based on the pre-calculated events in the Cosmos DB for our pre-defined personality types (Comedy fan, Drama fan, etc.), you will implement and deploy an algorithm that will generate these associations and put them in Cosmos DB for offline processing by the web and function applications.
-
Switch back to your Databricks workspace and open the 02 Association Rules notebook.
-
Attach your cluster to the notebook using the dropdown. In the drop down, select the small cluster.
-
Run each cell of the 02 Association Rules notebook by selecting within the cell, then entering Ctrl+Enter on your keyboard. Pay close attention to the instructions within the notebook so you understand each step of the data preparation process.
The goal of this algorithm is to compute two metrics that indicate the strength of a relationship between a source item and a target item based on event history, and then save that matrix to the associations container in Cosmos DB.
The algorithm begins with grouping events with a buy action into a transaction, grouping by the sessionId. This provides the set of items bough together.
For example, a transaction with two items would look like:
'404973': ['5512872', '4172430']
where 404973 is the sessionId that is used as the transactionId, and the the array contains the id's of the items bought ('5512872' and '4172430').
The notebook examined the events
data to find items that tend to be purchased together, and created a matrix that reflects the strength of the relationship. It then stored the matrix in the associations container. You will now look at the data in Cosmos DB, using the Data Explorer.
-
Switch back to the Azure Portal.
-
In your resource group, navigate to your Cosmos DB instance.
-
Open the associations container, review the items in the container.
NOTE: These items are created from the Databricks solution and include the association confidence level as compared from one movie to another movie.
NOTE: You will only see about 8 items generated here.
Duration: 30 minutes
Synopsis: Now that we have data for our association calculations, we will add code to the web app and function app to support this new recommendation engine.
-
In the Contoso.Apps.FunctionApp project, open the RecommendationHelper.cs file.
-
In the AssociationRecommendationByUser method, find the TODO #3 and complete it with the following:
//get 20 log events for the user. List<CollectorLog> logs = GetUserLogs(userId, 20); if (logs.Count == 0) return items; List<Rule> rules = GetSeededRules(logs); //get the pre-seeded objects based on confidence List<Recommendation> recs = new List<Recommendation>(); //for each rule returned, evaluate the confidence foreach (Rule r in rules) { Recommendation rec = new Recommendation(); rec.id = int.Parse(r.target); rec.confidence = r.confidence; recs.Add(rec); itemIds.Add(rec.id.ToString()); } items = GetItemsByImdbIds(itemIds);
The user logs are user clickstream events that are generated by a user's activity on the website, such as viewing an item, adding it to the shopping cart, etc. When a user adds an item to their shopping cart, for instance, the item's unique ID is captured as the
ContentId
in theCollectorLog
document. TheGetSeededRules
method returns data from the associations Cosmos DB container where thesource
matches theContentId
values retrieved from the user logs (logs
) that are passed to the method. This data applies this filter as well as excludes records where theContentId
is listed as the target. We are just trying to find associations based on the source matching the items captured in the user logs. The results are then ordered by the confidence rating in descending order.Ultimately, what is returned is a list of recommendations based on the relationship matrix between source movies and target movies with a high confidence level. This matrix was generated by the Association Rules Databricks notebook and saved to the associations container. Each time you run this notebook, it updates the matrices based on the latest user logs, or event history.
-
In the Contoso.Apps.Movies.Web project, open the HomeController.cs file.
-
Replace the Index method with the following:
var vm = new HomeModel(); Contoso.Apps.Movies.Data.Models.User user = (Contoso.Apps.Movies.Data.Models.User)Session["User"]; vm.RecommendProductsTop = RecommendationHelper.GetViaFunction("top", 0, 0); if (user != null) { vm.RecommendProductsBought = RecommendationHelper.GetViaFunction("assoc", user.UserId, 0); vm.RecommendProductsLiked = RecommendationHelper.GetViaFunction("collab", user.UserId, 0); } return View(vm);
-
Right-click the Contoso.Apps.FunctionApp function app project, select Publish.
-
Select Publish.
-
Right-click the Contoso.Apps.Movies.Web web app project, select Publish.
-
Select Publish, the site should load.
-
In the browser window that opened from your web application deployment above, check to see that you received recommendations as a non-logged in user. You should see the same results as you received previously.
-
In the top navigation, select LOGIN, then select the Comedy Fan ([email protected]) account.
-
Notice the main page now has different recommendations than what you received earlier, but we are still missing the similar 'liked' items
Duration: 30 minutes
Synopsis: In this exercise you will execute the implict ratings notebook in Azure Databricks to generate the implict rating for each user that has event data. You will only execute this once during this lab, however this notebook would need to be run on a set schedule to ensure that the users rating data is up to date.
-
Switch back to your Databricks workspace, select 03 Ratings.
-
Attach your cluster to the notebook using the dropdown. In the drop down, select the small cluster.
-
Run each cell of the 03 Ratings notebook by selecting within the cell, then entering Ctrl+Enter on your keyboard. Pay close attention to the instructions within the notebook so you understand each step of the data preparation process.
This notebook will use the implict events captured in the events container in Cosmos DB to calculate what a user would rate a given item, based on their actions. In other words it converts a users buy, addToCart and details actions into a numeric score for the item. The resulting user to item ratings matrix will be saved to the ratings container in Cosmos DB.
-
Switch back to the Azure portal.
-
In your resource group, navigate to your Cosmos DB instance.
-
Open the ratings container, review the items in the container.
NOTE: These ratings are generated as part of this notebook as an 'offline' operation. If you collect a significant amount of user data, you would need to re-evaluate the events using this notebook and populate the ratings container again for the online calculations to utilize.
-
Switch back to your Databricks workspace, select 04 Similarity.
-
Attach your cluster to the notebook using the dropdown. In the drop down, select the small cluster.
-
Run each cell of the 04 Similarity notebook by selecting within the cell, then entering Ctrl+Enter on your keyboard. Pay close attention to the instructions within the notebook so you understand each step of the data preparation process.
The similarity calculation logic in the notebook uses the user-to-item ratings previously created to calculate a score indicating the similarity between a source item and a target item.
The process begins by loading the ratings matrix and for each user to item rating, calculating a new normalized rating (to adjust for the user's bias).
An overlap matrix is calculated that identifies, for any pair of items, how many users rated both items. First, the normalized ratings matrix is converted to a boolean matrix. That is, if an item for a user has a rating (regardless of the value of the rating), it has a value of 1, otherwise it is zero. Then dot product of the normalized ratings matrix against its transpose is calculated. This yields a simpler matrix where the value each cell now contains the count of the number users who rated both items. Cells that don't have any overlap, have a value of zero.
Separately, the cosine similarity of the normalized ratings matrix is computed. It's easiest to understand the cosine similarity calculation as being done between an item
i
and another itemj
. The cosine similarity is a ratio:- The numerator is computed as the sum of the product of the normalized rating of item
i
multiplied with the rating ofj
, for all users who have provided ratings. - The denominator is computed as the square root of the sum of the squares of the normalized rating of item
i
multiplied by the square root of the sum of the squares of the normalized rating of itemj
.
In Python, the logic uses the
cosine_similarity
method from scikit-learn to compute the similarity between items by providing it our normalized user-to-items ratings matrix.The result is then filtered to remove entries with a similarity score lower than configured, and having an overlap in the overlap matrix of less than a configured overlap in quantity of ratings for the pair of items.
Just before saving, any resulting similarities with scores less than the configured minimum similarity are removed, so that weaker similarities are not recommended.
- The numerator is computed as the sum of the product of the normalized rating of item
-
Switch back to the Azure portal.
-
In your resource group, navigate to your Cosmos DB instance, it should start with s2cosmosdb....
-
Select Data Explorer.
-
Select the similarity container, then select items.
-
Select one or more of the items and review them.
NOTE: These items are created from the Databricks solution and include the similarity of one movie, the source, to another, the target.
-
In the Contoso.Apps.FunctionApp project, open the RecommendationHelper.cs file.
-
In the CollaborativeBasedRecommendation method, find the TODO task #4 and complete it with the following:
int neighborhoodSize = 15; double minSim = 0.0; int maxCandidates = 100; //inside this we do the implict rating of events for the user... Hashtable userRatedItems = GetRatedItems(userId, 100); if (userRatedItems.Count == 0) return new List<string>(); //this is the mean rating a user gave double ratingSum = 0; foreach(double r in userRatedItems.Values) { ratingSum += r; } double userMean = ratingSum / userRatedItems.Count; //get similar items List<SimilarItem> candidateItems = GetCandidateItems(userRatedItems.Keys, minSim); //sort by similarity desc, take only max candidates candidateItems = candidateItems.OrderByDescending(c=>c.similarity).Take(maxCandidates).ToList(); Hashtable recs = new Hashtable(); List<PredictionModel> precRecs = new List<PredictionModel>(); foreach(SimilarItem candidate in candidateItems) { int target = candidate.Target; double pre = 0; double simSum = 0; List<SimilarItem> ratedItems = candidateItems.Where(c=>c.Target == target).Take(neighborhoodSize).ToList(); if (ratedItems.Count > 1) { foreach (SimilarItem simItem in ratedItems) { try { string source = userRatedItems[simItem.sourceItemId].ToString(); //rating of the movie - userMean; double r = double.Parse(source) - userMean; pre += simItem.similarity * r; simSum += simItem.similarity; if (simSum > 0) { PredictionModel p = new PredictionModel(); p.Prediction = userMean + pre / simSum; p.Items = ratedItems; precRecs.Add(p); } } catch (Exception ex) { Console.WriteLine(ex.Message); } } } } //sort based on the prediction, only take x of them List<PredictionModel> sortedItems = precRecs.OrderByDescending(c => c.Prediction).Take(take).ToList(); //get first model's items... foreach(PredictionModel pm in sortedItems) { foreach(SimilarItem ri in pm.Items) { if (ri.targetItemId != null) { itemIds.Add(ri.targetItemId.ToString()); break; } } }
To summarize, this code grabs 100 ratings for a specific user, then query for any associated items that were generated in the association notebook within a preset
neighborhood
size (in this case 15). With a set of similar items, filters out any items that fall outside the user's meanratings
. Then, it sorts the remaining items by the similarity and present those as the recommendations.The following is a more detailed walk-through of the code:
The implicit ratings that are stored in the
userRatedItems
hashtable for the currently logged in user, come from theratings
Cosmos DB container, which is populated by theRatings
Databricks notebook. If no implicit ratings are found, the method returns an empty result. Otherwise, we add all the ratings and store them in theratingSum
variable, then calculate the mean rating (ratingSum
/ the count of ratings for the user), storing this value in theuserMean
variable..Next, we retrieve the similar items from the
similarity
Cosmos DB container, which is populated by theSimilarity
Databricks notebook. We sort these items in descending order by the similarity between a source item and a target item, adjusted for the user's bias (do they prefer comedies, action movies, etc.). We proceed to loop through the collection of similar items, retrieving associated items within the same collection, taking the presetneighborhood
size.We continue by looping through the associated (or similar) items, matching up the user's implicit rating for each similar item, from the
userRatedItems
hashtable. Since theSimilarity
notebook calculated the cosine similarity between items, we need to perform a calculation to filter out any items that fall outside the user's mean ratings. As you may recall, the notebook performs the cosine similarity calculation with the following ratio:- The numerator is computed as the sum of the product of the normalized rating of item
i
multiplied with the rating ofj
, for all users who have provided ratings. - The denominator is computed as the square root of the sum of the squares of the normalized rating of item
i
multiplied by the square root of the sum of the squares of the normalized rating of itemj
.
To account for this, we take the movie's implicit rating, minus the user's mean rating (
userMean
), then multiply the similarity by this value in a loop. We also keep a running sum of the item similarity (simSum
). If this value is greater than zero, we add a newPredictionModel
to the collection. Before returning this collection, we sort it in decending order by the prediction, taking only the number of items requested in the method parameters (take
). - The numerator is computed as the sum of the product of the normalized rating of item
-
Right-click the Consoto.Apps.FunctionApp function app project, select Publish.
-
Select Publish.
-
Right-click the Contoso.Apps.Movies.Web web app project, select Publish.
-
Select Publish, the site should load.
-
In the browser window that opened from your web application deployment above, check to see that you received recommendations as a non-logged in user. You should see the same results as you received previously.
-
If you are not already logged in, select LOGIN, then select the [email protected] account.
-
Notice the main page now has both the associative and collaborative results displayed:
Duration: 30 minutes
Synopsis: In this exercise you will setup stream analytics to process the change feed events fired from Cosmos DB into an Azure Function which then forwards to an event hub for real time Power BI analytics.
-
Open the Azure Portal, navigate to your Stream Analytics job that was created for you in the setup script.
-
Select Inputs.
-
Select +Add stream input, then select Event Hub.
-
For the alias, type s2events.
-
Select your subscription.
-
Select the s2ns.. event hub.
-
For the event hub, select store.
-
For the policy name, select RootManageSharedAccessKey.
-
Select Save.
-
Select Outputs.
-
Select +Add, then select Power BI.
-
For the output alias, type eventOrdersLastHour.
-
For the dataset, type eventOrdersLastHour.
-
For the table name, type eventOrdersLastHour.
-
Select Authorize, login to your Power BI instance.
-
Select Save.
-
Repeat for steps 11-16, but replace eventOrdersLastHour with:
- eventSummary
- failureCount
- eventData
-
Select Query.
-
Update the query to the following:
SELECT Count(*) as FailureCount INTO failureCount FROM s2events WHERE Event = 'paymentFailure' GROUP BY TumblingWindow(second,10) SELECT Count(distinct UserId) as UserCount, System.TimeStamp AS Time, Count(*) as EventCount INTO eventData FROM s2events GROUP BY TumblingWindow(second,10) SELECT System.TimeStamp AS Time, Event, Count(*) INTO eventSummary FROM s2events GROUP BY Event, TumblingWindow(second,10) select DateAdd(second,-10,System.Timestamp()) AS WinStartTime, System.Timestamp() AS WinEndTime,0 as Min, Count(*) as Count, 10 as Target into eventOrdersLastHour from s2events where event = 'buy' GROUP BY SlidingWindow(second,10)
-
The Query window should look similar to this:
-
Select Save query.
-
Select Overview, in the menu, select Start to start your stream analytics job.
-
In the dialog, ensure that Now is selected, then select Start.
NOTE: If your job fails for any reason, you can use the Activity Log to view the error(s).
-
In the Contoso.Apps.FunctionApp project, open the FuncChangeFeed.cs file.
-
Take a moment to review the function signature. Notice how it is trigger based on a Cosmos DB container.
-
Find the TODO task #5 and complete it with the following:
AddEventToEventHub(events);
-
Add the following method to the function class:
public void AddEventToEventHub(IReadOnlyList<Document> events) { try { //event hub connection EventHubClient eventHubClient; string EventHubConnectionString = config["eventHubConnection"]; string EventHubName = "store"; var connectionStringBuilder = new EventHubsConnectionStringBuilder(EventHubConnectionString) { EntityPath = EventHubName }; eventHubClient = EventHubClient.CreateFromConnectionString(connectionStringBuilder.ToString()); foreach (var e in events) { string data = JsonConvert.SerializeObject(e); var result = eventHubClient.SendAsync(new EventData(Encoding.UTF8.GetBytes(data))); } } catch (Exception ex) { log.LogError(ex.Message); } }
NOTE: This method will forward the change feed events to the event hub where stream analytics will be monitoring and then forwarding data to the Power BI dashboard
-
Right-Select the Consoto.Apps.FunctionApp function app project, select Publish.
-
Select Publish.
-
Right-click the DataGenerator project, select Set as startup project.
-
Press F5 to run the project.
-
Notice events will be generated based on a set of users and their preferred movie type.
-
Buy events will be generated for the first 30 seconds with random payment failures also generated. After 30 seconds, you will notice the orders per hour will fall below the target of 10. This would signify that something is wrong with the front end web site or order processing.
-
After about 1 minute, close the DataGenerator console program.
-
Open a new browser window to Power BI.
-
Click Sign In, sign in using the same credentials you used to authorize your outputs for Stream Analytics above.
-
Select My workspace.
-
Select +Create, then select Dashboard.
-
For the name, type Contoso Movies, select Create.
-
Select the ... ellipses, then select +Add tile.
-
Select Custom Streaming Data, select Next.
-
Select the eventData data set, then select Next.
Important: If the eventData data set does not appear, it is because there is a lag time of several minutes between when you first configure the Stream Analytics Power BI output and when data first appears in the streaming data set. Please ensure the data generator is running and that you have started the Stream Analytics query. Also, you may try restarting the Function App as well.
-
For the visualization type, select Card.
-
For the Fields, select EventCount.
-
Select Next.
-
For the title, type Event Count, then select Apply.
-
Select +Add tile, you may need to select the ... ellipses first.
-
Select Custom Streaming Data, select Next. Use the following table to create the needed tiles:
Dataset | Type | Fields | Title |
eventData | Card | UserCount | User Count |
failureCount | Card | FailureCount | Payment Failures |
eventSummary | Line cart | Axis = UserCount, Legend = Event, Values = Count | Count By Event |
eventOrdersLastHour | Gauge | Value = Count, Minimum = Min, Target = Target | Orders Per Hour |
-
Switch back to Visual Studio, press F5 to run the data generator project.
-
Switch to your Power BI dashboard, after a few minutes, you should see it update with the event data:
Duration: 30 minutes
In this exercise you will configure your Cosmos DB change feed function to call an HTTP Logic App endpoint that will then send an email when an order event occurs. The function will be using Polly to handle retries in the case the Function App is not available.
-
Open the Azure Portal to your resource group and select the Logic App in your resource group, it should be named s2logicapp....
-
Select Edit.
-
Select +New step.
-
Search for send an email, then select the Office 365 outlook connector.
-
Select Sign in, login using your Azure AD credentials.
-
Set the To as your email address.
-
Set the Subject as Thank you for your order.
-
Set the Body as Your order is being processed.
-
Select Save.
-
Select on the When a HTTP request is received action, copy the HTTP POST URL for the logic app and save it for the next task.
-
Open the Azure Portal to your resource group and select the Function App in your resource group, it should be named s2func....
-
Select Configuration.
-
Update the LogicAppUrl configuration variable to the Logic App http endpoint your recorded above.
-
Select Save.
-
In the Contoso.Apps.FunctionApp.ChangeFeed project, open the FuncChangeFeed.cs file.
-
Take a moment to review the function signature. Notice how it is triggered based on a Cosmos DB container.
-
Find the TODO task #6 and complete it with the following:.
CallLogicApp(events);
-
Add the following method to the function class:
public async void CallLogicApp(IReadOnlyList<Document> events) { try { // Have the HttpClient factory create a new client instance. var httpClient = _httpClientFactory.CreateClient("LogicAppClient"); // Create the payload to send to the Logic App. foreach (var e in events) { var payload = new LogicAppAlert { data = JsonConvert.SerializeObject(e), recipientEmail = Environment.GetEnvironmentVariable("RecipientEmail") }; var postBody = JsonConvert.SerializeObject(payload); var httpResult = await httpClient.PostAsync(Environment.GetEnvironmentVariable("LogicAppUrl"), new StringContent(postBody, Encoding.UTF8, "application/json")); } } catch (Exception ex) { log.LogError(ex.Message); } }
-
Right-Select the Consoto.Apps.FunctionApp function app project, select Publish.
-
Select Publish.
-
Switch to Visual Studio, right-click the DataGenerator project, select Set as startup project.
-
Press F5 to run the project.
-
For each
buy
event, you will receive an email.NOTE: It can take up to 5 minutes to receive emails, when you do you could receive quite a
few
emails.
Duration: 5 minutes
In this exercise, attendees will de-provision any Azure resources that were created in support of the lab.
-
Using the Azure portal, navigate to the Resource group you used throughout this hands-on lab by selecting Resource groups in the menu.
-
Search for the name of your research group, and select it from the list.
-
Select Delete in the command bar, and confirm the deletion by re-typing the Resource group name and selecting Delete.
You should follow all steps provided after attending the Hands-on lab.