Skip to content

Commit

Permalink
[Optimization engine] Added troubleshooting page and split FAQ (micro…
Browse files Browse the repository at this point in the history
…soft#1008)

Co-authored-by: Helder Pinto <[email protected]>
Co-authored-by: Michael Flanakin <[email protected]>
  • Loading branch information
3 people authored Sep 27, 2024
1 parent 8ff2f9f commit 12bfaa9
Show file tree
Hide file tree
Showing 5 changed files with 51 additions and 43 deletions.
2 changes: 1 addition & 1 deletion docs/_optimize/optimization-engine/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ The Azure Optimization Engine (AOE) was initially developed to augment Virtual M
Besides collecting **all Azure Advisor recommendations**, AOE includes other custom recommendations that you can tailor to your needs, such as:

* 💰 Cost
* Augmented Advisor VM right-sizing cost recommendations, with fit score based on virtual machine guest OS metrics (collected by Log Analytics or Azure Monitor agents) and Azure properties
* Augmented Advisor VM right-sizing cost recommendations, with fit score based on virtual machine guest OS metrics (collected by Azure Monitor agents) and Azure properties
* Underutilized VM scale sets, premium SSD disks, App Service plans, and Azure SQL databases (DTU-based SKUs only)
* Orphaned disks and public IPs
* Standard load balancers or application gateways without backend pool
Expand Down
27 changes: 2 additions & 25 deletions docs/_optimize/optimization-engine/configuring-workspaces.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,32 +50,9 @@ Install-Module -Name Az.OperationalInsights
./Setup-DataCollectionRules.ps1 -DestinationWorkspaceResourceId "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup/providers/Microsoft.OperationalInsights/workspaces/myWorkspace" -IntervalSeconds 30 -ResourceTags @{"tagName"="tagValue";"otherTagName"="otherTagValue"}
```

### Log Analytics agent (legacy Microsoft Monitoring Agent)
### Log Analytics agent (legacy Microsoft Monitoring Agent, deprecated on August 31, 2024)

With the help of the `Setup-LogAnalyticsWorkspaces.ps1` script, you can validate and fix the configured Log Analytics performance counters on the workspaces of your choice. In its simplest form of usage, it looks at all the Log Analytics workspaces you have access to and, for each workspace with Azure VMs onboarded, it validates performance counters configuration and tells you which counters are missing. But you can target a specific workspace and, if required, automatically fix the missing counters. See usage details below.

#### Requirements

```powershell
Install-Module -Name Az.Accounts
Install-Module -Name Az.ResourceGraph
Install-Module -Name Az.OperationalInsights
```

#### Usage

```powershell
./Setup-LogAnalyticsWorkspaces.ps1 [-AzureEnvironment <AzureChinaCloud|AzureUSGovernment|AzureGermanCloud|AzureCloud>] [-WorkspaceIds <comma-separated list of Log Analytics workspace IDs to validate>] [-IntervalSeconds <performance counter collection frequency - default 60>] [-AutoFix]
# Example 1 - just check all the workspaces configuration
./Setup-LogAnalyticsWorkspaces.ps1
# Example 2 - fix all workspaces configuration (using default counter collection frequency)
./Setup-LogAnalyticsWorkspaces.ps1 -AutoFix
# Example 3 - fix specific workspaces configuration, using a custom counter collection frequency
./Setup-LogAnalyticsWorkspaces.ps1 -AutoFix -WorkspaceIds "d69e840a-2890-4451-b63c-bcfc5580b90f","961550b2-2c4a-481a-9559-ddf53de4b455" -IntervalSeconds 30
```
If you are still using the legacy Log Analytics agent, please migrate to the [Azure Monitor Agent](https://learn.microsoft.com/azure/azure-monitor/agents/azure-monitor-agent-migration).

<br>

Expand Down
17 changes: 0 additions & 17 deletions docs/_optimize/optimization-engine/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,6 @@ All the frequently asked questions about AOE in one place.

* **What type of Azure subscriptions/clouds are supported?** AOE has been deployed and tested against EA, MCA and MSDN subscriptions in the Azure commercial cloud (AzureCloud). Although not tested yet, it should also work in MOSA subscriptions. It was designed to also operate in the US Government cloud, though it was never tested there. Sponsorship (MS-AZR-0036P and MS-AZR-0143P), CSP (MS-AZR-0145P, MS-AZR-0146P, and MS-AZR-159P) DreamSpark (MS-AZR-0144P) and Internal subscriptions should also work, but due to lack of availability or disparities in their consumption (billing) exports models, some of the Workbooks may not fully work.

* **Why are my Recommendations workbook and Power BI report still empty after deploying AOE?** AOE takes up to 3 hours after deployment to export and ingest the data required to generate recommendations into Log Analytics / SQL Database. If after this time you aren't still seeing any recommendations, check whether:
* Azure Advisor has been reporting recommendations for the subscriptions in the AOE scope;
* Azure Automation runbooks have been failing, especially critical ones such as `Ingest-` and `Recommend-`, and verify the Exception message that is logged, which will normally give you a hint for the failure cause;
* a daily cap has been set in the AOE Log Analytics Workspace that might be dropping the ingestion of AOE logs after the cap was reached.

* **Why some workbooks present this message: `Failed to resolve table or column expression named 'AzureOptimizationPricesheetV1_CL'`?** This is typically a symptom of not having granted the required permissions to the AOE Automation Account managed identity. See instructions [here](https://aka.ms/AzureOptimizationEngine/commitmentssetup).

* **Why is the Identity and Roles workbook empty and presenting error messages?** This is typically a symptom of not having granted the required permissions, at the Entra ID tenant level, to the AOE Automation Account managed identity. After having granted the `Global Reader` role to the AOE managed identity, the workbook should populate on the next day.

* **Why is my Power BI report empty?** Most of the Power BI report pages are configured to filter out recommendations older than 7 days. If it shows empty, just try to refresh the report data.

* **Why is my VM right-size recommendations overview page empty?** The AOE depends on Azure Advisor Cost recommendations for VM right-sizing. If no VMs are showing up, try increasing the CPU threshold in the Azure Advisor configuration... or maybe your infrastructure is not oversized after all!

* **Why are my VM right-size recommendations showing up with so many Unknowns for the metrics thresholds?** The AOE depends on your VMs being monitored by Log Analytics agents and configured to send a set of performance metrics that are then used to augment Advisor recommendations. See more details [here](https://aka.ms/AzureOptimizationEngine/rightsizeblogpt2).

* **Why am I getting values so small for costs and savings after setting up AOE?** The Azure consumption exports runbook has just begun its daily execution and only got one day of consumption data. After one month - or after manually kicking off the runbook for past dates -, you should see the correct consumption data.

* **What is the currency used for costs and savings?** The currency used is the one that is reported by default by the Azure Consumption APIs. It should match the one you usually see in Azure Cost Management.

* **What is the default time span for collecting Azure consumption data?** By default, the Azure consumption exports daily runbook collects 1-day data from 3 days ago. This offset works well for many types of subscriptions. If you're running AOE in PAYG or EA subscriptions, you can decrease the offset by adjusting the `AzureOptimization_ConsumptionOffsetDays` variable. However, using a value less than 2 days is not recommended.
Expand Down
39 changes: 39 additions & 0 deletions docs/_optimize/optimization-engine/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
layout: default
parent: Optimization engine
title: Troubleshooting
nav_order: 70
description: 'Solutions to the most frequent issues with deployment and runtime.'
permalink: /optimization-engine/troubleshooting
---

<span class="fs-9 d-block mb-4">Troubleshooting</span>
Solutions to the most frequent issues with deployment and runtime.
{: .fs-6 .fw-300 }

---

* **When deploying AOE, I am getting a generic template deployment error** In some situations, the AOE template deployment results in a "_The template deployment failed with multiple errors_" message or similar. To identify the cause of the deployment failure, you have to check, in the Azure portal, the "_Deployments_" menu option both in the resource group and subscription details you chose to deploy AOE in. You will find a `resourcesDeployment` deployment in the resource group and a deployment with the AOE name prefix in the subscription, where you can identify the error details. Azure Policy deny policies are one of the typical causes for deployment errors.

* **Why are my Recommendations workbook and Power BI report still empty after deploying AOE?** AOE takes up to 3 hours after deployment to export and ingest the data required to generate recommendations into Log Analytics / SQL Database. If after this time you aren't still seeing any recommendations, check whether:
* You have changed the Power BI data source to the SQL Database endpoint of your AOE deployment ([see instructions](https://aka.ms/AzureOptimizationEngine/reports)).
* Azure Advisor has been reporting recommendations for the subscriptions in the AOE scope.
* You refreshed the report data, as most of the Power BI report pages are configured to filter out recommendations older than 7 days.
* Azure Automation runbooks have been failing, especially critical ones such as `Ingest-RecommendationsToLogAnalytics`, `Ingest-RecommendationsToSQLServer` and all the runbooks with a `Recommend-` prefix, and verify the Exception message that is logged, which will normally give you a hint for the failure cause.
* A daily cap has been set in the AOE Log Analytics Workspace that might be dropping the ingestion of AOE logs after the cap was reached.

* **Why some workbooks present this message: `Failed to resolve table or column expression named 'AzureOptimizationPricesheetV1_CL'`?** This is typically a symptom of not having granted the required permissions to the AOE Automation Account managed identity, which authenticates with Azure Cost Management to download your Azure pricesheet. See setup instructions [here](https://aka.ms/AzureOptimizationEngine/commitmentssetup). NOTE: only Enterprise Agreement (EA) and Microsoft Customer Agreement (MCA) customers are supported by AOE for Azure pricesheet download.

* **Why some workbooks present this message: `Failed to resolve table or column expression named 'AzureOptimizationReservationsUsageV1_CL' (or 'AzureOptimizationSavingsPlansUsageV1_CL')`?** This can be caused by lack of permissions in the AOE managed identity (see question above) or simply because your organization did not buy any Reservations or Savings Plans.

* **Why is the Identity and Roles workbook empty and presenting error messages?** This is typically a symptom of not having granted the required permissions, at the Entra ID tenant level, to the AOE Automation Account managed identity. After having granted the `Global Reader` role to the AOE managed identity, the workbook should populate on the next day. If, after having granted the `Global Reader` role the workbook is still reporting errors, you need to investigate whether the `Export-AADObjectsToBlobStorage` runbook is failing and verify the Exception message that is logged, which will normally give you a hint for the failure cause. A typical cause is lack of sufficient memory in the Azure Automation sandbox worker. For a Hybrid Worker work-around, see instructions [here](https://aka.ms/AzureOptimizationEngine/customize#-scale-aoe-runbooks-with-hybrid-worker). You can also filter the Entra ID users and groups, by creating the `AzureOptimization_AADObjectsUserFilter` and `AzureOptimization_AADObjectsGroupFilter` automation variables with an [Microsoft Graph OData filter](https://learn.microsoft.com/graph/filter-query-parameter?tabs=http).

* **The `Export-ConsumptionToBlobStorage` runbook takes a long time to finish or the `Ingest-OptimizationCSVExportsToLogAnalytics` runbook has been failing consistently for the `consumptionexports` container** This might be caused by AOE having to deal with a large number of subscriptions in your environment, exporting a large number of small blobs. In order to optimize Azure consumption ingestion, we recommend you to switch consumption exports from a subscription scope to a billing account or billing profile scope (NOTE: this is possible only for EA or MCA customers). To achieve this, you must create, in the AOE Automation Account, an `AzureOptimization_ConsumptionScope` variable set to `BillingAccount` (EA) or `BillingProfile` (MCA). Ensure you have granted the needed permissions to the AOE managed identity at the EA/MCA billing account/profile level and that the `AzureOptimization_BillingAccountID` (EA/MCA) and `AzureOptimization_BillingProfileID` (MCA only) are correctly set ([see instructions](https://aka.ms/AzureOptimizationEngine/commitmentssetup)). After all this settings, the next run of the consumption exports should generate a single blob for the whole billing account/profile.

* **Why is my VM right-size recommendations overview page empty?** The AOE depends on Azure Advisor Cost recommendations for VM right-sizing. If no VMs are showing up, try increasing the CPU threshold in the Azure Advisor configuration (see steps [here](https://learn.microsoft.com/azure/advisor/advisor-cost-recommendations#configure-vmvmss-recommendations))... or maybe your virtual machine infrastructure is not oversized after all!

* **Why are my VM right-size recommendations showing up with so many Unknowns for the metrics thresholds?** The AOE depends on your VMs being monitored by Azure Monitor agents and configured to send a set of performance metrics that are then used to augment Advisor recommendations. See more details [here](https://aka.ms/AzureOptimizationEngine/workspaces).

* **Why am I getting values so small for costs and savings after setting up AOE?** The Azure consumption exports runbook has just begun its daily execution and only got one day of consumption data. After one month - or after manually kicking off the runbook for past dates -, you should see the correct consumption data.

* **Why am I seeing historical data in the AOE workbooks only for the last 30 days?** The default AOE Log Analytics retention is 30 days. If you need to keep historical data for a longer period, [increase the Log Analytics retention](https://learn.microsoft.com/troubleshoot/azure/azure-monitor/log-analytics/billing/configure-data-retention) accordingly.
9 changes: 9 additions & 0 deletions docs/_resources/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,19 @@ Legend:
🔍 Optimization engine
{: .fs-5 .fw-500 .mt-4 mb-0 }

> ➕ Added:
>
> 1. [Troubleshooting documentation page](../_optimize/optimization-engine/troubleshooting.md) with the most common deployment and runtime issues and respective solutions or troubleshooting steps.
>
> ✏️ Changed:
>
> 1. Replaced storage account key-based authentication with Entra ID authentication for improved security.
>
> 🚫 Deprecated:
>
> 1. With the deprecation of the legacy Log Analytics agent in August 31, the `Setup-LogAnalyticsWorkspaces` script is no longer being maintained and will be removed in a future update.
> - The script was used to setup performance counters collection for machines connected to Log Analytics workspaces with the legacy agent.
> - We recommend migrating to the [Azure Monitor Agent](https://learn.microsoft.com/azure/azure-monitor/agents/azure-monitor-agent-migration) and use the `Setup-DataCollectionRules` script to [setup performance counters collection with Data Collection Rules](https://aka.ms/AzureOptimizationEngine/workspaces).
🖥️ PowerShell
{: .fs-5 .fw-500 .mt-4 mb-0 }
Expand Down

0 comments on commit 12bfaa9

Please sign in to comment.