This notebook performs a comprehensive Profit and Loss (P&L) analysis using financial data from a Xero Demo Company dataset. It leverages Google BigQuery for data retrieval and Google Vertex AI for generating insightful analysis using natural language processing.
This notebook automates the process of generating a detailed P&L report by:
- Fetching financial data from BigQuery
- Processing and analyzing the data
- Using Vertex AI to generate natural language insights
- Producing a comprehensive HTML report
The notebook performs a detailed analysis of the Profit and Loss (P&L) data through several steps before leveraging the LLM for insights.
- The P&L data is first summarized at the account category level (e.g., Revenue, Cost of Delivery, Overheads).
- This provides a high-level view of the company's financial performance.
- The notebook calculates changes between different time periods:
- Month-over-Month (MoM)
- Year-to-Date (YTD) vs. Previous Year-to-Date
- These calculations highlight overall trends and significant shifts in financial performance.
- Within each account category, the data is further broken down by account groups.
- This more granular view helps identify which specific areas are driving changes in the broader categories.
- The notebook analyzes trends within account groups over time.
- It identifies groups with significant growth or decline, which may require further investigation.
- For account groups showing significant period-on-period changes, the notebook identifies individual transactions that are primarily responsible for these changes.
- This step helps pinpoint the root causes of major financial shifts.
- All the above calculations and findings are structured into a format suitable for LLM processing.
- This includes:
- Overall financial summaries
- Detailed breakdowns of significant changes
- Lists of notable transactions and their impacts
- The structured data is passed to the Vertex AI model via a carefully constructed prompt.
- The LLM then generates insights, including:
- Summary of overall financial performance
- Commentary on significant changes and their potential causes
- Identification of trends and their possible implications
- Recommendations based on the financial data
- The LLM's output is combined with the structured financial data and any generated visualizations.
- This comprehensive information is formatted into an HTML report for easy reading and sharing.
This multi-step analysis process ensures that the LLM has access to thoroughly processed and contextualized financial data, enabling it to provide more accurate, relevant, and insightful commentary on the company's financial performance.
- Google Cloud Platform account with BigQuery and Vertex AI enabled
- Python 3.7+
- Required Python libraries:
pandas
,matplotlib
,google-cloud-bigquery
,google-cloud-aiplatform
,langchain
- Clone this repository
- Install required Python libraries:
pip install -r requirements.txt
- Set up Google Cloud credentials
- Update the
PROJECT_ID
,DATASET_ID
, andLOCATION
variables in the notebook
The data for this notebook comes from the Fivetran Xero dbt transformation package. This package processes raw Xero data loaded by Fivetran's connector and produces analytics-ready models.
- Produces modeled tables leveraging Xero data from Fivetran's connector
- Provides analytics-ready models, including profit and loss report, general ledger, and balance sheet report
- Generates a comprehensive data dictionary through the dbt docs site
- xero__general_ledger: Contains journal line items, used for creating balance sheets and profit and loss statements.
- xero__profit_and_loss_report: Represents profit and loss line items at the month and account level.
- xero__balance_sheet_report: Shows the state of the balance sheet for each account on a monthly basis.
- xero__invoice_line_items: Contains invoice line items enriched with account, contact, and invoice information.
Fivetran offers the ability to run this dbt package automatically when loading Xero data. This process reshapes the data for optimization in downstream processes like data visualization or machine learning.
Fivetran provides two transformation solutions:
- Pre-built data models: Analytics-ready tables created by Fivetran, available directly in the Fivetran dashboard or importable into a dbt project.
- Integrations: Use Fivetran-hosted dbt Core or third-party dbt Cloud orchestration to manage all transformations in your dbt project.
These transformations run in your destination through dbt Core or are triggered to run by third parties like dbt Cloud, following the ELT (Extract, Load, Transform) model.
get_pl_data
: Retrieves financial data from BigQueryprocess_data
: Processes and structures the retrieved datagenerate_analysis_prompt
: Constructs the prompt for Vertex AIgenerate_report_html
: Creates the HTML report
This notebook uses Google Vertex AI, integrated through the LangChain library, to generate natural language insights from the processed financial data.
The generate_analysis_prompt
function constructs a detailed prompt for Vertex AI. Key aspects of the prompt include:
-
Context Provision: The prompt includes processed financial data, account group details, and cost change analysis.
-
Specific Instructions: It provides clear guidelines on what to focus on in the analysis, such as revenue, cost of delivery, and significant transactions.
-
Output Structuring: The prompt specifies the desired HTML structure for the output, ensuring a consistent and well-organized report.
-
Analytical Guidance: It provides guidance on how to approach the analysis, including comparing current performance to previous periods and identifying significant changes.
-
Customized Formatting: Instructions for using specific styles and tags in the output are included.
def generate_analysis_prompt(month, financial_data, account_group_details, cost_change_analysis):
return PromptTemplate.from_template("""
Analyze the following Profit and Loss data for {month} and provide a management report:
{financial_data}
Account Group Details (including significant transactions):
{account_group_details}
Cost Change Analysis:
{cost_change_analysis}
... [Detailed instructions and formatting guidelines] ...
<h1>📊 Executive Summary</h1>
<ul>
<li>📈 [Key summary point about overall financial performance]</li>
<li>🔮 [Key summary point about future outlook or recommendations]</li>
</ul>
... [More structured output guidelines] ...
""")
The Vertex AI model is configured in the initialize_clients
function:
llm = VertexAI(
model_name='gemini-1.5-flash-001',
max_output_tokens=2048,
temperature=0,
top_p=0.9,
top_k=30,
project=PROJECT_ID,
location=LOCATION,
verbose=True,
)
This configuration aims to produce consistent, focused outputs while allowing for some natural language variation.
The notebook processes financial data through several steps:
- Retrieval from BigQuery
- Data cleaning and structuring
- Calculation of key financial metrics
- Preparation of data for Gen AI analysis
The final report is generated as an HTML document, combining:
- Gen AI-generated insights
- Structured financial data tables
- Visualizations (if applicable)
The notebook produces:
- An HTML report displayed in the notebook
- Data stored in BigQuery for future reference
To customize this notebook for your needs:
- Modify the SQL queries in
get_pl_data
to match your data structure - Adjust the
generate_analysis_prompt
function to focus on specific areas of interest - Update the HTML generation in
generate_report_html
to change the report layout
For more information or to report issues, please open an issue in this GitHub repository.