-
Notifications
You must be signed in to change notification settings - Fork 2
Exporting Data
I use MariaDB and configured local network access (in Settings, Add-on, Configuration) so that I can access state and statistics data from the Home Assistant database via MySQL Workbench.
You need to allow database access via port 3306 so My SQL Workbench can connect (in Settings, Add-on, Configuration).
You may get a warning when connecting but once you have configured the connection and connected, you will see the following tables in the database:
The tables of interest for long term statistics are:
- statistics_meta: meta data that gives sensor names, what data is collected and provides the metadata_id
- statistics: contains the statistics data, organised in hourly summaries of the sensor data
To link the tables, join statistics_meta.id to statistics.metadata_id.
HA creates a row in the statistics table for each sensor every hour, so there are 24 rows added per sensor, per day. Unlike states, which are purged every 10 days by default, data is not purged from statistics so this table holds a complete history from the day the sensor was setup.
Measurements are numerical values that have state_class measurement. HA summarises each sensor value for each hour by calculating the mean, min and max values recorded during the last hour. Measurement values have the metadata value has_mean=1. Tracking min, mean and max allow you to check the range of a sensor over a period.
Totals are numerical values that have the state_class total or total_increasing. HA records these using the value of state each hour. HA also calculates the sum as the difference between this value and the previous value. Total values have the metadata value has_sum=1. For most purposes, queries that return the state value are the most useful.
This query gets solar generation and export energy by day for uploading to pvoutput.org:
SELECT DATE(from_unixtime(start_ts)) AS date
, MAX(CASE WHEN statistic_id = 'sensor.pv_energy_today' THEN state ELSE NULL END) AS pv_energy_today
, MAX(CASE WHEN statistic_id = 'sensor.feed_in_energy_today' THEN state ELSE NULL END) AS feed_in_energy_today
, MAX(CASE WHEN statistic_id = 'sensor.load_energy_today' THEN state ELSE NULL END) AS load_energy_today
, MAX(CASE WHEN statistic_id = 'sensor.grid_consumption_energy_today' THEN state ELSE NULL END) AS grid_consumption_energy_today
FROM statistics s JOIN statistics_meta m on m.id=s.metadata_id
WHERE statistic_id IN ('sensor.pv_energy_today', 'sensor.feed_in_energy_today', 'sensor.load_energy_today', 'sensor.grid_consumption_energy_today')
AND from_unixtime(start_ts) >= '2023-04-22'
GROUP BY DATE(from_unixtime(start_ts))
ORDER BY date;
This looks at the values of specific sensors (that read the inverter daily utility meters) and gets the maximum value returned each day. The start date can be altered to limit how much data is returned. Depending on your integration, the sensor names may need to be altered (i.e. some integrations may use sensors like pv_energy_daily or feedin_daily etc).
The results can be exported from My SQL Workbench as a CSV file and loaded into Excel for further analysis, summary or charting. The CSV file can be bulk uploaded to pvoutput.org.
Here's an example of the CSV format data for uploading:
2023-04-22 19.6 10.6 14.2 9.13
2023-04-23 19.4 9.6 14.4 5.9
2023-04-24 10.3 2.3 14 7.3
2023-04-25 32.6 22.4 14.2 9.6
2023-04-26 11.6 3.6 13.8 5.6
2023-04-27 8.5 1.1 12.9 8.2
2023-04-28 15.8 5.8 13.6 7
2023-04-29 23.5 13.9 13 5.4
2023-04-30 13.6 1.4 16.1 8.9
And the CSV upload settings to use:
