Skip to content

Fault Log

Nathan Hui edited this page Sep 27, 2023 · 1 revision

The fault log is a nonvolatile log of event codes stored in NVRAM. This allows for logging and debugging events during times when there is no console available to the user.

The API to the fault log (FLog) is in cli/flog.hpp.

All fault codes are defined as enums in FLOG_CODE_e. These are divided into several groups:

MSB Type
0x00 General
0x01 System
0x02 Temperature Calibration
0x03 INS
0x04 Ride/Deployment
0x05 Upload
0x06 GPS
0x07 Tempeturature Sensor
0x08 Filesystem
0xFF Debugging

Plaintext names for all of the fault codes live in flog.cpp:FLOG_Message. These are displayed on the console when the fault log is dumped.

Usage

In general, you can use the FLOG_AddError(FLOG_CODE_e, uint32_t) function to record an event to the fault log. If the fault log has already been initialized, this should be a very fast function, so it can be placed into high performance functions.

Simply pass the appropriate fault code as the first argument, and any arbitrary 32-bit unsigned value. This 32-bit unsigned value is user defined, and is retained in the fault log. This could be used to help identify unique events, additional states, etc.

You can dump the fault log to the console using the command line, or clear it. This is simply a human readable dump of the fault log, in order for as far back as the log goes. The timestamps are in milliseconds since boot.

Technical details

Since this structure lives in NVRAM, we need to have a couple fields that determine whether or not the data structure is properly initialized. This is handled by the FLOG_Data_t.numEntries and FLOG_Data_t.nNumEntries fields. The numEntries field contains the number of entries as an unsigned 32-bit value. We always keep nNumEntries as the bitwise not of the numEntries field. This is sufficient to ensure that a blank NVRAM does not appear as if a valid FLOG table exists. This is also a very rapid way to check if the table is valid.

Since NVRAM is limited, we must ensure that the table can wrap over. One requirement is that FLOG_NUM_ENTRIES is a power of 2. This allows us to take advantage of bitwise masking and rollover - if we mask numEntries with FLOG_NUM_ENTRIES - 1, it will always be the index of the next available entry. This also lets us quickly tell if the FLOG has likely overflowed.

Each entry in the fault log table starts with a 32-bit timestamp (milliseconds since boot), then the 16-bit error code, and then the 32-bit parameter value. This does mean that entries are 48 bits, and so not 4 word aligned, however, it seems that the ARM architecture and toolchain in the TSoM doesn't really care (this makes sense since the "EEPROM" is actually a file on the flash filesystem, so it is probably memory mapped, and the compiler probably compiles to a cross-boundary copy). If this doesn't work, then it is always acceptable to simply pack the entries to be 64-bit aligned, which will cost space, or switch to a 16-bit parameter value.

Clone this wiki locally