Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust timestamps to reflect actual detection time instead of audio file start time #235

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

tvoirand
Copy link

@tvoirand tvoirand commented Nov 14, 2024

Summary

Fixes #227

Problem: Currently, the detection timestamp stored in the database corresponds to the start time of the original audio file. As a result, several detections generated from the same file all share the same timestamp, even though these detections actually occurred at different moments within that file.

Proposed change: Adjust the timestamp stored in the database to reflect the actual detection time, by adding the delay in seconds between the start of the audio file and the start of the detection.

Detailed description of the changes

  • Datetime-related attributes are added to the Detection class in helpers.py:
    • Attributes (types in parenthesis):
      • datetime (datetime.datetime): The actual detection start time, calculated as the audio file's start datetime plus de delay (in seconds) between the audio file start and detection start.
      • date (str): Date in format YYYY-mm-dd.
      • time (str): Time in format HH:MM:SS.
      • iso8601 (str): Detection datetime in ISO 8601 format.
      • week (int): Week number of the detection.
    • These attributes store information similar to the datetime-related properties of the ParseFileName class in helpers.py. However attributes are used here instead of properties since these values do not need recalculating when accessed.
    • New argument: file_date is added to the Detection constructor to calculate these attributes based on the known audio file start datetime.
  • Removed properties: The date and time properties of the ParseFileName class in helpers.py are removed, as they are no longer used in the codebase.
  • Database and outputs update: In reporting.py, the detection timestamp is now based on the actual detection datetime instead of the audio file datetime. The updated timestamp is used in:
    • The SQL database,
    • The "Extracted" audio files,
    • The BirdDB.txt file,
    • Apprise notifications,
    • BirdWeather updates.
  • Detection instantiation update in server.py: The audio file start time is passed as argument when instantiating a Detection, aligning with the updated Detectionconstructor

@alexbelgium
Copy link

alexbelgium commented Nov 14, 2024

Very nicely written PR! The explanation of the changes and how everything is articulated is quite useful to understand the underlying logic

@Emmo213
Copy link

Emmo213 commented Nov 14, 2024

More accurate data is always preferred but in the end since the files are analyzed in 3 second chunks we're only talking about a difference of 1 or 2 seconds, right?

@alexbelgium
Copy link

The filename globally serves as a unique ID, so currently it is an issue that several detections have the same filename. For example, if one is deleted, the others lose their Extracted mp3 and generate an error upon deletion. The longer the extraction duration, the higher the probability of duplicate file names. Here is an example in which the new code prevented several files sharing the same start time, and therefore filename
IMG_6444

@tvoirand
Copy link
Author

Thanks for your reactions!

Currently (before this PR), the timestamp reflects the start time of the "original" full-length audio file before it's processed by extract_detection in birdnet_analysis.py. The original file duration is 15 seconds by default. So if detections occur in each 3-second chunk within that file, up to 5 detections can end up with the same timestamp.

These few seconds may not seem like a big difference. But currently some detections can't be distinguished by timestamp alone when reading from the database, although the exact detection time is available at the time of the database entry. The change I propose retains this detailed timing, improving data accuracy. The initial motivation for this update was to support another contribution related to BirdWeather, which I mention in this issue (but the current PR is fully independent).

Regarding the extracted filename, it also includes the detection confidence score, which reduces the risk of duplicates. But the risk still exists, and more accurate data can only help improve the system overall ;).

@Emmo213
Copy link

Emmo213 commented Nov 15, 2024

Thank you for the further explanation.

@Nachtzuster
Copy link
Owner

@tvoirand thanks for the nice PR!
The code looks fine at first glance 👍

@Nachtzuster Nachtzuster merged commit 3bc2c63 into Nachtzuster:main Nov 20, 2024
1 check passed
@tvoirand tvoirand deleted the detection-time branch November 21, 2024 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Store actual detection time in DB instead of audio file start time
4 participants