Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A survey of methods for time series change point detection @sneha3799 #169

Open
Tracked by #103
RobotPsychologist opened this issue Nov 20, 2024 · 2 comments
Open
Tracked by #103
Labels
research Background research and research publication team.

Comments

@RobotPsychologist
Copy link
Owner

No description provided.

@RobotPsychologist
Copy link
Owner Author

RobotPsychologist commented Nov 21, 2024

@sneharao37 could you please write your review on this ticket? Thanks!

@RobotPsychologist RobotPsychologist changed the title - A survey of methods for time series change point detection @sneha3799 A survey of methods for time series change point detection @sneha3799 Nov 22, 2024
@sneha3799
Copy link
Contributor

sneha3799 commented Nov 25, 2024

Prioritized List for Paper 2: Survey of Change Point Detection Methods

Title : A survey of methods for time series change point detection

Author : Aminikhanghahi, Samaneh and Cook, Diane J

Abstract

Change points are abrupt variations in time series data. Such abrupt changes may represent transitions that occur between states. Detection of change points is useful in modelling and prediction of time series and is found in application areas such as medical condition monitoring, climate change detection, speech and image analysis, and human activity analysis. This survey article enumerates, categorizes, and compares many of the methods that have been proposed to detect change points in time series. The methods examined include both supervised and unsupervised algorithms that have been introduced and evaluated. We introduce several criteria to compare the algorithms. Finally, we present some grand challenges for the community to consider.


Preliminaries

  • Change Point: A point in a time series marking a transition between distinct process states.
  • Change Point Detection (CPD): A hypothesis testing problem to identify whether a change occurs in a sequence ${x_m, x_{m+1}, \ldots, x_n}$. The null hypothesis ($H_O) assumes no change, while the alternative hypothesis ($H_A$​) indicates a change at a specific point $K^*$.
  • Key Metrics:
    • Accuracy: Proportion of correctly classified points.
    • Sensitivity (Recall/TP Rate): Measures the fraction of correctly identified positive cases.
    • G-Mean: Combines sensitivity and specificity to balance class imbalance.
    • Precision: Ratio of true positives to all points classified as positive.
    • F-Measure: Weighted harmonic mean of precision and recall.
    • ROC Curve: Analyzes trade-offs between TP and FP rates. Performance is summarized by the AUC (Area Under Curve).
    • PR Curve: Precision vs. recall analysis, useful for imbalanced datasets.

Summary and Prioritization

Probabilistic Methods for Change Point Detection

Probabilistic methods estimate the likelihood of changes in a time series by analyzing probabilities and patterns. Below are two key approaches:

1. Bayesian Change Point Detection (BCPD)

Purpose: Real-time detection of change points using probabilities based on prior data.

Key Concepts:

  • Run Length: Measures time since the last change.
  • Hazard Function: Predicts the likelihood of a change based on the current state's duration.

How It Works:

  1. Run Length Update: Computes probabilities for:
    • A new change point (run length resets to 0).
    • Continuation of the current state (run length increases by 1).
  2. Probability Calculation: Uses Bayes' theorem to update the run length distribution.
  3. Change Point Detection: Flags a change point if the reset probability is highest.

Strengths:

  • Effective for real-time detection.
  • Adapts as new data arrives.

Weakness: Computationally complex unless approximated.


2. Gaussian Process (GP) Method

Purpose: Predict future data points based on learned patterns and flag deviations as changes.

Key Concepts:

  • Gaussian Process: Models data as a combination of a smooth trend function $f(t)$ and random noise $\epsilon_t$.
  • Covariance Function: Defines relationships between data points to predict future values. Example: $K(t_1, t_2) = \sigma^2 \exp \left(-\frac{(t_1 - t_2)^2}{2l^2}\right)$, where $\sigma^2$ controls variance and $l$ controls smoothness.

How It Works:

  1. Model the Data: Represents time series as $x_t = f(t) + \epsilon_t$.
  2. Predict and Compare: Predicts the next data point and compares it to the actual value.
  3. Flag Changes: Flags a change if the deviation exceeds a threshold.

Strengths:

  • Highly accurate and considers the entire data history.

Weakness: Computationally intensive and complex to implement.


Kernel-Based Methods for Change Point Detection

Kernel-based methods analyze similarities between data chunks (windows) in a time series by transforming the data into a structured space for better comparison.

1. Key Idea: Transforming Data for Comparison

Why Transform? Real-world data can be messy or noisy. Kernel-based methods map data into a Reproducing Kernel Hilbert Space (RKHS), making patterns and differences easier to detect.

How It Works:

  • Kernel Function ($k(x, y)$): Measures similarity between two data points $x$ and $y$. Points closer in the data have higher similarity scores.
  • Transformation: Data points in a window are mapped into the RKHS using the kernel function, creating an organized, pattern-friendly representation.

2. Comparing Two Windows

Windows: Two sliding windows of data are compared: one from the recent past and one from the present.

Steps:

  1. Calculate mean $( \hat{\mu} )$ and covariance $( \hat{\Sigma} )$ of the transformed data in each window.
  2. Compute the Kernel Fisher Discriminant Ratio (KFDR):
    • KFDR measures the difference between the two windows.
    • Larger KFDR values suggest greater differences.

3. Detecting Change Points

Thresholding: Compare the KFDR score to a predefined threshold:

  • KFDR > Threshold: A change point is flagged.
  • KFDR ≤ Threshold: No change is detected.

Optional Strategy: Use a running maximum partition strategy to flag the maximum KFDR score across overlapping windows.


4. Challenges

  • Kernel Choice: Performance depends on the kernel function (e.g., linear, Gaussian, polynomial) and its parameters.
  • High Dimensions: Complex or high-dimensional data makes selecting the right kernel and parameters more difficult.

Graph-Based Methods for Change Point Detection

Graph-based methods use graph theory to represent and analyze time series data, detecting change points based on the structure and relationships in the graph.

1. Key Idea: Representing Data as a Graph

What is a Graph?

  • Nodes: Represent individual observations in the time series.
  • Edges: Connect nodes based on a similarity or distance measure.

Graph Types:

  • Minimum Spanning Tree: Connects all nodes with the shortest total edge length.
  • Nearest Neighbor Graph: Connects each node to its closest neighbors.
  • Visibility Graph: Connects nodes based on their visibility in a time series plot.

2. How It Works

  1. Create a Graph: Construct a graph from the time series data.
  2. Split the Data: For each potential change point $\tau$, divide the series into:
    • Window 1: Observations before $\tau$.
    • Window 2: Observations after $\tau$.
  3. Count Cross-Edges:
    • Calculate $RG(\tau)$, the number of edges connecting nodes in Window 1 to nodes in Window 2.
    • Fewer cross-edges indicate greater differences between the two windows.

3. Standardizing the Edge Count

To account for variability in graph structure and sample size, standardize the edge count: $ZG(t) = \frac{RG(t) - E[RG(t)]}{\sqrt{\text{VAR}[RG(t)]}}$

  • $E[RG(t)]$: Expected value of the edge count.
  • $\text{VAR}[RG(t)]$: Variance of the edge count.

The maximum $ZG(t)$ across all potential change points indicates the most likely change point.


4. Detecting Change Points

  • A change point is flagged if the maximum $ZG(t)$ exceeds a predefined threshold.
  • The threshold determines how "different" the two windows must be to signal a change.

5. Strengths and Weaknesses

Strengths:

  • Effective for High-Dimensional Data: Handles complex datasets where traditional methods struggle.
  • Nonparametric: Makes minimal assumptions about data distribution.

Weaknesses:

  • Graph Structure Dependency: Performance depends heavily on the chosen graph type.
  • Indirect Data Use: Focuses more on the graph representation than the raw time series.

Clustering Techniques for Change Point Detection

Clustering techniques analyze patterns in time series data by grouping similar observations and identifying deviations. Below are some key methods:

1. Sliding Window and Bottom-Up (SWAB)

What It Does: Breaks data into smaller windows and processes them incrementally.

How It Works:

  • Bottom-Up Approach: Groups individual data points into larger sequences based on similarity.
  • Finalizes the oldest sequence as new data enters the window for analysis.
  • Efficiently handles large datasets by processing data in chunks.

2. Minimum Description Length (MDL)

What It Does: Compresses time series data into clusters by minimizing the "cost" of representing the data in bits.

How It Works:

  • Measures the cost of representing the data.
  • Adjusts clusters (create, add, or merge) to find the most efficient grouping.
  • Focuses on uncovering meaningful patterns without needing a predefined number of clusters.

3. Shapelet-Based Clustering

What It Does: Identifies small recurring shapes (u-shapelets) in the data to cluster similar patterns.

How It Works:

  • Detects shapelets that represent key patterns in the data.
  • Uses these shapelets to separate and group data.
  • Applies a clustering method (e.g., k-means) to organize data based on the identified patterns.

4. Model Fitting

What It Does: Detects changes by fitting data points into predefined clusters with rules.

How It Works:

  • Assumes data fits into clusters defined by a center and a radius.
  • Flags changes when new points don’t fit well with existing clusters (e.g., they are too far from the cluster center).

Picture1

Other Notes

  • Algorithms like RuLSIF and MDL, though promising, require further investigation into scalability and adaptability for real-time processing.
  • Supervised methods demand diverse training datasets, often limiting their applicability.

Packages Containing Relevant Algorithms

  1. ruptures: GitHub Repo
  2. hmmlearn: GitHub Repo
  3. darts detection module: GitHub Repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
research Background research and research publication team.
Development

No branches or pull requests

2 participants