Skip to content

MowlanicaBilla/Time-Series-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Time-Series-Classification

Time series classification is one of the major research areas over the past few years mainly due to it's large number of practical applications in various domains. It has a usage in many industries such as business, hospitals, hotels and transportation. Stock market anomaly detection in business, identifying heartbeat patterns of patients in hospitals and detecting temperature levels in climate science are some of its’ practical examples. Accurate time series classification can increase the business revenue by a high margin as well as facilitate optimal resource allocation and therefore many industries have a great interest in this area. There are few terms related with time series classification which need to be defined beforehand. They are time series data-sets, time series analysis and finally, time series classification.

A time series data set is a data set which represents some measurements of a quantity over a period of time. The behavior of the series heavily depends on the order of the points and changing the order of data points changes the meaning of the whole data set. Time series analysis is developing statistical models to provide reasonable explanations regarding sample data. These models can be developed using various machine learning technologies.

Time series classification deals with classifying the data points over the time based on its’ behavior. There can be data sets which behave in an abnormal manner when comparing with other data sets. Identifying unusual and anomalous time series is becoming increasingly common for organizations. It is a must for an organization to identify abnormal behaviors in order to make strong business decisions and market predictions. As an example, huge business industries such as Yahoo monitor their mail servers over time in order to detect anomalies and malicious time series. In this case, Feature Extraction can be used as a methodology for time series classification.

Feature extraction related to extracting information from a time serious in order to represent the time series as a feature vector. These features can be derived by using scientific time series analysis. Correlation structure, distribution, entropy, stationarity and scaling properties are some of the examples for time series features and they facilitate to fit time series into a range of time series models. It is mainly related to statistics as most of the features which describe time series information are statistical.

Huge amounts of time series data are collected every day from many heterogeneous data sources across different application domains. A vast amount of data are generated in a fraction of a second especially in social media such as Facebook and Twitter. The highly dynamic and fluctuating nature of these domains along with collecting and storing such enormous amounts of data, poses new challenges for time series classification. As a result of the size, velocity and the complexity inherent in big data, the traditional classification methods such as instance based classification may fail in identifying anomalous time series in an accurate manner. Data noise and seasonality also increase this possibility. Feature based approaches are more interpretable and more resilient to missing data and noisy data. Therefore, preprocessing these data efficiently and identifying hidden patterns with bare minimum resources is a contemporary research interest.

A number of researchers have studied regarding time series classification over the past using different approaches. Rob Hyndman et al. propose an idea for time series classification using Principal Component Analysis (PCA) on features. This research has mainly focused on detecting unusual or anomalous time series. For that, they have applied bivariate outlier detection methods on first two principal components of a particular time series and through that, they have identified the most unusual time series among a given set of time series. This methodology has been compared with K-Means clustering as a baseline method and has out-performed that as a result of using a well-researched feature space for classification.

Ben Fulcher et al. introduce a time series classification technology based on a set of selected features of a time series. They have developed a mechanism for automating the process of extracting features from a time series. After generating a large number of features, the most suitable features for representing a particular time series have been selected through Greedy Approach. A time series is represented as a feature vector and a set of feature vectors are used with a classification model such as a decision tree for time series classification. This methodology has given a better performance over traditional classification methodologies such as instance based classification. In this case, they have also introduced a set of self-describable features for a time series such as lumpiness, spikiness, level shift and crossing points while using them for time series classification.

Feature based time series classification has also been used for time series analysis and visualization purposes. Nick Jones et al. propose a mechanism for time series representation using their properties measured by diverse scientific methods. It supports organizing time series data sets automatically based on their properties. Time series representation has been achieved using two dimensional matrix where rows represent times series and columns represent their operations. It makes time series analysis easier as it represents a large amount of information using time series features.

Time series classification is a supportive mechanism for time series forecasting. Kasun Bandara et al. propose a mechanism for time series forecasting using Long Short-Term Memory(LSTM) networks. In this case, they have developed different LSTM networks for different clusters of time series and time series forecasting for different clusters have been performed separately. In this case, feature based classification has been used as a supporting mechanism for time series clustering after representing a time series as a feature vector.

There are many other methods in which Time-series Classification can be done.Answer found here

One such application of Time-Series classification is used above : Indoor User Movement Prediction problem. In this challenge, multiple motion sensors are placed in different rooms and the goal is to identify whether an individual has moved across rooms, based on the frequency data captured from these motion sensors.

There are four motion sensors (A1, A2, A3, A4) placed across two rooms. Have a look at the below image which illustrates where the sensors are positioned in each room. The setup in these two rooms was created in 3 different pairs of rooms (group1, group2, group3).

A person can move along any of the six pre-defined paths shown in the above image. If a person walks on path 2, 3, 4 or 6, he moves within the room. On the other hand, if a person follows path 1 or path 5, we can say that the person has moved between the rooms.

The sensor reading can be used to identify the position of a person at a given point in time. As the person moves in the room or across rooms, the reading in the sensor changes. This change can be used to identify the path of the person.

Now that the problem statement is clear, it’s time to get down to coding! In the next section, we will look at the dataset for the problem which should help clear up any lingering questions you might have on this statement. You can download the dataset from this link: Indoor User Movement Prediction.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published