LANL Earthquake Prediction

https://www.kaggle.com/c/LANL-Earthquake-Prediction

DataSet

Dataset details, such as number of features, instances, data distribution

Features:

Training Data:

acoustic_data - the seismic signal [int16]
time_to_failure - the time (in milli seconds) until the next laboratory earthquake [float64]

Training Data instances: 629 million points

Data Distribution:

signal	quaketime
count	1.000000e+07
mean	4.502072e+00
std	1.780707e+01
min	-4.621000e+03
25%	2.000000e+00
50%	4.000000e+00
75%	7.000000e+00
max	3.252000e+03

Test Data:

seg_id- the test segment ids for which predictions should be made (one prediction per segment)
acoustic_data - the seismic signal [int16] for which the prediction is made.

Test Data instances: 2624 files, with 150,000 instances for each file => 393,600,000 instances

Techniques we plan to use:

SVM
Gradient Boosting
Random Forests

Experimental methodology:

Divide the training data into chunks of 150,000 data points as the test data consists of 150,000 points
We are not creating validation dataset as the input dataset is a continguous data from a sensor. Creating validation dataset by choosing the data randomly will not give any good results
Scale the data

Feature Engineering:

Feature generation: Create several groups of features:

Usual aggregations: mean, std, min and max
Average difference between the consequitive values in absolute and percent values;
Absolute min and max vallues;
Aforementioned aggregations for first and last 10000 and 50000 values - I think these data should be useful;
Max value to min value and their differencem also count of values bigger that 500 (arbitrary threshold);
Quantile features
Trend features
Rolling features

Coding Language:

Python

Team-Members

Chandra Kiran Saladi ( cxs172130 )
Shreyash Mane ( ssm170730 )
Tanya Tukade ( txt171230 )
Supraja Ponnur ( sxp179130 )

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Earthquake_Challenge.ipynb		Earthquake_Challenge.ipynb
Initial_report.docx		Initial_report.docx
ML_Project.ipynb		ML_Project.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LANL Earthquake Prediction

DataSet

Features:

Training Data:

Data Distribution:

Test Data:

Techniques we plan to use:

Experimental methodology:

Feature Engineering:

Coding Language:

Team-Members

About

Releases

Packages

Contributors 2

Languages

maneshreyash/Earthquake_Prediction

Folders and files

Latest commit

History

Repository files navigation

LANL Earthquake Prediction

DataSet

Features:

Training Data:

Data Distribution:

Test Data:

Techniques we plan to use:

Experimental methodology:

Feature Engineering:

Coding Language:

Team-Members

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages