Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Experiment] Mars Insight Unsupervised Clustering. #4

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
.DS_Store
<<<<<<< HEAD
=======

>>>>>>> master
.idea/
_old/
runs/
htmlcov/
models/
.data/

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down Expand Up @@ -136,4 +140,8 @@ dmypy.json

# Pyre type checker
.pyre/
<<<<<<< HEAD
/experiments/mars_insight_clustering/unsupervised_clustering/models/
=======
/experiments/triggered_tremor/convnet/runs/
>>>>>>> master
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,17 @@ a deep learning experimentation framework for seismic data.
## Active Experiments
* [Triggered Earthquake Detection](experiments/triggered_earthquake/README.md)
* [Triggered Tremor Detection](experiments/triggered_tremor/README.md)
* [Mars Insight Seismic Data Unsupervised Clustering](experiments/mars_insight_clustering/README.md)

## Supported Models
* [Deep Convolutional Network](seisml/networks/convnet.py)
* [Dialated Convolutional Network for Deep Clustering](seisml/networks/dilated_convolutional.py)
* [Convolutional Auto Encoder](seisml/networks/convolutional_autoencoder.py)

## Datasets
* [Triggered Earthquake](experiments/triggered_earthquake)
* [Triggered Tremor](experiments/triggered_tremor)
* [Mars Insight](experiments/mars_insight_clustering)

## Repo Structure
* `experiments/`
Expand Down Expand Up @@ -43,14 +46,14 @@ a deep learning experimentation framework for seismic data.
* `environment.yml`
- conda environment file for CI and use


## Installation
* clone the repository and `cd` to root
* create a new Anaconda environment
```shell script
conda create env -f environment.yml
```
* run a experiments following the `README.md` found in the specific experiment directory.
* run a experiments following the `README.md` found in the specific experiment directory.


## Inspiration
The inspiration and starting codebase for this model is from the Seismological Research Letters paper
Expand Down
1 change: 0 additions & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -191,4 +191,3 @@ dependencies:
- sqlalchemy==1.3.16
- torch-summary==1.2.0
- tqdm==4.45.0

31 changes: 31 additions & 0 deletions experiments/mars_insight_clustering/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Mars Seismic Data
The [Mars InSight Lander](https://mars.nasa.gov/insight/) has been recording seismic data for most of 2019. The data is being released on a 3-month cadence from [SEIS](https://www.seis-insight.eu/en/science/science-summary). There is also a [active catalog](https://www.seis-insight.eu/en/science/seis-products/mqs-catalogs) of events. [Iris](https://www.iris.edu/hq/sis/insight) is also hosting data, but is behind the offical EU site, therefore most data is pulled from SEIS. The paper from Nature Geoscience [__The Sesimicity of Mars__](https://www.nature.com/articles/s41561-020-0539-8) outlines the findings from the first catalog release. More information regarding Insight scientific findings can be found in the Nature issue, [Insight on Mars](https://www.nature.com/collections/iiiifgehfc).

## Experiment Overview
The current event categorization process at SEIS is done using common seismology techniques by professionals. This experiement proposes using unsupervised machine learning techniques in order to cluster small smaples of the seimic data on mars. The hypothesis is that many of the examples can be clustered to match results found from the SEIS lab. The hops is that some cluster may provide new events or help in the identification of anomalies such as wind, drastic temperature changes, or other instrument interference. Ideally a model that could cluster in such a way could assist in the categorization of data.
* [Deep Clustering Auto-Encoder](unsupervised_clustering/README.md)

## Data
Raw data is pulled directly from the SEIS API using the availability information provided [here](https://www.seis-insight.eu/en/science/seis-data/seis-data-availability). The `XB` Network is used for instruments on Mars and the station `ELYSE` is reserved for scientific data during the mission, post full instrumentation deployment. Channels `BHU`, `BHV` and `BHW` are used to

### Availability
The notebook `mars_event_windowsi.ipynb` (split into scripts with `download_all_data.py` & `prepare_data.py`) will download data directly from [Insight's science portal](https://www.seis-insight.eu) (EU) and split into smaller samples to be fed into models. This is desinged for unsupervised modeling. Model-ready (processed) datasets are hosted on S3 for the frameworks convenience [here](../../seisml/utility/download_data.py).

More notebooks related to retriveing manually labeled events can be found [here](../../playground/mars_insight_seismic).

### Dataset (`seisml/datasets/mars_insight.py`)
this dataset will apply a transform to each sample read from a directory of `mseed` files. For an example of using this dataset, refer to the [dataset unit test](../../tests/datasets/mars_insight_test.py).

## Events
Manually events have been labeled over this dataset using traditional seismology techniques, outline in [The Sesimicity of Mars](https://www.nature.com/articles/s41561-020-0539-8). We filter these events for exploration and hosted the images with url `https://blainerothrock-public.s3.us-east-2.amazonaws.com/seisml/mars/event_images/`. To view a particular events with general event find the event Name on [Iris](http://ds.iris.edu/ds/nodes/dmc/tools/mars-events/) and append to the url, examples:

#### Event Name: `S0387a`
* `https://blainerothrock-public.s3.us-east-2.amazonaws.com/seisml/mars/event_images/S0387a.png`: Bandpass filtered from 1 to 9

![event filtered](https://blainerothrock-public.s3.us-east-2.amazonaws.com/seisml/mars/event_images/S0387a.png)

* `https://blainerothrock-public.s3.us-east-2.amazonaws.com/seisml/mars/event_images/S0387a_raw.png`: raw event data

![event raw](https://blainerothrock-public.s3.us-east-2.amazonaws.com/seisml/mars/event_images/S0387a_raw.png)


Empty file.
10 changes: 10 additions & 0 deletions experiments/mars_insight_clustering/config_download.gin
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# download
download_data_availability.data_path = '~/.seisml/mars/all/'

split_availability.path='~/.seisml/mars/all/'
split_availability.network='XB'
split_availability.channel='BHU'
split_availability.location='02'

download_mseed.channel='BH?'
download_mseed.data_path='~/.seisml/mars/all/raw'
7 changes: 7 additions & 0 deletions experiments/mars_insight_clustering/config_prepare.gin
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# prepare
split_data.raw_dir='~/.seisml/mars/all/raw'
split_data.save_dir='~/.seisml/mars/all/prepared_12-3_oct01-010/sample'
split_data.stride=3
split_data.length=12
split_data.starttime='2019-10-01T00:00:00'
split_data.endtime='2019-10-15T00:00:00'
87 changes: 87 additions & 0 deletions experiments/mars_insight_clustering/download.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
import csv, json, os, sys
import requests
from time import sleep
import gin

import concurrent.futures

import obspy
from obspy import read

from obspy.core.event import read_events
from obspy import read_inventory
from datetime import datetime, timedelta
from obspy.core import UTCDateTime

@gin.configurable()
def download_data_availability(data_path):
data_path = os.path.expanduser(data_path)
try:
os.makedirs(data_path)
except FileExistsError:
pass

payload = {
'option': 'com_ajax',
'data': 'ELYSE',
'format': 'json',
'module': 'seis_data_available'
}
r = requests.post('https://www.seis-insight.eu/en/science/seis-data/seis-data-availability', payload)
with open(os.path.join(data_path, 'data_availability.json'), 'wb') as f:
f.write(r.content)

@gin.configurable()
def split_availability(path, network, location, channel):
path = os.path.expanduser(path)
with open(os.path.join(path, 'data_availability.json'), 'r') as f:
raw_ava = json.load(f)

ava = []
for t in raw_ava['data']:
if t['network'] == network and t['location'] == location and t['channel'].startswith(channel):
ava.append(t)

with open(os.path.join(path, 'catelog.json'), 'w') as f:
json.dump(ava, f)

return ava


@gin.configurable(blacklist=['event'])
def download_mseed(event, channel, data_path):

data_path = os.path.expanduser(data_path)

try:
os.makedirs(data_path)
except FileExistsError:
pass

payload = {
'network': event['network'],
'station': event['station'],
'startTime': event['startTime'],
'endTime': event['endTime'],
'location': event['location'],
'channel': channel
}

req = requests.get('http://ws.ipgp.fr/fdsnws/dataselect/1/query', params=payload)
file_name = '-'.join(
[event['network'], event['station'], event['location'], event['startTime'], event['endTime']]) + '.mseed'
print(f'{os.getpid()}: downloading: {file_name}')
path = os.path.join(data_path, file_name)
with open(path, 'wb') as c:
c.write(req.content)
return path


if __name__ == '__main__':
gin.parse_config_file('config_download.gin')
download_data_availability()
ava = split_availability()

with concurrent.futures.ThreadPoolExecutor() as executor:
for event in ava:
executor.submit(download_mseed, event)
Loading