-
Notifications
You must be signed in to change notification settings - Fork 1
/
ReadMe
41 lines (33 loc) · 1.23 KB
/
ReadMe
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
### Update 10/07/2018
## MAJOR CHANGES COMING IN SOON, inlcuding pytorch implementation and better structure
Tensorflow Speech Recognition Challenge
https://www.kaggle.com/c/tensorflow-speech-recognition-challenge
Folders :
images: audio clips -> spectrogram images
im_train: -> images -> resize to 28x28
results: results in graphs
papers: some useful papers
test_pics : ignore (spectrograms of test audio clips)
Deprecated : old GCP files. Ignore
Files :
complete.py -> code with two CNN models and adversarial training
ReadMe -> this
Some files were used for preprocessing on older data
but maybe useful for other projects
ignore these:
CNN_code_for_resized_data.py
dataset.py
downsizing.py <- recursively resize all images in a folder
ds.py <- tried an iterator
pp.py <- audio to image conversion. recursively converts all audio clips in a folder to
corresponding spectrograms
speech_recog.py <- ignore
GCP-SR.py <-- for local usage in google cloud platform
Models:
Shallow CNN: CNN similar to AlexNet. Two fc layers at the end, dropout enabled/disabled.
Deeper CNN:
wide : added more layers to the CNN, removed dropout
wider : increased number of filters
For Results and Talks:
ML_final.pdf
ML_talk.pdf