My machine learning code written by Python.
- (1) Install Python 3.5 at Windows10.
- (2) Install IPython 4.0.3.
- (3) Install machine learning packages installer Anaconda.
- (4) Run IPython and access "http://127.0.0.1:8888" at browser.
> ipython notebook
- Matrix operations: add, subtraction, inverse
- linspace
- zeros
- meshgrid
- sin/cos/pi
- random
- np.where to find index
- batch replace array values
- np.split
- np.arange
- multi-index for 1d array
- polynomial fit
- draw line
- draw image
- draw 2 lists to bar
- draw bar with x
- draw image without axes
- figure resize
- plot 2 lines at one fiture
- plot point and line at one fiture
- 3D plot with pca
- plot 2 lines with different line width
- save to png image
- plot x/y/title label with font_size
- sub-plot
- set x/y axes limits
- add legend
- plot multi-lines with labels for each line
- plot confusion matrix
- leastsq
- load matlab mat file
- sparse matrix basic
- gaussian filter
- get data distribution probability
- cdist euclidean
- downsample/down-sample data
- optimize with constraints equal
- load matlab mat file then fft
- Series basic for pandas
- DataFrame basic for pandas
- create DataFrame from list of data
- read csv and iterator row by row
- read excel and iterator row by row
- replace/fill all nan
- parse csv with special seperator
- read csv without header
- remove missing values row/column
- null/missing values
- drop columns contains only zero
- data merge
- data sort by count
- data sort by mean
- modify string to int for each column data
- count special column values distribution
- write excel to multiple sheets
- merge two df: hstack
- merge two df: vstack
Samples should be opened by ipython.
Linear Model
Decision Tree
Random Forest
SVM
Neural Network
Gradient Boosting for classification
CNN (Deep Learning)
XGBoost
DBN
RNN
DCNN
Linear Regression
- sklearn LinearRegression and linear function parameters
- difference between
np.linalg.lstsq
andlinear_model.LinearRegression
- regression by CNN
- HMM basic
- HMM application
k-Nearest Neighbor
DBSCAN
- KFold and StratifiedKFold
- accuracy score
- confusion matrix
- P, R, F1 value
- corss validation
- MSE
- Logloss
- classification report
- feature selection by feature importance
- homogeneity & completeness & v_measure for unsupervised learning cluster evaluation metric
- Install Tensorflow at windows
- "hello world" for basic calculation
- basic linear model by tf-core low level api
- basic linear model training by tf-core low level api
- basic linear model by tf.estimator high level api
- basic softmax regression at mnist dataset
DNNClassifier set batchsize and epoch
- Install keras at Windows
- Modify keras backend between theano and tf
- Keras basic neural network
- int to one-hot & one-hot to int
- get log loss each epoch
- model dump, load
- plot loss,acc history
- early stop
- adam optimization
- validation
- param of model.summary
- keras fixed model random seed
- Keras 1D-CNN
- Keras 2D-CNN
- RNN by keras
- RESNET-50 by keras
- LSTM
- LSTM+Attention
- complex attention models
- BI-LSTM
- model persistence(dump & load)
- euclidean distance
- boston house-prices dataset for regression
- decision tree ensembel with adaboost
- kmeans cluster
- mini batch kmeans cluster
- random forest grid search
- logloss
- GMM cluster
- TSNE/T-SNE plot
- F1-P-R calculation for multi-class output by sklearn
- Isomap plot
- array create/reshape/index
- array concat
- array operation
- basic linear regression
- linear regression by gluon
- get dataset fashion mnist
- softmax regression by gloun
- plot activation relu/sigmoid with gradient
- mlp and model parameter access
- CNN LeNet
- dataset jaychou
- generate jay text by rnn
- compute gradient
- count null/missing_value item for each column
- drop the column if it contains one missing_value
- drop the column if the missing_value count > 0.5*full_count
- check the data with null data > 20%
- insert mean value for missing values
Installation
- Install opencv-python at Win-64 by conda py2 env
- Install opencv-python at Win-64 with conda(python3)
- Install opencv-python at Win by py2
Basic
- Image Read/Cut/Display
- Image Read/Cut/Display by Jupyter
- save image/imwrite
- cvtColor
- rgb to hsv
- hsv image
Preprocess
- Image smooth by blurring, Gaussian, median and bilateral filtering
- modify pixel color
- modify pixel color fast
- flattern rgb
- split rgb
- find connected component
- convert white color to green
- blur detection
Projects
opencv 2.4.9 & windows-7
- SIFT
- Images matching with average distance output
- Images matching with most-similar-average distance output
- Different size images match by ORB feature and BFMatcher
- Different direction images match by ORB feature and BFMatcher
- Image smooth, shift, rotate, zoom by scipy.ndimage
- image enhancement for ocr
- Keep gray image pixel value range [0,255] when room/shift/rotate by setting order=1. commit-220ac520a0d008e74165fe3aace42b93844aedde
- template match
- Run at Cluster
- Distributed programming demo for feature extraction
- Get running slave current directory and IP address
- Submit multiple .py files which have dependency
- nltk basic usage such as tokenize/tagging
- nltk tag meaning en & zh
- identify named entities basic
- nltk load and process built-in data/dataset/corpus
- nltk load and process external data/dataset/corpus
- normalizing text by stemmer
- normalizing text by lemmatization
- nltk.Text.similar
- more details about nltk.Text.similar
- sentiment analysis
- offline isntall nltk_data
- install allennlp
- NER by biLSTM with CRF layer and ELMo embeddingstrained on the CoNLL-2003 NER dataset 2018
- fact predict by trained decomposable attention model 2017
- question answer from passage by BiDAF 2017
- basic intro
- frequency and data extraction from wav file
- audio feature extraction
- extract same length features for different audio
- GPU environment setup
- GPU vs CPU basic
- check GPU hardware status
- check if keras/tensorflow run on GPU
- 矩池云GPU
- load local dataset by pandas
- load local dataset by surprise
- load buidin dataset by surprise
- ml-100k dataset movie name to id, or id to movie
- basic/official example
- basic algorithm and testing for local data
- predict rating for test dataset
- user based collaborative filtering and predict one item rating
- item based collaborative filtering and predict one item rating
- top-n recommandation for buildin data
- top-n recommandation for local data
- SVD collaborative filtering
- correlation analysis
- association analysis
- jaccard similarity
- 2 strings/list similarity (unequal length)
- peak detect
- panel data analysis: fixed/random effects model by linearmodels
- panel data analysis: mixed effects model by statsmodels
- Human face image completion
- Human detection by opencv hug & svm
- gibberish text detection
- poker game AI
- learn deeper for pklearn
- Industrial Control System ICS Cyber Attack Dataset Classification
- red wine quelity classification
- chemistry dataset tox21 test by chainer-chemistry
- spam email/sms classification
- jaychou text generate by lstm
- SI/SIR model by scipy