Skip to content

Monish997/Global-Hack-Week--Init-2022-Machine-Learning-Track

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains code I have written while following MLH Global Hack Week: Init 2022 Machine Learning Track.
Link to data used for training the models

Order in which I learnt

Rules-based Model -> Machine Learning Model (Logistic Regression) -> Artificial Neural Network Model (Tensorflow and Keras)

What is NLP?

NLP stands for Natural Language Processing. NLP is the ability of a computer program to understand human language as it is spoken and written.

Basic Terms in NLP

  1. The smallest unit of NLP data is a character.
character = 'g'
  1. A sequence of characters that make up a "word" is called a token.
token = "good"
  1. A sequence of tokens that convey a meaning on its own is called a document.
document = "A. R. Rahman is a good film composer and songwriter."
  1. A collection of documents is called a corpus.
corpus = [
	"A. R. Rahman is a good film composer and songwriter.",
	"Pineapple on pizzas is a very bad idea.",
	"I like anime. Steins;Gate is my favourite",
	"My introvert friend is terrible at communicating.",
]

Basic Preprocessing involved in NLP

  1. Tokenization: breaks down text into smaller semantic units or single clauses
  2. Part-of-speech-tagging: marking up words as nouns, verbs, adjectives, adverbs, pronouns, etc
  3. Stemming and lemmatization: standardizing words by reducing them to their root forms
  4. Stop word removal: filtering out common words that add little or no unique information, for example, prepositions and articles (at, to, a, the).

Twitch Stream Links

Machine Learning Track Part 2: Intro to NLP
Machine Learning Track Part 3: Logistic Regression and Neural Networks
Machine Learning Track Part 4: Tensorflow, Keras, and Overfitting

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published