-
Notifications
You must be signed in to change notification settings - Fork 0
/
data science
36 lines (24 loc) · 1.91 KB
/
data science
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
Data science related questions
1. What are different hyperparameter techniques..?!
A. Grid search method, Random method
2. Difference between tokenization and embedding
A. Tokenization takes the text and maps input sequences to numbers.
Tokenization Straight mapping from token to numbers ( can be modeled but quickly gets too big).
These tokens are usually words that can also be phrases, punctuation marks, or even individual characters.
Tokenization is the first step in NLP and is essential for text preprocessing.
Tokenization helps in preparing the text data for analysis by making it more structured and easier to work with.
3. Type of tokenizations and embedding techniques
4. How do we handle imbalanced sampling
5. What are differen methods of imputation of missing values
6. What are different methods of embeddings and how do we visualize the embedded has successfully seggregated the values per class
8. Different methods of regularization
Methods: L1, L2, Elastic net, drop out, Early stopping, Batch normalization, Weight contraint
Elastic Net: This combination of L1 and L2 regularization controls the model by adding penalties from both L1 and L2,
which can be a useful middle ground.
7. Traditional methods like cross-validation and stepwise regression to perform feature selection and handle overfitting work well with a small set of features
but L1 and L2 regularization methods are a great alternative when you’re dealing with a large set of features.
8.Difference between L1 and L2 regularization
L1 Regularization (Lasso): Encourages sparsity in the model parameters. Some coefficients can shrink to zero, effectively performing feature selection.
L2 Regularization (Ridge): It shrinks the coefficients evenly but does not necessarily bring them to zero. It helps with multicollinearity and model stability.
9.Use of GPU in tensorflow and how to limit the usage of GPU
https://www.tensorflow.org/guide/gpu