In this project I've implemented a regression model that recommends the “crew” size for potential ship buyers. The python notebook named assignment.ipynb contains the whole code. I used JupyterNotebook for running the assignment.
As I've chose track 1a: Crew size prediction my major objective is to build a regressor model that reccomends the "crew" size for potential ship buyers.
Major steps followed:
-
Read the dataset and display columns. 1.1 Summarising the data by calculating the count,mean,standard deviation,min,max. 1.2 Plotting the heatmap for all the attributes to find the correlation factor. 1.3 Few data visualisations.
-
Analysing the data and drawing the observations. 2.1 Observations for part 2.
-
Selecting the variables that are used for predicting the crew size. 3.1 Plotting Heatmap of covariance to show correlation between the coefficients and it's observations. 3.2 Using the Selected important variables (columns) from above heatmap plot. 3.3 Build one-hot encoding for the categorical values and it's observations.
-
Creating train & test data.
-
Train a Multi-Regression ML model to predict the 'crew' size.
-
Evaluating the Regression model by calculating the scores and drawing observations. 6.1 6.1 Feature Reduction, Standardisation, Cross-Validation & Hyper-Parameter fine-tuning. 6.2 Different Methods of Dimensionality Reduction. 6.2.1 Principal Component Analysis (PCA) & reasoning/observation from the plot. 6.2.2 LASSO: Regularised Regression Analysis & reasoning/observation from the plot.
There are 2 other files in the repository or folder which contains EXERCISE 2 and EXERCISE 3 technical question files attached.
Exercise 2: In this I've answered all the questions based on Database (SQL and Management system).
Exercise 3: In this I've answered all the questions based on Data Management and Machine Translation related things.