Skip to content

In this assignment you'll explore the relationship between model complexity and generalization performance, by adjusting key parameters of various supervised learning models for both regression and classification.

Notifications You must be signed in to change notification settings

jigsawcoder/Applied-Machine-Learning-using-Python---Assignment-2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Applied-Machine-Learning-using-Python---Assignment-2

Assignment 2

In this assignment you'll explore the relationship between model complexity and generalization performance, by adjusting key parameters of various supervised learning models. Part 1 of this assignment will look at regression and Part 2 will look at classification.

FOR CLASSIFICATION:

Here's an application of machine learning that could save your life! For this section of the assignment we will be working with the UCI Mushroom Data Set stored in readonly/mushrooms.csv. The data will be used to train a model to predict whether or not a mushroom is poisonous. The following attributes are provided:

Attribute Information:

cap-shape: bell=b, conical=c, convex=x, flat=f, knobbed=k, sunken=s cap-surface: fibrous=f, grooves=g, scaly=y, smooth=s cap-color: brown=n, buff=b, cinnamon=c, gray=g, green=r, pink=p, purple=u, red=e, white=w, yellow=y bruises?: bruises=t, no=f odor: almond=a, anise=l, creosote=c, fishy=y, foul=f, musty=m, none=n, pungent=p, spicy=s gill-attachment: attached=a, descending=d, free=f, notched=n gill-spacing: close=c, crowded=w, distant=d gill-size: broad=b, narrow=n gill-color: black=k, brown=n, buff=b, chocolate=h, gray=g, green=r, orange=o, pink=p, purple=u, red=e, white=w, yellow=y stalk-shape: enlarging=e, tapering=t stalk-root: bulbous=b, club=c, cup=u, equal=e, rhizomorphs=z, rooted=r, missing=? stalk-surface-above-ring: fibrous=f, scaly=y, silky=k, smooth=s stalk-surface-below-ring: fibrous=f, scaly=y, silky=k, smooth=s stalk-color-above-ring: brown=n, buff=b, cinnamon=c, gray=g, orange=o, pink=p, red=e, white=w, yellow=y stalk-color-below-ring: brown=n, buff=b, cinnamon=c, gray=g, orange=o, pink=p, red=e, white=w, yellow=y veil-type: partial=p, universal=u veil-color: brown=n, orange=o, white=w, yellow=y ring-number: none=n, one=o, two=t ring-type: cobwebby=c, evanescent=e, flaring=f, large=l, none=n, pendant=p, sheathing=s, zone=z spore-print-color: black=k, brown=n, buff=b, chocolate=h, green=r, orange=o, purple=u, white=w, yellow=y population: abundant=a, clustered=c, numerous=n, scattered=s, several=v, solitary=y habitat: grasses=g, leaves=l, meadows=m, paths=p, urban=u, waste=w, woods=d

The data in the mushrooms dataset is currently encoded with strings. These values will need to be encoded to numeric to work with sklearn. We'll use pd.get_dummies to convert the categorical variables into indicator variables.

About

In this assignment you'll explore the relationship between model complexity and generalization performance, by adjusting key parameters of various supervised learning models for both regression and classification.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published