Skip to content

Assign kmeans() to a DataFrame (outputting a new column to the dataframe with cluster values)

Notifications You must be signed in to change notification settings

hoofay/assign_kmeans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

This is a function for the application of kmeans() to a dataframe.

The function outputs the input dataframe with an additional column stating the assign Cluster. The function allows for filtering of the dataframe, such that a subset of the dataframe is used to assign clusters, and the remaining dataframe is then assigned a cluster based on the original calculations.

df = dataframe (e.g. 'mydf') filter_var = variable to filter for the purposes of determining clusters (e.g. 'Customers') filter_val = value for filter_var to be greater than (e.g. 10 for 'Customers'>10) id_var = a unique id column within the dataframe (e.g. 'Customer_Id') cluster_vars = the variables used for clustering (e.g. c('Customer_Value','Number_of_Transactions')) num_clusters = the number of centres argument in kmeans() useScale = TRUE/FALSE, whether or not to scale the matrix prior to deploying kmeans().

About

Assign kmeans() to a DataFrame (outputting a new column to the dataframe with cluster values)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages