# Unsupervised Learning — K-Mean Clustering

2 min readOct 16, 2021

--

Goal is to find the right value of K and make K clusters.

Steps below are based on Basic Euclidean distance metric

Let X = {x1,x2,x3,……..,xn} be the set of data points and V = {v1,v2,…….,vc} be the set of centers.

1. Select ‘c’ cluster centers randomly.
2. Calculate the distance between each data point and cluster centers using the Euclidean distance metric as follows

3. Assign the point to a center where the calculated distance is minimum (points grouped to the same center are now called a cluster)

4. Calculate the cluster center

5. cluster center now become new cluster center, reassign the datapoint to nearest center.

6. If no data point was reassigned then stop, otherwise repeat steps 3 to 5.

7. Calculate WCSS (Within Cluster Sum of Squares) for each value of C- WCSS measures the squared average distance of all the points within a cluster to the cluster centroid.

Create Model
from sklearn.cluster import KMeans
model = KMeans(n_clusters=4)

Train data
model.fit(raw_data)
wcss = model.inertia_

select the cluster size based wcss [Elbow method]