As I'm getting a lot of assignments these days I've decided to pause app development and first work on them and they are interesting too.
I'll be detailing about every algorithm which I've learnt and will be learning.
So, lets start with k means algorithm.
K means Algorithm -
Suppose we have been given data about something and we wish to find pattern in the given data like some clusters where we can group similar type of data together.
To achieve this objective we can use k means algorithm.
To use this algorithm we need to have the information about the number of groups/clusters will be there. Let this number is k.
The algorithm -
1. Input - set of points (x1,x2,...xn) of size say n
2. Randomly generate k numbers and assume them to be the centroids at the moment.
3. Repeat until convergence:
for each point xi:
Find nearest centroid Cj (Can use eucledian distance to find the nearest centroid)
Assign the point xi to cluster j
for each cluster j=i...k
new centroid Cj = mean of all points xi assigned to cluster j in previous step
4.Stop when none of the cluster assignments change
Note -
After each iteration centroid will change and get placed optimally(better position than before)
We need to stop updating the centroids when there is no change in the cluster arrangement i.e. if clusters are remaining same for 2 consecutive iterations.
Time Complexity of this algorithm - O(#iteration*#clusters*#instances*#dimensions)
I've implemented this on python - See
I'll be detailing about every algorithm which I've learnt and will be learning.
So, lets start with k means algorithm.
K means Algorithm -
Suppose we have been given data about something and we wish to find pattern in the given data like some clusters where we can group similar type of data together.
To achieve this objective we can use k means algorithm.
To use this algorithm we need to have the information about the number of groups/clusters will be there. Let this number is k.
The algorithm -
1. Input - set of points (x1,x2,...xn) of size say n
2. Randomly generate k numbers and assume them to be the centroids at the moment.
3. Repeat until convergence:
for each point xi:
Find nearest centroid Cj (Can use eucledian distance to find the nearest centroid)
Assign the point xi to cluster j
for each cluster j=i...k
new centroid Cj = mean of all points xi assigned to cluster j in previous step
4.Stop when none of the cluster assignments change
Note -
After each iteration centroid will change and get placed optimally(better position than before)
We need to stop updating the centroids when there is no change in the cluster arrangement i.e. if clusters are remaining same for 2 consecutive iterations.
Time Complexity of this algorithm - O(#iteration*#clusters*#instances*#dimensions)
I've implemented this on python - See
The problem was - Given 150 flower's data with (petal length,petal width,sepal length,sepal width) we need to group them into 3 clusters.
I hope this will be helpful.
In addition to this I've rolled out an update for Medicator app. Now the app covers over 125 diseases.
You can check it out here - Medicator
No comments:
Post a Comment