Notes on codes, projects and everything

Implementing Fuzzy c-means in Clojure

I was asked to evaluate fuzzy c-means to find out whether it is a good clustering algorithm for my MPhil project. So I spent the whole afternoon reading through some tutorial to get some basic understanding. Then I thought why not implement it in Clojure because it doesn’t look too complicated (I was so wrong…).

I based my implementation on the tutorial posted here. However I still don’t understand how the terminate condition is met so I am setting the function to loop only like 10 times for now (until I understand it better). Added the objective function, loop will stop when the next result is higher than the current result.

My current code for fuzzy c-means:

(ns sensei.clustering.fcm)

(use 'clojure.set
     '[incanter core stats])

(defn fcm[data cluster-count fuzziness-index]
    (let [random-points (take
                            #(matrix (take
                               (count (first data))
                               (repeatedly rand)))))
          degree-of-membership (fn [point centroids cluster-index]
                                   (let [power #(pow % (/ 2 (- fuzziness-index 1)))]
                                     (/ 1 (apply
                                              #(power (/ (euclidean-distance point (nth centroids cluster-index))
                                                         (euclidean-distance point %)))
          fuzzy-membership (fn [centroids]
                                 (fn [point]
                                       #(degree-of-membership point centroids %)
                                       (range cluster-count)))
          cluster-membership (fn [cluster-index membership]
                                 (map #(nth % cluster-index) membership))
          new-centroid (fn [members]
                           (div (apply
                               #(mult (pow (nth members %) fuzziness-index) (nth data %))
                               (range (count members))))
                               #(pow (nth members %) fuzziness-index)
                               (range (count members))))))
          new-centroids (fn [membership]
                              #(new-centroid (cluster-membership % membership))
                              (range cluster-count)))
          objective (fn [membership centroids]
                            (fn [point-index]
                                    #(* (pow (nth (nth membership point-index) %) fuzziness-index)
                                        (pow (euclidean-distance (nth data point-index) (nth centroids %)) 2))
                                    (range cluster-count))))
                            (range (count membership)))))
          cluster (fn re-cluster
                      ([membership centroids] (re-cluster membership centroids (objective membership centroids)))
                      ([membership centroids objective-value]
                       (let [next-membership (fuzzy-membership centroids)
                             next-centroids (new-centroids next-membership)
                             next-objective (objective next-membership next-centroids)]
                         (if (>= next-objective objective-value)
                           (cons centroids membership)
                           (recur next-membership next-centroids next-objective)))))]
      (cluster (fuzzy-membership random-points) random-points)))

My attempt to plot a graph showing the relationship between each point and clusters. Only lines that shows the most degree of membership are shown in this example

(ns sensei.core)

(use 'sensei.clustering.fcm
     '[incanter core charts stats])

(let [data (take
               #(matrix (take
                          (repeatedly rand)))))
      [centroids & membership] (fcm data 5 5)
      chart (scatter-plot
              (map #(nth % 0) data)
              (map #(nth % 1) data))]
  (view (reduce
    (fn [chart point-index]
        (let [point-membership (nth membership point-index)
              max-membership (apply max point-membership)]
            (fn [chart cluster-index]
                (if (= max-membership (nth point-membership cluster-index))
                    (vector (nth (nth centroids cluster-index) 0) (nth (nth data point-index) 0))
                    (vector (nth (nth centroids cluster-index) 1) (nth (nth data point-index) 1)))
            (range (count centroids)))))
    (range (count data)))))

Example output:

fuzzy c-means

Updated 2011-06-23: Fixed some problems and added objective function in, but solution still doesn’t look right. Updated with image.
Updated 2011-07-18: Found this while looking for a better implementation.

leave your comment

name is required

email is required

have a blog?

This blog uses scripts to assist and automate comment moderation, and the author of this blog post does not hold responsibility in the content of posted comments. Please note that activities such as flaming, ungrounded accusations as well as spamming will not be entertained.


Click to change color scheme