(note (code cslai))

Implementing Fuzzy c-means in Clojure

I was asked to evaluate fuzzy c-means to find out whether it is a good clustering algorithm for my MPhil project. So I spent the whole afternoon reading through some tutorial to get some basic understanding. Then I thought why not implement it in Clojure because it doesn’t look too complicated (I was so wrong…).

I based my implementation on the tutorial posted here. However I still don’t understand how the terminate condition is met so I am setting the function to loop only like 10 times for now (until I understand it better). Added the objective function, loop will stop when the next result is higher than the current result.

My current code for fuzzy c-means:

(ns sensei.clustering.fcm)

(use 'clojure.set
     '[incanter core stats])

(defn fcm[data cluster-count fuzziness-index]
    (let [random-points (take
                          cluster-count
                          (repeatedly
                            #(matrix (take
                               (count (first data))
                               (repeatedly rand)))))
          degree-of-membership (fn [point centroids cluster-index]
                                   (let [power #(pow % (/ 2 (- fuzziness-index 1)))]
                                     (/ 1 (apply
                                            +
                                            (map
                                              #(power (/ (euclidean-distance point (nth centroids cluster-index))
                                                         (euclidean-distance point %)))
                                              centroids)))))
          fuzzy-membership (fn [centroids]
                               (map
                                 (fn [point]
                                     (map
                                       #(degree-of-membership point centroids %)
                                       (range cluster-count)))
                                 data))
          cluster-membership (fn [cluster-index membership]
                                 (map #(nth % cluster-index) membership))
          new-centroid (fn [members]
                           (div (apply
                             plus
                             (map
                               #(mult (pow (nth members %) fuzziness-index) (nth data %))
                               (range (count members))))
                           (apply
                             +
                             (map
                               #(pow (nth members %) fuzziness-index)
                               (range (count members))))))
          new-centroids (fn [membership]
                            (map
                              #(new-centroid (cluster-membership % membership))
                              (range cluster-count)))
          objective (fn [membership centroids]
                        (apply
                          +
                          (map
                            (fn [point-index]
                                (apply
                                  +
                                  (map
                                    #(* (pow (nth (nth membership point-index) %) fuzziness-index)
                                        (pow (euclidean-distance (nth data point-index) (nth centroids %)) 2))
                                    (range cluster-count))))
                            (range (count membership)))))
          cluster (fn re-cluster
                      ([membership centroids] (re-cluster membership centroids (objective membership centroids)))
                      ([membership centroids objective-value]
                       (let [next-membership (fuzzy-membership centroids)
                             next-centroids (new-centroids next-membership)
                             next-objective (objective next-membership next-centroids)]
                         (if (>= next-objective objective-value)
                           (cons centroids membership)
                           (recur next-membership next-centroids next-objective)))))]
      (cluster (fuzzy-membership random-points) random-points)))

My attempt to plot a graph showing the relationship between each point and clusters. Only lines that shows the most degree of membership are shown in this example

(ns sensei.core)

(use 'sensei.clustering.fcm
     '[incanter core charts stats])

(let [data (take
             500
             (repeatedly
               #(matrix (take
                          2
                          (repeatedly rand)))))
      [centroids & membership] (fcm data 5 5)
      chart (scatter-plot
              (map #(nth % 0) data)
              (map #(nth % 1) data))]
  (view (reduce
    (fn [chart point-index]
        (let [point-membership (nth membership point-index)
              max-membership (apply max point-membership)]
          (reduce
            (fn [chart cluster-index]
                (if (= max-membership (nth point-membership cluster-index))
                  (add-lines
                    chart
                    (vector (nth (nth centroids cluster-index) 0) (nth (nth data point-index) 0))
                    (vector (nth (nth centroids cluster-index) 1) (nth (nth data point-index) 1)))
                  chart))
            chart
            (range (count centroids)))))
    chart
    (range (count data)))))

Example output:

Updated 2011-06-23: Fixed some problems and added objective function in, but solution still doesn’t look right. Updated with image.
Updated 2011-07-18: Found this while looking for a better implementation.
Exit mobile version