982 Data Mining Homework 4 (due 2010/6/11)
1. Given two objects represented by the tuples (22,1,42,10) and (20,0,36,8):
a. Compute the Euclidean distance between the two objects.
b. Compute the Manhattan distance between the two objects.
c. Compute the Minkowski distance between the two objects, using q=3.
2. 兩筆資料分別如下表所示。試依以下定義求出其相似度。
Object 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 |
Object 2 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 |
a. Simple matching coefficient (SMC)
b. Jaccard coefficient
c. Rao’s coefficient
3. X1=(4,1), X2=(1,1), X3=(5,2), X4=(0,2), X5=(3,3)
a. 試作出這些點之間的distance matrix.
b. 試以single link distance為準作出以上五個點的dendrogram.
c. 試以complete link distance為準作出以上五個點的dendrogram.
4. 設K=2並且令initial partition為{X1, X2, X3}與{X4, X5}。試以K-means方法對這五個點進行Clustering. (請列出中間步驟,並求SSE值。)
5. 令δ=2並依X1, X2, …, X5 的順序進行incremental clustering. (請列出中間步驟。)
本文来源:https://www.2haoxitong.net/k/doc/de090b1185868762caaedd3383c4bb4cf6ecb75b.html
文档为doc格式