Data Mining Assignment 1 Solution

$29.99 $18.99

Problem 1 (k-means, 40pts) Generate 2 sets of 2-D Gaussian random data, each set containing 500 samples using parameters below. 1 = [1;0]; 2 = [0; 1:5]; 1 = 0:4 0:9 ; 2 = 0:4 0:9 (1) 0:9 0:4 0:9 0:4 (20pts) Write a function cluster = mykmeans(X, k, c) that clusters data X 2…

You’ll get a: . zip file solution

 

 

Description

Rate this product

Problem 1

(k-means, 40pts) Generate 2 sets of 2-D Gaussian random data, each set containing 500 samples using parameters below.

1 = [1;0]; 2

= [0; 1:5]; 1

=

0:4

0:9

; 2

=

0:4

0:9

(1)

0:9

0:4

0:9

0:4

  1. (20pts) Write a function cluster = mykmeans(X, k, c) that clusters data X 2 Rn p (n number of objects and p number of attributes) into k clusters. The c here is the initial centers, although this is usually not necessary, we will need it to test your function. Terminate the iteration when the ‘2-norm between a previous center and an updated center is 0:001 or the number of iteration reaches 10000.

  1. (10pts) Apply your code to the data generated above with k = 2 and initial centers c1 = (10; 10) and

c2 = ( 10; 10). In your report, report the centers found for each cluster. How many iterations did it take? Show a scatter plot of the data and the centers of clusters found.

3. (10pts) Apply your code to the data generated above with k = 4 and initial centers c1 = (10; 10) and c2 = ( 10; 10), c3 = (10; 10) and c4 = ( 10; 10). In your report, report the centers found for each cluster. How many iterations did it take? Show a scatter plot of the data and the centers of clusters found.

Problem 2

(Non-parameteric density estimation 60pts)

  1. (30pts) Write a function [p, x] = mykde(X,h) that performs kernel density estimation on X with bandwidth h. It should return the estimated density p(x) and its domain x where you estimated the p(x) for X in 1-D and 2-D.

CSE4334/5334 Data Mining Assignment 1

  1. (10pts) Generate N = 1000 Gaussian random data with 1 = 5 and 1 = 1. Test your function mykde

on this data with h = f:1; 1; 5; 10g. In your report, report the histogram of X along with the gures of estimated densities.

  1. (10pts) Generate N = 1000 Gaussian random data with 1 = 5 and 1 = 1 and another Gaussian random data with 2 = 0 and 2 = 0:2. Test your function mykde on this data with h = f:1; 1; 5; 10g. In your report, report the histogram of X along with the gures of estimated densities.

  1. (10pts) Generate 2 sets of 2-D Gaussian random data with N1 = 500 and N2 = 500 using the following parameters:

1 = [1; 0]; 2 = [0; 1:5]; 1

=

0:4

0:9

; 2

=

0:4

0:9

:

(2)

0:9

0:4

0:9

0:4

Test your function mykde on this data with h = f:1; 1; 5; 10g. In your report, report gures of estimated densities.

Instructor: W. H. Kim (won.kim@uta.edu), TA: Xin Ma (xin.ma@mavs.uta.edu) Page 2 of 2