Assignment 0 Solution

$29.99 $18.99

Part 1) Youare(irstrequiredto checkout Assignment0 starter code from your SVN repository. Your SVN repository location is of the following form: https://142.1.44.22/svn/csc207h/UTORID where UTORID is your UTORID. You are free to use any IDE that you wish. If you plan on using Eclipse, then there is a short tutorial Blackboard under Assignments folder that may be…

You’ll get a: . zip file solution

 

 
Categorys:
Tags:

Description

Rate this product

Part 1)

Youare(irstrequiredto checkout Assignment0 starter code from your SVN repository.

  1. Your SVN repository location is of the following form: https://142.1.44.22/svn/csc207h/UTORID

where UTORID is your UTORID.

  1. You are free to use any IDE that you wish.

  1. If you plan on using Eclipse, then there is a short tutorial

Blackboard under Assignments folder that may be useful in explaining how to checkout the project using Eclipse andSubclipse (SVN client that integrates

Eclipse as a Eclipse Plugin). The link to the tutorial is reproduced here as well:

http://goo.gl/tL4uYC

  1. The last revision that is present in your SVN repository at time of due date; will be used for grading.

  1. Once you have checked out your project, you will notice that there are two Java (iles.

    1. Cfiltering.java

    1. CfilteringDriver.java

Youwilluseboththese(ilestowriteyourcode.

Your job is to (ind a similarity score between pair of users that tells how similar they are i.e. Are user1 and user2 more similar based on their rating of movies OR are user3 and user1 more similar? etc.

You can safely assume:

  1. Number of users will be between: 3 ≤ # of users ≤ 9

  1. Number of movies will be between: 3 ≤ # of movies ≤ 9.

  1. Rating of movies will be in the range: 1 ≤ rating ≤ 5

Euclidean Distance Score (How to calculate similarity score?):

The basic idea behind Euclidean distance score is to calculate distance between any two points. For example, the two points are as follows:

  1. Point1 (x,y)= (1,1)

  2. Point2 (x,y)=(3,1)

What is the distance between the above two points?

d

***Before you read any further, make sure you understand above how the distance formula works in 2 dimensions x and y***

Using the above idea, the movies that have been rated in common between users serve as axis or dimensions. The actual users are now data point on the movie graph. One example is provided here:

d

In the (igure, there are 5 blue dots (each blue dot represents a single user) plotted against two movies i.e. Happy Feet and Toy Story. The 5 blue dots or 5 users are as

follows:

  1. (x,y) i.e. (ToyStory=2, HappyFeet=2) is user1

  1. (x,y) i.e. (ToyStory=3, HappyFeet=2) is user2

  2. (x,y) i.e. (ToyStory=4, HappyFeet=3) is user3

  3. (x,y) i.e. (ToyStory=4, HappyFeet=5) is user4

  4. (x,y) i.e. (ToyStory=5, HappyFeet=5) is user5

TocalculatetheEuclideanDistancebetweenpairsofuserintheabove(igure,weuse the Euclidean Distance formula i.e.

d

This formula actually calculates the distance and we know that for users that are very similar (or very close on the chart) the distance will be less compared to users that are very dissimilar (or very far on the chart).

In order to get higher values (similarity score) for users who are similar we take the

inverse of the distance and add 1 to the denominator to avoid division by 0. Hence the (inal similarity score between these pairs of users (user5 and user1), (user5 and user4):

d

The similarity score is a value between 0 and 1. Value of 0 indicates that the pair of users are very different (dissimilar) and a value of 1 indicates that the pair of users have exactly identical taste. Based on the above-­‐calculated similarity score we can say that (user5 and user4) are more similar compared to (user5 and user1).

In the above (igure there are two dimensions (Toy Story and HappyFeet). For multi-­‐ dimensions (more than two movies), the Euclidean Distance formula is:

distance (user1, user2) =

d

  1. The userUserMatrix that your program calculates (where columns and rows are User1, User2, User3 and User4) is the following:

d

Look at the Note section, on how to print the userUserMatrix.

  1. Most similar pair of users (both users in the pair MUST be different. If there are more than one pair of different users with the same score, you must list all of them but do not repeat them for example the pair (User1 and User3) is identical to (User3 and User1)):

From the userUserMatrix, the most similar pair of users are:

User1 and User3.

  1. Most dissimilar pair of users (both users MUST be different. If there are more than one pair of different users with the same score, you must list all of them but do not repeat them for example the pair (User3 and User4) is identical to (User4 and User3)):

From the userUserMatrix, the most dissimilar pair of users are:

User3 and User4.

Note:

-­‐-­‐Precision in the userUserMatrix must be rounded off to four decimal places.

-­‐-­‐The output from your program MUST contain three entities (SEE THE SAMPLE OUTPUT PROVIDED ON THE LAST PAGE).

  1. userUserMatrix

  2. most similar pair(s) of users

  3. most dissimilar pair(s) of users.

-­‐-­‐Print/output your userUserMatrix in a readable form, so it becomes easy for the TA to mark it.

-­‐-­‐ The template code has lots of comments for you to get started, but your program must have atleast the following functions (you are free to de\ine any signature of the function i.e. take in any number of input of any type and return value of any type. BUT DO NOT CHANGE THE NAME OF THE FUNCTION):

calculateSimilarityScoreForAllPairsOfUsera) —> this

function will calculate the similarity score for all pairs of User

printUserUserMatrixb) —>this function will print the userUserMatrix to console

  1. findAndprintMostSimilarPairOfUsers —>this function will (ind and then print the most similar pair of users

  2. findAndprintMostDissimilarPairOfUsers —>this function will (ind and then print the most dissimilar pair of users

-­‐-­‐You MUST create a test folder in your SVN repository, which will reside inside here. (i.e. test folder must be created inside the 207Assignment0 folder). The name of the folder is case sensitive.

https://142.1.44.22/svn/csc207h/UTORID/207Assignment0/

The test folder will contain FOUR test input (iles that you used to test your code against and that will be different from the two provided.

-­‐-­‐At time of marking your assignment, we will not change the input (ile in any form. All your test input (iles MUST obey the format of the provided input1.txt and input2.txt

-­‐-­‐Have suf(icient comments in your code. You do not need to generate Javadoc for this assignment.

-­‐-­‐Have suf(icient (atleast 6 revisions towards assignment0) revisions with comments in your repository. Do not have all 6 revisions on the day the assignment is due.

Here is a an output, that your program should display when executed with the provided input2.txt:

Enter the name of input file?

input2.txt

userUserMatrix is:

[1.0000, 1.0000, 1.0000,

1.0000]

[1.0000, 1.0000, 1.0000,

1.0000]

[1.0000, 1.0000, 1.0000,

1.0000]

[1.0000, 1.0000, 1.0000,

1.0000]

The most similar pairs of users from above userUserMatrix are:

User1 and User2,

User1 and User3,

User1 and User4,

User2 and User3,

User2 and User4,

User3 and User4

with similarity score of 1.0000

The most dissimilar pairs of users from above userUserMatrix are:

User1 and User2,

User1 and User3,

User1 and User4,

User2 and User3,

User2 and User4,

User3 and User4

with similarity score of 1.0000

Note: There are two line break separate each section in the above output.

Page !9 of !9