ASSIGNMENT 1 SOLUTION

$29.99 $18.99

Q1. (40 marks) Consider the following base cuboid Sales with four tuples and the aggregate function SUM: Location T ime Item Quantity Sydney 2005 PS2 1400 Sydney 2006 PS2 1500 Sydney 2006 Wii 500 Melbourne 2005 XBox 360 1700 Location, T ime, and Item are dimensions and Quantity is the measure. Suppose the system has…

You’ll get a: . zip file solution

 

 
Categorys:
Tags:

Description

Rate this product

Q1. (40 marks)

Consider the following base cuboid Sales with four tuples and the aggregate function

SUM:

Location

T ime

Item

Quantity

Sydney

2005

PS2

1400

Sydney

2006

PS2

1500

Sydney

2006

Wii

500

Melbourne

2005

XBox 360

1700

Location, T ime, and Item are dimensions and Quantity is the measure. Suppose the system has built-in support for the value ALL.

  1. List the tuples in the complete data cube of R in a tabular form with 4 attributes, i.e., Location; T ime; Item; SUM(Quantity)?

  1. Write down an equivalent SQL statement that computes the same result (i.e., the cube). You can only use standard SQL constructs, i.e., no CUBE BY clause.

  1. Consider the following ice-berg cube query:

SELECT Location, Time, Item, SUM(Quantity)

FROM Sales

CUBE BY Location, Time, Item

HAVING COUNT(*) > 1

Draw the result of the query in a tabular form.

  1. Assume that we adopt a MOLAP architecture to store the full data cube of R, with the following mapping functions:

8

>1 if x = ‘Sydney’;

<

fLocation(x) = 2 if x = ‘Melbourne’;

>

:0 if x = ALL:

8

>1 if x = 2005;

<

fT ime(x) = 2 if x = 2006;

>

:0 if x = ALL:

2 DUE ON 23:59 14 APR, 2019 (SUN)

8

  • if x = ‘PS2’;

>

>

>

<2 if x = ‘XBox 360’;

fItem(x) =

>3 if x = ‘Wii’;

>

>

:0 if x = ALL:

Draw the MOLAP cube (i.e., sparse multi-dimensional array) in a tabular form of (ArrayIndex; V alue). You also need to write down the function you chose to map a multi-dimensional point to a one-dimensioinal point.

Q2. (30 marks)

Consider binary classi cation where the class attribute y takes two values: 0 or 1. Let the feature vector for a test instance be a d-dimension column vector ~x. A linear classi er with the model parameter w (which is a d-dimension column vector) is the following function:

(

y = 1 , if w>x > 0

  • , otherwise.

We make additional simplifying assumptions: x is a binary vector (i.e., each dimension

of x take only two values: 0 or 1).

Prove that if the feature vectors are d-dimension, then a Na ve Bayes classi er is a linear classi er in a d + 1-dimension space. You need to explicitly write out the vector w that the Na ve Bayes classi er learns.

It is obvious that the Logistic Regression classi er learned on the same training dataset as the Na ve Bayes is also a linear classi er in the same d + 1-dimension space. Let the parameter w learned by the two classi ers be wLR and wNB, re-spectively. Brie y explain why learning wNB is much easier than learning wLR.

i

log x

i

=

i

i

Hint .1 log

P

x

Q

Q3. (30 marks)

We have a sample of mixture of two chemical compound, S1 and S2. The (unknown) percentages of each chemical in the sample are denoted as q1 and q2 (whereas q1 + q2 = 1), respectively.

We have a device that can detect the percentages of m = 3 di erent components that are contained in both chemical compounds, albeit with di erent percentages. We denote the components as f Oj gmj=1. We list the percentages of each components in pure Sis in the following table:

pi;j

O1

O2

O3

S1

0.1

0.2

0.7

S2

0.4

0.5

0.1

After measuring the three components, we obtain their percentages as f uj gmj=1.

COMP9318 (19T1) ASSIGNMENT 1

3

  1. Write out the log likelihood function (as a function of qi, pi;j, and ui).

  2. If u1 = 0:3; u2 = 0:2; u3 = 0:5, what are the MLE of q1 and q2? What are the expected percentage of each component under a model with the MLE parameters?

Submission

Please write down your answers in a le named ass1.pdf. You must write down your name and student ID on the rst page.

You can submit your le by

give cs9318 ass1 ass1.pdf

Late Penalty. -10% per day for the rst two days, and -20% for each of the following days.