Name: Solved--Midterm-- Solution
SKU: 16041
Price: 30.00 USD
Availability: InStock

Description

5/5 – (2 votes)

Introduction

With the data given and looking at Figure 1, we can clearly see that the Y values come from a mixture distribution where the mean of the smaller values appears to be around ₁ = 110, and the mean of the larger values is close to ₂ = 150. Therefore, when we come up with our prior we will use these values as the prior means. Also, looking at the kernel density line, we can clearly tell that the variance of the smaller distribution is greater than the larger distribution, so we will use values of ₁² = 100, and ₂² = 64. Judging by how similar the peaks of both the distributions are, I believe that an equal number observations come from each distribution, and my priors will re ect this.

Figure 1: Histogram of Data Given

Priors

		p	beta(a; b)
		X_i	Bernoulli(p)
		Y_i	N( ₁; ₁²); if X_i = 1
		Y_i	N( ₂; ₂²); if X_i = 0
		_j	N( ₀; ₀²); j = 1; 2
		1= _j²Gamma( ₀=2; _{0 0}²=2)
The joint distribution for all of the variables is
2	2	~ ~	2	2		~ ~
p( ₁; ₂; ₁	; ₂	; p; XjY ) / p( ₁; ₂; ₁		; ₂	; p; X; Y )
		~	~		2	2	2	~
		= p(Y jX; ₁; ₂; ₁				; ₂ )	p( _j) p(1= _j ) p(p) p(Xjp):

Sampling Distribution

The sampling distribution for the Y values can be given by

~ j ~ 2 2 j 2 2

p(Y X; ₁; ₂; ₁ ; ₂ ; p) = p(Y_i X_i; ₁; ₂; ₁ ; ₂ )

i=1

dnorm(y_i; ₁; ₁²)^xⁱ dnorm(y_i; ₂; ₂²)¹ ^xⁱ ;

i=1

where dnorm(y_i; ₁; ₁²) signi es the pdf of a N( ₁; ₁²)

Full Conditionals

The full conditional distribution of X_i can be given by

P (X

= 1

;

; ²

; p)

P (X_i = 1jp) p(Y_ijx_i = 1; ₁; ₁²)

p(Y_ijx_i; ₁; ₂; ₁²; ₂²)

j _i

p p(Y_ijx_i = 1; ₁; ₁²)

p p(Y_ijx_i = 1; ₁; ₁²) + (1 p) p(Y_ijx_i = 0; ₂; ₂²)

p dnorm(y_i; ₁; ₁²)

p dnorm(y_i; ₁; ₁²) + (1 p) dnorm(y_i; ₂; ₂²)

Therefore, we can conclude that

X_i

y_i; ₁; ₂

; ²

; ²; p

Bernoulli

p dnorm(y_i; ₁; ₁²)

p dnorm(y_i; ₁; ₁²) + (1 p) dnorm(y_i; ₂; ₂²)

Now, nding the full conditional distribution of p, we get

p(pjY ; X; ₁; ₂; ₁

; ₂ ) / Joint pdf of all from above

/ p(p) p(Xjp)

a + b)

i^Y

a) b)

p^a ¹(1 p)^b ¹

p^x(1 p)¹ ^x

_/ _p^Pxi^+a ¹₍₁_p₎ⁿ ^Pxi^+b ¹_:

Looking at this result, we can conclude that p comes from a beta distribution,

given by

pjY^~ ; X;^~ ₁; ₂; ₁²; ₂² beta ^X x_i + a; n

^X x_i + b :

Now to nd the full conditional for ₁.

~ ~

/ Joint pdf of all variables

p( ₁jY ; X; p; ₂; ₁ ; ₂ )

~ ~

/ p(Y jX; ₁; ₂; ₁

; ₂ ) p( ₁)

dnorm(y_i; ₁; ₁²)^xⁱ dnorm(y_i; ₂; ₂²)¹ ^xⁱ dnorm( ₁; ₀; ₀²)

i=1

dnorm(y_i; ₁; ₁²)^xⁱ dnorm( ₁; ₀; ₀²)

i=1

2 ₀²

2 ₁²

exp

( ₁

₀)²

(y_i

₁)²

Then, ignoring the

1=2 in front, inside the expfg we have

(₁² 2₁₀

+ ₀²)+

(

^X _y_i2

2 ₁

^X y_i + n ₁²) = a ₁²

2b ₁ + c; where

₀²

₁²

₁

₀

a =

; b =

₀

y_i

;

and c = c( ₀

; ²

; ~y):

Now putting it all together, we get

~ ~			1
	2	2	1		2
	2	2			2
p( ₁jY ; X; p; ₂	; ₁	; ₂ ) / exp		2	(a ₁	2b ₁)		=a
		= exp	₂ a( ₁²			2b ₁	=a + b²=a²) + ₂ b²	=a
			1				1

/ exp	1	b=a)²
	₂ ^a( 1

)

	1	b=a		2
= exp		¹₁₌p		:
	2
			a

This function follows the shape of a normal distribution where the mean is b=a,

and the variance is 1=a. Therefore, if Y₁ = fY_i

: X_i = 1g, n₁ is the size of Y₁,

and y₁ is the mean of values in Y₁, we can conclude

~ ~

where

₁jY ; X; p; ₂

;₁;₂

^N( n;1^; _n;1^);

_n;²

;

and

ⁿ1

₀

n;1

= _a =

₀2

_n;2

₁:

⁺₁2

₀

n₁y₁

Without loss of generality, we can follow a similar process to come up with the distribution for ₂, and will end up with

~ ~

where

₂jY ; X; p; ₁

; ₁

; ₂

^N( n;2^; _n;2^);

_n;²₂

; and

n₂

₀

n;2

₀2

+ ₂₂

_n;²₂;

₀

n₂y

where Y₂ = fY_i : X_i = 0g, n₂ is the size of Y₂, and y₂ is the mean of values in Y₂.

Finally, we must nd the full conditional distribution of ₁² and ₂². Since we know the prior distribution of 1= ₁², we will nd the full conditional based

on the precision. Therefore,

2 ~ ~	2
p(1= ₁ jY ; X; p; ₁	; ₂; ₂ ) / Joint pdf of all variables
	~ ~	2	2
	/ p(Y jX; p; ₁; ₂	; ₂ )	p(1= ₁ )	=2	)!
	_/ _(~₁²₎ⁿ1⁼²_exp ⁽ _~₁²		(y_i ₁)²	=2	)!
			n₁

i=1

(~₁²) 0^{=2 1}exp ~₁² _{0 0}²=2

n h i o

= (~₁²)⁽ 0⁺ⁿ1^{)=2 1} exp ~₁² ₀ ₀² + (y_i ₁)² =2 ;

where all the y_i values have corresponding x_i values that are 1. This function takes the shape of a gamma distribution, therefore,

n;1

_n²₁ n;1

1= ₁ jY^~ ; X;^~ p; ₁

;₂;₂

Gamma

;

; where

n;1

= ₀ + n₁;

and;

+ n

s²

(

) ; where,

⁼ _n;1 ⁰ 0

n;1

ⁿ1

^sn;1

( ₁) = ^X(y_i

₁)

=n₁:

Again, without loss of generality, we can follow the same logic to come up with the full conditional distribution for 1= ₂², which will end up being,

n;2

_n²₂ n;2

1= ₂ jY^~ ; X;^~ p; ₁

;₂;₁

Gamma

;

; where

n;2

= ₀ + n₂;

and;

+ n

_s2

(

) ; where,

⁼ _n;2 ⁰ 0

n;2

n₂

^sn;2

( ₂) = ^X(y_i

₂)

=n₂:

Gibb’s Sampler

I implemented a Gibb’s Sampler next with values of a = b = 1, because I didn’t really know anything about the probability of people from the low mean distribution versus the high mean distribution, ₀ = 140, because it was the mean of the entire set of y values, ₀² = 625, ₀² = 625, because they seemed like reasonable numbers for the variances, and ₀ = 5. I then used the full conditional distributions from above and ran the Gibb’s Sampler using 100,000

iterations.

Diagnostics

Looking to Figure 3 for the means and 95% Con dence Intervals for all of the parameters of interest, our initial guesses for the _j values look to be pretty close, but our original guesses for the _j² weren’t that close at all. It also appears as if about 46:7% of people come from the distribution with smaller mean, and the rest are from the larger distribution.

Parameter	2:5%	50%	97:5%

1	108:87	11:79	112:70
2	143:10	146:43	149:12
₁²	106:65	129:55	158:89
2	187:83	236:89	306:44
2
p	0:393	0:467	0:529

Figure 2: Means and 95% Con dence Intervals

From both Figure 2 and Figure 3, one can see that ₂ has a wider spread in ₂ when compared to ₁. This is another consequence of there being a signi cant amount of extreme values one the high end of the original data set, making ₂ vary more.

According to the Gibb’s Sampler, if we take a sample of Y from the corre-sponding _j and _j, we get a similar looking kernel density which can be seen in Figure 4. The shape of the distribution with the smaller mean from the orig-inal sample looks pretty similar to the mixture model, however the distribution

Figure 3: Posterior Density of ₁ and ₂

with the larger mean doesn’t quite match up. That’s because when we run the Gibb’s Sampler, ₂ = 15:45, which is quite high. I think this happens because there are a couple extreme values on the high end of the original sample which might be in uencing the variance of the higher mean distribution.

Figure 4: Kernel Densities

I also found that, given an individual whose y value is 120 has a 78% chance of coming from the distribution with with a smaller mean.

Looking at the auto correlation functions in Figure 5, we can see that the _j values seem to be quite correlated, but as time goes on, they eventually become uncorrelated. This is why I did such a large number of iterations with the Gibb’s Smapler, so I could get large enough e ective samples sizes for both of the _j values. Those e ective sample sizes were 5111:7 for ₁ and 3952:5 for ₂.

Figure 5: Kernel Densities

Solved–Midterm– Solution

Description

Related products

Project One: Top of Pile Solution

Homework 1 Solution

[solved]Homework 6: Hero Agents-Solution

Homework 2: Path Network Navigation SOlution

Project 4: GPU Programming Solution