Solved–Homework 02 –Solution

$30.00 $19.00

This assignment contains 3 questions. Please read and follow the following instructions. (30 points) [Farzaneh Khoshnevisan] [HMM] Infection is a common condition among patients in ICU settings and can have various roots, which makes it challenging to be determined. Assume that we want to model infection using an HMM, while infection is the hidden state…

You’ll get a: . zip file solution

 

 
Categorys:
Tags:

Description

5/5 – (2 votes)

This assignment contains 3 questions. Please read and follow the following instructions.

  1. (30 points) [Farzaneh Khoshnevisan] [HMM]

Infection is a common condition among patients in ICU settings and can have various roots, which makes it challenging to be determined. Assume that we want to model infection using an HMM, while infection is the hidden state and the only available observation is blood pressure (0 for normal and 1 for abnormal). When patients entering ICU, the probability of being infected is 0.75. At any given time, infected patients have 40% chance of improving to be uninfected and uninfected patients have 20% chance of becoming infected. There is 80% chance of observing an abnormal blood pressure for infected patients while only 10% chance of observing abnormal observation for uninfected patients.

  1. (5 points) How many parameters are required to fully de ne this HMM. Justify your answer.

  1. (3 points) Create initial, transition, and emission probability tables based on the problem statement given above.

  1. (6 points) Using the described HMM and the generated probability tables, apply the forward algorithm to compute the probability that we observe the sequence f0; 1; 1g blood pressure. Show your work (i.e., show each of your s).

  1. (6 points) Using the backward algorithm, compute the probability that we observe the aforementioned sequence (f0; 1; 1g). Again, show your work (i.e., show each of your s).

Fall 2019: CSC 591/791 CSC 591/791: HW 2 – Page 2 of 4

    1. (4 points) Using the forward-backward algorithm, compute the most likely setting for each state. Show your work.

    1. (6 points) Use the Viterbi algorithm to compute the most likely sequence of states. Show your work.

  1. (20 points) [Farzaneh Khoshnevisan] [HMM]

We design a two-state HMM for a dice toss game between two players where the output of their toss is our only observation. S1 and S2 indicate the hidden state where the dice is tossed by player 1 and player 2, respectively. The transition probabilities between these two states are given in diagram below. Both players are equally likely to start the game and they each play with a biased dice. The output distribution corresponding to each player is de ned over f1; 2; 3; 4; 5; 6g and is given in the table below the diagram.

    1. (5 points) Give an example of one output sequence of length 2, which cannot be generated by the given HMM. Justify your answer.

    1. (5 points) We generated a sequence of 20; 6012020 observations from this HMM, and found that the last observation in the sequence was 1. What is the most likely hidden state corresponding to the last observation? Justify your answer.

    1. (5 points) Consider an output sequence f1; 1g. What is the most likely sequence of hidden states corresponding to this output observation sequence? Show your work.

    1. (5 points) Now, consider an output sequence f1; 1; 3g. What are the rst two states of the most likely hidden state sequence? Show your work.

  1. (50 points) [Song Ju] [Reinforcement Learning] Consider the following Markov Decision Process:

Fall 2019: CSC 591/791 CSC 591/791: HW 2 – Page 3 of 4

Our state space S: fS1; S2; S3; S4, S5g and our action space A: f”Lef t”, “Right”g. For

all parts of this problem, assume that = 0:8.

For subquestions (a)-(c) below, we assume that all actions are deterministic:

  1. (5 points) What is the optimal policy for this MDP?

  1. (15 points) Calculate VS5 . Show your work. (The acceptable answer should be a numeric number and you should show all the key steps).

  1. (10 points) Consider executing Q-learning on this MDP. Assume that 1) all of the initial Q values are 0, 2) = 0:5, and 3) it uses a greedy exploration policy by always choosing the action with maximum Q value at any given state. The algorithm breaks ties by choosing “Left”. What are the rst 10 (state, action) pairs if our robot learns using Q-learning and starts in state S3? (A candidate answer can be expressed in the form of: (S3, Left),(S2, Right),(S3, Right), . . .)

For the subquestions [d]-[e], assume the actions are not deterministic in that after taking an action, it’s possible to stay in the same state:

  1. (10 points) Consider executing Value Iteration on this MDP. The transition matrix shown below indicates the probability of transitioning from state s to state s0 by taking action a. In the matrix, the rst column is the start state and the rst row is the ending state. For example, when take action lef t in state S2, the probability of transit to S1 is 0.6 while the probability of stay in S2 is 0.4. All the empty cell in the matrix means there’s no transition.

For a given iteration t, the value functions of each state are: VSt1 = 20, VSt2 = 30, VSt3 = 20, VSt4 = 30, VSt5 = 10, compute new value function of all states in the next iteration, t + 1, using Value Iteration.

Fall 2019: CSC 591/791 CSC 591/791: HW 2 – Page 4 of 4

  1. (10 points) Based on the Value function and transition matrix in previous sub-question (d), answer the following questions: 1) what’s the optimal policy at time t? 2) what’s the optimal policy at t + 1, after running Value Iteration? 3) Are the two policies di erent?