Description
Problem 1 [25%]
Suppose that I collected data for a group of machine learning students from last year. For each student, I have a feature X1 = hours studied for the class every week, X2 = overall GPA, and Y = whether the student receives
ˆ |
ˆ |
ˆ |
= 1.0. |
an A. We fit a logistic regression model and produce estimated coefficients, β0 |
= −6, β1 |
= −0.1, β2 |
-
Estimate the probability of getting an A for a student who studies for 40h and has an undergrad GPA of 2.0
-
By how much would the student in part 1 need to improve their GPA or adjust time studied to have a 90% chance of getting an A in the class? Is that likely?
Problem 2 [25%]
Consider a classification problem with two classes T (true) and F (false). Then, suppose that you have the following four prediction models:
-
T: The classifier predicts T for each instance (always)
-
F: The classifier predicts F for each instance (always)
-
C: The classifier predicts the correct label always (100% accuracy)
-
W: The classifier predicts the wrong label always (0% accuracy)
You also have a test set with 60% instances labeled T and 40% instances labeled F. Now, compute the following statistics for each one of your algorithms:
Statistic Cls. T Cls. F Cls. C Cls. W
recall
true positive rate
false positive rate
true negative rate
specificity
precision
Some of the rows above may be the same.
Problem O2 [30%]
This problem can be substituted for Problem 2 above, for up to 5 points extra credit. The better score from problems 2 and O2 will be considered.
Solve Exercise 3.4 in [Bishop, C. M. (2006). Pattern Recognition and Machine Learning].
In this problem, you will derive the bias-variance decomposition of MSE as described in Eq. (2.7) in ISL. Let
-
be the true model, fˆ be the estimated model. Consider fixed instance x0 with the label y0 = f(x0). For simplicity, assume that Var[ ] = 0, in which case the decomposition becomes:
-
-
-
-
E h(y0 − fˆ(x0))2i = Var[fˆ(x0)]
+
E[f(x0) − fˆ(x0)]
2
.
|
{z
}
|
{z
}
-
-
-
| {z }
2