Solved–HOMEWORK #9 –Solution

$30.00 $19.00

*Please be sure to submit your assignment by 11:55ish pm (or before) to prevent any glitches in the upload from precluding your timely submission. *Please work well in advance, getting help during office hours and labs, as there will be no extensions given for this assignment, outside of extreme, extenuating circumstances which must be communicated…

You’ll get a: . zip file solution

 

 
Categorys:
Tags:

Description

5/5 – (2 votes)

*Please be sure to submit your assignment by 11:55ish pm (or before) to prevent any glitches in the upload from precluding your timely submission.

*Please work well in advance, getting help during office hours and labs, as there will be no extensions given for this assignment, outside of extreme, extenuating circumstances which must be communicated in advance to the primary instructor.

There is 1 problem with various parts (1a -1g), in this homework assignment. Please double check that you have provided a response for each part of the problem, before you submit.

BST 210 Problem set policies:

  • We encourage you to discuss homework with your fellow students (or with the instructor or the TAs), but you must write your own final answers, in your own words.

  • Please include the appropriate computer output in your solution if that helps you to answer a question, but be sure to interpret your findings in words – submitting only output is not sufficient for full credit.

  • Homework assignments will not be accepted late (other than for extreme emergency, but the primary instructor must be reached in advance).

  • Be complete in your responses; not verbose, to get full scores.

  • All homework must be submitted online via Canvas by 11:59pm on Tuesday.

Problem 1

Suppose you wish to design a prospective cohort study assessing whether obesity (BMI ≥ 30) is associated with presence or absence of coronary heart disease (CHD) or time to CHD. Because we don’t want to wait 24 years to collect our data, a consistent four-year follow-up is planned for each subject. Future subjects are expected to look similar to those in the Framingham study.

A pilot study of 250 Framingham-like subjects is run, excluding subjects who already had prevalent CHD. First, we look at presence or absence of obesity to predict a binary four-year CHD incidence. Subjects who died within four years without having a prior CHD are viewed as not developing CHD. Among the 250 subjects, 3 of 31 obese (BMI ≥ 30) subjects developed CHD within 4 years, and 11 of 219 non-obese subjects developed CHD within 4 years. Using logistic regression, the estimated odds ratio comparing obese vs. non-obese subjects is 2.025974 with 95% confidence interval (0.5325236, 7.707772).

Next, we look at presence or absence of obesity to predict time to CHD, with right censoring occurring at four years for subjects who did not have CHD by that time. Subjects who died without having a prior CHD were viewed as being censored at their time of death. Among these same 250 subjects, the estimated hazard ratio from the proportional hazards model comparing obese vs. non-obese subjects is 1.990321 with 95% confidence interval (0.5552206, 7.134780). Not surprisingly, given only 250 subjects, we do not reach

statistical significance with either analysis. View these data as informative historical data (and ignore the analysis of the full Framingham data farther below for now).

  1. Determine the sample size needed for 90% power in a two-sided 0.05 level test to compare proportions of incident CHD over four years in obese vs. non-obese subjects if, under the alternative hypothesis, we had proportions with incident CHD as observed in the 250 subjects. Keep the proportions of obese and non-obese subjects the same as observed in the pilot study.

  1. Determine the sample size needed for 90% power in a two-sided 0.05 level log-rank test to compare times to CHD in obese vs. non-obese subjects if, under the alternative hypothesis, we have a hazard ratio as observed in the 250 subjects. Keep the proportions of obese vs. non-obese subjects the same as observed, as well as the proportion of censored observations.

  1. How do the sample sizes change if we design each of the studies above to have an equal number of obese vs. non-obese subjects? Is the total sample size larger or smaller? Does that make intuitive sense? Briefly comment on the feasibility of such a design.

  1. If you had to pick “one primary outcome” for your study, would you prefer to design this study to have a binary or a time-to-event outcome? Briefly justify your choice.

  1. In practice, one would use more “rounded” values for OR’s, HR’s, or proportions of obese or censored observations than was done above, as you would not exactly believe the estimates from the sample size of 250. It would also be important to perform a range of calculations to show power or sample size under different scenarios. Using your “one primary outcome” selected above, develop an appropriate sample size based on your parameter of interest (either OR or HR) of 2.0 for 90% power for a two-sided 0.05 level test. Then, develop a table and/or graph for a range of 1.5 to 2.5 in increments of 0.1 for your parameter of interest, for a fixed sample size to show changes in power resulting under different scenarios. Also include a sentence or two summarizing your results that would be appropriate to include in a protocol or grant application. Feel free to add or adjust anything here that seems reasonable to you – different reasonable researchers might use different assumptions and so would end up with different sample sizes, so be sure that your sentences summarizing your results include all necessary information about your assumptions.

  1. Analyses from the “full” Framingham data set are below, restricting follow-up to four years to develop CHD for each subject and eliminating subjects with incident CHD or missing BMI. How close were your assumptions in your sample size calculations to the “truth” from the full data set? Were the sample sizes that were actually achieved sufficient for high power? Briefly comment.

  1. A colleague noted that the relative risk, the odds ratio, and the hazard ratio estimates in these analyses were similar. Also, that logistic regression, the log rank test, and the Cox model gave similar P-values. Briefly discuss whether these observations make sense or not.

(Note that even more work would be needed if we also wanted to account for other covariates or confounders, such as adjusting for gender or age, in our comparisons of obese vs. non-obese subjects.)

Below are analysis results of the complete Framingham dataset, with a binary outcome of chd4 for whether or not CHD occurred within four years and a survival outcome of timechd4 for time to CHD with observations being appropriately censored at four years (or earlier, if death without CHD occurred earlier), by obese status (= 1 for obese, = 0 for non-obese). These analyses eliminate subjects with incident CHD or who have missing BMI values.

. cs chd4 obese,

or

| obese

Unexposed

|

Total

|

Exposed

|

—————–Cases

+————————

39

168

+

————207

|

|

Noncases

|

500

3514

|

4014

—————–Total

+————————

539

3682

+

————4221

|

|

Risk

|

.0723562

.0456274

|

.0490405

|

|

|

Point estimate

|

[95% Conf. Interval]

|

|

Risk difference

|————————

.0267288

+

————————.0038421

.0496156

|

|

Risk ratio

|

1.585807

|

1.132751

2.220067

(Cornfield)

Odds ratio

|

1.6315

|

1.139401

2.336332

+————————————————-

chi2(1) =

7.20 Pr>chi2 = 0.0073

. logistic chd4 obese

Logistic regression

Number of obs

=

4221

LR chi2(1)

=

6.46

Log likelihood =

-822.73894

Prob > chi2

=

0.0111

Pseudo R2

=

0.0039

——————————————————————————

chd4 | Odds Ratio

Std. Err.

z

P>|z|

[95% Conf. Interval]

————-obese

+

—————————————————————-1.6315

.3002934

2.66

0.008

1.137405

2.340232

|

_cons

|

.0478088

.0037757

-38.50

0.000

.0409529

.0558124

——————————————————————————

. sts graph, by(obese) risktable

failure _d: chd4

analysis time _t: timechd4

Kaplan-Meier survival estimates

001.

750.

500.

250.

000.

0

1

2

3

4

Number at risk

analysis time

obese = 0 3682

3640

3610

3564

3514

obese = 1 539

525

519

508

500

obese = 0

obese = 1

. sts test obese

failure _d: chd4

analysis time _t: timechd4

Log-rank test for equality of survivor functions

|

Events

Events

obese |

observed

expected

0——

+

————————-168

181.02

|

1

|

39

25.98

——

+

————————-

Total |

207

207.00

chi2(1) =

7.45

Pr>chi2 =

0.0063

. stcox obese

failure _d: chd4

analysis time _t: timechd4

Cox regression — Breslow method for ties

No. of subjects =

4221

Number of obs

=

4221

No. of failures

=

207

Time at risk

=

16483.55373

LR chi2(1)

=

6.60

Log likelihood

=

-1719.5688

Prob > chi2

=

0.0102

——————————————————————————

_t

| Haz. Ratio

Std. Err.

z

P>|z|

[95% Conf. Interval]

————-obese

+

—————————————————————-1.617147

.2874419

2.70

0.007

1.141436

2.291118

|

——————————————————————————