Intro
to Probability and Statistics
Sample
Final #2 Questions Only
Professor Brian Shydlo
Instructions:
1) Please
write your name: _____________________________________
2) There
are 7 questions totaling 100 points. Please be careful to answer all questions.
Partial credit will be given.
Question 1)
12 Points (Correlation
and Covariance)
Question 2) 8 Points (Expected Value and Standard Deviation of a
Portfolio of Two Assets)
Question 3)
15 Points (Sample Means and Confidence Intervals)
Question 4)
15 Points (T
Distribution)
Question 5)
29 Points (Linear Regression)
Question 6)
12 Points (Multiple Linear Regression)
Question 7) 9 Points
(Miscellaneous)
Total 100
Points
Question
1) (12 points in Total)
You have
the following table of X and Y values.
(For example, there is a 40% chance that X will be 8 and Y will be 11,
and so on
)
X |
Y |
Prob(y,x) |
2 |
3 |
20% |
3 |
5 |
20% |
4 |
10 |
20% |
8 |
11 |
40% |
To help you
out I have calculated the Variance and Mean (or Expected Value) of each.
μx = 5
μy = 8
sx2
= 6.4
sy2
= 11.2
Question
1a) (6 Points)
What is
Covariance(X,Y)?
Answer: _______ญญญญญญญญญญญญญ____________________
Question
1b) (4 Points)
What is the
Correlation Coefficient of X,Y?
Answer: _______ญญญญญญญญญญญญญ____________________
Question
1c) (2 Points)
Suppose you
got to part B an answer of a Correlation Coefficient of 1.2. What would you conclude about your answer?
Answer: _______ญญญญญญญญญญญญญ__________________________________________
Question
2) (8 Points in
Total)
A certain
stock, X, has an expected return of 40% per year and a standard deviation of
50%.
A certain
bond, Y, has an expected return of 10% per year and a standard deviation of
10%.
The have a
Correlation Coefficient of 0.6
You could
write this as:
mx
= 40%, my
= 10%, sx
= 50%, sy
= 10%, and рx,y = 0.6
Question
2a) (3 Points)
You decide
to invest $40 dollars in X and $60 in Y ($100 in total).
How much
money do you expect to have in one year?
Answer: _______ญญญญญญญญญญญญญ____________________
Question
2b) (5 Points)
You decide
to invest $40 dollars in X and $60 in Y ($100 in total).
What is the
Standard Deviation your portfolio?
Answer: _______ญญญญญญญญญญญญญ____________________
Question
3) (15 Points in Total)
A random sampling
of 100 American shrubs revealed the average height of a shrubbery to be 60
centimeters. The Standard Deviation of
shrubberies is well known to be 15 centimeters (meaning that the Standard
Deviation of the population of shrubberies is known with certainty to be 15
centimeters).
Question
3a) (5 Points)
What is the
Standard Error (also called the Standard Deviation) of the Sample Mean?
Answer: _______ญญญญญญญญญญญญญ____________________
Question
3b) (5 Points)
What is a
95% (2 standard deviation) Confidence Interval for the height of an American
shrub? Please assume that the
appropriate Z-score to use is 1.96.
Answer: _______ญญญญญญญญญญญญญ____________________
Question
3c) (5 Points)
What would
the sample size need to be to get a 95% Confidence Interval that is exactly 9.8
centimeters wide? (Please use a Z-score of exactly 1.96)
Answer: _______ญญญญญญญญญญญญญ____________________
Question
4) (15 points in Total)
I took a
sample and created 2 Confidence Intervals each with the same Standard Deviation
and Point Estimate of the mean of a distribution. Both Confidence Intervals were for 95%.
The only
thing that was different between the two Confidence Intervals was that for one
of them I used the Z-distribution (Standard Normal) and for the other one I
used the T-distribution:
Confidence
Interval A: P[68.24
≤ μ ≤
91.76] = 95%
Confidence
Interval B: P[60.91
≤ μ ≤
99.10] = 95%
Question
4a) (4 Points)
For which
Confidence Interval did I use the T distribution (A or B)?
Answer: __________________
Question
4b) (4 Points)
For the
distribution that used the Z distribution I used a Z score of exactly 1.96
(meaning Zα/2 is 1.96). What was the Standard Deviation I
used?
Answer: __________________
Question
4c) (4 Points)
I didnt
say how many degrees of freedom I used for calculating the T distribution, but
given the following choices, which one do you think I used?
Choice A: 3 Degrees of Freedom
Choice B: 103 Degrees of Freedom
Answer: __________________
Question
4d) (3 Points)
If I had
used the Degrees of Freedom of 103, then what must n have been (or how many items were in the sample
or what was the sample size)?
Answer: __________________
Question
5) (29 Points in Total)
I did a
regression on the following data:
# |
X |
Y |
1 |
2.060 |
2.044 |
2 |
0.064 |
0.059 |
3 |
1.202 |
1.124 |
4 |
0.800 |
0.500 |
5 |
4.099 |
5.080 |
6 |
3.516 |
3.500 |
7 |
2.157 |
2.362 |
8 |
1.378 |
1.410 |
9 |
3.481 |
4.084 |
10 |
0.301 |
0.319 |
and got the following information:
The regression equation is
Y = - 0.227 + 1.19 X
Predictor Coef StDev T P
Constant -0.2268
0.1494 -1.52
0.167
X 1.19374 0.06413 18.62
0.000
S = 0.272 R-Sq = XXxXXX
Analysis of Variance
Source DF SS MS F P
Regression 1
25.558 25.558 XXXxXX 0.000
Residual Error XX
0.590 0.074
Total XX 26.148
Unfortunately,
the printer is an old model and it smudged the output. I would print another copy, but I realized
that it was the last piece of paper.
Please help me reassemble the original output
Question
5a) (3 Points)
What is the
Degrees of Freedom Total?
Answer: __________________
Question
5b) (3 Points)
What is the
R-Squared?
Answer: __________________
Question
5c) (3 Points)
What is the
F-score?
Answer: __________________
Question
5d) (4 Points)
Please
predict Y using the Regression Equation when X = 3.
Answer: __________________
Question
5e) (4 Points)
Please give
me a 95% Confidence Interval for Y when X = 3.
Please use a Z-score of 1.96 for your Confidence Interval.
Answer: ________________________
Question
5f) (4 Points)
How might
you respond to someone who asked you to predict a Y with an X of 30?
Answer:
___________________________________________
Question
5g) (3 Points)
You do
another, unrelated regression and get the following information:
R-Squared =
81%
The
regression equation is
Y = 10.1 -
0.879 X
What is the
value of R (the correlation coefficient?)
Answer: __________________
Question
5h) (5 Points)
I did
another, unrelated regression and got the chart below. Is there anything about this chart, which
shows the errors (also called the residuals), which would make you question the
validity of this Regression?
Answer:
___________________________________________
Question
6) (12 points in Total)
The following
is output from a Multiple Linear Regression:
The regression equation is
Y = - 13.4 + 4.78 X + 0.674
Z
Predictor Coef StDev T P
Constant -13.376 3.867 -3.46
0.005
X 4.7839 0.2727 17.54
0.000
Z 0.6738 0.6479 XXXX
0.319
S = 4.451 R-Sq =
96.5%
Analysis of Variance
Source DF SS MS F P
Regression 2
6598.0 3299.0 166.54
0.000
Residual Error 12
237.7 19.8
Total 14 6835.7
Question
6a) (4 Points)
Please
predict Y when X = 5 and Z = 6
Answer: __________________
Question
6b) (5 Points)
Please comment
on the value of adding the Z variable to the Regression Model (versus leaving
it out).
Answer:
________________________________________
Question
6c) (3 Points)
What is the
T-score of the variable "Z". (In the regression output, it is the value
with the XXXX in it.)
Answer: __________________
Question
7) (9 points in Total)
Question
7a) (3 Points)
It is
possible to have Covariance(X,Y) = 200 and Correlation
Coefficient(X,Y) = -0.6
Please
indicate if this is True or False
Answer: __________________
Question
7b) (3 Points)
You do 2
simple Linear Regressions:
Y = 0 + 3X
Y = 0 + 5Z
If you do a
Multiple Linear Regression you would expect to see this as your Regression
Equation: Y = 0 + 3X + 5Z
Please
indicate if this is True or False
Answer: __________________
Question
7c) (3 Points)
You run a
Multiple Linear Regression of three variables (e.g., A, B, C) and then add a
forth variable (e.g., D) it is possible for your new R-Squared to be higher,
even if there is actually no "true" relationship between D and any
other variables.
Please
indicate if this is True or False
Answer: __________________