Intro to Probability and Statistics

 

Sample Final #3 – Questions Only

Professor Brian Shydlo

brian@shydlo.com

 

 

 

 

Instructions:

1) Please write your name: _____________________________________

 

2) There are 7 questions totaling 100 points. Please be careful to answer all questions. Partial credit will be given.

 

Question 1) 16 Points

Question 2) 18 Points

Question 3)   6 Points

Question 4) 15 Points

Question 5) 18 Points

Question 6) 21 Points

Question 7)  6 Points

Total:               100 Points

 

 

 

 

 


 

Question 1) (15 points in total)

 

A certain stock, X,  has an expected return of 20% per year and a standard deviation of 25%.

 

A certain bond, Y,  has an expected return of 5% per year and a standard deviation of 9%.

 

The have a correlation of -0.2.

 

You could write this as:

mx = 20, my = 5, sx = 25, sy = 9, and rxy = -0.2.

 

 

Question 1a)  (3 Points)

You decide to invest $100 dollars in either X or Y or some combination of both.  How do you allocate your $100 to maximize your expected return?

 

 

 

Answer:  __________________________________________

 

 

Question 1b)  (3 Points)

You remember hearing something about diversification in an investments class.  So you decide to split your money and invest $50 in X and $50 in Y.  How much money do you expect to have after one year (your initial investment of $100 + the expected return of your portfolio of X and Y).

 

 

 

Answer:  __________________________________________

 

 

Question 1c)  (6 Points)

What is the standard deviation and variance of the portfolio from part b?

 

 

 

 

Answer:  __________________________________________


Question 1d)  (4 Points)

What is a 95% (2 standard deviation) confidence interval for your return?   That is, give me a confidence interval for 50% in X and 50% in Y.

 

 

 

Answer:  __________________________________________

 

Question 2) (18 Points in total)

A city decides to determine the mean expenditures per tourist per visit.  A random sample of 100 finds that the average expenditure is $800.  The standard deviation of expenditures for all tourists is $120. 

 

Question 2a) (6 Points)

What is the standard deviation of the mean, given that the standard deviation of the whole population is $120 and the number of people sampled is 100?

 

 

 

 

Answer:  __________________________________________

 

Question 2b) (6 Points)

What is a 95% (2 standard deviation) confidence interval for the value of the expenditures per tourist? 

Base this off the sample of 100 tourists.  I am not looking for the 2 standard deviation confidence interval of a single sample from the population.

 

 

 

Answer:  __________________________________________

 

Question 2c) (6 Points)

If the city wants the error of estimation to be $20, how many people does it need to sample?   (If this helps, remember that the error of estimation is equal to half the width of the confidence interval.)

 

 

 

Answer:  __________________________________________


Question 3) (6 points in total)

 

On the following graph, one of the lines is the Z distribution and the other is the T Distribution with 2 degrees of freedom.

 

 

Question 3a) (6 Points)

Is Line 1 the Z distribution or the T distribution with 2 degrees of freedom?

 

 

 

Answer:  __________________________________________

 

Question 4) (15 Points)

The average distance of stopping a certain make of automobile is 65 feet.  A company deigns a new brake thought to be more effective than the type they currently use.  To test this brake, they install it on 64 cars.   The new brakes give a stopping distance of 63 feet with a standard deviation of 4 feet. 

 

 

Question 4a) (5 Points)

Formulate a hypothesis for a lower-tail test.  (i.e. write the null hypothesis and the alternate hypothesis).

 

 

 

 

Answer:  __________________________________________


Question 4b) (5 Points)

Construct the Z statistic to test how many standard deviations is the sample mean of 63 away from the original mean of 65.

 

 

 

Answer:  __________________________________________

 

Question 4c) (5 Points)

Are the new brakes significantly better than the old brakes?  Significant is this case means 95%.

 

 

 

Answer:  __________________________________________

 


Question 5) (18 points) Someone starts a new mutual fund that invests based on the winner of the Superbowl.

 

Quoted from a newspaper from January 2001:

"One of the quirkiest stock market indicators is the so called Super Bowl predictor for the market. If the Super Bowl is won by a team from the original, pre-merger National Football League, the market will close higher by the end of the year. If a team from the old American Football League wins, then the market will fall."

 

Here is data from the market over the last 10 years:

#

Year

Stock Market Return

Old NFL Wins

(Yes/No)

Old NFL Wins

(1/0)

Prediction Correct

1

1991

30

Yes

1

Yes

2

1992

-2

Yes

1

No

3

1993

20

Yes

1

Yes

4

1994

-2

No

0

Yes

5

1995

50

Yes

1

Yes

6

1996

40

Yes

1

Yes

7

1997

-20

Yes

1

No

8

1998

40

Yes

1

Yes

9

1999

60

Yes

1

Yes

10

2000

-5

No

0

Yes

 

If the old NFL (National Football League)) wins the mutual fund puts their money into the stock market, that is, they bet it will go up.  Otherwise if they bet it will go down.  If you look at the table above, you'll see that the prediction was correct 8 out of 10 times or 80% of the time.  It was only wrong in 1992 and 1997.

You decide to test out this theory by doing a simple linear regression.  For the X (predictor variable) you use Old NFL Wins (0 for no and 1 for yes). For the Y you use Stock Market Return.  You run this through Minitab and this is the output you get.

 

The regression equation is

Stock Return = - 3.5 + 30.7 Old NFL Wins?

 

Predictor        Coef       StDev          T        P

Constant        -3.50       17.80      -0.20    0.849

NFL Wins        30.75       19.90       1.55    0.161

 

S = 25.17       R-Sq = 23.0%     R-Sq(adj) = 13.4%

 

Analysis of Variance

Source            DF          SS          MS         F        P

Regression         1      1512.9      1512.9      2.39    0.161

Residual Error     8      5068.0       633.5

Total              9      6580.9


Question 5a) (6 Points) Based on the Minitab output, comment and interpret on whether or not it the regression model is valid in this case.

 

 

 

 

 

Answer:  __________________________________________

 

Question 5b) (6 Points)  This year, the Baltimore Ravens won, and they are not part of the old NFL (they were part of the AFL), so the rule says that the stock market should go down.

 

What is your prediction of the return on the stock market based on the regression equation from Minitab from part a?

 

(Regardless of whether or not the regression is valid, I still want your prediction.)

 

 

 

 

 

Answer:  __________________________________________

 

Question 5c) (6 Points) What is a 95% (2 standard deviation) confidence interval for your estimate from part b?

 

 

 

 

 

Answer:  __________________________________________


Question 6) (21 points in total)

You are the head of marketing for BSEOC Consulting. You are summoned to the CEO's office on the other side of the building.  He asks to see the Minitab analysis of the regression of sales based on the amount of advertising.  He thinks he is extremely proficient in statistics, and is capable of forming his own conclusions based on the data, so he doesn't just want your interpretation of the Minitab output, he wants to see it for himself. 

 

You print out the Minitab output and rush over to the CEO's office.  It is a long walk and just as you reach the office your clumsy coffee-drinking pal bumps into you, knocking you over and causing your sheet of paper with the output to fall to the floor AND your clumsy pal spills coffee on your Minitab output. It is too late to run back to your desk to print out another copy.  You decide to hand write in the fields that are blurred by the coffee.

 

The regression equation is

Sales = 29.0 + 1.50 Advert

 

Predictor        Coef       StDev          T        P

Constant       28.959       6.637       XXXX1    0.002

Advert         1.4951      0.6355       XXXX2    0.046

 

S = XXXX3       R-Sq = XXXXX4     R-Sq(adj) = 33.5%

 

Analysis of Variance

 

Source            DF          SS          MS         F        P

Regression         1      XXXXXX6      220.40      XXXX9    0.046

Residual Error     X5     XXXXXX7       39.82

Total              9      XXXXXX8

 


Question 6a) (9 Points) 

What is the values of

      1) The T-statistic for the constant?

 

 

                                Answer_______________________

      2) The T-statistic for advertising?

 

 

                                Answer_______________________

      3) S, the Standard Error of the Regression?

 

 

                                Answer_______________________

      4) R-Squared?

 

 

                                Answer_______________________

      5) The Degrees of Freedom for the Residual Error

 

 

                                Answer_______________________

      6) SS-Regression (Sum of Squared of the Regression)?

 

 

                                Answer_______________________

      7) SS-Residual Error (Sum of Squared of the Residual Error)?

 

 

                                Answer_______________________

      8) SS-Total (Sum of Squared of the Total)?

 

 

                                Answer_______________________

      9) The F-statistic?

 

 

                                Answer_______________________


Question 6b) (6 Points)

You and the CEO both agree that the linear regression model is appropriate for the data originally used.

The CEO decides to predict sales based on the advertising money spent.

 

You have 10 years worth of data as shown below.  This is the data that you used for the previous regression.

 

(All Numbers in Millions)

Year

Sales

Advert

1

32.2

3.2

2

33

5.4

3

41.8

12.1

4

38.2

9.6

5

45.8

12.8

6

44.5

12.6

7

46

13.3

8

49

9.6

9

52

11

10

56

10

 

The CEO says,

"Well, based on the regression equation of:

 

Sales = 29.0 + 1.50 Advertising

 

if our budget for Advertising is 100, then we should be able to pull in sales of 29 + 1.50 * 100  which equals 179.

179 is great for us. 

 

What do you tell your CEO with regards to his prediction?

 

(What is the main warning you would give the CEO why the prediction may not match the actual value?)

 

 

 

 

 

 

 

 

 

Answer:  __________________________________________

 


Question 6c) (6 Points) You decide to try to improve the relatively small R-squared by (it is only 40.9%) by adding another predictor equation, in other words, you decide to perform a multiple-linear regression.  You add in promotions and get the following output. 

 

The regression equation is

Sales = 22.5 + 1.42 Advert + 2.06 Promotions

 

Predictor        Coef       StDev          T        P

Constant       22.489       7.602       2.96    0.021

Advert         1.4187      0.5960       2.38    0.049

Promotion       2.062       1.401       1.47    0.185

 

S = 5.896       R-Sq = 54.9%     R-Sq(adj) = 42.0%

 

Analysis of Variance

 

Source            DF          SS          MS         F        P

Regression         2      295.65      147.82      4.25    0.062

Residual Error     7      243.34       34.76

Total              9      538.98

 

Is it appropriate to add promotions to the regression model?  Answer "Yes" or "No" and give a reason.

 

 

 

 

 

Answer:  __________________________________________

 

 


Question 7) (6 Points) You run a simple linear regression on tree width versus age. Minitab produces the following graph for you of the residuals.

 

What does this graph tell you concerning the validity of the Linear Regression Model in this case.

 

 

 

 

 

Answer:  __________________________________________