Exercise 13 Questions to Be Graded only #2,3,5,8

Individual Excel Problem
You should make up your own data for this excel project.
Create a workbook to contain your worksheets related to this project. Your workbook and worksheets should look professional in terms of formatting and titles, etc. Date the workbook. Name the workbook Excel Project and your first name. Name each worksheet according to the task you are performing (such as subtotals). Put your name on each worksheet.
Include the following in your worksheets:
Use a separate worksheet to show results of each task. Directly on the worksheet explain each numbered item and worksheet specifically so that I can follow your logic. For example, the worksheet showing functions – what five functions did you use and what is the purpose for each? Explain the data you are using.
1. Use a minimum of five functions in your first worksheet (such as SUM, MIN, etc.)
2. Create a Chart to help visualize your data.
3. Use the sort command on more than one column. Create conditional formatting along with this sort.
4. Use AutoFilter to display a group of records with particular meaning.
5. Use subtotals to highlight subtotals for particular categories.
6. Develop a Pivot Table and Pivot Chart to visualize data in a more meaningful way.
7. Use the If function to return a particular value.
8. Use the Goal Seek command.
9. Submit your workbook and word document on Blackboard so that I can evaluate the cells.
1 Let Fbe the function satisfying (ABIX)=F((AIX),(BIAX))for all A, B, and X, as in Rule II(b). Letting x=(AIX),
y = (BIAX), and z=(CIABX), show that
F(x,F(y,z)) =F(F(x,y),z)

[Hint: both sides are equal to (ABCIX)]

2) Give an example of a background assumption X and two propositions A and B such that XA is logically equivalent to XB but A is not logically equivalent to B. Does it follow that P[AIX] =P[B IX]?

3) Give an argument using examples for why the plausibility CAB IX) should only depend on the plausibilities CAlX)
and CBIAX)

Common Sense Reasoning: Rule II(b)

2 • Similarly, we assume a
functional relationship: (ABIX) = F( (AIX) , (BIAX) )
3 • Note: We are always

conditioning on X!
4 • Why not just use (AIX) and
(8IX)? We need interaction effect. Example: Chance of brown left eye, chance of blue right eye; chance of both?
5 • Assume F is nondecreasing
in each variable: (AIX) <(A' IX') and (8IAX) <(8'IA'X') implies (A8IX) <(A'8'IX')


Statistics

Review “Analyzing Student Data: RTI Model Student Scenario” to inform the assignment that follows. In a 500750 word summary, complete the following:
 Using the 10 weeks of Tier 2 progress monitoring data, calculate Paul’s performance level and slope.
 Using the dualdiscrepancy approach, determine whether Paul is responding adequately to Tier 2 instruction. Explain.
 Why do you think the support team members disagree about what tier of instruction would best meet Paul’s needs? Explain.
 What tier of instruction would you recommend for Paul; Tier 1 instruction only or
another round of Tier 2 instruction? Explain.
Written permission has been acquired for the use of:
Brown, J., Skow, K., & the IRIS Center. (2009). RTI: Databased decision making. Retrieved from http://iris.peabody.vanderbilt.edu/wpcontent/uploads/pdf_case_studies/ics_rtidm.pdf
Prepare this assignment according to the APA guidelines found in the APA Style Guide, located on the Student Success Center. An abstract is not required.
This assignment uses a rubric. Review the rubric prior to beginning the assignment to become familiar with the expectations for successful completion.
You are required to submit this assignment to Turnitin.
Analyzing Student Data: RTI Model
Student Scenario
Student: Paul
Age: 8
Grade: 3
Paul attends Lincoln Elementary School. He has received Tier 2 instruction for 10 weeks. Paul’s teacher has been monitoring his progress using the Vanderbilt University Passage Reading Fluency probe. Paul’s eighteenweek goal is 55 wpm and his expected rate of growth is 1. The school support team is meeting today to review Paul’s progress and to determine what tier of instruction would best meet his current educational needs. When they apply the dualdiscrepancy approach, the support team members disagree about what tier of instruction would best meet Paul’s needs.
Written permission has been acquired for the use of:
Brown, J., Skow, K., & the IRIS Center. (2009). RTI: Databased decision making. Retrieved from
http://iris.peabody.vanderbilt.edu/wpcontent/uploads/pdf_case_studies/ics_rtidm.pdf

Analyzing Student Data: The RTI Model

Project Week 5
For these project assignments throughout the course you will need to reference the data in the ROI Excel spreadsheet (See Attachment)
Using the ROI data set:
For each of the 2 majors consider the ‘School Type’ column. Assuming the requirements are met, construct a 90% confidence interval for the proportion of the schools that are ‘Private’. Be sure to interpret your results.
For each of the 2 majors construct a 95% confidence interval for the mean of the column ‘Annual % ROI’. Be sure to interpret your results.

BUSINESS STATISTICS WEEK 5

What is not true about the explanation of time series analysis for demand forecasting?(The answer can be one or more)
â‘ Time series analysis is based on the assumption that the past pattern of demand, which is a dependent variable, will continue in the future.
â‘¡ The naive method is a technique to forecast demand for the next period with the latest demand, and can be used effectively when the demand is stable.
â‘¢ The moving average method is a useful technique when only random variation is largely applied. The latest n period data can be arithmetically averaged or weighted averaged to forecast the demand for the next period.
â‘£ Trend analysis can predict future demand by finding a straight trend line that minimizes the sum of the errors between the past demand and trend forecasts, if there is a noticeable increase or decrease in the past data.
â‘¤ The exponential smoothing method can analyze trends and seasonal variations, but even if the smoothing constant is reduced, the weight of recent demand data can not be imposed less than the weight of past demand data.

Stats Help

Using the feedback on your problem statement, draft a CaRSstyle introduction, following Swales and Feak ch. 8 (esp. p. 331). Your introduction may still have some holes in it, especially around Move 3. But you should be getting to the point where you can articulate most of your research terrain in a focused, problemdriven way. Write in full paragraphs, with citations (APA style) wherever possible. If you know that you need a citation for something .
Remember that a good paragraph should include:
 A topic sentence that announces the claim you are going to make – the thing you are going to prove in that paragraph
 Reasons to support your claim – an explanation for why the claim is persuasive
 Evidence to support your reasons – statistics, data, quotations from experts, etc.
 Examples (optional) that illustrate the claim
Here are the moves you need to make (any given move might take more than one paragraph – think of them as sections):
Move 1: Establish a research territory
a. Argue for the centrality of your research area
i. Argue for the importance of your topic area
ii. Articulate a problem in your topic area
iii. Articulate the harms or effects of the problem
b. Argue that current (realworld) efforts to address the problem are insufficient
i. What has been done to address the problem?
ii. Why are those efforts inadequate to address the problem?
Move 2. Establish a (scholarly) niche.
a. What previous scholarly research has been conducted into your problem?
b. What are the limitations of that research? What is the gap in the existing research that you are going to fill? (NB "no one has done this before" is not a sufficient reason on its own)
Move 3. Occupy the niche.
a. Articulate the purpose of your research (how it fills the niche)
b. State your research questions and/or hypotheses
[c. State principle findings – OMIT FOR PROPOSAL]
[d. State value of research/scholarly contribution – OMIT FOR PROPOSAL]
e. State the structure of your proposal

write introduction

MMIS 671
Homework 1. Constrained Optimization Problems
A small business produces 3 types of cables: A, B, and C. The cost of inhouse production is estimated to be $6, $12, and $10 per foot of A, B, and C respectively. The manufacturing process requires machining and finishing. The machining and finishing time needed to produce a foot of each type of cable are presented in the table below.
For the coming production period the firm is contractually obligated to produce 60,000 feet of A, 80,000 feet of B, and 90,000 feet of C. As only 800 hours of machine time and 400 hours of finishing time are available, these demands cannot be met by inhouse production alone. The firm has the option of procuring these cables from an external supplier in order to meet the demand. The cost prices (from the external supplier) per foot of cable are as follows: $8 for A, $15 for B, and $12 for C.
The production manager has to decide how much of each type of cable to produce inhouse and how much to purchase from the external supplier in order to meet the demands exactly and minimize total cost. The data is summarized below:
Cable Type

A

B

C

Demand (ft)

60,000

80,000

90,000

Production Cost/ft

$6

$12

$10

Procurement Cost/ft

$8

$15

$12

Machine time needed (mins/ft)

0.50

0.50

0.60

Finishing time needed (mins/ft)

0.60

0.20

0.40

1. Formulate the problem as a Linear Program:
Decision Variables:
Objective Function:
Constraints:
 Solve the LP and report your optimal solutions:
Minimum cost attainable = $ ___________________
Decision variable values under optimal solutions:
Cable Type

A

B

C

Produce (ft)




Procure (ft)




Resources used:
Machine time (minutes)


Finishing time (minutes)


 Sensitivity analysis:
Explain in each of the following cases whether you expect the costs to increase, decrease, or remain unchanged when the following parameters are changed (one at a time). Assume that all other parameters remain at their original values. Please be as brief and precise as possible in your explanations.
(i) The production cost for a foot of cable A increases to from $6 to $7.
Cost will: increase, decrease, remain unchanged.
Reasoning:
(ii) The purchase cost for a foot of cable B increases from $15 to $16.
Cost will: increase, decrease, remain unchanged.
Reasoning:
4. Memorandum based on sensitivity analysis:
Given that the current production capacity is insufficient to meet the demand through inhouse production alone, the Chief Operations Officer (COO) wants to know whether the firm should consider increasing the availability of finishing time. Write a short (one paragraph) memorandum to the COO with your recommendation. The memorandum should also specify at most how much the firm should be willing to pay per hour to increase the availability of finishing time (beyond the current availability of 400 hours).

Linear Programming

Note: You MUST show all work. While math/stat software can be used to check your work) you MUST show how you obtained all answers) steps leading up to your final answer) assumptions needed) etc. The only exception to this is that you MA Y use any math/stat software to find the critical value of the normal) i, F) or chisquare distribution. This test must be completed by you alone; help from any other human will be considered cheating.

1. [20 Points] Suppose that ti independent paired data points (Xl, YI), ... , (Xn' Yn) satisfy the linear regression model,

Y = /30 + /31(X  x) + E,

where X is a random regressor, E is uncorrelated with X and has mean zero and variance (T2, and x is the sample mean of the x's. Let y denote the fitted value of Y from this regression model. Prove that the coefficient of determination R2 is equal to the square of the sample correlation between Y and y.

2. [20 Points] In a linear regression model,

where X is a nonrandom regressor, and E has mean zero and variance (T2. Suppose that ti independent paired data points (Xl, YI), ... , (Xn' Yn) satisfy this model. After fitting this model to a set of ti = 25 paired data points, we obtain R2 = 0.82, Syy = 50. Construct a 95 prediction value of Y at X = x, where x is the sample mean of the ti = 25 x's.

3. [40 Points] You are given the following sample of (x, y) data points:

(8,10), (8, 10), (7,9), (6, 10), (3,6), (4,8), (4,8), (5,9), (3,7), (6,9).

A simple linear regression model is fitted to these data points. (a) Estimate the yintercept and slope.
(b) Estimate the variance of y.

(c) Test statistical significance for slope.
(d) Construct a 95 confidence interval of the slope. (e) Construct an ANOVA table for testing slope.
(f) State any assumptions used in your analyses of ( a)  (e) above. Make sure to match any assumption you state with (some combination of) (a)  (e) specifically. Discuss your answers in detail.

4. [20 Points] Let the leastsquares residuals be e; = Yi  fJi for i = 1, ... , n; obtained from a simple regression model,

where Yi is the predicted value corresponding to Yi from this simple regression analysis. Derive the variance of e; and discuss comparing this variance with the variance of the random error Ei.


Advanced Statistical Regression

Discuss in detail the process that a researcher goes through once they are aware of a problem by finding a problem that you feel would be important to improve the health outcomes of the population of the Kingdom. discuss the research assignment by discussing the problem, literature review, theories you find apply to the research, and then share your hypothesis. Identify the important variables that you want to study for this research.
Be sure to support your statements with logic and argument, citing any sources referenced.
plagiarism free (less than 15%)
intext citation
2 pages
4 references
APA style

6.5 The Role of Probability

Watch the video How to: Write an Abstract of a Research Paper (link below). Discuss how you narrow the research topic and what information sources are acceptable in research. Identify the source qualities that are of the most importance and describe the skills and competencies that are required to interpret a research article.
How to: Write an Abstract of a Research Paper
How to: Write an Abstract of a Research Paper
Summary: This video explains writing an abstract for a research paper.
Be sure to support your statements with logic and argument, citing any sources referenced.
plagiarism free (less than 15%)
intext citation
2 pages
4 references
APA style

6.6 Confidence Interval Estimates

Please provide responses separately according to each unit's question
NO PLAGIARIZED WORK PLEASE
Topic 7: Multiple Regression Unit 1 Describe a study you might design that could use multiple regression. What are the variables? What are the null and alternative hypotheses? Why would multiple regression be appropriate for this design? What would multiple regression indicate in relation to your research question? What would it not indicate? Be specific and use your sample study to illustrate your answers. This response does not need to be supported with external research.
Unit 2 Considering your personal research interest, which of the methods described in this course do you believe will be most useful to your research? Why? This response does not need to be supported with external research.
Topic 8: Multivariate Comparison of Means: The OneWay and Factorial MANOVA Unit 1 Consider your dissertation research interests. Identify one categorical/nominal scale IV with more than 2 categories, and three DVs that are measured on continuous scales. Think of DV measures that probably are moderately correlated with each other because they are measuring different components of the same or similar concepts (e.g., three different measures of academic performance). What information would a oneway MANOVA provide you? What more would you want to know if you get significant results in the MANOVA? Why would this be significant to your research? (Research support is not required for this question.)
Unit 2 Using the DC Network, locate information about the 10 Strategic Points, the Prospectus template, the Proposal template, and the Dissertation template. What is the purpose of each of these documents? How do you anticipate interacting with these documents? Explain.

Statistics: Multiple Regression

Suppose a professor splits their class into two groups: students whose last names begin with AK and students whose last names begin with LZ. If p1 and p2 represent the proportion of students who have an iPhone by last name, would you be surprised if p1 did not exactly equal p2?
If we conclude that the first initial of a student's last name is NOT related to whether the person owns an iPhone, what assumption are we making about the relationship between these two variables?

Prob and Stats

19193

Write a minimum 4 page report explaining the results from your analysis of the ROI of Business Majors and Engineering Majors. Use your results from each weeks assignment to make this report.
This report should include: A detailed description of the data used in this analysis specifically explain what the data in each column represents. A detailed explanation of any information learned about: School Type for each major; The Cost for each major; The 30 Year ROI for each major; The Annual % ROI for each major. Explanations should be supported with the results obtained from your work in previous weeks. Finally, answer the following questions and support your answer with your results. Does there appear to be a particular major that gives a better ROI? Why or Why not? Given that we are using statistical inference to make our conclusions, is it guaranteed that the major you choose that gave a better ROI for this sample will always have a better ROI than the other major? Explain your reasoning

Need on Time Delivery!!!

This assignment about Saudi Arabia
 Follow instructions in the attached file carefully
 Avoid Plagiarism
 1 page not including the cover page and the references.
 follow APA style.
 provide intext citation not later than 2013
Discuss your experience with RStudio setup.
Evaluate how research may be applied to epidemiology and public health to provide guidance to the Saudi Ministry of Health and healthcare administration.

Biostat

Find a research topic of interest in healthcare that is research you feel is needed in the Kingdom of Saudi Arabia. Conduct a literature review using at least four articles on that topic. After the literature review formulate a problem statement that would represent a research area that you would like to find a solution to. Once you have the problem that you want to solve write a hypothesis statement that you could test by gathering data and performing statistical tests (we will not be doing the actual analysis).
Your paper should meet the following structural requirements:
 The paper should be 4 pages in length, not including the cover page and reference page.
 Formatted according to APA writing standards.
 Be sure to discuss and reference concepts taken from the assigned textbook reading and relevant research.
 Provide support for your statements with intext citations from a minimum of four scholarly, peerreviewed articles.
plagiarism free (less than 15%)

6.4 Summarizing Data Collected in a Defined Population Sample

19151

Discuss your experience with RStudio setup. Evaluate how research may be applied to epidemiology and public health to provide guidance to the Ministry of Health and healthcare administration.
Be sure to support your statements with logic and argument, citing any sources referenced.
plagiarism free (less than 15%)
intext citation
2 pages
4 references
APA style

6.1 Introduction to Biostatistics

Research the ethical and legal implications of clinical research in the Kingdom of Saudi Arabia. What aspects of clinical research design helps to protect the human subjects? What are the best practices of creating an informed consent? Finally, create an informed consent and include all the components that you have found are necessary to protect the patient.
Create a PowerPoint presentation of 910 slides, not including the title and reference slides. Keep bullet points to 35 words and include your discussion in the speaker notes, explaining your findings for each slide.
Your presentation should meet the following structural requirements:
 Be organized, using professional themes and transitions.
 Consist of 910 slides, plus the title and reference slides.
 Each slide must provide detailed speakers notes—a minimum of 100 words. Notes must draw from and cite relevant reference materials.
 Provide support for your statements with intext citations from a minimum of four scholarly articles.
 Follow APA writing standards.
plagiarism free (less than 15%)
speakernotes (minimum 100 words)

6.2 Study Designs

Project 1 Instructions
Based on Brase & Brase: section 2.1
Use the Project 1 Data Set to create the graphs and tables in Questions 1–4 and to answer both parts of Question 5. If you cannot figure out how to make the graphs and tables in Excel, you are welcome to draw them by hand and then submit them as a scanned document or photo.
1. Open a blank Excel file and create a grouped frequency distribution of the maximum daily temperatures for the 50 states for a 30 day period. Use 8 classes. (8 points)
2. Add midpoint, relative frequency, and cumulative frequency columns to your frequency distribution. (8 points)
3. Create a frequency histogram using Excel. You will probably need to load the Data Analysis addin within Excel. If you do not know how to create a histogram in Excel, view the video located at: http://www.youtube.com/watch?v=_gQUcRwDiik. A simple bar graph will also work.
If you cannot get the histogram or bar graph features to work, you may draw a histogram by hand and then scan or take a photo (your phone can probably do this) of your drawing and email it to your instructor. (8 points)
4. Create an ogive in Excel (or by hand). (8 points)
5. A. Do any of the temperatures appear to be unrealistic or in error? If yes, which ones and why? (4 points)
B. Explain how this affects your confidence in the validity of this data set. (4 points)
Project 1 is due by 11:59 p.m. (ET) on Monday of Module/Week 1.

Based on Brase & Brase: section 2.1

1. Pagano Chapter 2: Q13 (a), (b)
2. Pagano Chapter 3: Q6 (a) i. ii. iv. v. vi. (i.e., skip iii.), (b)
3. Pagano Chapter 3: Q7
4. Pagano Chapter 3: Q8
5. Pagano Chapter 6: Q7
6. Pagano Chapter 6: Q9 (a), (b)
7. Pagano Chapter 6: Q10 (a), (b)
(a) Using these data, construct a stacked bar chart showing the number of hospital discharges following hip fracture by age group. (Each bar should consist of four separate sections representing white men, black men, white women, and black women.)
(b) How does the overall number of hip fractures vary with age?
(c) Based on the graph, what can you conclude about the relationship between gender and hip fracture?
13. In an investigation of the risk factors for cardiovascular disease, levels of serum cotininea metabolic product of nicotinewere recorded for a group of smokers and a group of nonsmokers [18]. The relevant frequency distributions are displayed below.

Cotinine Level




(ng/ml)

Smokers


Nonsmokers

013

78

,

3300

1449 .

133



72

5099

142


23

100149

206


15

150199

197


7

200249

220


8

250299

151


9

300+

412


11

Total

1539


3445

(a) Is it fair to compare the distributions of cotinine levels for smokers and nonsmokers based on the absolute frequencies in each interval? Why or why not? (b) Compute the relative frequencies of serum cotinine level readings for each of the two groups.
(c) Construct a pair of frequency polygons.
(d) Describe the shape of each polygon. What can you say about the distribution of recorded cotinine levels in each group?
(e) For all individuals in this study, smoking status is selfreported. Do you think any of the subjects might be misclassified? Why or why not?
14. The relative frequencies of blood lead concentrations for two groups of workers in Canadaone examined in 1979 and the other in 1987are displayed below [19].

6. A study was conducted investigating the longterm prognosis of children who have suffered an acute episode of bacterial meningitis, an inflammation of the membranes enclosing the brain and spinal cord. Listed below are the times to the onset of seizure for 13 children who took part in the study [10]. In months, the measurements are:

0.10 0.25 0.50 4 12 12 24 24 31 36 42 55 96

er 3 Numerical Summary Measures

(a) Find the following numerical summary measures of the data.
1. mean iv. range
ii. median v. interquartile range
iii. mode vi. standard deviation
(b) Show that 2.~:1 (Xi  x) is equal to O.
7. In Massachusetts, eight individuals experienced an unexplained episode of vitamin D intoxication that required hospitalization; it was thought that these unusual occurrences might be the result of excessive supplementation of dairy milk [11]. Blood levels of calcium and albumina type of proteinfor each subject at the time of hospital admission are provided below.

Calcium

Albumin

(mmol/l)

(gIl)

2.92

43

3.84

42

2.37

42

2.99

40

2.67

42

3.17

38

3.74

34

3.44

42

(a) Find the mean, median, standard deviation, and range f th .
levels. 0 e recorded calcium

Compute the mean, median, standard deviation and ra f h .
min levels. ' nge 0 t e given albu
(c) For healthy individuals, the normal range of calc' 1 .
2 74 III . ium va ues IS 2 12 to
. m~o .while the range of albumin levels is 32 to 55 11 D '.
that patients suffering from vitamin D intoxication h g. 0 you believe of calcium and albumin? ave normal blood levels

8. A study was conducted comparing female adolescents who cc
. ., SUtler from b I' .
healthy females WIth similar body compositions and lev 1 f . u irma to
Listed below are measures of daily' caloric intake recorded^{e}, S k~l phYSIcal activity.
, In IOCmori ki
gram, for samples of adolescents from each group [12]. es per 10

8. A study was conducted comparing female adolescents who suffer from bulimia to healthy females with similar body compositions and levels of physical activity. Listed below are measures of daily caloric intake, recorded in kilocalories per kilogram, for samples of adolescents from each group [12].

Dally CaJorie Intake (keal/kg)


Bulimic


HeaJthy

15.9

18.9

25.1

20.7

30.6

16.0

19.6

25.2

22.4

33.2

16.5

21.5

25.6

23.1

33.7

17.0

21.6

28.0

23.8

36.6

17.6

22.9

28.7

24.5

37.1

18.1

23.6

29.2

25.3

37.4

18.4

24.1

30.9

25.7

40.8

18.9

24.5


30.6


(a) Find the median daily caloric intake for both the bulimic adolescents and the healthy ones.
(b) Compute the interquartile range for each group.
(c) Is a typical value of daily caloric intake larger for the individuals suffering from bulimia or for the healthy adolescents? Which group has a greater amount of variability in the measurements?
9. Figures 3.6 and 3.7 display the infant mortality rates for 111 nations on three

1. What is the frequentist definition of probability?
2. What are the three basic operations that can be performed on events?
3. Explain the difference between mutually exclusive and independent events.
4. What is the value of Bayes' theorem? How is it applied in diagnostic testing?
5. What would happen if you tried to increase the sensitivity of a diagnostic test?
6. How can the probabilities of disease in two different groups be compared?
7. Let A represent the event that a particular individual is exposed to high levels of carbon monoxide and B the event that he or she is exposed to high levels of nitrogen dioxide.
(a) What is the event An B? (b) What is the event A U B?
(c) What is the complement of A?
(d) Are the events A and B mutually exclusive?
8. For Mexican American infants born in Arizona in 1986 and 1987, the probability that a child's gestational age is less than 37 weeks is 0.142 and the probability that

. t lity statIstIcs 11
9 Consider the followmg na a I . .. that a randomly selected woman who gave
· th probabilities e 11
cording to these d~ta, e llowing age groups are as 10 ows:
birth in 1992 was m each of the fo




Age

Probability

<15

0.003

1519

0.124

2024

0.263

2529

0.290

3034

0.220

3539

0.085

4044

0.014

4549

0.001

Total

1.000

(a) What is the probability that a woman who gave birth in 1992 was 24 years of age or younger?
(b) What is the probability that she was 40 or older? .
(c) Given that the mother of a particular child was under 30 years of age, what IS the probability that she was not yet 20?
(d) Given that the mother was 35 years of age or older, what is the probability that she was under 40?
10. The probabilities associated with the expected principal Source of payment for hospital discharges in the United States in the year 1990 are listed below [20].

Principal Source of Payment

Private insurance Medicare Medicaid
Other govt. program Selfpayment Other/No charge Not stated

0.387 0.345 0.116 0.033 0.058 0.028 0.033

6.7 Review Exercises J $7

(a) What is the probability that the principal source of payment for a given hospital discharge is the patient's private insurance?
(b) What is the probability that the principal source of payment is Medicare, Medicaid, or some other government program?
(c) Given that the principal source of payment is a government program, what is the probability that it is Medicare?
11. Looking at the United States population in 1993, the probability that an adult be

PHEB 602699/700 Biostatistics I: Homework 1

STAT200 Introduction to Statistics
Dataset for Written Assignments
Description of Dataset:
The data is a random sample from the US Department of Labor’s 2016 Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures (https://www.bls.gov/cex/). It contains information from 30 households, where a survey responder provided the requested information; it is all selfreported information. This dataset contains four socioeconomic variables (whose names start with SE) and four expenditure variables (whose names start with USD).
Description of Variables/Data Dictionary:
The following table is a data dictionary that describes the variables and their locations in this dataset (Note: Dataset is on second page of this document):
Variable Name

Location in Dataset

Variable Description

Coding

UniqueID#

First Column

Unique number used to identify each survey responder

Each responder has a unique number from 130

SEMaritalStatus

Second Column

Marital Status of Head of Household

Not Married/Married

SEIncome

Third Column

Annual Household Income

Amount in US Dollars

SEAgeHeadHousehold

Fourth Column

Age of the Head of Household

Age in Years

SEFamilySize

Fifth Column

Total Number of People in Family (Both Adults and Children)

Number of People in Family

USDAnnual Expenditures

Sixth Column

Total Amount of Annual Expenditures

Amount in US Dollars

USDFood

Seventh Column

Total Amount of Annual Expenditure on Food

Amount in US Dollars

USDEntertainment

Eighth Column

Total Amount of Annual Expenditure on Entertainment

Amount in US Dollars

USDEducation

Ninth Column

Total Amount of Annual Expenditure on Education

Amount in US Dollars

How to read the data set: Each row contains information from one household. For instance, the first row of the dataset starting on the next page shows us that: the head of household is not married and is 40 years old, has an annual household income of $98,717, a family size of 3, annual expenditures of $56,393, and spends $7,036 on food, $106 on entertainment, and $213 on education.
UniqueID#

SEMaritalStatus

SEIncome

SEAgeHeadHousehold

SEFamilySize

USDAnnualExpenditures

USDFood

USDEntertainment

USDEducation

1

Not Married

98717

40

3

56393

7036

106

213

2

Not Married

96572

59

2

56515

7179

95

349

3

Not Married

96690

57

2

56097

6822

88

252

4

Not Married

96664

53

3

55558

7051

83

295

5

Not Married

96886

44

2

55321

6982

79

312

6

Not Married

96522

43

4

56152

6991

101

237

7

Not Married

97912

49

1

55704

6937

97

277

8

Not Married

96727

39

2

56440

7051

93

222

9

Not Married

96928

43

3

55932

6953

105

273

10

Not Married

95744

52

4

55963

7040

105

340

11

Not Married

97681

53

4

56124

7097

108

263

12

Not Married

95432

51

1

55120

7089

84

274

13

Not Married

94929

59

2

55247

6948

97

236

14

Not Married

96621

54

2

55746

7000

106

322

15

Not Married

95366

48

2

57082

7130

90

305

16

Married

101829

45

4

82385

10821

201

810

17

Married

98309

51

2

75776

9118

117

477

18

Married

112559

39

3

80934

11189

51

386

19

Married

106894

51

5

81585

11360

185

682

20

Married

98686

51

3

73563

9298

132

455

21

Married

103422

36

4

80005

10616

38

378

22

Married

100964

28

2

77744

9397

174

11

23

Married

95835

54

3

73092

9111

117

476

24

Married

102326

32

3

74202

8844

96

45

25

Married

106627

56

3

82676

10363

179

794

26

Married

95922

55

3

72228

9302

135

472

27

Married

95975

41

3

73114

8922

131

463

28

Married

99610

36

2

73550

9513

158

37

29

Married

114505

36

5

78325

11375

60

482

30

Married

106977

56

3

79358

10611

109

311

STAT200 Introduction to Statistics

Assignment #1: Descriptive Statistics Data Analysis Plan

Assignment #1: Prepare Descriptive Statistics Data Analysis Plan

Before conducting any statistical analyses, researchers develop a plan for how they will analyze their data to answer their research questions. The purpose of this assignment is to provide an experience

developing a descriptive statistics analysis plan. Note: This first assignment is a plan only; no statistics will be calculated or graphs created. The second assignment will involve carrying out the plan, after

receiving feedback from your instructor.

Assignment Steps:
Step #1: Review the STAT200 data set file. (Note: This data set will be used for all three of this term's written assignments).
The data is a subsample from the US Department of Labor's Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures (https:/Iwww.bls.gov/cex/). Detailed information on the sample and variables is included with the data set file; please carefully review this information to familiarize yourself with the data (Note: This information will be used in Assignment #2 to describe the dataset).

Step #2: Develop descriptive statistics data analysis plan.

~ Task 1: Develop scenario. Imagine that you are head of a household and have to determine a household budget plan based on the data available from the dataset. For instance, you are a 35 year old single parent with a high school diploma and one child.

~ Task 2: Select variables for analysis that match the scenario developed in Task LThe data set

provides information on household consumption; there are socioeconomic variables and expenditures variables. The socioeconomic variable names start with "SE" and the expenditure

variable names start with a "USD;" all expenditures are in US dollars. All students must use income as one variable. Select two additional socioeconomic variables (one qualitative and one quantitative) and two expenditures for your analysis that match the scenario you developed for Task 1. For instance, using the example scenario of a 35 year old single parent with a high

school diploma and one child, you could select "income," "education," and "number of children" as socioeconomic variables and then pick two household expenditure items to show the

distribution of costs and compare that with your income. When selecting variables, think about the following three questions:
o Why am I choosing these variables?
o What interests me about these variables?

o What do I think will be the outcome?

~ Task 3: Determine appropriate measures of central tendency and dispersion for the selected variables. For each quantitative variable, select at least one measure of central tendency and at least one measure of dispersion (Please see below table for list of measures). For the qualitative

variable, select one measure of central tendency. When determining the measures of central

tendency and dispersion, think about what is appropriate given the level of measurement and type of variable. Recommend referring to the text and information posted in our LEO classroom to help with this task (Note: you will use this information to provide a rationale for your choice of measures).


Measures of Central Tendency


Measures of Dispersion

•

Mean

•

Range

•

Mode

•

Sample Standard Deviation

•

Median

•

Variance

~ Task 4: Determine appropriate graph and/or table for each of the selected variables. Select one graph or table for each variable (Please see below table for list of graphs and tables). When determining the graphs and tables, think about what is appropriate given the level of measurement and type of variable. Recommend referring to the text and information posted in our LEO classroom to help with this task (Note: you will use this information to provide a rationale for your choice of graphs and/or tables).

Types of Graphs

Types of Tables

•

Pie Chart

•

Frequency Table

•

Bar Chart

•

Relative Frequency Table

•

Histogram

•

Grouped Frequency Table

•

Box Plots (also known as BoxandWhiskers Plot)



Step #3: Complete the "Assignment #1: Descriptive Statistics Data Analysis Plan Template."

Remember, you will not be conducting any statistical analysis, drawing any graphs, or compiling any

tables for the first assignment. Rather, you need to wait for feedback from your instructor on this assignment and use that feedback to complete Assignment #2.

Here are the main sections for this assignment (i.e., completing the plan template):
,/ Identifying Information. Fill in information on name, class, instructor, and date.
,/ Scenario. In this section, briefly (23 sentences) describe the scenario you developed in Step #2,

,/ Complete Table 1: Variables Selected for the Analysis. Enter information the variables selected for analysis in Step #2, Task 2. For each selected variable be sure to include its: name as listed in the data set, description, and variable type.
,/ Reason(s) for Selecting the Variables and Expected Outcome(s): In this section, for each selected variable, please answer the following questions:
./ Why did I choose this variable?
./ What interests me about this variable?

./ What do I think will be the outcome?
,/ Complete Table 2. Numerical Summaries of the Selected Variables. Enter information on selected measures of central tendency and dispersion for each selected variable. Be sure to briefly explain why you choose those measurements. Note: The information for the required variable, "Income," has already been completed and can be used as a guide for completing

information on the remaining variables.

,/ Complete Table 3. Type of Graphs and/or Tables for Selected Variables. Enter information on selected graph and/or table for each selected variable. Be sure to briefly explain why you

choose those measurements. Note: The information for the required variable, "Income," has

already been completed and can be used as a guide for completing information on the

Assignment Submission: Name the file that contains your completed "Assignment #1: Descriptive Statistics Data Analysis Plan Template" using the following format: "AssignmentlStudentLastName."

Then, submit the file via the Assignments area in the LEO classroom in the "Assignment #1: Descriptive Statistics Data Analysis Plan" folder and wait for your instructor's feedback.

