# Hypothesis Testing and Statistics Lab

# Hypothesis Testing and Statistics Lab

User Generated

## Get Your Custom Essay Written From Scratch

We have worked on a similar problem. If you need help click order now button and submit your assignment instructions.

Get Answer Over WhatsApp Order Paper NowJust from $13/Page

Sn2020

Science

### Description

### UNFORMATTED ATTACHMENT PREVIEW

Error (st dev) bar Mean value for the bar Labeled y-axis An adequately detailed figure caption Labeled x-axis Figure 1. A comparison of the coefficient of thermal expansion (mean+SD) for three metals. Labeled y-axis Rate of Breathing (Breaths per Minute) Single data point Trend line Temperature (oC) An adequately detailed figure caption Labeled x-axis Figure 2. The relationship between air temperature and breathing rates for desert fox squirrels. I used the data in the lab for male and female height and made the following graph. This is what your graph should look like. (I have labeled all the important things for you. On the next slide you can see what things looked like in Excel.) Standard deviation or error bars Bars showing average height of males and females 200 180 Average Height (cM) 160 140 120 100 80 60 40 Labeled y-axis 20 0 Males 1 Females Figure 1. A comparison of the average height of male and female students. A figure caption Labeled x-axis Mean values Standard deviation values 1 BIOL 100 Lab 1: Hypothesis Testing and Statistics I. Objectives The objectives of this laboratory session are: 1. 2. 3. 4. 5. To understand the importance of probability in the outcome of experiments To understand the need to have many data points for an experiment To become familiar with descriptive statistics To make a hypothesis and test that hypothesis To learn to construct and label a graph properly II. Materials The materials needed for this lab include: 1. a calculator or preferably a computer with Microsoft Excel 2. a coin (1 per student), if this lab was done in a classroom setting III. Introductory Exercise: Probability (Note: You should read this section and follow along with the questions. You are NOT required to fill in the spaces below for a grading.) Each student should take a coin. What percentage of the time should you get a heads when you flip the coin? _______. Flip the coin 5 times. After each flip, record whether you got a heads or a tails. What percentage of the time did you get a heads? _______. Was your prediction the same as your observation? _______. Did something go wrong with your experiment? Should you have gotten exactly what you predicted you would? Give an explanation as to why your results were not exactly what you would predict. _______________________________________________________________________. If you flipped the coin 120 times, do you think your observation would be closer to your prediction? _______. Instead of having each of you flip a coin 120 times, we will have everyone write their results on the board and take an average (that is, the mean). Calculate and record the mean for the class. _______. Was the class mean closer to your prediction? _______. Was it exactly the same? Record two data points that were furthest from the class mean? ____________. Were there any data that were exactly the class mean? _______. Is it better to have many data points instead of fewer? Explain. _______________________________________________________________________ _____________________________________________________________________________. 2 You have just carried out a scientific experiment. Although you did not state it, you developed a hypothesis by stating your prediction. Your hypothesis may have been along the lines of this: “when flipping a coin, getting heads is as likely as tails.” From that hypothesis, you formed a prediction. Perhaps your prediction was, “When flipping the coin, I predict that I will flip heads 50% of the time.” You also made observations and recorded those data to evaluate your prediction. In this lab we will discuss hypothesis testing in more detail. From your coin experiment, you should notice the importance of probability to hypothesis testing. Probability gave you a basis for making your hypothesis and for making your prediction. At the same time, your outcomes were probabilistic, not deterministic. Even with the entire class’s data, you may not have gotten the exact result that you predicted. The reason for this is because variation in the possible outcome existed. You could get a head or a tail each time you flipped the coin and recorded your data. In other words, if you got a head the first time you flipped the coin, that did NOT determine the outcome of your next coin flip—the next time you flipped the coin you were uncertain as to whether you would get heads or tails. As with coin flipping, any time we perform an experiment in biology, there will be uncertainty in our measurements. That uncertainty in biology is due to the same uncertainty we had with the coin flipping. Biological variability exists! Most biological traits show differences and those differences are important. Most of the biological questions you might be interested in pursuing involve biological variability. IV. Descriptive Statistics: When we make observations, we often record those observations as numerical facts that we call data. When we collect the data, the information is grouped using two types of variables—independent variables and dependent variables. Independent variables are the properties that we manipulate. Independent variables are also often called x variables because when they are graphed, they are put on the x-axis of the graph. The dependent variables are the data the change in response to the independent variables. Thus, these dependent variables are also called response variables. They are also called y variables because they are usually plotted on the y-axis of a graph. We use descriptive statistics to discuss how a dependent variable changes with an independent variable. Descriptive statistics can be used to describe two aspects of a dependent variable—its central tendency (where the variable is usually found) and its variability (how likely it is to be found there). The best understood measure of central tendency for quantitative data is the sample mean. The sample mean estimates the true population mean, and so describes one parameter of a population—the average value. The mean can be calculated as follows: Mean = (∑xi)/n. This equation just states that you add all of the observations together (∑xi) and divide by the number of data points (n). This should not be intimidating to you; you have already done this for the coin data. 3 Measures of central tendency give only a partial description of a data set. The description is incomplete without a measure of variability—a measure of how the data set is spread. The variance is a good estimate for how the data is spread. For example, the distributions of data in this figure have the same means and sample sizes, but the data are spread much differently in each distribution. Or imagine two students. One has taken four exams and got a 75%, an 85%, and two 80%. A second student took the same four exams and got a 40%, a 90%, a 92%, and 98%. Both students have the same mean—80%—but the second student’s performance was much more variable. Any data that we collect will have variation. Consider the data from the coin flipping experiment. Some of the data points were relatively close to our mean and some were quite different from it, remember? Generating a number that describes variation around a mean is probably less familiar to you than calculating the mean. To calculate the variance (we will call the variance s2), you take each data point and subtract the mean from that data point. That difference is squared. Squaring the differences makes all the numbers positive. (If we did not square the differences, the differences would add up to zero, and we cannot use zero to describe the spread of our data.) The next step is to add all of the differences together. That number is then divided by one less than the number of samples we have. The mathematical notation for the variance is: s2 = ∑(xi Mean)2/n-1. This equation just states the steps for calculating the variance is a brief way. As the variability in the data increases, the variance increases. Because the variance is a squared sum, it can never be negative. The standard deviation, s, is the positive square root of the variance, and it represents the degree to which the data deviates from the mean. Because it is the square root of the variance, the standard deviation is in the same units as the mean. Because it is in the same units as the mean, it is the measure of variability that is reported with the mean. The steps in calculating the standard deviation are: 1. 2. 3. 4. 5. Take each data point and subtract the mean from that data point. Square each difference. Add all of the differences together. Divide the sum by one less than the number of samples we have. Take the square root of the number. 4 Mean and Standard Deviation Example Below is a simple example of calculating the mean and standard deviation from a set of numbers. 5 Representing the mean and standard deviation in graph form In addition to reporting the mean and standard deviation numerically as above, you can display them visually in the form of a graph. In the graph below, the top of each of the bars (red and green) indicate mean values, while the black lines (error bars) indicate one standard deviation unit above and one standard deviation unit below the mean value. For example, the mean value of the data bar on the far left appears to be 7700 steps and the standard deviation is depicted as being 200 steps above and below the mean. Figure 1. Bar plots of the mean steps taken by female and male students in each of three classes. Error bars indicate the standard error of the means. For future references, please note how the graph is labeled and that it has figure caption (e.g., Figure 1…) below the graph, rather than a title at the top. 6 V. Hypothesis Testing When interpreting data, you have to reach a decision about the data—does the data support your argument or not. Hypothesis testing is an approach that helps you make decisions about your data. The process of statistical hypothesis testing involves the following steps: Step 1. Formulating the hypotheses. Null hypothesis. This hypothesis is the hypothesis of no difference. This is usually the unstated hypothesis. The prediction for the null hypothesis is that the groups are the same (that is, there is no difference between the groups for the variable being measured), so any differences between the two treatments would be due only to chance. Alternative hypothesis. This is the hypothesis of difference. This is the hypothesis that we are usually thinking about when we talk about a hypothesis. The prediction for the alternative hypothesis is that there is a real difference between our groups. Provided that the difference between the groups is great enough, the difference between groups will be detectable, even with chance variation. Step 2. Statistical Testing of the Null Hypothesis (You will NOT be doing Step 2 Statistical Testing in this course.) Even if the null hypothesis is true, the means from our two groups will not be exactly the same. However, if the two means are different enough, then we may reject our null hypothesis and say that there is a difference between our groups. To reject our null hypothesis and say that there is a difference between groups, we have to compare our groups statistically. To compare our groups statistically, we have to say how likely it is that the two groups are not the same. Recall that science is based on probability; it is not deterministic. We quantify the likelihood of the two groups not being the same using a probability statement. A probability near 1.0 (~100%) would suggest that the groups are almost certainly the same. A probability-value of 0.10 (a probability of 10%) suggests that the likelihood of the two groups being the same is 10%. A p-value of 0.05 (a probability of 5%) suggests that the likelihood of the two groups being the same is 5%. For most of biology, and for this class, when we get a p-value of 0.05 (a probability of 5%), it suggests that we can be 95% confident in rejecting the null hypothesis. By rejecting the null hypothesis, we accept the alternative hypothesis—the hypothesis that there is a difference between the groups. In other words, if the p-value is less than 0.05, the likelihood of the two groups having the same mean is small enough that we say the two groups are significantly different from each other. Hypothesis testing, then, requires that you formulate hypotheses and that you test those hypotheses using statistical analyses. To test your hypotheses with statistical analyses, 7 you will need to have large enough samples that your sample mean is a good estimate of the true mean. The larger your sample, the better your estimate of the true mean will be. Recall how your class estimate for the coin data was a good estimate, but individual measures were not as good. To test your hypothesis, you will also need to perform a statistical analysis to generate your p-value (probability value). For this course, we will generate the statistical analyses together. Usually this will require using a t-test or a regression. Once the statistical results are generated, you will need to evaluate the results to decide whether to reject the null hypothesis or not. VI. Exercise: Testing a Hypothesis regarding human height and speed A. Summarizing the Data In this lab exercise, you will be making a bar graph to compare height (as a dependent variable) among males and females. Height (and speed) measurements of males and females in a hypothetical UE class have been provided for you in Table 1 on page 10 below. Please note that the height data collected are in metric (e.g., cm). You will be required to generate a graph comparing the height (mean and standard deviation) of males and females preferably using Microsoft Excel or some other software tool that you are familiar with that uses spreadsheets and graphing tools. You can calculate the average and the standard deviation for the male and female height and speed data using the formulae provided (p. 2-3). However, it would be much easier to use Microsoft Excel. For those not familiar with Excel, there are many tutorials available on the internet to show you how to enter you data to Excel and help you calculate means and standards deviation using this software, including the following site for Windows: https://www.youtube.com/watch?v=BGq8kuffR_Q You may find this YouTube video to be preferable if you are using an Apple computer: https://www.youtube.com/watch?v=tfjgzCni_jQ Here are a couple of videos from YouTube to assist you in making bar graphs with Excel: 1. Creating a bar graph and adding standard deviation (=error bars) to the graph using a WINDOWS PC: https://www.youtube.com/watch?v=uH4RuuVQKLI 2. Creating a bar graph and adding standard deviation (=error bars) to the graph using an APPLE: https://www.youtube.com/watch?v=G10_qGcuELA When using Excel, you may notice that the instructions for making graphs on so on not only can vary from Apple versus Windows versions of the software, but from different versions within the same operating system (e.g., different versions of Excel for Mac). With this in mind, you may find it helpful to click on the Help drop-down menu for Excel and do a search for what you need help with (e.g., adding a trendline, putting standard deviation error bars on a graph, etc.) B. Making a Bar Graph 8 To create a bar graph that summarizes the data, you should plot the mean and standard deviation of each dependent variable for each independent variable. (See the graph below as an example in which coefficient of thermal expansion is the dependent variable and type of metal is the independent variable. For what it’s worth, I have also posted a sample graph to Blackboard.) The independent variable should be placed on the x-axis. The dependent variable should be plotted on the y-axis. The units on the y-axis should be consistent. (Notice on the figure below that each hatch mark equals 0.00001 unit.) Your units may be of a different increment, but they should be consistent.) Each axis should be labeled. Your figure should also have a descriptive figure caption written just below it. The figure caption should be informative. Notice that the figure caption below indicates that the blue bars are means and your error bars represent standard deviation (SD) error bars. (Written as mean+SD) C. Plotting the Height versus Speed Data – Scatterplot with a Trendline 9 Using the speed data (e.g., number of seconds to run 100 meters) for the males and females in Table 1, I want you to create a graph showing the relationship between height and speed. (In this case, you will combine the data for males and females to indicate the height-speed relationship.) Again, you will need to use Excel or some other software tool that you are familiar with to create your graph Here are a couple of videos from YouTube to assist you in making a scatterplot graph with a trendline in Excel: Using Windows PC: https://www.youtube.com/watch?v=iP_Ziu7RO4I Using an Apple: https://www.youtube.com/watch?v=OEc7niSuZnw To create a scatterplot graph that summarizes the data, you should plot relationship between individuals based on their paired scores (score for the dependent and independent variable). See the graph below as an example of the relationship between air temperature and breathing rates for desert fox squirrels. Again, the independent variable should be placed on the x-axis. The dependent variable should be plotted on the y-axis. The units on both the x- and y-axes should be consistent. The scatterplot should have a trendline (e.g., a line indicating the general course or tendency of the relationship). Each axis should be labeled. Your figure should also have a descriptive figure caption written just below it. The figure caption should be informative. Table 1. Hypothetical height and speed data for male (M) and female (F) BIOL 100 students. GENDER HEIGHT (cm) SPEED (secs) 10 (M/F) M 190 13 M 183 15 M 166 15 M 172 17 M 180 11 M 175 12 M 167 14 M 155 16 M 185 11 M 170 13 M 164 10 M M 174 190 11 13 F F F F F F F F F F F F F 152 160 178 155 159 173 177 157 161 154 157 150 154 16 18 15 15 16 14 12 15 13 17 14 17 14 MEAN for Males __________ __________ Standard Dev. __________ __________ MEAN for Females __________ __________ Standard Dev. __________ __________ Name:_____________________________ 11 VII. Assignment (what you will turn in) for Lab 1. (Due 4:59 pm, 9/7 CST) 1.Explain the difference between an independent variable and a dependent variable. _____________________________________________________________________________ _____________________________________________________________________________. 2. State your hypothesis for the relationship between height and speed. You will find it useful to frame your hypothesis in the form of “if” and “then” statement. (Note: You will be graphing this relationship for Question 5 below.) _____________________________________________________________________________ _____________________________________________________________________________. 3. Explain why a statistical analysis is necessary to suggest whether groups differ from each other. (You may have to do an internet search to properly answer this question.) _____________________________________________________________________________ _____________________________________________________________________________. 4. Using Excel, create an appropriately labeled graph (with an appropriate caption) comparing height data (means and standard deviations) for male and female UE college students. See the bar graph on page 8 as a guide for making your graph. 5. Using Excel, create an appropriately labeled graph (with an appropriate caption) depicting the relationship between height and speed for all the hypothetical UE BIOL 100 college students. See the scatter plot graph on page 9 as a guide for making your graph. VIII. Final Comments You should submit (=email) this lab assignment (and all other lab assignments) as a Word document. It is easy to copy a graph from Excel and paste it into a Word document. Your file must be named in the following manner: your last name and the number of the lab assignment. For example: Smith Lab 1.docx.

Purchase answer to see full attachment

Purchase answer to see full attachment

Explanation & Answer:

1 Lab Report

## Needs help with similar assignment?

We are available 24x7 to deliver the best services and assignment ready within 3-4 hours? Order a custom-written, plagiarism-free paper

Get Answer Over WhatsApp Order Paper Now