(g) Inferential Statistics: Comparison of Sample Means
In Physical Geography it is often necessary to test whether two samples of the same natural event are statistically distinct under different conditions. For example, a research scientist may want to determine if rainfall from convective storms differs in its intensity in rural and adjacent urban landscapes. Information from this investigation can then be used to test theories concerning the influence of the urban landscape on the development of thunderstorms.
Further, this type of hypothesis testing changes our general comprehension in Physical Geography from simple description to process oriented understanding. A number of inferential statistical procedures have been developed to carry out this process. We will examine two of the most popular statistical techniques available to researchers.
Mann-Whitney U test
In research work it is often necessary to test whether two samples of the same phenomenon are statistically different. One test that is particularly useful for this type of test situation is the Mann-Whitney U test. This technique is non-parametric or 'distribution-free' in nature. Non-parametric methods are particularly suited to data that are not normally distributed.
Setting up the Null Hypothesis
This is the first stage of any statistical analysis and states the hypothesis that is to be tested. This is the assumption that will be maintained unless the data provide significant evidence to discredit it. The null hypothesis is denoted symbolically as H0. For our example, the null hypothesis would be:
H0 : there is no difference in precipitation levels between urban and adjacent rural areas.
It is also necessary to state the alternative hypothesis (H1). In this case the alternative might be:
H1 : there is an increase in precipitation levels in urban areas relative to adjacent rural areas because of the heating differences of the two surface types (the urban area heats up more and has increased convective uplift).
Calculation
To calculate the U-statistic, the values for both sets of samples are ranked together in an ascending fashion. When ties occur, the mean rank of all the scores involved in the tie is entered for those observations. The rank values for each set of observations are then summed separately to determine the following values:
S r1 and S r2
These values are then entered in the formulae shown under Table 3g-1 for the calculation of U and U1.
In Physical Geography it is often necessary to test whether two samples of the same natural event are statistically distinct under different conditions. For example, a research scientist may want to determine if rainfall from convective storms differs in its intensity in rural and adjacent urban landscapes. Information from this investigation can then be used to test theories concerning the influence of the urban landscape on the development of thunderstorms.
Further, this type of hypothesis testing changes our general comprehension in Physical Geography from simple description to process oriented understanding. A number of inferential statistical procedures have been developed to carry out this process. We will examine two of the most popular statistical techniques available to researchers.
Mann-Whitney U test
In research work it is often necessary to test whether two samples of the same phenomenon are statistically different. One test that is particularly useful for this type of test situation is the Mann-Whitney U test. This technique is non-parametric or 'distribution-free' in nature. Non-parametric methods are particularly suited to data that are not normally distributed.
Setting up the Null Hypothesis
This is the first stage of any statistical analysis and states the hypothesis that is to be tested. This is the assumption that will be maintained unless the data provide significant evidence to discredit it. The null hypothesis is denoted symbolically as H0. For our example, the null hypothesis would be:
H0 : there is no difference in precipitation levels between urban and adjacent rural areas.
It is also necessary to state the alternative hypothesis (H1). In this case the alternative might be:
H1 : there is an increase in precipitation levels in urban areas relative to adjacent rural areas because of the heating differences of the two surface types (the urban area heats up more and has increased convective uplift).
Calculation
To calculate the U-statistic, the values for both sets of samples are ranked together in an ascending fashion. When ties occur, the mean rank of all the scores involved in the tie is entered for those observations. The rank values for each set of observations are then summed separately to determine the following values:
S r1 and S r2
These values are then entered in the formulae shown under Table 3g-1 for the calculation of U and U1.
Table 3g-1: Analysis of convective precipitation levels per storm event (mm of rain) between urban and rural areas using the Mann-Whitney U test.
Urban (n1)
|
Rural (n2)
|
Rank (n1)
|
Rank (n2)
|
28
|
14
|
26
|
5
|
27
|
20
|
25
|
13.5
|
33
|
16
|
28
|
8.5
|
23
|
13
|
20
|
2.5
|
24
|
18
|
23
|
11
|
17
|
21
|
10
|
16
|
25
|
23
|
24
|
20
|
23
|
20
|
20
|
13.5
|
31
|
14
|
27
|
5
|
23
|
20
|
20
|
13.5
|
23
|
20
|
20
|
13.5
|
22
|
14
|
17
|
5
|
15
|
11
|
7
|
1
|
-
|
16
|
-
|
8.5
|
-
|
13
|
-
|
2.5
|
n1 = 13
|
n2 = 15
|
S r1 =
267
|
S r2 =
139
|
CITATION
Pidwirny, M. (2006). Fundamentals of Physical Geography, 2nd Edition. 29/12/2011. http://www.physicalgeography.net/fundamentals/1b.html