Inferential Statistics


      With inferential statistics, we try to reach conclusions that extend beyond the immediate data. For example, we use inferential statistics to try to infer from a sample of the population what the population might think.  Statistical inference combines methods of descriptive statistics with the theory of probability.  (The probability of an event is the proportion of times it will occur in a long string of independent opportunities.) The purpose is to learn what the sample data tell us about the population from which they were drawn.

      The statistics used to describe populations are called parameters. The mean, variance and correlation of x and y in a population are examples of parameters. It is conventional to represent parameters with Greek letters (such as mu, sigma, pi and rho for mean, standard deviation, proportion and correlation, respectively).

      We take samples from populations to learn something about the parameters.  Essentially, we would want the sample to be representative of the population to reflect the population's characteristics. However, if the total population is unknown or unreachable, then how can we know whether a sample is representative of it?  Through the use of random probability sampling, we can determine the probability that our sample is representative of the population.

      Inferential statistics come in two types: interval estimation and hypothesis testing.

      Interval Estimation. This form of statistical inference produces an interval or range of values (e.g., -.12 to +.35) by a process that has a known probability of including the true but unknown parameter value in the interval.  The interval is called a confidence interval.  Confidence intervals have confidence coefficients that tell us the probability of that interval capturing the population parameter.  The confidence coefficient is controlled by the data analyst and usually assumes values close to 1.0, such as  .90, .95 and .99.

      One of the simplest inferential statistics, the t-test, is used to compare the average performance of two groups on a single measure to see if there is a difference. You might want to know whether males and females differ in test scores for a particular class.  The t-test for differences between groups allows us to compare the average performance between two groups.

      Most commonly used inferential statistics come from the general family of statistical models known as the General Linear Model. This includes the t-test, Analysis of Variance (ANOVA), Analysis of Covariance (ANCOVA), regression analysis, and many multivariate methods like factor analysis, multidimensional scaling, cluster analysis, discriminant function analysis, et cetera.  Because the General Linear Model is so important, it is a good idea for the serious social researcher to familiarize him/herself with its workings.

      Use your browser's BACK button to return to the previous page.
      Click here to go on to Methods and Measurements Index.
      Click here to go the the site index.