# Basic Statistics Overview

Use Minitab's basic statistics capabilities for calculating basic statistics
and for simple estimation and hypothesis testing with one or two samples.
The basic statistics capabilities include procedures for:

· Calculating
or storing descriptive statistics

· Hypothesis tests
and confidence intervals
of the mean or difference in means

· Hypothesis
tests and confidence intervals for a proportion or the difference in proportions

· Hypothesis
tests and confidence intervals of the occurrence rate, mean number of
occurrences, and the differences between them for Poisson processes.

· Hypothesis
tests and confidence intervals for one variance,
and for the difference between two variances

· Measuring
association

· Testing
for normality of a distribution

· Testing
whether data follow a Poisson distribution

Calculating and storing descriptive statistics

· Display
Descriptive Statistics produces descriptive statistics for each column
or subset within a column. You can display the statistics in the Session
window and/or display them in a graph.

· Store
Descriptive Statistics stores descriptive statistics for each column
or subset within a column.

· Graphical Summary produces
four graphs and an output table in one graph window.

For a list of descriptive statistics available for display or storage,
see Descriptive
Statistics Available for Display or Storage. To calculate descriptive
statistics individually and store them as constants, see Column
Statistics.

Confidence intervals and hypothesis tests of means

The four procedures for hypothesis tests and confidence intervals for
population means or the difference between means are based upon the distribution
of the sample mean following a normal distribution.
According to the Central Limit Theorem,
the normal distribution becomes an increasingly better approximation for
the distribution of the sample mean drawn from any distribution as the
sample size increases.

· 1-Sample Z computes a confidence
interval or performs a hypothesis test of the mean when the population
standard deviation, s, is known. This procedure
is based upon the normal distribution, so for small samples, this procedure
works best if your data were drawn from a normal distribution or one that
is close to normal. From the Central Limit Theorem,
you may use this procedure if you have a large sample, substituting the
sample standard deviation for s. A common rule
of thumb is to consider samples of size 30 or higher to be large samples.
Many analysts choose the t-procedure over the Z-procedure whenever s is unknown.

· 1-Sample t computes a confidence
interval or performs a hypothesis test
of the mean when s is unknown. This procedure
is based upon the t-distribution, which is derived from a normal distribution
with unknown s. For small samples, this procedure
works best if your data were drawn from a distribution that is normal
or close to normal. This procedure is more conservative than the Z-procedure
and should always be chosen over the Z-procedure with small sample sizes
and an unknown s. Many analysts choose the
t-procedure over the Z-procedure anytime s is
unknown. According to the Central Limit Theorem,
you can have increasing confidence in the results of this procedure as
sample size increases, because the distribution of the sample mean becomes
more like a normal distribution.

· 2-Sample t computes a confidence
interval and performs a hypothesis test of the difference between two
population means when s's are unknown and samples
are drawn independently from each other. This procedure is based upon
the t-distribution,
and for small samples it works best if data were drawn from distributions
that are normal or close to normal. You can have increasing confidence
in the results as the sample sizes increase.

· Paired t computes a confidence interval
and performs a hypothesis test of the difference between two population
means when observations are paired (matched). When data are paired, as
with before-and-after measurements, the paired t-procedure results in
a smaller variance and greater power of detecting differences than would
the above 2-sample t-procedure, which assumes that the samples were independently
drawn.

## Confidence intervals and hypothesis tests of proportions

· 1 Proportion computes a confidence
interval and performs a hypothesis test
of a population proportion.

· 2 Proportions computes a confidence
interval and performs a hypothesis test of the difference between two
population proportions.

## Confidence intervals and hypothesis tests of Poisson rates

· 1-Sample Poisson
Rate computes a confidence interval and performs a hypothesis
test
on the occurrence rate and mean number of occurrences in a Poisson process.

· 2-Sample Poisson
Rate computes a confidence interval and performs a hypothesis test
on the difference in occurrence rates and the difference in the mean number
of occurrences of two Poisson processes.

## Confidence intervals and hypothesis tests of variance

· 1 Variance computes a confidence
interval and performs a hypothesis test
on the variance of one sample.

· 2
Variances computes a confidence interval and performs a hypothesis
test for the equality, or homogeneity, of variance of two samples.

## Measures of association

· Correlation calculates the Pearson
product moment correlation coefficient or the Spearman rank-order correlation
coefficient for pairs of variables. The correlation coefficient measures
the degree of linear or monotonic relationship between two variables.
You can obtain a p-value to test if there is sufficient evidence that
the correlation coefficient is not zero.

By using a combination of Minitab commands, you can also
compute a partial correlation coefficient. A partial correlation coefficient
is the correlation coefficient between two variables while adjusting for
the effects of other variables.

· Covariance calculates the covariance
for pairs of variables. The covariance is a measure of the relationship
between two variables but it has not been standardized, as is done with
the correlation coefficient, by dividing by the standard deviation of
both variables.

## Tests for normality and outliers

· Normality Test generates
a normal probability plot and performs a hypothesis test
to examine whether or not the observations follow a normal distribution.
Some statistical procedures, such as a Z- or t-test,
assume that the samples were drawn from a normal distribution. Use this
procedure to test the normality assumption.

· Outlier Test identifies a single
outlier in a sample.

## Goodness-of-fit test

Goodness-of-Fit
Test for Poisson evaluates whether your data follow a Poisson distribution.
Some statistical procedures, such as a U chart, assume that the data follow
a Poisson distribution. Use this procedure to test this assumption.