|
In grades 9-12, the
mathematics curriculum should include the continued study of data
analysis and statistics so that all students can--
- construct and draw
inferences from charts, tables, and graphs that summarize data
from real-world situations;
- use curve fitting
to predict from data;
- understand and
apply measures of central tendency, variability, and correlation;
- understand sampling
and recognize its role in statistical claims;
- design a statistical
experiment to study a problem, conduct the experiment, and interpret
and communicate the outcomes;
- analyze the effects
of data transformations on measures of central tendency and variability;
and so that, in addition,
college-intending students can--
- transform data
to aid in data interpretation and prediction;
- test hypotheses
using appropriate statistics.
Focus
Collecting, representing,
and processing data are activities of major importance to contemporary
society. In the natural and social sciences, data are also summarized,
analyzed, and transformed. These activities involve simulations
and/or sampling, fitting curves, testing hypotheses, and drawing
inferences. To enhance their social awareness and career opportunities,
students should learn to apply these techniques in solving problems
and in evaluating the myriad statistical claims they encounter in
their daily lives.
The study of statistics
in grades 9-12 should consolidate, deepen, and build on student
understandings of methods of exploratory data analysis as developed
in the elementary and middle grades. Students should be encouraged
to apply statistical tools to other academic subjects through the
exploration of such data as student-opinion polls for social studies,
word or letter counts for English, and plant-growth records for
biology. Out-of-school activities such as athletics provide further
opportunities for data analysis, the results of which can be seen
to be immediately useful.
It is essential that students
come to understand the difference between the right-or-wrong quality
characteristic of most mathematical thinking and the qualified nature
of outcomes in statistical analysis. It is equally important, however,
that students do not extrapolate beyond this fact to reject statistical
thinking because it allows counterexamples. Instead, they should
recognize that statistics plays an important intermediate role between
the exactness of other mathematical studies and the equivocal nature
of a world dependent largely on individual opinion.
Computing technology allows
students to represent data in graphs quickly (with curve fitting
done for them) and to calculate statistical measures with remarkable
precision using single computer keystrokes. What is missing--and
what their study of statistics should provide--is an understanding
of which measures are appropriate for a given problem and what such
measures as mean, variance, and correlation can tell them about
a problem. Furthermore, it is essential that students learn to interpret
results intelligently.
Discussion
This standard should not
be viewed as advocating, or even prescribing, a statistics course;
rather, it describes topics that should be integrated with other
mathematics topics and disciplines. For example, curve fitting is
a statistical topic that integrates easily into the study of linear
and higher-order equations. Students could investigate the possible
relationship between car age and mileage by collecting data from
the school parking lot and constructing a scatter plot (fig.
10.1).
Fig. 10.1.
Car mileage by model year
Since the points seem to
lie in a reasonably narrow band, students can identify a "best"
line that fits their data. Techniques for constructing such a line
can range from the most basic, such as placing a piece of uncooked
spaghetti or a string on the graph so that approximately the same
number of data points fall above as below, to approaches involving
medians of grouped data or the technique of least squares. Many
calculators provide the capability to generate an equation for the
regression line (one of these "best" lines) as well as
the associated correlation coefficient. Students then can use either
their graph or their equation to predict, for example, the expected
mileage of a 1980 car. They should be encouraged to write a summary
paragraph about the information displayed in the table or graph
and include inferences they believe are supported by their analysis
of the data.
Students should also come
to realize that curve fitting is not appropriate for all data sets.
In paperback books, for example, there is so little relationship
between the number of pages and the price that further analysis
would not yield useful information.
Communication plays a central
role in statistical problems. Quantitative results require careful
exposition and interpretation if they are to have meaning. In particular,
it is often true that different modes of data representation convey
quite different messages. A regression line (calculated directly
from data without reference to a scatter plot) might be strongly
influenced by a few aberrant points, for example, whereas the scatter
plot for the same data might suggest that these outliers represent
anomalies that may be due to mistakes in data handling. A further
investigation of these specific points might lead to their rejection,
a new curve fit, and an improved correlation.
Students must acquire intuitive
notions of randomness, representativeness, and bias in sampling
to enhance their ability to evaluate statistical claims. These understandings
would give students the appropriate tools for rejecting such television
advertising claims as one that portrays a series of people choosing
the same commercial toothpaste. (Here the implication of representativeness
clearly is not fulfilled.) If they are to grasp the concepts of
sampling, the central limit theorem, and confidence intervals, students
should have experience constructing and analyzing sampling distributions
through simulations. These experiences provide students with the
tools and the perspective they need to interpret such claims in
the media as, "The polls indicate that 55% of the voters, with
an error of 3%, prefer candidate X (with 90% confidence)."
Statistics and probability threads are interwoven here. College-intending
students also should apply this understanding of sampling in designing
their own experiments to test hypotheses.
Students should be aware
that bias can arise in the interpretation of results as well as
in sampling: the interpreter's predisposition or expectation may
strongly affect the message derived from the statistical results.
This often occurs in the presentation and interpretation of data
gathered for political purposes.
College-intending students
should become familiar with such distributions as the normal, Student's
t, Poisson, and chi square. Students should be able to determine
when it is appropriate to use these distributions in statistical
analysis (e.g., to obtain confidence intervals or to test hypotheses).
Instructional activities should focus on the logic behind the process
in addition to the "test" itself.
In recent years, nonparametric
methods or distribution-free methods like the chi-square test in
the cola example in the standard on probability
(Standard 11) have increasingly been used as alternatives to
statistical tests that assume a particular (often normal) distribution.
Nonparametric techniques (which also include such measures as the
sign test, the Mann-Whitney U test, and Spearman's rank correlation
test) are extremely versatile, easy to use, often derive their power
directly from combinatorics and the binomial distribution (of the
statistics, not the sample), and are particularly well suited to
small samples. As these methods continue to gain in popularity,
it is expected that they will become an integral part of the evolving
statistics curriculum.
All students should be
encouraged to discover generalizations that relate the effect of
modifying a set of data by addition or scalar multiplication on
the mean, median, mode, and variance. For example, class test scores
could be transformed by increasing each score by 10 points (or by
multiplying each score by 1.1). Technology provides an easy means
by which students can compute the statistics for the transformed
data, which, on analysis, lead to generalizations that the mean,
median, and mode are increased by 10 (multiplied by 1.1) and the
variance is unchanged (multiplied by (1.1)2 ). College-intending
students should be able to derive these results algebraically.
Statistical data, summaries,
and inferences appear more frequently in the work and everyday lives
of people than any other form of mathematical analysis. It is therefore
essential that all high school graduates acquire, at the appropriate
level, the capabilities identified in this standard. This expectation
will require that statistics be given a more prominent position
in the high school curriculum.
|