Data Analysis and Probability
The Data Analysis and Probability Standard recommends that students formulate questions that can be answered using data and addresses what is involved in gathering and using the data wisely. Students should learn how to collect data, organize their own or others' data, and display the data in graphs and charts that will be useful in answering their questions. This Standard also includes learning some methods for analyzing data and some ways of making inferences and conclusions from data. The basic concepts and applications of probability are also addressed, with an emphasis on the way that probability and statistics are related.
The amount of data available to help make decisions in business, politics, research, and everyday life is staggering: Consumer surveys guide the development and marketing of products. Polls help determine political-campaign strategies, and experiments are used to evaluate the safety and efficacy of new medical treatments. Statistics are often misused to sway public opinion on issues or to misrepresent the quality and effectiveness of commercial products. Students need to know about data analysis and related aspects of probability in order to reason statisticallyskills necessary to becoming informed citizens and intelligent consumers.
The increased curricular emphasis on data analysis proposed in these Standards
is intended to span the grades rather than to be reserved for the middle
grades and secondary school, as is common in many countries. NCTM's 1989
Curriculum and Evaluation Standards for School Mathematics introduced
standards in statistics and probability at all grade bands; a number of
organizations have developed instructional materials and professional
development programs to promote the teaching and learning of these topics.
Building on this base, these Standards recommend a strong development
of the strand, with concepts and procedures becoming increasingly sophisticated
across the grades so that by the end of high school students have a sound
knowledge of elementary statistics. To understand the fundamentals of
statistical ideas, students must work directly with data. The emphasis
on working with data entails students' meeting new ideas and procedures
as they progress through the grades rather than revisiting the same activities
and topics. The data and statistics strand allows teachers and students
to make a number of important connections among ideas and procedures from
number, algebra, measurement, and geometry. Work in data analysis and
probability offers a natural way for students to connect mathematics with
other school subjects and with experiences in their daily lives.
In addition, the processes used in reasoning about data and statistics
will serve students well in work and in life. Some things children learn
in school seem to them predetermined and rule bound. In studying data
and statistics, they can also learn that solutions to some problems depend
on assumptions and have some degree of uncertainty. The kind of reasoning
used in probability and statistics is not always intuitive, and so students
will not necessarily develop it if it is not included in the curriculum.»
Formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them
Because young children are naturally curious about their world, they often raise questions such as, How many? How much? What kind? or Which of these? Such questions often offer opportunities for beginning the study of data analysis and probability. Young children like to design questions about things close to their experienceWhat kind of pets do classmates have? What are children's favorite kinds of pizza? As students move to higher grades, the questions they generate for investigation can be based on current issues and interests. Students in grades 68, for example, may be interested in recycling, conservation, or manufacturers' claims. They may pose questions such as, Is it better to use paper or plastic plates in the cafeteria? or Which brand of batteries lasts longer? By grades 912, students will be ready to pose and investigate problems that explore complex issues.
Young children can devise simple data-gathering plans to attempt to answer their questions. In the primary grades, the teacher might help frame the question or provide a tally sheet, class roster, or chart on which data can be recorded as they are collected. The "data" might be real objects, such as children's shoes arranged in a bar graph or the children themselves arranged by interest areas. As students move through the elementary grades, they should spend more time planning the data collection and evaluating how well their methods worked in getting information about their questions. In the middle grades, students should work more with data that have been gathered by others or generated by simulations. By grades 912, students should understand the various purposes of surveys, observational studies, and experiments.
A fundamental idea in prekindergarten through grade 2 is that data can be organized
or ordered and that this "picture" of the data provides information about
the phenomenon or question. In grades 35, students should develop
skill in representing their data, often using bar graphs, tables, or line
plots. They should learn what different numbers, symbols, and points mean.
Recognizing that some numbers represent the values of the data and others
represent the frequency with which those values occur is a big step. As
students begin to understand ways of representing data, they will be ready
to compare two or more data sets. Books, newspapers, the World Wide Web,
and other media are full of displays of data, and by the upper elementary
grades, students ought to learn to read and understand these displays.
Students in grades 68 should begin to compare the effectiveness
of various types of displays in organizing the data for further analysis
or in presenting the data clearly to an audience. As students deal with
larger or more-complex data sets, they can reorder data and represent
data in graphs quickly, using technology so that they can focus on analyzing
the data and understanding what they mean.
Select and use appropriate
statistical methods to analyze data
Although young children are often most interested in their own piece of data on a graph (I have five people in my family), putting all » the students' information in one place draws attention to the set of data. Later, students should begin to describe the set of data as a whole. Although this transition is difficult (Konold forthcoming), students may, for example, note that "more students come to school by bus than by all the other ways combined." By grades 35, students should be developing an understanding of aggregated data. As older students begin to see a set of data as a whole, they need tools to describe this set. Statistics such as measures of center or location (e.g., mean, median, mode), measures of spread or dispersion (range, standard deviation), and attributes of the shape of the data become useful to students as descriptors. In the elementary grades, students' understandings can be grounded in informal ideas, such as middle, concentration, or balance point (Mokros and Russell 1995). With increasing sophistication in secondary school, students should choose particular summary statistics according to the questions to be answered.
Throughout the school years, students should learn what it means to make valid
statistical comparisons. In the elementary grades, students might say
that one group has more or less of some attribute than another. By the
middle grades, students should be quantifying these differences by comparing
specific statistics. Beginning in grades 35 and continuing in the
middle grades, the emphasis should shift from analyzing and describing
one set of data to comparing two or more sets (Konold forthcoming). As
they move through the middle grades into high school, students will need
new tools, including histograms, stem-and-leaf plots, box plots, and scatterplots,
to identify similarities and differences among data sets. Students also
need tools to investigate association and trends in bivariate data, including
scatterplots and fitted lines in grades 68 and residuals and correlation
in grades 912.
Develop and evaluate inferences and predictions that are based on data
Central elements of statistical analysisdefining an appropriate
sample, collecting data from that sample, describing the sample, and making
reasonable inferences relating the sample and the populationshould
be understood as students move through the grades. In the early grades,
students are most often working with census data, such as a survey of
each child in the class about favorite kinds of ice cream. The notion
that the class can be viewed as a sample from a larger population is not
obvious at these grades. Upper elementary and early middle-grades students
can begin to develop notions about statistical inference, but developing
a deep understanding of the idea of sampling is difficult (Schwartz et
al. 1998). Research has shown that students in grades 58 expect
their own judgment to be more reliable than information obtained from
data (Hancock, Kaput, and Goldsmith 1992). In the later middle grades
and high school, students should address the ideas of sample selection
and statistical inference and begin to understand that there are ways
of quantifying how certain one can be about statistical results.
In addition, students in grades 912 should use simulations to
learn about sampling distributions and make informal inferences. In particular,
they should know that basic statistical techniques are used to monitor
quality in the workplace. Students should leave secondary school
» with the ability to judge the validity of arguments that
are based on data, such as those that appear in the press.
Understand and apply basic concepts of probability
A subject in its own right, probability is connected to other areas of mathematics, especially number and geometry. Ideas from probability serve as a foundation to the collection, description, and interpretation of data.
In prekindergarten through grade 2, the treatment of probability ideas should be informal. Teachers should build on children's developing vocabulary to introduce and highlight probability notions, for example, We'll probably have recess this afternoon, or It's unlikely to rain today. Young children can begin building an understanding of chance and randomness by doing experiments with concrete objects, such as choosing colored chips from a bag. In grades 35 students can consider ideas of chance through experimentsusing coins, dice, or spinnerswith known theoretical outcomes or through designating familiar events as impossible, unlikely, likely, or certain. Middle-grades students should learn and use appropriate terminology and should be able to compute probabilities for simple compound events, such as the number of expected occurrences of two heads when two coins are tossed 100 times. In high school, students should compute probabilities of compound events and understand conditional and independent events. Through the grades, students should be able to move from situations for which the probability of an event can readily be determined to situations in which sampling and simulations help them quantify the likelihood of an uncertain outcome.
Many of the phenomena that students encounter, especially in school, have predictable outcomes. When a fair coin is flipped, it is equally likely to come up heads or tails. Which outcome will result on a given flip is uncertaineven if ten flips in a row have resulted in heads, for many people it is counterintuitive that the eleventh flip has only a 50 percent likelihood of being tails. If an event is random and if it is repeated many, many times, then the distribution of outcomes forms a pattern. The idea that individual events are not predictable in such a situation but that a pattern of outcomes can be predicted is an important concept that serves as a foundation for the study of inferential statistics.