1. Question: formulate questions that
explore whether or not a relationship exists in a real-world context. The question
needs to be clearly stated and based on collectable data. Not all questions can be
answered!
"Data as they are" questions can be answered:
If the U.S. Presidential elections were held today, what percent of Americans would
vote for Al Gore as president?
"What-if questions under replicable circumstances" can be answered:
Among all American school children age 6 to 12, would giving Vitamin C prevent colds?
(You can imagine testing this out on more and more children.)
Data from nonreplicable events, in general, can NOT be answered!
"How many U.S. troops be in Bosnia if the American Revoluation had failed?"
- data collection schedule (fixed intervals between pre and posttest, at the
beginning and end of participation, after each session,...)2
3. Displaying Data: select, use, and defend appropriate
methods of displaying data
display data by hand or by computer in a variety of ways, including circle graphs
4. Conclusions: present analyses and conclusions based on
displayed data. Conclusions are clearly stated and answer the question based on
available data.
read and interpret graphs that are provided
determine and use the most appropriate measure of central tendency in a given context
(mode, median, mean)
describe the variability of given data using range or box-and-whisker plots (range,
extremes, gaps, clusters, and quartiles)
analyse sets of data by comparing different measures of central tendency (mode, median,
mean)
histogram - graphical representation of a frequency table.
Technically the bars should touch, but not all graphing programs do this.
The program I used placed the value for the bar in the middle of each bar so I
chose the midpoint age values 37, 42, 47, 52, 57, 62, 67, and 72.
statistics - systematic collection and
arrangement of large numbers of observations and quantities of numerical observations, and
with ways of drawing useful conclusions from such data
population - eligible people for a data collection
investigation
sample - part of a population selected so as to give
information about the population as a whole
Biased Samples
Unbiased Samples
convenience sampling - quick and easy way
to obtain data, but not everyone in the population has an equal chance of being selected
systematic sampling - every nth member of the
population is sampled
self selective sampling - population
provides information by volunteering their opinions
simple random sampling - the sample is chosen randomly
from the population
cluster sampling - a particular segment of
the population is sampled
stratified random sampling - the population is divided
into groups (strata)
frequency - the number of times an event occurs
frequency table - a table showing a set of values of a variable and
the number of times each value occurs
The Independent variable is always assigned to the X-AXIS.
What is the independent variable? The independent
variable does not relying on an other variable. The values of the independent
variable can be chosen freely.
There are
three types of relationships between variables:
linear
non-linear (curved-line or other pattern)
no relationship at all
axis
- a line drawn through the center of a figure
scale - a sequence of marks, usually along a line, used
in making measurements
proportional - one variable is proportional to another if
the ration of corresponding values remains constant
interpolation - to estimate a value by following a
pattern and staying within the values already known
extrapolation - to estimate a value by following a
pattern and going beyond the values already known
discreet variable - have measurements that are distinct,
periodic, and unconnected between data points (e.g. the distance an athlete throws a
discus)
continuous variable - measurements are uninterrupted and connected
between data points (e.g. growth of a plant)
scatter plot - a graph that relates data from two
different sets
line of best fit (trend line) - A line on a scatter plot
which can be drawn near the points to more clearly show the trend between two sets of data
trend - relationship between two sets of data. The trend
will show a positive correlation, a negative correlation, or no correlation.
positive correlation -both sets of data increase together
negative correlation -one set of data decreases as the
other set of data increases
no correlation - the two data sets are not related.
weak correlation - when the data is not clustered along
an obvious line
strong correlation - when the data is clustered along an
obvious line ( can be positive or negative)
lower extreme - minimum data value
upper extreme - maximum data value
range - upper extreme minus lower extreme
cluster - a particular segment of the population
gaps - spaces in the data set without a segment of the
population
outlier - a point separted from the main body of the data
central tendency - point within the range about
which the rest of the data is considered balanced.
The three common measures of central tendency
are mean, median and mode.
lower quartile - separates the first 25% of the distribution from the
remaining 75%.
upper quartile - separates the first 75% of the distribution from the
remaining 25%.