biostatistics exam 1 ch 1-3

Home

Get App

Create

research

systemmatic study of one or more problems usually posed as research questions germane to a specific dicipline.
population

larger group researcher wants to draw conclusions about.
parameter
- characteristic of a population.
- usually unknown but are estimated with statistics.
sample

group of the population that is actually studied.
statistic

characteristic of a sample.
statistics

a branch of applied mathematics that deals with collecting, organizing and interpreting data using well defined procedures.
purpose of statistics (3 parts)
- describe and summarize info, reducing it to smaller, more meaningful data sets.
- make predictions or generalize about occurnaces based on observations.
- identify associations, relationships or differences in observations.
types of statistics (2)

descriptive and inferential.
descriptive statistics

characterizes data by summarizing it into more understandable terms without losing or distorting much information.
inferential statistics

provides predicitions about a population's characteristics based on information from a sample of that population.
data

raw materials of research gathered from a sample that has been selected from a population.
variable
- characteristic being measured that varies among persons, events, or objects being studied.
- a concept that a method of measurement has been determined for.
measurement

assignment of numerals to objects or events according to a set of rules.
types of measurement scales (4)

nominal, ordinal, interval & ratio.
nominal scale
- lowest form of data.
- organizes data into discrete untis.
- allows researcher to assign numbers that classify characteristics of people, objects or events into categories.
- assignment of numerals is arbitrary.
qualitative nominal variable
categorical nominal variable
types of nominal variables

categorical and qualitative
ordinal scale
- places characteristics into categories and categories are ordered in some meaningful way.
- distance between categories is unknown.
interval scale
- distances between category values are equal due to some accepted physical unit of measurement.
- i.e. F - temperature
types of interval variables

continuous and discrete
continuous interval variable

may take on any numerical value within a variable's range.
discrete interval variable

takes on only a finite number of value between two points.
ratio scale
- most precise level of measurement.
- meaningfully ordered characteristics with equal intervals between them and the presence of a zero point that is determined by nature.
- i.e. pulse, bp, weight
meaningfullness

clincial or substantive meaning of the results of statistical analysis.
davidson's principles
- principles of statistical data handling to fill the gap between getting data into the computer and running statistical tests.
- summarize the key dilemas that researchers face when entering data into the computer.
- he wrote a book in 1996 about them. our book only covers 18.
dp - appropriate data principle
- you cannot analyze what you do not measure.
- must anticipate variables needed to expalin results.
dp - social consequences principle
- data about people are about people.
- can have social consquences.
- i.e. drug proven not to work better, unethical to advise people to take it?
dp - data control principle
- take control of structure and flow of data.
- monitor procedure for layout of data record.
dp - data efficiency prinicple

be efficient in getting your data into a computer, but not at the cost of losing cucial information.
dp - change awareness principle

data entry is an interactive process. try to use the computer to do as much computing and debugging as possible.
dp - data manipulation prinicple
- let the computer to do as much work as possible.
- let it manipulate the data for you by instructing it to do so.
dp - original data principle

always save a computer file of the original, unaltered data.
dp - default prinicple
- know your software's defualt settings and whether they meet your needs.
- (especially concerning missing values)
dp - complex data structure principle

if your software can accommodate complex data structures, then you might benefit from using that software feature.
dp - software's data relations principle

know if your software can perform the following four relations and if so, what commands are necessary for it to do so: subsetting, catenation, merging and relational database construction.
dp - software's sorting principle

know how to perform a sort in your software and whether your software requires a sort before a by group analysis or before merging.
dp - impossibility/ implausibility principle

use the computer to check for impossible and implausible data.
dp - burnstein's data sensibility principle

run your data all the way through to the final computer analysis and ask yourself whether the results make sense.
dp - extant error principle

data bugs exist even if you've corrected mistakes it's possible you've missed something.
dp - manual check principle
- nothing can replace another pair of eyes to check over a data set.
- check it yourself or get someone else to do it.
dp - error typology principle
- debugging includes detection and correction of errors.
- try to classify each error as you uncover it.
dp - kludge principle
- sometimes the way to manipulate data is not elegant and seems to waste computer resources.
- patching together cpomputer demands awkwardly to make data do what you want.
dp - atomicity principle
- you cannot measure below the data level that you observe.
- i.e. age 21-25 nominal (lowest)
- age 26-29
- vs. age? ___ more precise
quantitative research

using specific methods to advance the science base of the discipline by studying phenomena relevant to the goals of that discipline.
individuals

objects being described by a set of data.
quantitative research methods

experiments, surveys, correlational studies, meta-analysis, and psychometric evaluations.
bar chart
- simplest foorm of a chart for nominal or ordinal data.
- category labels horizontally in a systematic order with vertical bars with spaces between.
histogram
- appropriate for interval, ratio, and sometimes ordinal variables.
- similar to bar chart except bars are placed side by side.
polygon
- a chart for interval or ratio variables.
- it is equivalent to a histogram but appears smoother made by connecting midpoints of the top of each bar.
pie chart

a circle that has been partitioned into percentage distribution of quantitative variable total area 100% = 360 degrees.
statistical table

data is organized into values or categories and then described with titles and captions.
working table
- a frequency distribution for interval or ratio variables.
- an ordered array of values.
mean and formula
- best known and widely used average.
- the center of a frequency distribution.
- x bar = M = sum of x/n
measures of central tendency
- mean
- median
- mode
- - center of trend or average
median
- middle value of a set of ordered numbers.
- point where 50% of distribution falls below and above.
- not affected by outliers.
- - place #'s in order and the middle number is median if n is odd
- - if n is even average the middle two #'s
mode
- most frequent value or category in a distribution.
- not calculated, just observed.
- if all scores are different then there is no mode.
- -use when dealing with frequency distribution for nominal data.
homogeneous
- having low variability.
- numbers clustered.
heterogeneous
- having high variability.
- numbers are spread out.
variability
- measure of spread or dispersion.
- measure of degree to which scores in a distribution are spread out or clustered together.
types of variability
- standard deviation (SD)
- range
- interquartile range
interquartile range
- range of values extending from the 25th percentile to the 75th percentile.
- - divide by 2 for the semi interquartile range
range
- max-min... highest #- lowest #.
- simplest measure of variability.
- sensitive to extreme #
- unstable since it's only based on two numbers.
standard deviation (SD)
- measure of dispersion of scores around the mean.
- most widely reported, indicates spread.
- low SD means close together and high means spread out.
interpercentile measures
- interquartile range (IQR)
- -range of values from P25 to P75
- not sensitive to outliers.
- used on growth charts.
- good with skewed data.
- use with median.
skewness
- non symmetrical distribution.
- measure of symmetry.
kurtosis

measure of flatness.
measures of variability
- standard deviation
- range
- interpecentile measures
types of data transformation
- square root transformation
- log transformation
- inverse transformation
inverse transformation
log transformation
square root transformation
reflecting
normal distribution
symmetry
degrees of freedom
pearson's skewness coefficient
fisher's measure of skewness
SD formula
percentile
line chart
negatively skewed
box plot
positively skewed
outlier
winsorized mean
trimmed mean

Author

ar1529

23895

Card Set

biostatistics exam 1 ch 1-3

Description

for 1st exam in biostatistics class ch 1-3

Updated

2010-06-17T20:43:13Z

Show Answers