biostatistics exam 1 ch 1-3

  1. research
    systemmatic study of one or more problems usually posed as research questions germane to a specific dicipline.
  2. population
    larger group researcher wants to draw conclusions about.
  3. parameter
    • characteristic of a population.
    • usually unknown but are estimated with statistics.
  4. sample
    group of the population that is actually studied.
  5. statistic
    characteristic of a sample.
  6. statistics
    a branch of applied mathematics that deals with collecting, organizing and interpreting data using well defined procedures.
  7. purpose of statistics (3 parts)
    • describe and summarize info, reducing it to smaller, more meaningful data sets.
    • make predictions or generalize about occurnaces based on observations.
    • identify associations, relationships or differences in observations.
  8. types of statistics (2)
    descriptive and inferential.
  9. descriptive statistics
    characterizes data by summarizing it into more understandable terms without losing or distorting much information.
  10. inferential statistics
    provides predicitions about a population's characteristics based on information from a sample of that population.
  11. data
    raw materials of research gathered from a sample that has been selected from a population.
  12. variable
    • characteristic being measured that varies among persons, events, or objects being studied.
    • a concept that a method of measurement has been determined for.
  13. measurement
    assignment of numerals to objects or events according to a set of rules.
  14. types of measurement scales (4)
    nominal, ordinal, interval & ratio.
  15. nominal scale
    • lowest form of data.
    • organizes data into discrete untis.
    • allows researcher to assign numbers that classify characteristics of people, objects or events into categories.
    • assignment of numerals is arbitrary.
  16. qualitative nominal variable
  17. categorical nominal variable
  18. types of nominal variables
    categorical and qualitative
  19. ordinal scale
    • places characteristics into categories and categories are ordered in some meaningful way.
    • distance between categories is unknown.
  20. interval scale
    • distances between category values are equal due to some accepted physical unit of measurement.
    • i.e. F - temperature
  21. types of interval variables
    continuous and discrete
  22. continuous interval variable
    may take on any numerical value within a variable's range.
  23. discrete interval variable
    takes on only a finite number of value between two points.
  24. ratio scale
    • most precise level of measurement.
    • meaningfully ordered characteristics with equal intervals between them and the presence of a zero point that is determined by nature.
    • i.e. pulse, bp, weight
  25. meaningfullness
    clincial or substantive meaning of the results of statistical analysis.
  26. davidson's principles
    • principles of statistical data handling to fill the gap between getting data into the computer and running statistical tests.
    • summarize the key dilemas that researchers face when entering data into the computer.
    • he wrote a book in 1996 about them. our book only covers 18.
  27. dp - appropriate data principle
    • you cannot analyze what you do not measure.
    • must anticipate variables needed to expalin results.
  28. dp - social consequences principle
    • data about people are about people.
    • can have social consquences.
    • i.e. drug proven not to work better, unethical to advise people to take it?
  29. dp - data control principle
    • take control of structure and flow of data.
    • monitor procedure for layout of data record.
  30. dp - data efficiency prinicple
    be efficient in getting your data into a computer, but not at the cost of losing cucial information.
  31. dp - change awareness principle
    data entry is an interactive process. try to use the computer to do as much computing and debugging as possible.
  32. dp - data manipulation prinicple
    • let the computer to do as much work as possible.
    • let it manipulate the data for you by instructing it to do so.
  33. dp - original data principle
    always save a computer file of the original, unaltered data.
  34. dp - default prinicple
    • know your software's defualt settings and whether they meet your needs.
    • (especially concerning missing values)
  35. dp - complex data structure principle
    if your software can accommodate complex data structures, then you might benefit from using that software feature.
  36. dp - software's data relations principle
    know if your software can perform the following four relations and if so, what commands are necessary for it to do so: subsetting, catenation, merging and relational database construction.
  37. dp - software's sorting principle
    know how to perform a sort in your software and whether your software requires a sort before a by group analysis or before merging.
  38. dp - impossibility/ implausibility principle
    use the computer to check for impossible and implausible data.
  39. dp - burnstein's data sensibility principle
    run your data all the way through to the final computer analysis and ask yourself whether the results make sense.
  40. dp - extant error principle
    data bugs exist even if you've corrected mistakes it's possible you've missed something.
  41. dp - manual check principle
    • nothing can replace another pair of eyes to check over a data set.
    • check it yourself or get someone else to do it.
  42. dp - error typology principle
    • debugging includes detection and correction of errors.
    • try to classify each error as you uncover it.
  43. dp - kludge principle
    • sometimes the way to manipulate data is not elegant and seems to waste computer resources.
    • patching together cpomputer demands awkwardly to make data do what you want.
  44. dp - atomicity principle
    • you cannot measure below the data level that you observe.
    • i.e. age 21-25 nominal (lowest)
    • age 26-29
    • vs. age? ___ more precise
  45. quantitative research
    using specific methods to advance the science base of the discipline by studying phenomena relevant to the goals of that discipline.
  46. individuals
    objects being described by a set of data.
  47. quantitative research methods
    experiments, surveys, correlational studies, meta-analysis, and psychometric evaluations.
  48. bar chart
    • simplest foorm of a chart for nominal or ordinal data.
    • category labels horizontally in a systematic order with vertical bars with spaces between.
  49. histogram
    • appropriate for interval, ratio, and sometimes ordinal variables.
    • similar to bar chart except bars are placed side by side.
  50. polygon
    • a chart for interval or ratio variables.
    • it is equivalent to a histogram but appears smoother made by connecting midpoints of the top of each bar.
  51. pie chart
    a circle that has been partitioned into percentage distribution of quantitative variable total area 100% = 360 degrees.
  52. statistical table
    data is organized into values or categories and then described with titles and captions.
  53. working table
    • a frequency distribution for interval or ratio variables.
    • an ordered array of values.
  54. mean and formula
    • best known and widely used average.
    • the center of a frequency distribution.
    • x bar = M = sum of x/n
  55. measures of central tendency
    • mean
    • median
    • mode
    • - center of trend or average
  56. median
    • middle value of a set of ordered numbers.
    • point where 50% of distribution falls below and above.
    • not affected by outliers.
    • - place #'s in order and the middle number is median if n is odd
    • - if n is even average the middle two #'s
  57. mode
    • most frequent value or category in a distribution.
    • not calculated, just observed.
    • if all scores are different then there is no mode.
    • -use when dealing with frequency distribution for nominal data.
  58. homogeneous
    • having low variability.
    • numbers clustered.
  59. heterogeneous
    • having high variability.
    • numbers are spread out.
  60. variability
    • measure of spread or dispersion.
    • measure of degree to which scores in a distribution are spread out or clustered together.
  61. types of variability
    • standard deviation (SD)
    • range
    • interquartile range
  62. interquartile range
    • range of values extending from the 25th percentile to the 75th percentile.
    • - divide by 2 for the semi interquartile range
  63. range
    • max-min... highest #- lowest #.
    • simplest measure of variability.
    • sensitive to extreme #
    • unstable since it's only based on two numbers.
  64. standard deviation (SD)
    • measure of dispersion of scores around the mean.
    • most widely reported, indicates spread.
    • low SD means close together and high means spread out.
  65. interpercentile measures
    • interquartile range (IQR)
    • -range of values from P25 to P75
    • not sensitive to outliers.
    • used on growth charts.
    • good with skewed data.
    • use with median.
  66. skewness
    • non symmetrical distribution.
    • measure of symmetry.
  67. kurtosis
    measure of flatness.
  68. measures of variability
    • standard deviation
    • range
    • interpecentile measures
  69. types of data transformation
    • square root transformation
    • log transformation
    • inverse transformation
  70. inverse transformation
  71. log transformation
  72. square root transformation
  73. reflecting
  74. normal distribution
  75. symmetry
  76. degrees of freedom
  77. pearson's skewness coefficient
  78. fisher's measure of skewness
  79. SD formula
  80. percentile
  81. line chart
  82. negatively skewed
  83. box plot
  84. positively skewed
  85. outlier
  86. winsorized mean
  87. trimmed mean
Card Set
biostatistics exam 1 ch 1-3
for 1st exam in biostatistics class ch 1-3