4130 Chapter 14: Rigour in Research

  1. Chapter 14: Rigour in Research
    focus on: rigour, content validity, construct validity, & error
    • *This chapter relates to measurement tools such as
    • questionnaires, surveys, pain assessment tools, self efficacy tools, grief tools, the MMSE, the Braden Scale & many, many more.

    *The soundness of quantitative research revolves around what is being studied, the aim of inquiry & how will data be measured & analyzed.

    *This is a complex topic that requires statistical testing on specialized statistical programs. The tests relate to reliability & validity of measurement instruments.

    *To formulate a questionnaire one must understand the concept being studied, have a sound literature review which is linked back to specific items on a tool. The researcher must understand what they are looking for.
  2. Rigour in Quantitative Research
    *Rigour refers to the quality,  believability or trustworthiness of research findings.

    *It is determined by measurement instruments that are reliable & valid in reflecting the concepts of the theory being tested.

    *Psychometric (the application of statistical and mathematical techniques to psychological testing) assessments are designed to obtain evidence of the quality of these instruments.

    *Invalid instruments produce inaccurate generalizations about the larger population being studied. 

    Image Upload 1
  3. Reliability & Validity
    *Reliability & validity are not all or nothing phenomenon. No measuring instrument is completely reliable or valid. It is measured in terms of degrees. Validity has to do with truth, strength & value.

    • *Validity is not a commodity like money so that if
    • validity is present then the researcher has won.

    • *It is like integrity, character or quality to be
    • assessed relative to the research purpose & with each study.

    *Validity testing validates the use of an instrument for the research study not the validity of the tool itself. It may be valid in 1 situation but not another.
  4. Reliability
    based on: stability, homogeneity, & equivalence 
    reliability coefficient: tests stability, homo, & equivalence and expresses error between the 3 
    Reliability = consistency, predictability, accuracy, precision, stability, equivalence & homogeneity

    *It is the extent to which the instrument yields the same results on repeated measures.

    *It is the proportion of accuracy to inaccuracy in measurement.

    *The 3 main attributes of a reliable scale are:

    1. Stability: the ability of an instrument to produce the same results repeatedly.

    2. Homogeneity: all items in the measurement tool measure the same concept or characteristic.

    3. Equivalence: if the tool produces the same results when equivalent or parallel instruments or procedures are used. 

    Reliability coefficient: there are other ways to test reliability

    *The reliability coefficient refers to testing of stability, homogeneity & equivalence. 

    • *It expresses the amount of error between these 3
    • attributes. 

    *All measurement has some error (accounts for extraneous variables). Because of this reliability is expressed in degrees & is expressed as a form of correlation coefficient.

    • *Expressed in values 0 to 1.            
    • NONE                          

    *1 is perfect reliability.

    *0 is no reliability.

    *Lowest acceptable reliability is 0.8 for a well developed instrument tool & 0.7 for a newly developed tool. 

    *This also can be expressed as a measurement having 80% reliability with a 20% random error. 

  5. Other ways to test out Reliability
    1st attribute 
    2nd attribute 
    3rd attribute: Cronbach's alpha
    Image Upload 2

    1st attribute:

    *The same results are obtained on repeated administration of an instrument. This is assessed in 2 ways:

    A. Test–retest reliability: the administration of the same instrument to the same subjects under similar conditions on 2 or more occasions (repeated measures). Tests from different times are compared & expressed as a Pearson r. (remember this test? a statistical test that is calculated to reflect the degree of relationship between 2 variables, -0.3 to 0 to +0.3=no relationship) Used in experimental & quasiexperimental situations.

    B. Parallel or alternate form reliability: can be tested only if 2 comparable forms of the same instrument exist. Subjects take one form of a test then a second form of a test or use one form of a tool then a second form of the tool at a later date. 

    *The researcher must ensure that each tool is measuring the same concept. The 2 forms of the tool must be highly correlated before being used. Rarely used in nursing as nursing concepts change over time. For this to be reliable the concept would need to be constant. 

    *Randomly divide the questions on a tool in 2 sets. The 2 forms can be used independent of each other & are considered equivalent measures.

    2nd attritube:

    *The reliability of an instrument to measure the same concept. If the items on a tool correlate or compliment each other it is one-dimensional. 

    *Homogeneity is assessed by 4 methods:

    A. Item-to-total correlation: measures the relationship between each of the items in a tool & the total scale. If a item is not highly correlated it is removed. If the value is > 0.3 it is acceptable.

    B. Split-half reliability: items are split in ½ & compared. This measures the consistency in content of the tool. If the scores of the 2 halves are similar the tool is reliable. Levels should be at least 0.75.

    C. Kuder–Richardson coefficient: this is used when the responses in a tool are dichotomous (yes/no or true/false). Minimum acceptable level is 0.7. 

    D. Cronbach's alpha is the most commonly used measure of the internal consistency of a measurement tool.

    Each item on the scale is compared with each other.

    A total score is then used in the analysis of the data.

    You are looking for a Cronbach's alpha of at least 0.7 to indicate a high degree of internal consistency or that the items in a tool are indeed correlated & measuring the concept you are trying to measure.

    It is not a statistical test but a measure of sameness or consistency amongst items on a tool.

    4th attribute:

    *Is the consistency or agreement amongst observers using the same measurement tool or the consistency or agreement between alternate forms of a tool. 

    *Two methods to test this are:

    1. Parallel or alternate form. (see previous slides for this)

    2. Interrater reliability: use when data is an observation. This requires training for sameness in data collectors. Observers can agree in the form of a % that a tool indeed measures what it should or a test can be used called Cohen’s Kappa (this is much more reliable). A Kappa score of 0.8 or > indicates good interrater reliability. 
  6. Validity
    *Validity refers to whether a measurement instrument measures what it is intended to measure.

    *To be valid at tool must first be reliable. This is a necessary but not a sufficient condition for validity.

    *A tool can be reliable but not valid. A tool cannot measure an attribute such as anxiety in a erratic or inconsistent way it must do so accurately.

    *Remember not to get these terms mixed up with internal and external validity of a study (Chapter 9 re: controls). 

    *3 kinds of validity: that vary according to the researcher's purpose:

    1. content validity.

    2. criterion-related validity.

    3. construct validity.

    • ________________
    • 1. Content Validity

    *This examines the extent to which the method of measurement includes all the major elements relevant to the construct or concept being measured (example self efficacy).

    *This begins with the development of a tool. The researcher defines the concept to be measured & the dimensions that are the components of the concept.

    • *A panel of judges who are considered experts in the concept review the tool & give an opinion on how well it measures the concept. The tool is piloted &
    • revised as needed. 

    2. Criterion-related validity

    *This refers to what degree the subject's performance on a measurement tool & the subject’s actual behavior are related.

    *There are 2 types of criterion related validity:

    • 1. Concurrent validity: refers to the degree
    • of correlation of 2 measures of the same construct administered at the same time. A high correlation indicates agreement between the 2 measures.

    2. Predictive validity: refers to the degree of correlation between the measure of a concept & the future measure of the same concept. 

    3. Construct Validity

    • *The extent to which a test measures a theoretical
    • construct or trait & attempts to validate a body of theory underlying the measurement & testing of the hypothesized relationships. This is done by using empirical testing. 4 approaches to this:

    1. Hypothesis-testing: example: individuals with ↑health status would have ↑levels of self efficacy.

    2. Convergent & divergent: Convergent is when 2 tools are used to measure a concept & they are found to have a + correlation. A + correlation indicates the presence of convergence. Divergent validity uses measurement approaches that differentiate one construct from others that may be similar. This is looking for opposites. If a negative correlation exists the strength of a tool's validity is ↑.

    • 3. Contrasted-groups: the researcher identifies 2 groups of individuals expected to score extremely high or extremely low in the characteristic being measured by the instrument. The tool is administered to the high & low scoring groups & then the scores
    • examined. The differences in groups are noted. If the differences are significant then construct validity is supported.

    • 4. Factor analysis: this method assesses the degree to which the individual items on a scale truly cluster
    • around one another in dimensions. Items that are designed to measure the same dimensions should load on the same factors & those designed to measure different dimensions should load on different factors.

  7. Critique of reliability and validity
    *Was the appropriate method used to test the reliability of the tool?

    *Is the reliability of the tool adequate?

    *Was an appropriate method used to test the validity of the instrument?

    *Is the validity of the measurement tool adequate?

    • *If the sample from the development stage of the tool
    • different from the current sample, were the reliability & validity recalculated to determine whether the tool is still adequate?

    • *Have the strengths & weaknesses of the reliability
    • & validity of each instrument been presented?

    *Have the strengths & weaknesses of the rigour of the research appropriately addressed in the discussion section of the report?
  8. Systematic Error
    Change or random error
    • Systematic error is a measurement error that is attributable to relatively stable characteristics of the study population that may bias their behavior or cause incorrect instrument calibration. These are the extraneous variables. These can affect the responses or behavior of study participants. Examples are education level, socioeconomic status, family support or social desirability (answering a questionnaire in a
    • certain way to be polite or provide an acceptable response). 

    Definition: an error attributable to the lasting characteristics of the subject that don't tend to fluctuate from one time to another; also called constant error

    *Chance or random error is an error that is difficult to control such as a participant’s anxiety at the time of testing. They are unsystematic & cannot be corrected.

    Image Upload 3

    • *Every score collected by a researcher has a degree of
    • error. The observed or actual score is the true score of a participant that is collected plus any error in measurement. Error can be by chance or be systematic.

    Definition: an error attributable to fluctuations in subject characteristics that occur at a specific point in time and are often beyond the awareness and control of the examiner; also called chance error
  9. Review
  10. Random/chance error = an error attributable to fluctuations in subject characteristics that occur at a specific point in time and are often beyond the awareness and control of the examiner; also called chance error

    Systematic error/constant error = an error attributable to the lasting characteristics of the subject that don't tend to fluctuate from one time to another; also called constant error

    EXAMPLES of each 
    Examples of systematic:

    1. the scale used to measure daily wts was inaccurate, reading 1500 g less than the actual wt

    2. students chose the socially acceptable responses on an instrument for assessing attitudes toward people with acquired immune deficiency syndrome (AIDS)

    Chance/random error:

    1. the evaluators were confused about how to score wound healing

    2. the subjects were nervous about taking the pyschological tests
  11. Construct validity examples
    CV was supported by factor analysis yielding the 3 subscales of communication, consistent use, and correct use self-efficacy. In addition, CV was supported int hat subscales allowed investigators to differentiate between regular and irregular condom users. Cronbach's alpha coefficient rated between .72 and .78 for the subscales, and was .85 for the total scale

    Previous studies have demonstrated that the daily hassles for adolescents inventory has good CV, as evidenced by its significant relationship to adjustment measures." 
  12. Content Validity example ??????
    In studies using [the descriptive phenomenological] method, the researcher develops a 'rationale for investigating an experience and for seeking certain types of data in particular settings and circumstances' (Porter). Data to be included were the mothers' perceptions, actions, and intentions relevant to helping the young adults with [traumatic brain injury], and the interview guide was developed accordingly. Favourable views of its Content validity were provided by an eligible mother who did not participate and by the case manager who worked with eligible mothers"
  13. Identify 3 concepts that are related to reliability
    • 1. stability
    • 2. homegeneity
    • 3. equivalence 
  14. Give an example each of the 2 types of tests for stability
    • 1. test-retest reliability
    • 2. parallel 
  15. each item on the test using a 5 point likert scale had a moderate correlation with every other item on the test

    --what kind of homogeneity test is this?
    Cronbach alpha
Card Set
4130 Chapter 14: Rigour in Research
Rigour in Research