-
Must have components of Comprehensive Exam
- Hearing Screening
- Case History
- Oral Mechanism Exam
- A Communication Sample
- Specific areas of Eval
-
Specific Areas of Eval
- speech sound production
- language
- fluency
- voice
-
Validity
- extent to which the test measures what it says it will measure
- highly correlated to the purpose of the test
*tests may be stronger in one type of validity than another or have aspects of multiple
-
Construct Validity
Does the test measure what it says it will measure?
-
Content Validity
- How well do the test materials measure what the test aims to measure?
- Is the test comprehensive enough?
-
Criterion Validity
extent to which this score will indicate future performance on a same/similar task
ex) some schools use GRE as a criterion validity measure of how well a student will do in graduate school
-
Three Types of Validity
- construct
- content
- criterion
-
Reliability
can the test be repeated under similar conditions with similar results
-
Three Types of Reliability
- Inter-rater
- Test-Retest
- Split-half
-
Inter-rater Reliability
agreement of two differnt admins when assessing client responses
ex) if one person administering the test and the other is observing and also marking client responses, the two people should have the same results
-
Test-Retest Reliability
the consistency of test results over time
ex) if one person delivers a test and then I deliver the same test a week later, we should have similar results
-
Split-Half Reliability
- split the test in half, grade both halves, the results should be similar
- refers to internal consistency of a test
-
Normal Distribution of Scores
- theory of predictability of scores
- average scores are a predetermined distance from the mean
- data is symmetrical
- bell curve; 34, 14, 2 % on either side of the mean by the standard deviation
-
Raw Score
- the initial score of a test
- has no true value
- must be changed to more meaningful scores (converted/derived scores)
-
Grade/Age Equivalence
- median raw score of an age/grade
- usually not used
- least useful, most dangerous
- leads to misunderstandings of child performance
- skills develop faster in early ages so the raw scores increase at a greater rate
-
Commonly Used Converted Scores
- Percentile Rank
- Standard Scores
- Z-scores
- T-scores
- Stanine
-
Percentile Rank
the percentage of subject scores that fall a or below a particular raw score
-
Standard Scores
- Mean = 100
- Standard Deviation = 15
-
Z-scores
- Mean = 0
- Standard Deviation = 1
-
T-scores
- Mean = 50
- Standard Deviation = 10
-
Stanine
standard score bands that divide distribution into 9 parts
-
Standard Error of Measurement (SEM)
- stat used to increase precision in determining if an observed score is close to a true score
- estimates range in which true score on a test falls
- no test is error free or perfectly reliable
-
Calculate SEM based on...
- 1. estimate of test reliability
- 2. mean and standard deviation of scores (from normative sample)
- 3. test taker's observed score
-
Confidence Interval
- all the values within the range defined bu the confidence limits of a a sample stat
- calculation of SEM lies within a confidence interval
ex) observed score = 50; true score = 53 at a 95% confidence interval
*we are confident that the subject's true score will fall in that range 95 out of 100 test administrations
-
Criteria for Evaluating Formal Tests
- test administration and scoring
- reliability
- normative sample
- validity
-
Common Errors in Use of Norm-Reference Tests
- 1. Measuring Treatment Progress
- 2. Analyzing individual test items for treatment target selection
- 3. forgetting they distort what they measure
- 4. ignoring cultural makeup of normative sample
-
Basal
- point from which progress is recorded
- level that an individual passes all items on a test
-
Ceiling
- highest item number were a certain number of items has been failed
- assumed that all items above this level are incorrect
|
|