What is the goal of classroom testing and assesssment?
to obtain valid, reliable, and useful information concerning student achievement.
What are the basic steps in classroom testing and assessment?
- 1. Determining the purpose of measurement.
- 2. Developing specifications
- 3. Selecting appropriate assessment tasks.
- 4. Preparing relevant assessment tasks.
- 5. Assembling the assessment.
- 6. Administering the assessment.
- 7. Appraising the assessment.
- 8. Using the results.
- GOAL: Improved learning and instruction.
define and delimit the achievement domain to be measured and descirbe the sample of test items and assessment tasks to be prepared.
A two-way chart (table of specifications) involves what to build?
- 1. obtaining the list of instructional objectives
- 2. outlining the course content
- 3. preparing the two-way chart that relates the instructional objectives to the course content and spcidfies the nature of the desired sample of items and tasks.
Objectives tests present students with...
a highly structured task that limits their response to supplying a word, brief phrase, number, or symbol or to selecting the answer from among a given number of alternatives.
Performance assessments permit students to
respond by selecting, organizing, and presenting ideas or performing in a way they consider appropriate.
The preparation of a set of relevant test items and assessment tasks involves:
- 1. obtaining a representative sample of all intended outcomes.
- 2. elminating irrelevant barriers to the answer
- 3. preventing unintendded clues to the response
- 4. focusing on improving learning and instruction.
How is a Table of Specifications set up?
- 1. List general instructional objectives across the top of the table.
- 2. List the major content areas down the left side of the table.
- 3. Determine what proportion of the test items should be devoted to each objective and each content area.
The number of items and performance tasks depend on
- 1. purpose of measurement
- 2. types of test items and assessment tasks used.
- 3. Age of the students.
- 4. level of reliability needed for effective use of the results.
What are Possible barriers in test items and assessment tasks (7 items).
- 1. Ambiguous statements.
- 2. Excessive wordiness.
- 3. Difficult vocabulary.
- 4. Complex sentence stucture.
- 5. Unclear instructions.
- 6. Unclear illustrative material.
- 7. Racial, ethnic, or gender bias
What are common clues in test items?
- 1. Grammatical inconsistencies.
- 2. Verbal associations.
- 3. Specific determiners (e.g., always)
- 4. Phrasing of correct responses.
- 5. Length of correct responses.
- 6. Location of correct resopnses.
What are the General Suggestions for Writing Test Items and Assessment Tasks?
- 1. User your test assessment specifications as a guide.
- 2. Write more items and tasks than needed.
- 3. Write the items and tasks well in advance of the testing date.
- 4. Write each test item and assessment task so that the task to b performed is clearly defined and calls forth performance described in the intended learning outcome.
- 5. Write each item or task at an appropriate reading level.
- 6.write each item or task so that it does not provide help in responding to other items or tasks.
- 7. write each item so that the answer is one that would be agreed on by experts.
- 8. Whenever a test item or assessment task is revised, recheck its relevance.
What are the simpler forms of objective test items?
- 1. short-answer
- 2. true-false
- 3. alternative-response items
- 4. matching exercises.
short-answer and completion items: how do they differ and what types of answers do they have?
- 1. They can be answered by a word, phrase, number, or symbol.
- 2. They are essentially the same, differing only in the method of presenting the problem.
- 3. Short-answer uses a direct question.
- 4. Completion consists of an incomplete statement.
The short-answer test item is suitable for measuring what type of learning outcomes?
- 1. knowledge of terminology
- 2. knowledge of specific facts (dates, weights, etc.)
- 3. Knowledge of principles. (if...decreases..what will happen to..)
- 4. Knowledge of method or Procedure. (what device is used...)
- 5. Can be used to measure ability to interpret diagrams, charts, graphs, and pictorial data.
5. Simple interpretations of Data. (If the number, what value does x represent)?
Suggestions for Constructing Short-Answer Items.
- 1. Word the item so that the required answer is both brief and specific.
- 2. Do not take statements directly from textbooks to use as a basis for short-answer items. (taking out of context)
- 3. A direct question is generally more desirable than an incomplete statement.
- 4. If the answer is to be expressed in numerial units, indicate the type of answer wanted.
- 5. Blanks for answers should be equal in length and in a column to the right of the question.
- 6. When completion items are used, do not include too many blanks.
What does the alternative-response test item consist of?
a declarative statement that the student is asked to mark true or false, right or wrong, correct or incorrect, yes or no, fact or opinion, agree or disagree. There are only two possible answers.
What type of test items are true-false questions appropriate for?
in measuring the ability to identify the correctness of statements of fact, definitions of terms, and statements of principles. Also, the student's ability to distinguish fact from opinion, ability to recognize cause-and-effect relationships (is the relationship true or false). Some aspect of logic (if P then Q concludes Q)
What are the suggestions for creating True/False questions?
- 1. Avoid broad general statements if they are to be judged true or false.
- 2. Avoid trivial statements.
- 3. Avoid the use of negative statements, especially double negatives.(students tend to overlook negative words such as no or not.) When negatives are used, they should be BOLD and underlined.
- 4. Avoid long, complex sentences.
- 5. Avoid including two ideas in one statement unless cause-and-effect relationships are being measured.
- 6. If opinion is used attributed it to some source, unless the ability to identify opinion is being specifically measured.
- 7. True statements and false statements should be approximately equal in length.
- 8. The number of true statements and false statements should be approximately equal.
Matching exercises consists of
two parallel columns with each word, number, or symbol in one column being matched to a word, sentence, or pharse in the other column. Items in the column for which a match is sough are called premises, the other column are called responses.
Examples of relationships for Matching excerces.
- Dates..........................Historical Events
- Authors......................Titles of Books
- Objects......................Names of Objects
Suggestions for constructing matching excercises
- 1. Use only homogeneous material in a single matching excercise.
- 2. Include an unequal number of responses and premises and instruct the student responses may be used once, more than once, or not at all.
- 3. Keep list of items to be matched brief and place shorter responses on the right.
- 4. Arrange list of responses in logical order, words alphabetically, numbers in sequence.
- 5. Indicate in the directions the basis for matching the responses and premises.
- 6. Place all the items for one matching exercise on the same page.
The incorrect alternatives in multiple-choice items are called
Multiple-choice items can be stated as
- 1. a direct question
- 2. incomplete statement.
Multiple-choice questions can be used to measure
Knowledge and understanding levels.
Knowledge outcomes include vocabulary, facts, principles, methods and procedures.
Understanding outcomes include application and interpretation of facts, principles, and methods.
What are the limitations with multiple-choice questions?
it is inappropriate for measuring learning outcomes requiring the ability to recall, organize, synthesize, or evaluate ideas.
What knowledge outcomes can be measured with Multiple Choice
- 1. Knowledge of Terminology
- 2. Knowledge of specific facts (who, what, when, where)
- 3. Knowledge of Principles.
- 4. Knowledge of methods and Procedures.
What Understanding outcomes can be measured with Multiple Choice
- 1, Ability to idenfity applicationh of facts and principles.
- 2. Ability to Interpret Cause-and-Effect Relationships.
- 3. Ability to Justify Methods and Procedures.
Suggestions for constructing Multiple choice items
- 1. The stem of the item should be meaningful by itself and should present a definite problem.
- 2. The item stem should include as much of the item as possible and should be free of irrelevant material.
- 3. Use a negatively stated stem only when significant learning outcomes require it. (no, not, least,...)
- 4. All the alternatives should be grammatically consistent with the stem of the item.
- 5. An item should contain only one correct or clearly best answer.
- 6. Items used to measure understanding should contain some novelty, but beware of too much.
- 7. All distracters should be plausible. The purpose of a distracter is to distract the uninformed from the correct answer.
- 8. Verbal associations between the stem and the correct answer should be a avoided.
- 9. The relative length of the alternatives should not provide a clue to the answer.
- 10. The correct answer should appear in each of the alternative positions an approximately equal number of times but in random order.
- 11. Use sparingly special alternatives such as "none of the above" or "all of the above." These are used to force students to consider all alternatives and to increase difficulty of the items.
- 12. Do not use multiple-choice items when other item types are more appropriate.
A variety of learning outcomes included in complex achievement are the ability to:
- 1. apply a principle
- 2. interpret relationships
- 3. recognize and state inferences
- 4. recognize the relevance of information.
- 5. develop and recognize tenable hypotheses
- 6. formulate and recognize valid conclusions.
- 7. recognize assumptions underlying conclusions
- 8. recognize the limitations of data.
- 9. recognize and state significant problems
- 10. design experimental procedures
- 11. interpret charts, tables, and data
- 12. evaluate arguments.
- 13. recognize warranted and unwarranted generalizations.
Suggestions for constructin interpretive exercises.
- 1. Select introductory material that is relevant to the objectives of the course.
- 2. Select introductory material that is appropriate to the students' curricular experience and reading level.
- 3. Select introductory material that is new to students.
- 4. Select introductory material that is brief but meaningful.
- 5. Revise introductory material for clarity, conciseness, and greater interpretive value.
- 6. Construct test items that require analysis and interpretation of the introductory material.
- 7. Make the number of test items roughly proportional to the length of the introductory material.
- 8. In constructing test items for an interpretive exercise, observe all pertinent suggestions for constructing objective items.
- 9. In constructing key-type test items, make the categories homogeneous and mutually exclusive.
- 10. In constructing key-type test items, develop standard key categories where applicable.
Complex achievement refers to learning outcomes based on higher mental processes, such as:
- 1. classification under various general headings,
- 2. understanding
- 3. reasoning,
- 4. thinking
- 5. and problem solving.
Interpretive exercise consists of a series of objective questions based on:
- 1. written materials
- 2. tables
- 3. charts
- 4. graphs
- 5. maps
- 6. pictures
Complex achievement/interpretive excercise may be aske to recognize/make
- 1. assumptions
- 2. inferences
- 3. conclusions
- 4. relationships
- 5. applicatoins
What are the performance based assessments?
- 1. essay
- 2. gathering information
- 3. oral presentations
- 4. conducting experiments
- 5. repairing or manipulating equipment
- 6. portfolios
What is a restricted-response essay?
an essay usually limited both in content and the response. Restricted by scope of the topic, form of response, and length.
Suggestions for constructing essay questions
- 1. Restrict the use of essay questions to those learning outcomes that canot be measured satisfactorily by objective items.
- 2. Construct questions that will call forth the skills specified in the learning standards.
- 3. Phrase the question so that the student's task is clearly defined.
- 4. Indicate an approximate time limit for each question.
- 5. Avoid the use of optional questions.
How are restricted response essays scored?
- 1. ruberics
- 2. list of looked for items.
What consists of Analytic scoring rubics for extended-response essays?
- 1.Ideas and Content
- 2. Organization
- 3. Voice
- 4. Word Choice
- 5. Sentence fluency
- 6. Conventions
- 7. Citing Sources
Suggestions for scoring essay questions
- 1. Prepare an outline of the expected answer in advance.
- 2. Use the scoring rbric that is most appropriate.
- 3. Decide how to handle factors that are irrelevant to the learning outcomes being measured.
- 4. Evaluate all responses to one question before going on to the next one.
- 5. When possible, eva.luate the answers without looking at the student's name.
- 6. If especially important decisions are to be based on the results, obtain two or more independent ratings.
What does an essay measure that cannot be measured by more objective means?
- 1. the ability to supply rather than merely identify interpretations and applications of data. (restricted-response)
- 2. The ability to organize, integrate, and express ideas.(extended-response)
What are the limitations of essay measurements.
- 1. scoring tends to be unreliable
- 2. scoring is time consuming
- 3. only a limited sampling of achievement is obtained.
Scoring procedures can be improved for the essay by:
- 1. using a scoring rubric
- 2. adapting the scoring method to the type of question.
- 3. controlling the influence of irrelevant factors
- 4. evaluating all answers to each question at one time.
- 5. evaluating without looking at the students' names, and
- 6. obtaining two or more independent ratings when important decisions are to be made.
Performance based assessments provide a basis to evaluate the effectiveness of what two items?
- 1. process or procedure use (approach to data collection or manipulation of instruments)
- 2. Product resulting from performance of a task (completed report of results)
Problem formulation, organization of ideas, integration of types of evidence, orginality are all aspects.
Restricted-response performance tasks usually
- are relatively narrow in defintion.
- Limitations on the types of performance expected are indicated.
Extended Performance Tasks
freedom enables students to demonstrate their ability to select, organize, integrate, and evaluate information and ideas.
Types of restricted-response performance tasks
- Ability to:
- read aloud
- ask directions in a foreign language
- construct a graph
- use a scientific instrument
- type a letter
Types of Extended-response performance tasks
- Ability to:
- collect, analyze, and evaluate data
- organize ideas, create visuals, and make an integrated oral presentation
- create a painting or perform with a musical instrument
- repair an engine
- write a creative short story.
Suggestions for constructing performance tasks
- 1. Focus on learning outcomes that require complex cognitive skills and student performances.
- 2. Select or develop tasks that represent both the content and the skills that are central to important learning outcomes.
- 3. Minimize the dependence of task performance on skills that are irrelevant to the intended purpose of the assessment task.
- 4. Provide the necessary scaffolding for students to be able to understand the task and what is expected.
- 5. Construct task directions so that the student's task is clearly indicated.
- 6. Clearly communicate performance expectations in terms of the scoring rubrics by which the performances will be judged.
A holistic rubic provides
- 1. descriptions of different levels of overall performance.
- are efficient and correspond more directly to global
- 2. judgments required in the assignment of grades.
- 3. Do not provide students with specifric feedback about the strengths and weaknesses of their performance.
Analytic scoring rubric requires
identification of different ddimensions or characteristics of performance that are rated separately.
Rubric rating scales are often limited to:
- 1. making quality judgments (excellent, good, fair, poor)
- 2. scaled requencey judgments (always, frequently, sometimes, or never)
Quality of Explanation
- 6 = Excellent explanation (complete, clear, or ambiguous)
- 5 = Good explantion (reasonably clear and complete)
- 4 = Acceptable explanation (problem completed but may contain minor flaws in explanation)
- 3 = needs improvement (on the right track but may contain serious flaws; demonstrates only parital understanding)
- 2= Incorrect or inadequate explanation (shows lack of understanding of problem)
- 1= incorrect without attempt at explanation
Separate Ratings of Answer and Explanation
- 4= Correct
- 3= Almost correct or partially correct
- 2=Incorrect but reasonable attempt
- 1=Incorrect with no relationship to the problem
- 0=no answer
- 4= complete, clear, logical
- 3= essentially correct but incomplete or not entirely clear.
- 2= Vague or unclear but with redeeming features.
- 1= Irrelevant incorrect, or no explanation
Types of Rating Scales
- 1. Numerical ( 1 through 4)
- 2. Graphic rating Scale (Never, seldom, occasionally, frequently, always)
- 3. Descriptive Graphic Rating Scale(never, between, participates, between, participates more than others)
Common Errors in Rating
- 1. personal bias (generosity error, severity error, central tendency error)
- 2. halo effect (an error when a rater's general impression of a person influences the rating.)
- 3. logical errors.( when two characteristics are rated as more alike or less alike than they actually are.)
Principles of Effective Rating
- 1. Characteristics should be educationally significant.
- 2. Identify the learning outcomes that the task is intended to assess.
- 3. Characteristics should be directly observable.
- 4. Characteristics and points on the scale should be clearly defined.
- 5. Select the type of scoring rubric that5 is most appropriate for the task and the pupose of the assessment.
- 6. Between three and seven rating positions should be provided.
- 7. Rate performances of all students on one task before going on to the next one.
- 8. When possible, rate performance without knowledge of the student's name.
- 9. When results from a performance assessment are likely to have long-term consequences for students, ratings from several observers should be combined.
Checklists for assessing a procedure consist of a series of sequential steps..what are they?
- 1. Identify each of the specific actions desired in the performance.
- 2. Add to the list those actions that represent common errors (if they are useful in the assessment, are limited in number, and can be clearly stated).
- 3. Arrange the desired actions (and likely errors, if used) in the approximate order in which they are expected to occur.
- 4. Provide a simple procedure for checking each action as it occurs (or for numbering the actions in sequence, if appropriate).
Another type of checklist is the Plus/Minus
- A plus (+) sign is entered on each item that is satisfactory.
- A minus (-) sign is entered on each item that is unsatisfactory.
Guidelines for Portfolios
- 1. specify the contents
- 1. specify types
- 3. specify the minimum number of entries.
- 4. intended audiences and who has access to the portfolio
- 5. Rrequirements for self-reflection and self-evaluation of both the entries and the portfolio as a whole.
- 6. The evaluation criteria that will be used as a whole should be clarified.
Advantanges of Porfolios
- ease of integrated with classroom instruction
- have value in encouraging students to develop self-evaluation skills.
- effective in communicating with parents
Disadvantages of using Portfolios
- 1. labor intensive for the teacher, requiring considerable time in planning, monitoring, and providing feedback to students.
- 2. difficult to score reliably.
Key Steps in Defining, Implementing, and Using Portfolios
- 1. Specify the Purpose
- 2. Provide guildelines for selecting portfolio entries
- 3. Define the student role in selection and self-evaluation
- 4. Specify the evaluation criteria.
- 5. Use portfoios in instruction and communication.
Four types of Portfolios
- 1. instruction and assessment
- 2. current accomplishments and progress.
- 3. showcase and documentaion
- 4. finished and working portfolios.
Instructional Purposes for Portfolios
Primary purpose is instruction, might be used as a means of helping students develop and refine self-evaluation skills. Learning to evaluate one's own work
Four dimensions distinguishing the purposes of portfolios
- Instruction <---- ----> Assessment
- Current Accomplishments <--- ----> Progress
- Best Work Showcase <---- ----->Documentation
- Finished <---- ----->Working