

ABCDEFGHIJKLMNOPQRSTUVWXYZ


 abstraction
 The act of considering something as a general quality or characteristic, apart from concrete realities, specific objects, or actual instances
 achievement test
 A measure of knowledge and skills in a content area.
 acquiescence set
 The tendency to agree with statements on a test or affective measure.
 affective
 Having to do with attitudes, beliefs, and values.
 affective domain
 The area of human action which emphasizes the internalized processes such as emotion, feeling, interest, attitude, value, character development, and motivation.
 affective taxonomy
 A system for classifying different levels of internalization of an attitude or value.
 algorithm
 A method of computation. Usually set up so that calculation can be made routinely without mathematical understanding from the computational scheme.
 analyze
 To separate into constituent parts or elements; to examine critically, so as to bring out the essential elements or give the essence of
 anecdotal record
 A written description of an observed event.
 aptitude
 A natural talent or ability.
 assessment
 Collecting data in the context of conducting measurement.
 association form
 A shortanswer item format in which the student is given set of words or phrases and must supply corresponding words or phrases according to a defined basis.
 attitude test
 A measure of one's feelings.
 balance
 The selection and provision of test items such that subject matter topics and behaviors are sampled in accordance with established relative weights.
 biserial correlation
 Shows the degree of relationship between a continuous and a normally distributed variable which has been dichotomized.
 blind guessing
 The selection of an alternative for a selectedresponse item without using any knowledge or rational approach to the choice. The probability of choosing the correct response is at chance level. If there are two choices, a blind guess should result in the correct selection about 50 percent of the time, for four choices, 25 percent of the time, and so on.
 bluffing
 A strategy for responding to essay questions; providing an answer that may not directly address the question.
 Buckley Amendment
 Legislation that gives students and their parents access to information about themselves, including test scores.
 centile point
 Is the point on a scoring scale below which fall a certain percentage of the cases.
 central tendency
 An average or middle value for a distribution of scores.
 checklist
 A measure of the presence or absence of listed attributes.
 chi square
 Shows the degree of divergence between observed and expected frequencies.
 coefficient of determination
 The square of the correlation coefficient; the percentage of the variance in one variable that is predictable from another variable.
 cognitive
 Having to do with knowing or understanding.
 cognitive domain
 The area of human action which pertains to mental processes such as intellectual, learning, and problem solving.
 cognitive taxonomy
 A system for classifying different levels of understanding.
 completion form
 A shortanswer item format in which the student is to supply the missing word or words in a given item.
 comprehensive
 Covering all material taught to date in a course.
 computerized adaptive testing
 Computerassisted testing in which the items that are presented are determined by the responses to previous items.
 concurrent validity
 A form of criterion validity based on the correlation of test scores with those on a criterion measure obtained at about the same time.
 construct
 An idea or concept invented to explain an aspect of human behavior or some other nonphysical characteristic. Example: hostility.
 construct validity
 The extent to which a test measures certain psychological traits.
 content bias
 Disproportionate representation of topics and terms within a test.
 content sampling
 The extent to which the items on a test represent the entire domain of possible items in a content area.
 content validity
 The extent to which a test or measure is representative of a defined body of knowledge.
 correction for guessing
 A mathematical adjustment that brings the score to zero for someone who guessed on each item.
 correlation
 A measure of the strength and direction of the association between two sets of scores.
 covariation
 Variance that two or more tests have in common.
 criterion referenced
 A way of interpreting a test score which compares an individual's performance to an established standard of performance.
 criterion validity
 Validity based on the correlation between test scores and scores on some measure representing an identified criterion.
 Cronbach alpha procedure
 A procedure for estimating internal consistency reliability, based on parts of a test.
 crossvalidation
 Related to predictive validity; using results for one sample of individuals to determine if validity coefficients will remain stable for another sample.
 decile
 Any one of nine centile points (scores) which divide a distribution into ten parts.
 demographics
 Vital and social statistics
 descriptive statistics
 Summary characteristics of distributions, such as shape, average, and dispersion.
 diagnostic test
 A test used to measure a student's strengths and weaknesses in a given area.
 difficulty index
 A measure of the percentage of incorrect responses determined by dividing the number getting the item wrong by the number who tried the item. Used to establish how difficult an item was for the group who took the test.
 direct observation
 Noticing of phenomena without any intervening factor between the observer and that which is being observed. A record of the situation is made.
 discrimination
 The ability of a test item to separate high and low scores on a total test.
 discrimination index
 A value which indicates the ability of an item to separate highachieving students from lowachieving students.
 dispersion
 The spread among scores in a distribution.
 distractor
 A response for a multiplechoice item that is classed as an incorrect alternative. It is a plausible wrong answer designed to be attractive to students who do not know the correct response.
 distractor analysis
 Item analysis technique concerned with the options on a multiplechoice item.
 domain
 A sphere of human activity. The three major categories are cognitive, affective, and psychomotor.
 domain specification
 A precise delineation of a body of content or a set of behaviors.
 empirical
 Verifiable by experience or experiment; objective collection of data to test a subjective concept
 equivalence reliability
 The extent to which measurement on two or more forms of a test is consistent.
 equivalent (parallel) forms
 Two or more forms of a test covering the same content whose item difficulty levels are similar.
 error
 Variation produced by the inaccuracies of measurement. The source of the variation may be within the test instrument, within the subjects of measurement, or in the way the test was administered.
 essay item
 An item format that requires the student to structure a rather long written response, up to several paragraphs.
 evaluation
 The process of making a value judgment based on information from one or more sources.
 experiment
 The modification of the conditions of a group or groups that have been chosen for study, and the analysis of the resulting outcomes.
 extended response
 An answer to an essay item which asks or implies a question which has no definite limits to restrict the student response. The response set is open ended. (See limited response.)
 f test
 To determine the significance of the difference between the variances (*2) of two groups.
 factor analysis
 An analytical procedure that can be used for identifying the number and nature of constructs underlying a set of measures.
 factor loading
 From factor analysis; a correlation between a factor and a test score.
 formative
 Done to monitor progress over a period of time.
 frequency distribution
 A listing of scores and the number of persons receiving each score.
 general factor
 From factor analysis; a factor that has substantial loading with all measures or tests.
 globalquality scaling
 A method of scoring an essay item; also called holistic scoring, scoring based on the general impression of overall adequacy and quality of the response.
 grade equivalent scores
 Normreferenced scores that report performance in terms of grade and month (such as 4.6fourth grade, sixth month).
 grading
 The process of evaluating performance and assigning a mark of performance level; commonly associated with assigning letters, A, B, C, D, and FA being of better or higher performance than B, and so on.
 grammatical clue
 A flaw in objective items in which the wording or punctuation directs the examinee to the correct answer.
 group factor
 From factor analysis; a factor that has high loadings with two or more but not all measures or tests.
 grouped frequency distribution
 A frequency distribution that categorizes scores by intervals.
 halo effect
 The tendency to give high scores to students known to be good students and vice versa, independent of the quality of the response.
 highstakes test
 A test for which the consequences of doing well or poorly are costly.
 histogram
 A bar graph that describes a distribution of scores.
 informed consent
 Giving approval for certain procedures after indicating an understanding of those procedures.
 intelligence
 The capacity for reasoning and understanding.
 intelligence quotient (IQ)
 The ratio of mental age to chronological age multiplied by 100 (100 x (MA/CA)); one whose mental age is average for his or her chronological age group has an IQ of 100.
 internal consistency reliability
 The extent to which parts of a test are consistent in measurement.
 interval
 A defined distance on a scale of measurement.
 interval measurement
 Measurements that classify, order, and have equal distances between points on the scales.
 isomorphic
 Something similar or identical in structure or appearance to something else.
 item analysis
 An examination of student performance for each item on a test. It consists of reexamination of the responses to items of a test by applying mathematical techniques to assess two characteristicsdifficulty and discriminationof each objective item on the test.
 item sampling
 A technique used in schoolwide, state, or national testing that administers only a part of a test to each student. This allows a longer test to be administered but does not require a long test session for each student involved. If each student is administered only onefourth of the test, a fourhour test could be administered with no student giving more than one hour of time.
 item specifications
 Item writing procedures for criterionreferenced tests that include sample items and descriptions of the stimulus and the response.
 item statistics
 Summary descriptions of a group's performance on a particular test item.
 itemtotal correlation
 the coefficient that describes the association between the scores on a particular item and the scores on the entire list.
 Kelly's range
 The distance between the 10th and 90th centile ranks.
 KuderRichardson Formula 21 procedure (KR21)
 a splithalf approach to estimating reliability that may be substituted for the KR20 procedure if item difficulty levels are similar.
 KuderRichardson Formula 20 procedure (KR20)
 a splithalf approach to estimating reliability that provides the mean of all possible splithalf reliability coefficients for a test.
 kurtosis
 Refers to the peakedness or flatness of a frequency distribution as compared with a normal distribution.
 lepotokurtic
 A frequency distribution more peaked than normal.
 limited response
 Essay item which asks a question or gives instructions for restricting the area to be covered in responding to the stated tasks. The coverage expected is well fenced in for the student. (See extended response.)
 local norm
 The average test performance in some city or region.
 masterynonmastery discrimination
 Item analysis technique concerned with decisions regarding a cutoff score.
 matching item
 An item consisting of a twocolumn formatpremises and responsesthat requires the student to make a correspondence between the two.
 mean
 The arithmetic average of a set of scores.
 mean deviation
 A measure of variability or dispersion of a distribution of scores.
 measurement
 A process that assigns by rule a numerical description to observation of some attribute of an object, person, or event.
 measurement scales
 Classifications of measures based on the amount of information contained in each score.
 median
 The middle score of a distribution.
 mental age
 The average intellectual functioning of normal persons at al given age, usually expressed in months.
 minimum competency testing
 Testing designed to measure the acquisition of competence or skills to or beyond a defined standard.
 mode
 The most frequent score of a distribution.
 multifactored assessment
 Assessment that usually includes the physical, cognitive, psychological, and social factors that are believed to affect learning.
 multiplechoice item
 A test format in which the examinee selects the correct answer from a list of possible options.
 national norm
 The average performance of a sample selected to be representative of the entire country.
 needs assessment
 A process whereby the educational requirements of students collectively or individually are determined. Usually thought of as a formal structured approach, but may be done informally by the teacher.
 negative skewness
 Asymmetry in which most of the scores in a distribution are at the high end.
 nominal measurement
 Measurement that classifies elements into mutually exclusive and exhaustive categories.
 norm group
 The set of subjects used to establish the averages to be used to interpret student scores on a standardized test.
 norm referenced measurement
 Measurement in which an individual's score is interpreted by comparing it to the scores of a defined group.
 normal distribution (curve)
 A theoretical distribution of scores which forms a curve that is bell shaped and symmetrical.
 norms
 The test scores (also possibly statistics generated from scores) of one or more defined groups considered to be representative.
 null hypothesis
 A statement that there is no difference in measures of the criterion vairable except what would be expected from sampling; requires that a significance level be stated (.05, .01, . . .).
 objective
 Dealing with things external to the mind rather than with thoughts or feelings; pertaining to that which can be known, or that which is an object or a part of an object.
 objective items
 Items that can be objectively scored; items on which persons select a response from a list of options.
 objectivity
 The degree to which the task to be performed is clear and the correct response is definite.
 objectivity (in scoring)
 The extent to which equally competent scorers obtain the same result.
 observation
 Any fact which is used as a basis for evaluation procedures. The output of the process of observing.
 oral tests
 Examinations in which both the questioning and answering are done aloud.
 ordinal measurement
 Measurement that classifies and orders along a continuum.
 parallel forms
 Two or more forms of a test covering the same content whose item difficulty levels are similar.
 partial correlation
 Shows the relationship between two variables with the effects of one or more other variables held constant.
 penalty for guessing
 A mathematical procedure for lowering scores as a function of the number of incorrect answers.
 percentiles
 Normreferenced scores that indicate the percentage of a norm group that a particular score exceeded.
 performance bias
 Bias introduced when individuals are not able to perform on a test because they have not had the opportunity to learn the test content.
 performance test
 Nonpaperandpencil tests that require the student to engage in some type of process, produce a product, or both.
 pilot study
 A miniature study conducted with a group of students that is not used as part of the major study. It is used to try out procedures or instruments (adapted from Hopkins & Antes, 1990, p. 461)
 phi coefficient
 Shows the degree of relationship between two dichotomous variables.
 platykurtic
 A frequency distribution that is flatter than normal.
 point biserial correlation
 Shows the degree of relationship between a continuous and a truly dichotomous variable.
 population
 Any defined aggregate of persons, objects, or events.
 positional preference
 The regular placement of the correct response in a particular position; for instance, always in choice C.
 positive skewness
 Asymmetry in which most of the scores in a distribution are low.
 power test
 A test in which time does not affect quality of performance, that is, students would not perform better if given additional time.
 practice effect
 The consequences of taking similar tests or testlike exercises.
 prepost discrimination
 Item analysis technique concerned with assessing performance before and after instruction.
 predictive validity
 A form of criterion validity based on the correlation of test scores with scores on a criterion measure obtained at some time.
 premises
 In a matching item, the column of words consisting of item stems.
 prescriptive test
 A test designed to identify student deficiencies, weaknesses or problems, and to suggest corrective learning activities.
 problem solving
 Settlement of a perplexing question or situation.
 product moment correlation
 Shows the degree of relationship between two continuous variables.
 project
 Any thrust area activity which is funded by the National Science Foundation or uses resources designated as matching funds
 psychomotor
 Having to do with movement or motor skills.
 psychomotor domain
 The area of human action which emphasizes all types of body movements which are involuntary or voluntary.
 psychomotor taxonomy
 A system for classifying psychomotor behaviors in terms of the amount of concentration required.
 qualitative
 Information in the form of statements or narrative
 quantitative
 Information that has been expressed in terms of mathematically manipulable numbers
 quartile deviation
 A measure of variability or dispersion of a distribution of scores.
 quartile one
 The point (score) in a distribution that sets off the lower fourth of the group.
 quartile three
 The point (score) in a distribution that sets off the higher fourth of the group.
 quartile two
 The point (score) in a distribution which divides the distribution into two equal parts.
 random sample
 A sample in which every member of the parent population has an equal chance of being chosen.
 range
 The difference between the highest and lowest scores in a distribution.
 rank correlation
 Shows the degree of relationship between two continuous variables by comparing ranks.
 rating scale
 A measure that contains one's estimate of the value of a person or thing.
 ratio measurement
 Measurement that classifies, orders, has equal units, and a true zero point.
 raw score
 The original score, as of a test, before it is statistically adjusted. It may include weighting and a correction for guessing but no other transformation.
 reading difficulty
 The level of reading ability required to understand test questions.
 reliability coefficient
 A numerical index of reliability based on a correlation coefficient; theoretically, the index can range from O to + 1.0.
 reliability
 The consistency with which a data collection device measures whatever it is that the device measures.
 representative sample
 Any subset of persons or items selected to represent a larger group or population which has the same inclinations as the total group or population with reference to some characteristic or characteristics. In testing, the test instrument is composed of tasks which are intended to reflect the characteristics of the larger population of possible test tasks which could be asked.
 roleplaying
 The act of assuming a pose or role when responding to affective questions.
 sample
 Any subaggregate of a larger population
 scatterplot
 A twodimensional graph of the relationship between two sets of scores.
 scorer reliability
 The consistency with which two or more individuals would score the same response to a test item.
 secure test
 A test (often commercially published) that is not circulated so it can be used repeatedly.
 separate answer sheets
 Forms provided for item response that are not attached to nor contained in the test copy; many can be electronically scored.
 shortanswer item
 A test item for which the student supplies a brief response, usually consisting of a word or phrase.
 skewness
 The tendency of a distribution to depart from symmetry or balance.
 socially acceptable response
 An answer to a question that may be inaccurate but conforms to desired social norms.
 spearmanBrown formula
 A formula for estimating reliability if test length is changed.
 specific determiners
 Terms such as always, never, every, and all that provide clues to correct answers.
 specific factor
 From factor analysis; a factor that has a high loading with only one measure or test.
 speeded test
 A test administered so that students are required to complete the exam within a specified amount of time.
 splithalf method
 A procedure for estimating test reliability by which a test is divided into two comparable halves and the scores on the halves are then correlated.
 stability reliability
 The extent to which measurement on the same test is consistent over time.
 standard deviation
 A measure of dispersion in a distribution that is the positive square root of the variance.
 standard error of estimate
 Gives the amount of error involved in predicting a score from the regression equation.
 standard error of measurement
 The standard deviation of the distribution of error scores.
 standard error of the mean
 Is the standard deviation of a distribution of sample means.
 standard score
 A normreferenced measurement that indicates how many standard deviations a score is above or below the mean.
 standardized
 A process of preparing a test instrument for use in widely separated locations. The test is standardized so that administration and scoring procedures are the same for all test takers. Score interpretation is made to averages of performances of groups of test takers whose scores are then used for making comparison to interpret scores obtained from other students.
 stanines
 Normreferenced scores that can range from 1 to 9, they have a mean of 5 and a standard deviation of 2.
 statistics
 Descriptive characteristics of a distribution of scores; also, that area of mathematics dealing with the collection, organization, and interpretation of numerical data.
 stem
 The introductory part of an objective test item.
 subjective
 Existing in the mind; belonging to the thinking subject rather than to the object of thought; relating to the nature of an object as it is known in the mind as distinct from a thing in itself
 summative testing
 Done at the conclusion of a course or some larger instructional period.
 t test for a correlation
 A test to discover if a correlation shows a real (significant) relationship, or a relationship due merely to chance.
 t test between means or proportions
 A test to discover if the difference between two means or two proportions is significant, or merely due to chance.
 table of specifications
 A twodimensional grid, content by cognitive process, used in planning a test.
 takehome test
 A test that a student completes outside of class, usually in an uncontrolled setting.
 taxonomy
 A system of classification and the concepts of identification, naming, and categorization underlying the coordination.
 teacher competency test
 A test for (prospective) teachers on knowledge and skills essential for effective teaching.
 technical adequacy
 The level of test reliability and validity necessary before the test can be recommended for use.
 technical problem
 A complex situation from a specialized field of study which is presented to a student for solution within the structure of that field. Usually used for assessment of general understandings of a wide set of principles and ideas rather than for special skills and talents.
 test anxiety
 A psychological state of stress caused by a testing situation.
 test bias
 A systematic error in the measurement process.
 test item file
 A collection of individual items on cards which are arranged by content areas for future use in test assembly.
 testretest method
 A procedure of estimating test reliability by which the same test is administered twice to the same individuals and the scores from the two administrations are then correlated.
 test
 The set of items or questions presented to one or more individuals under specified conditions for purposes of measurement.
 testing arrangement
 The setting in which a test is administered.
 testing
 The process of administering or taking a test.
 tetrachoric correlation
 Shows the degree of relationship between two normally distributed variables which are categorized into dichotomies.
 transformed standard scores
 Zscores that have been converted to a distribution with a prespecified mean and standard deviation.
 true component
 The part of an individual's score that is nonerror; the score if the test were perfectly reliable.
 truefalse item
 A test format in which examinees indicate whether given statements are correct (true) or incorrect (false).
 unobtrusive observation
 Instances of noticing made in such a way that persons being observed do not know that they are being observed.
 usability
 The practical factors that must be considered in test selection: cost, testing time, examiner training, and so on.
 validity
 The extent to which a test measures what it is intended to measure.
 validity coefficient
 The correlation between a test of known validity and a test of unknown validity.
 variance
 A measure of dispersion.
 weighted scores
 The composite scores that are weighted combinations of two or more separate scores.
 work sample
 A nontest measurement of student learning.




References
Koenker, R. H. (1971). Simplified statistics. Totowa, NJ: Littlefield, Adams & Co.
Wiersma, W., & Jurs, S. G. (1990). Educational measurement and testing, 2nd ed. Boston: Allyn & Bacon.
Hopkins, C. D., & Antes, R. L. (1990). Classroom testing: Construction. Itasca, IL: F. E. Peacock Publishers.
Hopkins, C. D., & Antes, R. L. (1990). Educational research: A structure for inquiry, (3rd ed.). Itasca, IL: F. E. Peacock Publishers.
Webster's Encyclopedic Unabridged Dictionary of the English Language. (1989). New York: Gramercy Books.



© 2001 Foundation Coalition. All rights reserved. Last modified



