The California Critical Thinking Skills Test Manual Psychology

Posted on by Muk

CCTST reports deliver individual and group results in a presentation ready format.   Each report includes a wide range of statistical and demographic information about individuals and/or test-taker groups. Test-taker scores and group summaries are presented with interpretative analysis by Insight Assessment measurement scientists.

The CCTST measures and reports on an array of reasoning skill scale scores.  Online versions of the CCTST provide an overall measure of thinking skills (Total Score) and the following individual scale scores: Analysis, Interpretation, Inference, Evaluation, Explanation, Induction and Deduction. The online CCTST-N adds a measure of Numeracy. Earlier versions of the CCTST and current paper-and-pencil versions of the CCTST provide the following scale scores: Total Score, Analysis, Inference, Evaluation, Induction and Deduction.

Insight Assessment clients depend on the comprehensive data and analysis in a CCTST report to provide the insights needed to achieve their assessment goals.  Clients are currently using individual data for professional development, student or intern placement, hiring, advising, competency training. Group data is being used for new cohort assessment, outcomes assessment, demonstrating the quality of an educational or training program, demonstrating group proficiency, staff development, admissions and more.

Clients can customize their results package with additional analyses, graphics and interpretative text discussing your scores in relationship to your particular goals and objectives. For further information, see Insight Assessment Reports and Analytics

        The Assessment Report package includes: 

  • Individual test-taker analytics :
    • an overall score of thinking ability (Overall Score) 
    • a categorical interpretation of the strength of the Overall Score  and scale scores
    • a norm-referenced percentile ranking (applicable to skills assessments only)
    • scale scores to indicate which of the skills areas are particularly strong and which are weaker and require training attention.
    • test administrators control whether test-takers receive their individual results after testing
  • For customers who are testing groups:
    • descriptive statistics and presentation ready graphic representation of the average Overall score and scale scores for the group
    • descriptive graphics and representations including  size of the group, mean, median, standard deviation, standard error of the mean, lowest score, highest score, first quartile score and third quartile score. See video about interpreting group report histograms
    • descriptive statistics of the demographic characteristics of the test-taker group (if collected )
    • the average percentile of  group as compared to a pre-selected external norm group (applicable to skills assessments only). For more information on norm-referenced scores . 
    • Electronic data files spreadsheet with all scale scores and demographic responses
  • Test Manual which includes chapters on interpreting individual and group test-taker scores using our 4-Step Process.

Time and performance on the California Critical Thinking Skills Test.


Critical thinking (Study and teaching)


Frisby, Craig L.
Traffanstedt, Bobby K.


Name: Journal of College Reading and Learning Publisher: College Reading and Learning Association Audience: Academic Format: Magazine/Journal Subject: Education Copyright: COPYRIGHT 2003 College Reading and Learning Association ISSN: 1079-0195


Date: Fall, 2003 Source Volume: 34 Source Issue: 1

Accession Number:


Full Text:

The relationship between total scores on the California Critical Thinking Skills Test (CCTST) and the time taken to complete it was investigated in a sample of 16 high school seniors and 131 college undergraduates. Slower test takers obtained significantly higher scores. With performance on a proxy measure for general cognitive ability, the Raven Advanced Progressive Matrices - Set 2 (APM), as a covariate, slower CCTST test-taking time related to higher CCTST scores but not to higher scores on the California Critical Thinking Dispositions Inventory (CCTDI). Stepwise multiple regression analysis revealed that APM scores, CCTDI scores, and CCTST completion time explained 43 %, 6 %, and 3% of the independent variance in total CCTST scores. Implications of these findings for college instruction are discussed

The teaching and development of critical thinking (CT) skills is a popular objective among college educators (Mines, King, Hood, & Wood, 1990; Pascarella, Bohr, Nora, & Terenzini, 1996). Most college teachers would acknowledge that immaturity, lack of experience, and limited knowledge of the subject matter are factors that can make it difficult for students to develop CT skills within the context of unfamiliar subject matter. There is a certain kind of student who quickly comes to the attention of college instructors because of serious difficulties in this regard. These students are often either highly opinionated or apathetic in their attitudes, and may have difficulty looking at issues from different perspectives. They may appear bored and uninterested in course content, giving the impression during lectures that they are being subjected to an adverse experience against their will. In putting forth minimal effort to grapple with the complexities inherent in the subject matter, they miss key insights that are obvious to instructors. These students may avoid taking their time to complete assignments with care, and often turn in work that is sloppy, ill-conceived, and rushed. For these students, the development of critical thinking not only involves learning the content of the course, but also involves developing the attitudes and dispositions that would enable them to engage the subject matter at a deeper level.

However, other students seem to have a naturally easier time grappling with intellectual material regardless of content. Although they may not have an interest in the course subject matter, they have an easier time applying themselves to intellectually demanding material. These students seem to retain more knowledge from their study time, to grasp abstract and conceptual content more quickly than other students, and to obtain better grades on tests than other students. This raises some interesting questions: If the lower performing students devoted more time and adopted a more reflective approach to their work, would they be developing "critical thinking skills"? Or, are the better students who have more of a natural intellectual advantage also better critical thinkers? These vexing questions challenge most college instructors and also researchers in defining precisely what is meant by, and involved in, critical thinking (Halpern, 1996; McPeck, 1981; Paul, 1995), and the role that is played by natural intellectual ability and the time that is devoted to grappling with difficult material.

CT as Performance on Critical Thinking Tests

Many researchers attempt to understand CT by defining it as performance on a specialized test of CT ability. A first step is to design reliable and valid instruments for measuring its essential features (Aretz, Bolen, & Devereux, 1997; Ennis, 1993; Facione, 1990) and to use CT tests as outcome measures in studying factors that hinder or facilitate the development of CT skills (Frisby, 1992; Olsen, 1990; Pascarella et al., 1996).

Tests of CT ability can be broadly categorized as requiring production responses or selection responses. For production response CT tests, test takers are required to either say or write something in response to a stimulus, which is then scored according to specified criteria (e.g., see Ennis, Millman, & Tomko, 1985a; Kitchener, 1993; Norris, 1989, 1990). Selection response CT tests are usually designed in a multiple choice format, where the test taker reads a brief passage and selects the correct choice (among a group of distractors) to a question about the passage. Multiple choice tests of CT share many features in common, as nearly all measure the ability to apply rules, concepts, and principles of formal/ informal logic or scientific inquiry. After reading a brief passage, test takers are asked to do some of all of the following: break down complex arguments into their component parts; identify and fill in gaps in a logical argument; detect emotional or illogical fallacies in arguments; use deductive skills to assess the logical strength of inferences that tie statements together; discriminate correctly degrees of confidence in the truth, falsity, or general reliability of conclusions or hypotheses from a given set of data; of educe consequences that follow from a given set of information (e.g., see Ennis, Millman, & Tomko, 1985b; Facione, Facione, Blohm, Howard, & Giancarlo, 1998; Watson & Glaser, 1980). These tests are designed so that the presence of absence of specific content knowledge in a discipline is not a factor in CT proficiency.

Relationship Between Time and Performance on CT Ability Tests

Throughout the course of a semester, instructors observe that some students have a tendency to consistently turn in exams more quickly than other students. Hypotheses concerning the expected relationship between test-taking time and performance on selection response CT tests, however, are rarely made explicit. Explicit hypotheses can be proffered as a natural extension of what is known about the nature of individual differences in general mental ability, the effects of time limits on ability test performance, and theorists' models of CT dispositions.

The "smart and fast" hypothesis. This hypothesis suggests that brighter students will complete a CT test faster, and with higher scores, than students who are not as bright. In the context of cognitive test performance, persons with high levels of general mental ability are expected to display a higher probability of providing correct responses to more difficult items. In contrast, persons with low levels of general mental ability are expected to respond correctly to less difficult items, but display a higher probability of responding incorrectly to the more difficult items (Rasch, 1980). Within groups of items to which both high and low ability persons can respond correctly (and all other factors being equal), higher ability individuals would also be expected to arrive at correct answers more quickly than lower ability individuals (Carroll, 1993; Lindley, Smith, & Thomas, 1988; Neubauer, 1997). When examinees are instructed to complete correctly mental tasks as quickly as they can within a predetermined time limit, higher ability examinees would be expected to complete more items correctly than lower ability examinees. However, on a pure power test with no time limits, the relationship between mental ability and speed of correct responses is obscured. Here, it is quite possible that individual differences in impulsivity versus reflection would influence test-taking speed. This possibility leads to the next hypothesis.

The "slow and careful" hypothesis. This hypothesis suggests that higher scores on a CT test should be associated with longer test-taking times, since the better critical thinker habitually displays a disposition for slow and careful reflection when interacting with intellectual material--and this behavioral trait assists the test taker in getting more items correct. CT theorists have argued that the ability to think critically does not necessarily imply that individuals have the disposition, or natural proclivity, to think critically (e.g., Facione, Sanchez, Facione, & Gainen, 1995). In the late 1980s, the American Philosophical Association sponsored a project in which a cross-disciplinary panel of scholars from philosophy, education, and the social sciences participated in a Delphi project designed to achieve consensus regarding traits of the ideal critical thinker. According to the Delphi panel, the ideal critical thinker should be habitually inquisitive, well-informed, trustful of reason, open-minded, flexible, fair-minded in evaluation, honest in facing personal biases, prudent in making judgments, willing to reconsider, clear about issues, orderly in complex matters, diligent in seeking relevant information, reasonable in the selection of criteria, focused in inquiry, and persistent in seeking results which areas precise as the subject and circumstances of inquiry permit (Facione, 1990).

The California Critical Thinking Dispositions Inventory (CCTDI) was developed from this project as a measure of ideal CT dispositions. The seven subscales of the CCTDI do not directly identify time as a component of the definition of dispositions toward CT; however, some items imply this contribution. For example, the Systematicity subscale supposedly addresses the subject's "orderliness in working with complexity," "diligence in seeking relevant information," and "care in focusing attention on the concern at hand" (Facione, Facione, & Giancarlo, 1996). Subjects are asked to endorse items such as "I always focus on the question before I attempt to answer it," "My problem is I'm easily distracted," or "People say I rush into decisions too quickly." It stands to reason that subjects who are prone to be careless and impulsive (rather than careful and reflective) in their approach to cognitively demanding tests may complete measures quickly without fully engaging in the task at hand (which may also increase the probability of incorrect responses). There is modest evidence that extending time limits on tests allows test takers to obtain higher scores. Evans and Reilly (1972) studied the effects of reducing the speededness of the Law School Admission Test on scores in test takers from regular versus "fee-free" centers. They found that reducing the amount of speededness produced higher scores for both groups. Powers and Fowles (1996) found that essay performance of Graduate Record Examination (GRE) test takers was significantly better when examinees were given 60 minutes (as opposed to 40 minutes) to write. Llabre and Froman (1987) used microcomputer technology to compare the time allocated to items from the California Test of Mental Maturity inference subtest. They found that an arbitrary 10-minute time limit would have lowered the scores of Hispanic subjects to a greater degree than would be the case if no time limits were imposed.

The "independence" hypothesis. This hypothesis suggests that individual differences in general mental ability and in the disposition to be reflective (and thus spend more time on test items) are independent. To put the matter bluntly, cognitive test takers either know the right answer or they don't (which is related to mental ability). As a result, individual differences in the propensity to ruminate carefully on mental test items have little relationship with (or are independent of) total test scores. Therefore, there would be no relationship between test-taking time and scores on a CT test. There is also moderate support for this hypothesis. Wild, Durso, and Rubin (1982) studied the effects of increased time on GRE verbal and quantitative test items for subjects who differed in race, sex, and years since last attending school. They found that increasing the time (from 20 minutes to 30 minutes per test) had a negligible effect on test scores. Bridges (1982) found that the test performance of college students could not be predicted by the amount of time required to complete exams, nor by the relative order in which students completed exams. Herman (1997) found similar nonsignificant results in his study of the performance of college students on multiple choice exams. Although longer allocated time was associated with higher scores in the Powers and Fowles (1996) study, examinees who described themselves as slow in their writing and test-taking style did not benefit differently from generous time limits compared to quicker test takers.

Purpose of the Study

The variety of hypotheses concerning the relationship between time and scores on CT tests is the direct function of a dual definition of CT as a form of mental ability and as a constellation of attitudinal dispositions (Facione, 1990). The "smart and fast" hypothesis would predict that higher CT scores are associated with briefer test-taking times. In contrast, the "independence" hypothesis would predict no relationship between CT scores and test-taking times. The "slow and careful" hypothesis would predict that longer test-taking times and higher scores are associated with more optimal CT dispositions (controlling for the effects of mental ability). The purpose of this study is to determine the extent to which these hypotheses are supported by an investigation of the relationship between test-taking times and scores on measures of CT ability and dispositions.


The 147 participants were 131 undergraduates from a major Midwestern university and 16 high school seniors from public schools located in the same town. The college participants were drawn from a wide variety of majors, although most were recruited from statistics, education, or psychology classes. The participants were 103 women and 44 men who ranged in age from 16-43 years, with a mean age of 21 years (SD = 3.4). A majority of them were Caucasian. Most participants received extra credit for their participation in the study; however, the study was open to any student wishing to participate. Informed consent was obtained from each participant.


California Critical Thinking Skills Test. The California Critical Thinking Skills Test (CCTST; Facione et al., 1998) measures the CT abilities of high school and postsecondary students, and is based on an American Philosophical Association Delphi panel report (Facione, 1990) that defined critical thinking as "the process of purposeful, self-regulatory judgment... [that] gives reasoned consideration to evidence, context, conceptualizations, methods, and criteria" (as cited in Facione et al., 1998, p. 2). The CCTST contains 34 multiple choice questions based on short vignettes from which respondents were to make inferences from the information provided, evaluate arguments for or against a proposition, or analyze the relationships among the statements or descriptions. For some items, each question addressed its own vignette. For other items, pairs of items related to a single vignette. Correct answers were scored on 3 factors: analysis (9 items), evaluation (14 items), and inference (11 items). These 3 variables were combined into an overall CT score, calculated for comparison with norms which are provided in the test manual "as the starting point for institutional research" (Facione et al., 1998, p. 9). The CCTST is available in two forms (Forms A & B), with Form A used in the current study.

Faccione et al. (1998) report KR-20 reliability coefficients ranging from .68 to .70. These coefficients were obtained from the initial pilot study for the CCTST, with no reliability estimates reported in any of the other technical reports (Michael, 1995). However, Jacobs (1995) reported alpha reliabilities on CCTST subtests (Analysis = . 14; Evaluation = .67; Inference= .64) corrected for unequal subtest lengths, from a sample of 1,383 college freshmen. Facione et al. (1998) reported validity coefficients ranging from .40 to .71 for subtests of the GRE, Scholastic Assessment Test (SAT), and American College Test (ACT) as well as coefficients of .40 and .54 with the Watson-Glaser Critical Thinking Appraisal.

According to the test developers, users are instructed to advise test takers that they have 45 minutes to complete the test. However, extended time is allowed if the users wish to develop local norms (Facione et al., 1998). Despite the directions suggested by the test developers, participants in this study were instructed to take as much time as they needed to complete the test. Test takers were instructed to answer all items, and not leave any items blank.

California Critical Thinking Disposition Inventory. The California Critical Thinking Disposition Inventory (CCTDI; Facione, et al., 1996) is a 75-item questionnaire developed from the consensus of several experts from the American Philosophical Association Delphi panel (Facione, 1990) on dispositional characteristics of an ideal critical thinker. According to Facione et al. (1996), this description is "discipline neutral and comprises a generalizable description of the ideal critical thinker across multiple contexts and situations" (p. 2). The CCTDI asks respondents to rate statements regarding dispositions toward truth-seeking, being open-minded, being analytical in thought, systematic thinking, confidence in one's conclusions, being inquisitive, and maturity on a 6-point Likert scale ranging from strongly agree to strongly disagree. The CCTDI is scored on seven scales (truth-seeking, open-mindedness, "analycity", "systematicity", self-confidence, inquisitiveness, and maturity) consisting of 7 -12 items per scale. Scores for each scale are converted to standardized scores using a scale standardization table and added together to form a total disposition score. Although the authors provide no norms, they note that a minimum score of 40 for each scale (minimum total score of 280) indicates an "average" critical thinking disposition.

Internal reliability coefficients provided in the test manual indicate split-half subscale reliabilities ranging from .71 (truth-seeking) to .80 (inquisitiveness), with a total score reliability reported as .91 (Facione et al., 1996).

Raven's Advanced Progressive Matrices. Raven's Advanced Progressive Matrices (APM; Raven, Raven, & Court, 1998a, 1998b) was used as a proxy measure for general cognitive ability. The APM is a paper-and-pencil test that is convenient to administer in a group format within a relatively short period of time. In addition, when the APM is submitted to a factor analysis with other cognitive tests, it is usually among the two or three tests having the highest loading on the g (general mental ability) factor (Ablard & Mills, 1996; Jensen, 1998; McLaurin & Farrar, 1973; Mills, Ablard, & Brody, 1993). Spearman (1923) defined the essence of g as the eduction (drawing out) of relations (i.e., induction, or inferring the general rule from specific instances) and correlates (i.e., deduction, or recognizing or generating a specific instance when given another specific instance and the general rule). However, the APM measures g through the medium of nonverbal reasoning tasks only. As such, this test is limited in providing a more complete picture of general intellectual functioning, as compared to a test that includes verbal tasks (Braden, 2000).

A typical item in the Raven matrices consists of a 3 x 3 matrix of abstract figures (e.g., combinations of straight or curved lines, dots, triangles, squares, and circles); however, one of the figures is missing. The respondent must select the correct figure from a series of 8 choices, one of which correctly completes the matrix when correct inductive/ deductive rules are applied. The APM consists of 2 sets of problems (Set I with 12 problems and Set II with 36 problems) totaling 48 items. According to the test manual, Set I is generally used to determine whether to administer Set II, or as a means of familiarizing participants to the matrices test before beginning Set II (Raven et al., 1998a, p. 69). The APM is a pure "power" test (i.e., untimed); hence, participants had as much time as was needed to complete the instrument. For purposes of this study, only scores on Set II were included in analyses.

According to Raven et al. (1998b), test-retest reliability coefficients obtained at ah eight-week interval indicated high reliability for adults (r = .91, N = 243). Internal consistency data indicate split-half-reliability coefficients which vary between .83 and .87 (no mean given). Construct validity is demonstrated in correlations (corrected for range restriction) of the APM with the Wechsler Adult Intelligence Scale (WAIS) verbal IQ (.61), performance IQ (.69), and full scale IQ (.74; see McLaurin & Farrar, 1973).

Task completion times. Task completion times were recorded for the California Critical Thinking Skills Test (CRITTIME). Participants were asked to self-record the time at which they began each test and the time at which they completed it. Each time variable consisted of the number of minutes that elapsed between the start time and the completion time.


Participants were scheduled for test administration in two sessions within a two-week period. Testing was conducted in a group format. During each of the two sessions, an ability test (either CCTST or APM) and a rating scale (16 Personality Factors Questionnaire or CCTDI) was given. However, the order of the instruments both within and across sessions were counterbalanced across all subjects in order to minimize order effects. The 16 Personality Factor Questionnaire (Cattell, Cattell, & Cattell, 1993) results were analyzed for a different study, and not included in the current investigation.

Participants were given a 4-digit code number with which to identify their test responses. Although each test was untimed, participants were asked to mark down beginning and completion times on each of the four paper-and-pencil measures. Participants were instructed to take as much time as needed for each test and were given the opportunity for a short break between tests.

Data Analysis

The means, standard deviations, and first-order intercorrelations among all variables are listed in Tables 1 and 2.

Distribution of CCTST time scores. Among participants, the slowest completion time was 71 minutes and the fastest completion time was 10 minutes (M = 28.4; SD = 8.4). Only three participants completed the CCTST in over 45 minutes, which is the test developers' suggested time limit for administering the test. The original sample was subdivided into three subsamples representing slow, medium, and fast CCTST completion times (in minutes). Cutoff scores that subdivide the sample into the nearest equivalent groups (in size) were as follows: Slow Group (range: 31 - 71 minutes; M = 37; SD = 6.7; n = 50); Medium Group (range: 26 - 30 minutes; M = 28; SD = 1.6; n = 46); Fast Group (range: 10 - 25 minutes; M z 20.2; SD = 4.4; n = 51). The PROC GLM and PROC REG procedures, available from SAS data analysis software (SAS Institute, 1999), were used to analyze all data from the study.

Phase 1. Analysis of variance (ANOVA) procedures were used to analyze the relationship between membership in test-taking time groups (as the independent variable) and CT scores (as the dependent variable). The "smart and fast" hypothesis would predict that faster CT test takers would achieve significantly higher CT scores (as measured by the CCTST). In contrast, the "independence" hypothesis would predict that there would be no significant relationship between test-taking time and CCTST scores. The "slow and careful" hypothesis would predict that slower performance on the CCTST would be associated with significantly higher CT dispositions (as measured by CCTDI scores), controlling for individual differences in mental ability as measured by the Raven Advanced Progressive Matrices--Set II (using analysis of covariance procedures). In addition, the "slow and careful" hypothesis would predict that slower test-taking time would be associated with higher CCTST scores, also controlling for individual differences in mental ability.

Phase 2. Results were integrated through ah analysis of independent variance in CCTST scores predicted by test-taking times on the CCTST and Sets I and II of the Raven Advanced Progressive Matrices (CRITTIME and APMTIME), Raven Advanced Progressive Matrices Set II (APM) scores, and California Critical Thinking Dispositions Inventory (CCTDI) scores. The data were analyzed using a stepwise multiple regression approach, with CCTST scores as the dependent variable and CRITTIME, APMTIME, and CCTDI scores as independent variables. In this procedure, the model begins with no independent variables. The procedure selects the variables that contribute the largest significant F statistic to the model (at the .05 level). Variables are added one by one to the model if the F statistic for inclusion is significant at the .05 level, but are subject to deletion from the model if the F statistic is not significant (at the .05 level) in the context of variables already included in the model. This procedure ends when none of the variables outside the model has an F statistic that is statistically significant, and all variables inside the model are significant.


Phase 1

All Phase 1 analyses were evaluated using an alpha level of .05. Means and standard deviations of CCTST scores for all three time groups are shown in Table 3.

The relationship between membership in a slow, medium, of fast test-taking time group and CCTST scores resulted in ah F statistic (3, 143) of 39.66, which was significant at p < .0001. Here, slower test-taking times were associated with higher CCTST scores, which failed to support the "smart and fast" and "independence" hypotheses. The relationship between CCTST test-taking time and CCTDI scores resulted in an F statistic (2, 144) of 7.12, which was significant at p = .0011. These two initial findings appear to support the "slow and careful" hypothesis.

When APM scores were added as a covariate, however, the resulting F statistic (df = 2, 144; F = 2.58) was not significant (p = .07) for CCTDI scores. Thus, controlling for individual differences in mental ability resulted in no significant differences between speed of completing the CCTST and self-reported CT dispositions as measured by the CCTDI, which failed to support the "slow and careful" hypothesis. In contrast, the relationship between longer test-taking times and higher CCTST scores was significant, even after controlling for individual differences in mental ability (p = .04). The Bonferroni procedure was used to conduct post hoc analyses of all pairwise means, while controlling for the Type I error rate. All pairwise comparisons were significant (p = .05), which appeared to support the "slow and careful" hypothesis when applied to actual performance on a CT test.

Phase 2

The results of a stepwise regression of CCTST scores on CRITTIME, CCTDI, and APM scores are given in Table 4. Raven Advanced Progressive Matrices--Set 2 (APM) scores explained the largest portion of significant independent variance in CCTST scores (42%), followed by CCTDI scores (6%) and time taken to complete the CCTST (CRITTIME; at 3%). A total of 53% of the variance in CCTST scores were accounted for by the variables in the study.


Many studies that investigate the effects of test-taking time on test scores are designed as experiments, where test-taking time is deliberately manipulated by the experimenter (e.g., Powers & Fowles, 1996; Wild, Durso, & Rubin, 1982). While experiments are perhaps the most powerful method for understanding the influence of test-taking time on outcomes, the present study is correlational in nature. Here, our time data is continuous, and there is no examiner manipulation of conditions. As stated previously, the CCTST test developers recommend 45 minutes As a comfortable amount of time within which test takers can reasonably finish, but allow for unlimited time in the context of establishing norms. Thus, our investigation of time as a continuous variable corresponds more closely to natural test-taking conditions that involve the CCTST.

The results of our regression analyses showed that nonverbal reasoning ability on the APM (as a proxy for general mental ability) was the most robust predictor of the largest portion of variance in CCTST scores (42%). This was not surprising, given the difficulty level of the logical mental operations required to successfully answer items. Brighter test takers are more likely to answer items correctly, and thus are more likely to achieve higher overall CCTST scores. However, the manner in which mental ability advantages translate into test-taking style (which in turn influences the time taken to complete the test) can be characterized by two possible scenarios. Bright test takers may be more likely to identify the correct answer more quickly than test takers who are less bright, resulting in faster-test taking time. Of, bright test takers may be more prone to carefully weigh options and check their reasoning several times before answering (of before being satisfied with their answer). Less mentally capable test takers may not be willing to spend the time necessary to answer correctly. As the disposition to carefully weigh options is the component of interest, it is necessary to see whether it influences overall scores over and above the influence of general mental ability. The results of our regression analysis showed that 9% of significant variance in CCTST scores was independent of general mental ability (e.g., attributable to CCTDI scores and CCTST time).

The "smart and fast" hypothesis was not supported by our data. A direct refutation of this hypothesis is the correlation of +.48 (see Table 2) between APM scores and time taken to complete the CCTST (CRITTIME). For this hypothesis to be tenable, the correlation would necessarily be negative (higher APM scores associated with faster CCTST times). Our data suggest that CCTST users who finish the test in 25 minutes of less (which corresponds to our fast group) are not likely to include many high scorers. Roughly 75 percent of all fast group test takers achieved a score of 17 or less (out of 34), which corresponds to the 50th percentile (or lower). If a test taker with no CT ability chose to guess on all items, the average score is expected to be approximately 8 items correct (19 four-choice items plus 15 five-choice items equals expected guessing correct score of 7.7). Six persons in the fast group achieved CCTST scores of 8 or less, compared to 1 person in the middle group and 0 persons in the slow group.

There was mixed support for the "slow and careful" hypothesis. The only way to isolate the effects of slower and more reflective test taking on test scores was to remove the intellectual advantage that is afforded to test takers with higher general mental ability. This was done with both self-reported CT dispositions (CCTDI scores) and actual scores on a CT ability test (CCSTS scores) as dependent variables. There were no significant differences between groups in their self-reported CT dispositions. Although the overall F test was not significant, a Bonferroni post hoc analysis of pairwise means is designed to identify precisely which groups were (or were not) significantly different in mean CCTDI scores. This post hoc analysis showed nonsignificant differences between the slow and medium groups, but significant differences (at p < .05) between the fast and medium groups. In other words, CCTST test takers who completed the test in 25 minutes or less rated themselves as adopting less of a CT attitude in their approach to life compared to the other two time groups. The "slow and careful" hypothesis was supported, however, with CCTST scores as the dependent variable. Taking one's time did give a score advantage to all groups, even when the effects of general mental ability were removed.

Implications for College Instruction

The results of this study have specific implications for college instructors who wish to infuse CT measurement in the evaluation of course objectives. First, instructors must keep clear distinctions between the ability to think critically versus the disposition to think critically. Failure to make this distinction is sure to result in disillusionment with the often repeated and somewhat ill-defined goal of improving critical thinking. It is indeed admirable to teach students to "look for facts which do not support their views," "seek relevant information for a problem," or to "use clarity in stating questions" (which are all desirable dispositions necessary for a critical approach to life). However, the use of a commercial CT ability test is unlikely to measure these behaviors to a significant degree. Thus, a high score on a commercial CT ability test does not necessarily imply that a student has internalized the attitudes toward general academic material that are reflective of a good critical thinker (such as those assessed by the CCTDI). On the other hand, variability in scores on CT ability scales (similar to the CCTST) is influenced to a significant degree by preexisting individual differences in general mental ability. Students who may not be adept at difficult pencil-and-paper verbal logic problems can still benefit from explicit teaching on the advantages of a more critical approach to life (e.g., how to evaluate deceptive advertising in TV commercials).

Second, instructors who are clear about their interest in measuring CT ability (and not dispositions) must face decisions that involve some difficult trade-offs. Measuring CT using tasks or items that are tied to specific course content requires a teacher-constructed test. As most instructors do not have the psychometric skills or resources to design a useful CT test, it is much easier to use a commercially available test such as the CCTST. The rub here is that the CCTST (and other similar tests) are purposely constructed using tasks that are not tied to specific content learned in a specific course of study. If an instructor wants to measure gains in CT ability using such tests (e.g., using a pretest-posttest design), the conditions under which such an undertaking is useful are that (a) the class is fairly homogeneous in general mental ability and (b) the objectives of the class are tied to the kind of verbal logic problems found on the CCTST.

If the CCTST is used in accord with these conditions, then our third recommendation is for instructors to allow a generous amount of time (even longer than the 45 minutes suggested in the manual) for all students to complete the CCTST (since time taken to complete the CCTST contributes a small but significant portion to variance in total scores). Our results suggest a specific time threshold below which CCTST scores are likely to reflect both poor CT ability and low CT dispositions. That is, students who complete the CCTST in under 25 minutes are less likely to obtain scores substantially above the 50th percentile (according to our sample). However, there may be intellectually gifted students who possess a rare combination of high general mental ability, coupled with superior verbal logic skills, who may obtain a high CCTST score within 25 minutes. According to results shown in Table 3, a CCTST score of 24 correct (out of 34 items) reflects roughly two standard deviations above the mean of the Fast Group (low scorers) and one standard deviation above the mean of the Slow Group (high scorers). For students who complete the CCTST under 25 minutes, we suggest this cut score as an approximate point above which facility with multiple choice verbal logic tasks reflects truly superior CT skills.


Ablard, K. E., & Mills, C. J. (1996). Evaluating abridged versions of the Raven's Advanced Progressive Matrices for identifying students with academic talent. Journal of Psychoeducational Assessment, 14, 54-64.

Aretz, A. J., Bolen, M. T., & Devereux, K. E. (1997). Critical thinking assessment of college students. Journal of College Reading and Learning, 28(1), 12-23.

Braden, J. (2000). Perspectives on the nonverbal assessment of intelligence [Editor's introduction]. Journal of Psychoeducational Assessment, 18(3), 204-210.

Bridges, K. R. (1982, April). Order of finish, time to completion, and performance on course-base objective examinations: A systematic analysis of their relationship on both individual exam scores and total score across three exams. Paper presented at the annual meeting of the Eastern Psychological Association, Baltimore, MD.

Carroll, J. B. (1993). Human cognitive abilities Cambridge, England: Cambridge University Press.

Cattell, R. B., Cattell, A. K., & Cattell, H. E. (1993). Sixteen Personality Factor Questionnaire (5th ed.). Champaign, IL: Institute for Personality and Ability Testing.

Ennis, R. (1993). Critical thinking assessment. Theory Into Practice, 32(3), 179-186.

Ennis, R., Millman, J., & Tomko, T. (1985a). Cornell Critical Thinking Essay Test manual. Pacific Grove, CA: Midwest Publications.

Ennis, R., Millman, J., & Tomko, T. (1985b). Cornell Critical Thinking Tests Level X and Level Z manual. Pacific Grove, CA: Midwest Publications.

Evans, F. R., & Reilly, R. R. (1972). A study of speededness as a source of test bias. Journal of Educational Measurement, 9, 123-131.

Facione, P. A. (1990). Critical thinking: A statement of expert consensus for purposes of educational assessment and instruction: Research findings and recommendations. Newark, DE: American Philosophical Association. (ERIC Document Reproduction Service No. ED315423)

Facione, P. A., Facione, N. C., Blohm, S. W., Howard, K., & Giancarlo, C. A. F. (1998). The California Critical Thinking Skills Test manual (Rev. ed.). Millbrae, CA: California Academic Press.

Facione, P. A., Facione, N. C., & Giancarlo, C. A. F. (1996). The California Critical Thinking Disposition Inventory Test manual. Millbrae, CA: California Academic Press.

Facione, P. A., Sanchez, C. A., Facione, N. C., & Gainen, J. (1995). The disposition toward critical thinking. The Journal of General Education, 44(1), 1-25.

Frisby, C. L. (1992). Construct validity and psychometric properties of the Cornell Critical Thinking Test (Level Z): A contrasted group analysis. Psychological Reports, 71, 291-303.

Halpern, D. (1996). Thought and knowledge: An introduction to critical thinking (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

Herman, W. E. (1997). The relationship between time to completion and achievement on multiple choice exams. Journal of Research and Development in Education, 30(2), 113-117.

Jacobs, S. S. (1995). Technical characteristics and some correlates of the California Critical Thinking Skills Test, Forms A and B. Research in Higher Education, 36(1), 89-108.

Jensen, A. R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger Publishers.

Kitchener, K. S. (1993). Developmental range of reflective judgement: The effect of contextual support and practice on developmental stage. Developmental Psychology, 29(5), 893-906.

Lindley, R. H., Smith, W. R., & Thomas, T. J. (1988). The relationship between speed of in formation processing as measured by timed paper-and-pencil tests and psychometric intelligence. Intelligence, 12, 17-25.

Llabre, M. M., & Froman, T. W. (1987). Allocation of time to test items: A study of ethnic differences. Journal of Experimental Education, 55(3), 137-140.

McLaurin, W. A., & Farrar, W. E. (1973). Validities of the Progressive Matrices Tests against IQ and grade point average. Psychological Reports, 32, 803-806.

McPeck, J. E. (1981). Critical thinking and education. New York: St. Martin's Press.

Michael, W. B. (1995). Review of the California Critical Thinking Skills Test. In J. C. Conoley & J. C. Impara (Eds.), The twelfth mental measurements yearbook. Lincoln, NE: University of Nebraska Press.

Mills, C. J., Ablard, K. E., & Brody, L. E. (1993). The Raven's Progressive Matrices: Its usefulness for identifying gifted/talented students. Roeper Review, 15, 183-186.

Mines, R. A., King, P. M., Hood, A. B., & Wood, P. K. (1990). Stages of intellectual development and associated critical thinking skills in college students. Journal of College Student Development, 31, 538-547.

Neubauer, A. C. (1997). The mental speed approach to the assessment of intelligence. In J. Kingma and W. Tomic (Eds.), Advances in cognition and educational practice (Vol. 4, pp. 149-173). Greenwich, CT: JAI Press.

Norris, S. P. (1989). Evaluating critical thinking: The practitioner's guide to teaching thinking series. Pacific Grove, CA.: Critical Thinking Press.

Norris, S. P. (1990). Effect of eliciting verbal reports of thinking on critical thinking test performance. Journal of Educational Measurement, 27(1), 41-58.

Olsen, S. A. (1990, October). Examining the relationship between college core course areas and sophomore critical thinking test scores. Paper presented at the American Evaluation Association's annual conference, Washington, DC. (ERIC Document Reproduction Service No. ED328145).

Pascarella, E. T., Bohr, L., Nora, A., & Terenzini, P. T. (1996). Is differential exposure to college linked to the development of critical thinking? Research in Higher Education, 37(2), 159-174.

Paul, R. W. (1995). Critical thinking: How to prepare students for a rapidly changing world. Santa Rosa, CA: Foundation for Critical Thinking.

Powers, D. E., & Fowles, M. E. (1996). Effects of applying different time limits to a proposed GRE writing test. Journal of Educational Measurement, 33(4), 433-452.

Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press.

Raven, J., Raven, J. C., & Court, J. H. (1998a). Advanced Progressive Matrices Test manual (1998 ed.). Oxford, England: Oxford Psychologists Press.

Raven, J., Raven, J. C., & Court, J. H. (1998b). Manual for Raven's Progressive Matrices and Vocabulary Scales: General overview. Oxford, England: Oxford Psychologists Press.

SAS Institute. (1999). SAS for Windows (Version 8.2). Cary, NC: Author.

Spearman, C. (1923). The nature of "intelligence" and the principles of cognition. London: Macmillan.

Watson, G., & Glaser, E. (1980). Watson-Glaser Critical Thinking Appraisal. San Antonio, TX: The Psychological Corporation.

Wild, C. L., Durso, R., & Rubin, D. B. (1982). Effect of increased test-taking time on test scores by ethnic groups, years out of school, and sex. Journal of Educational Measurement, 19, 19-28.

Craig L. Frisby, Ph.D., is the director of the School Psychology Program and Associate Professor of School Psychology in the Department of Educational, School, and Counseling Psychology at the University of Missouri. He is a former Associate Editor of the School Psychology Review, the official journal of the National Association of School Psychologists. He is co-editor of the text Test Interpretation and Diversity, published by the American Psychological Association (APA). He is also a member of the APA Division 16 Task Force on Evidence-Based Interventions. His research interests are in psychoeducational assessment, curriculum-based assessment, the measurement of thinking skills, and multidimensional scaling applications. Bobby K. Traffanstedt, MA, Ed.S., is now a school psychologist with the Columbia Public Schools in Columbia, Missouri. He received an MA in School Psychology (1999) and Education Specialist degree (2003) from the University of Missouri-Columbia. Mr. Traffanstedt's research interests include educational topics such as slow learners, behavioral issues, and issues surrounding the teaching of critical thinking at lower (K-3) grade levels. Correspondence concerning this article should be addressed to Craig L. Frisby, Department of Educational, School, and Counseling Psychology at the University of Missouri, Columbia, Missouri 65211-2130.Table 1 Means and Standard Deviations for All Variables (N = 147) Variable M (a) SD CCTST 16.6 5.0 APM 23.1 6.5 CCTDI 299.5 30.3 CRITTIME 28.4 8.4 Note. CCTST = California Critical Thinking Skills Test. CCTDI = California Critical Thinking Dispositions Inventory. APM = Raven Advanced Progressive Matrices--Set 2. CRITTIME = Time taken to complete CCTST (in minutes). (a) CCTST scores range from a low of 0 (all items incorrect) to a high of 34 (all items correct). CCTDI Total Score < 280 = serious overall deficiency in critical disposition and > 350 = consistent strength in critical thinking disposition. APM scores range from a low of 0 (all items incorrect) to a high of 36 (all items correct). Table 2 First-Order Pearson Correlation Matrix of Variables (N = 147) Variable CCTST CCTDI APM CRITTIME CCTST 1.00 0.49 0.65 0.51 CCTDI 1.00 0.39 0.29 APM 1.00 0.48 CRITTIME 1.00 Note. All correlations are significant at p < .01. CCTST = California Critical Thinking Skills Test. CCTDI = California Critical Thinking Dispositions Inventory. APM = Raven Advanced Progressive Matrices--Set 2. CRITTIME = Time taken to complete CCTST (in minutes). Table 3 Means and Standard Deviations of California Critical Thinking Skills Test (CCTST) Scores for Test-Taking Time Groups Group (a) n M (b) SD Slow 50 19.2 4.6 Medium 46 16.6 4.2 Fast 51 14.0 4.7 Note. (a) Slow Group = 31-71 minutes; Medium Group = 26-30 minutes; Fast Group = 10-25 minutes. (b) Mean CCTST score. Table 4 Summary of Stepwise Regression Procedures with California Critical Thinking Skills Test (CCTST) Scores as the Dependent Variable Step Variable Partial Model F Significance [R.sup.2] [R.sup.2] 1 APM .42 .429 109.15 <.0001 2 CCTDI .06 .498 19.66 <.0001 3 CRITTIME .03 .534 11.34 <.001 Note. CCTDI = California Critical Thinking Dispositions Inventory. APM = Raven Advanced Progressive Matrices--Set 2. CRITTIME = Time taken to complete CCTST (in minutes).

Gale Copyright:

Copyright 2003 Gale, Cengage Learning. All rights reserved.

Categories: 1

0 Replies to “The California Critical Thinking Skills Test Manual Psychology”

Leave a comment

L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *