The Impact of Generative Al on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers

Hao-Ping (Hank) Lee, Advait Sarkar, Lev Tankelevitch, Ian Drosos, Sean Rintel, Richard Banks, Nicholas Wilson
CHI '25
Carnegie Mellon University

Table of Contents

Overall Summary

Study Background and Main Findings

This research investigates how Generative AI (GenAI) tools influence critical thinking among knowledge workers. The study, conducted through a survey of 319 professionals using GenAI at least weekly, sought to understand when and how these workers perceive their engagement in critical thinking while using these tools, and how GenAI affects the perceived effort required for such thinking. The researchers used a mixed-methods approach, combining quantitative analysis of survey responses with qualitative analysis of participants' free-text explanations. Critical thinking was operationalized (defined in a measurable way) using Bloom's taxonomy, a hierarchical framework of cognitive skills ranging from basic knowledge recall to higher-order evaluation.

The study found a complex relationship between confidence and critical thinking. Higher confidence in the GenAI tool itself was associated with less reported critical thinking, while higher self-confidence in performing the task without AI was associated with more critical thinking. Quantitatively, the majority of participants reported that GenAI reduced the perceived effort required for various cognitive activities associated with critical thinking. For example, 72% reported less effort for 'Knowledge,' 79% for 'Comprehension,' and 69% for 'Application' tasks when using GenAI compared to not using it. However, the qualitative analysis revealed a more nuanced picture.

Qualitatively, the researchers identified three key shifts in the nature of critical thinking. First, effort shifted from gathering information to verifying information provided by the AI. Second, for tasks involving application of knowledge, effort shifted from problem-solving to integrating the AI's response into a usable form. Finally, for higher-order tasks like analysis, synthesis, and evaluation, effort shifted from direct task execution to stewardship of the AI, meaning guiding and monitoring the AI's output to ensure it met the desired quality and objectives.

The main conclusion is that GenAI tools are not simply replacing critical thinking but are transforming it. While GenAI may reduce the effort required for some cognitive tasks, it increases the need for others, particularly those related to verification, integration, and oversight. This has significant implications for the design of GenAI tools and the training of knowledge workers, suggesting a need to focus on skills that complement, rather than are replaced by, AI capabilities.

Research Impact and Future Directions

The study provides valuable insights into the complex relationship between Generative AI (GenAI) use and critical thinking in knowledge work, demonstrating that correlation does not equal causation. While the data shows associations between confidence levels and critical thinking, it's crucial to remember that these are observed relationships, not necessarily direct cause-and-effect links. For example, increased self-confidence might correlate with more critical thinking, but other factors, like prior experience or personality traits, could also play a significant role.

The research's practical utility lies in its identification of how GenAI is changing the nature of cognitive tasks. The shift from information gathering to verification, and from task execution to stewardship, highlights the need for new skills and training for knowledge workers. These findings place the study within a broader context of understanding how technology reshapes work and the need for continuous adaptation.

Based on the findings, a key recommendation is to design GenAI tools that promote a balance between leveraging AI's capabilities and fostering users' critical thinking skills. This includes incorporating features that encourage verification of AI-generated content and support users in effectively managing and overseeing AI-assisted tasks. However, it's important to acknowledge that the study relies on self-reported data, which may be subject to biases.

Several critical questions remain unanswered. How can we objectively measure critical thinking engagement in GenAI-assisted tasks, beyond self-reporting? What are the long-term effects of GenAI use on cognitive skills and professional development? While the study's limitations, such as the potential for self-reporting bias and the focus on a specific population, don't invalidate the findings, they do highlight the need for further research to explore these questions and confirm the generalizability of the results.

Critical Analysis and Recommendations

Clear Research Questions (written-content)
The abstract clearly states the research questions, providing a concise overview of the study's objectives. This clarity helps readers quickly grasp the core focus of the research, improving its accessibility and impact.
Section: Abstract
Clarify Types of Knowledge Work (written-content)
The abstract does not specify the types of knowledge work investigated. Adding examples (e.g., writing, data analysis, coding) would enhance the abstract's informativeness and help readers quickly assess the study's relevance.
Section: Abstract
Historical Contextualization (written-content)
The introduction effectively establishes the context by referencing historical concerns about technology's impact on human thought. This framing helps readers understand the broader significance of the research and its relevance to ongoing debates.
Section: Introduction
Define 'Critical Thinking' Earlier (written-content)
The introduction does not define 'critical thinking' until Section 2. Defining this central concept earlier, in the first or second paragraph, would significantly improve reader comprehension and strengthen the paper's conceptual foundation from the outset.
Section: Introduction
Use of Validated Instruments (written-content)
The method section details the use of validated instruments, such as the Reflective Thinking Inventory and the Propensity to Trust Technology scale. Using established measures enhances the study's reliability and allows for comparisons with other research.
Section: Method
Specify Versions of Online Resources (written-content)
The method section does not specify the version or date of access for online resources like Prolific and O*NET. Including this information is crucial for reproducibility, as different versions can have variations in features or participant pools.
Section: Method
Structured Framework for Critical Thinking Practices (written-content)
The section clearly maps knowledge workers' critical thinking practices into three phases: goal and query formation, inspect response, and integrate response. This structured framework provides a clear and understandable model for analyzing the complex process of critical thinking in the context of GenAI use.
Section: Findings for RQ1: When and how do knowledge workers perceive the enaction of critical thinking when using GenAI?
Consistently Report Number of Participants for Qualitative Findings (written-content)
The section does not consistently report the number of participants contributing to each qualitative finding. Consistent reporting (e.g., "Six of 319 participants...") would improve clarity and make the prevalence of each finding immediately apparent.
Section: Findings for RQ1: When and how do knowledge workers perceive the enaction of critical thinking when using GenAI?
Identification of Qualitative Shifts in Critical Thinking Effort (written-content)
The section identifies three distinct shifts in critical thinking effort: from information gathering to verification, from problem-solving to response integration, and from task execution to stewardship. This provides a concise and insightful summary of how GenAI is changing the nature of cognitive work.
Section: Findings for RQ2: When and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI?
Integrate Quantitative and Qualitative Findings More Explicitly (written-content)
The section does not fully integrate the quantitative and qualitative findings. Adding explicit links between the quantitative data (e.g., decreased perceived effort) and the qualitative shifts (e.g., from gathering to verification) would create a more compelling and unified narrative.
Section: Findings for RQ2: When and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI?
Novel Concept of Task Stewardship (written-content)
The discussion introduces the concept of "task stewardship" to describe the shift in cognitive effort from task execution to oversight. This novel metaphor provides a valuable framework for understanding the changing role of knowledge workers in the age of GenAI.
Section: Discussion
Analyze Relationships Between Shifts in Critical Thinking (written-content)
The discussion does not fully analyze the relationships between the three identified shifts in critical thinking effort. Exploring these interconnections (e.g., how verification relates to integration and stewardship) would provide a more nuanced and holistic understanding.
Section: Discussion

Section Analysis

Abstract

Key Aspects

Strengths

Suggestions for Improvement

Introduction

Key Aspects

Strengths

Suggestions for Improvement

Method

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Table 1: Categories and sub-categories for GenAI tool usage [13].
Figure/Table Image (Page 5)
Table 1: Categories and sub-categories for GenAI tool usage [13].
First Reference in Text
Task type. Brachman et al. [13] classify knowledge workers' current usage of GenAI tools into nine types (See Table 1), grouped into three major categories: 1) for creation, 2) to find or work with information, 3) to get advice.
Description
  • Categorization of GenAI Tool Usage: Table 1 categorizes the different ways knowledge workers use GenAI tools. The categories are grouped into three major areas: 1) creation, 2) information handling, and 3) advice-seeking. Within these major categories, there are nine specific types of GenAI tool usage. For creation, the tool can generate an artefact or an idea. For information handling, the tool can search, help a user learn, or summarize information. For advice, the tool can improve something, provide guidance, or validate an artefact.
Scientific Validity
  • Classification based on existing research and MECE categories.: The classification is based on existing research (Brachman et al. [13]), which adds to its validity. The categories are mutually exclusive and collectively exhaustive (MECE), which is important for a well-defined taxonomy. The table serves as a basis for further analysis and comparison within the study.
Communication
  • The table is clearly referenced and contributes to the overall understanding of the methodology.: The table provides a structured overview of how GenAI tools are used, which helps readers understand the scope of the study and how different tasks were classified. The reference to Brachman et al. [13] provides credibility and allows readers to explore the original source for more details.
Table 2: Cognitive activities defined in Bloom's taxonomy [12].
Figure/Table Image (Page 6)
Table 2: Cognitive activities defined in Bloom's taxonomy [12].
First Reference in Text
See Table 2 for more details.
Description
  • Bloom's Taxonomy Levels: Table 2 outlines Bloom's taxonomy, which categorizes learning objectives into different levels of cognitive complexity. The levels are: Knowledge (recalling information), Comprehension (understanding the meaning), Application (using knowledge in new situations), Analysis (breaking down information), Synthesis (creating something new), and Evaluation (judging the value of information). Each level is defined by specific cognitive activities.
Scientific Validity
  • Validity of Bloom's Taxonomy: Bloom's taxonomy is a widely accepted and validated framework in education and cognitive psychology. Its inclusion provides a solid theoretical basis for analyzing critical thinking activities. The table accurately represents the core concepts of Bloom's taxonomy, providing a valid framework for the study.
Communication
  • Clarity and conciseness of descriptions: The table effectively communicates the different levels of cognitive skills within Bloom's taxonomy. The descriptions are concise and provide a clear understanding of each level. Referencing Bloom's taxonomy [12] provides a strong foundation and allows readers to consult the original source for a more in-depth explanation.
Table 3: Participant demographics.
Figure/Table Image (Page 6)
Table 3: Participant demographics.
First Reference in Text
The 319 participants (159 men, 153 women, 5 non-binary/gender diverse, 2 prefer not to say) came from diverse age groups, occupations, and countries of residence (see Table 3).
Description
  • Demographic Characteristics: Table 3 presents the demographic characteristics of the 319 participants in the study. The table includes information on gender (159 men, 153 women, 5 non-binary/gender diverse, and 2 prefer not to say), age groups, occupations, and countries of residence. The top 5 GenAI tools used are also listed, along with the top 5 occupations and countries of residency. The table provides a snapshot of the sample's diversity across various attributes.
Scientific Validity
  • Importance of demographic information: Providing participant demographics is crucial for assessing the generalizability of the study's findings. Including a diverse range of demographic factors strengthens the study's rigor. Listing the top 5 GenAI tools and occupations adds valuable context for interpreting the results.
Communication
  • Comprehensive overview of participant demographics: The table effectively communicates the demographic composition of the study participants. Including information on gender, age, occupation, and country of residence provides a comprehensive overview of the sample. The table format allows for easy comparison of different demographic categories.
Figure 1: Schematic overview of the survey design and our corresponding...
Full Caption

Figure 1: Schematic overview of the survey design and our corresponding analysis approach.

Figure/Table Image (Page 7)
Figure 1: Schematic overview of the survey design and our corresponding analysis approach.
First Reference in Text
Both RQ1 when and how do knowledge workers perceive the enaction of critical thinking when using GenAI? - and RQ2 - when and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI? - were answered via both quantitative and qualitative analysis (See Figure 1 for an overview of our approach).
Description
  • Schematic overview of the methodology: Figure 1 provides a schematic overview of the study's survey design and analysis approach. The figure outlines the key components of the survey, including the GenAI tool use examples, task factors, critical thinking assessment, user factors, and the types of analysis performed (qualitative and quantitative). The diagram visually connects the survey questions to the research questions (RQ1 and RQ2) and indicates the flow of information from the survey to the analysis.
Scientific Validity
  • Accurate representation of the methodology: The schematic accurately represents the methodology described in the paper. The connection between survey components, research questions, and analysis methods is clearly depicted. The figure provides a valid overview of the study's design and contributes to the transparency and reproducibility of the research.
Communication
  • Clarity and organization of the schematic: The figure clearly illustrates the survey design and analysis approach. The use of color-coding and labels helps to differentiate between different components. The schematic provides a high-level understanding of the study's methodology, making it easier for readers to follow the subsequent analysis.
Table 4: Non-standardised coefficients of the mixed-effects regressions...
Full Caption

Table 4: Non-standardised coefficients of the mixed-effects regressions modeling knowledge workers' perceived enaction of critical thinking and perceived effort in cognitive activities when using generative AI tools.

Figure/Table Image (Page 10)
Table 4: Non-standardised coefficients of the mixed-effects regressions modeling knowledge workers' perceived enaction of critical thinking and perceived effort in cognitive activities when using generative AI tools.
First Reference in Text
Table 4 summarises the seven models and reports the corrected p-values.
Description
  • Regression Model Results: Table 4 presents the non-standardized coefficients from mixed-effects regression models used to analyze the relationship between various factors and knowledge workers' perceived enaction of critical thinking and perceived effort in cognitive activities when using GenAI tools. The table includes several models, each examining a different dependent variable, such as perceived enaction of critical thinking, knowledge, comprehension, application, analysis, synthesis, and evaluation. The independent variables include task factors (task type, confidence in self, confidence in AI, confidence in evaluation) and user factors (gender, age, occupation, tendency to reflect, trust in GenAI). Each cell in the table contains the coefficient, and the p-value, indicating the statistical significance of the relationship.
Scientific Validity
  • Appropriateness of statistical methods: Mixed-effects regression is an appropriate method for analyzing the data, as it accounts for the nested structure of the data (multiple observations per participant). Reporting non-standardized coefficients is useful for comparing the relative magnitude of effects across different variables. Including corrected p-values addresses the issue of multiple comparisons and strengthens the validity of the findings. It is crucial that the assumptions of the mixed models were checked and met.
Communication
  • Density and accessibility of the table: The table is quite dense and may be challenging for readers unfamiliar with regression analysis. Clearer labeling of the rows and columns, along with a brief explanation of the key variables, could improve its accessibility. Highlighting the significant p-values more prominently would also help readers quickly identify the key findings.

Findings for RQ1: When and how do knowledge workers perceive the enaction of critical thinking when using GenAI?

Key Aspects

Strengths

Suggestions for Improvement

Findings for RQ2: When and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI?

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Figure 2: Distribution of perceived effort (%) in cognitive activities (based...
Full Caption

Figure 2: Distribution of perceived effort (%) in cognitive activities (based on Bloom's taxonomy) when using a GenAI tool compared to not using one.

Figure/Table Image (Page 13)
Figure 2: Distribution of perceived effort (%) in cognitive activities (based on Bloom's taxonomy) when using a GenAI tool compared to not using one.
First Reference in Text
Figure 2: Distribution of perceived effort (%) in cognitive activities (based on Bloom's taxonomy) when using a GenAI tool compared to not using one.
Description
  • Distribution of Perceived Effort: Figure 2 presents the distribution of perceived effort in cognitive activities when using a GenAI tool compared to not using one. The cognitive activities are based on Bloom's taxonomy and include Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation. The figure shows the percentage of responses for each effort level (Much less effort, Less effort, About the same, More effort, Much more effort) for each cognitive activity. For instance, for Knowledge, approximately 41% of respondents reported 'Less effort,' while about 30% reported 'About the same.'
Scientific Validity
  • Limitations of self-reported data: The figure provides a visual representation of the survey responses regarding perceived effort. However, it's important to consider the limitations of self-reported data and potential biases in participants' responses. The figure should be interpreted in conjunction with the statistical analysis and qualitative data to provide a more complete understanding.
  • Use of percentages and sample size: The use of percentages allows for comparison across different cognitive activities. The sample size for each activity should be considered when interpreting the distributions. It is important to note whether the differences in distributions across the cognitive activities are statistically significant.
Communication
  • Effectiveness of stacked bar chart visualization: The figure uses stacked bar charts to display the distribution of perceived effort for each cognitive activity. This visualization is generally effective in showing the relative proportions of different effort levels. However, the overlapping segments could make it difficult to precisely compare specific effort levels across different activities.
  • Clarity of labels and percentages: The figure clearly labels the cognitive activities based on Bloom's taxonomy and includes percentages, which aids in understanding the data. However, adding the actual number of responses for each category could further enhance clarity.

Discussion

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Table 5: Codebook for the qualitative analysis.
Figure/Table Image (Page 23)
Table 5: Codebook for the qualitative analysis.
First Reference in Text
Not explicitly referenced in main text
Description
  • Overview of the codebook: Table 5 presents the codebook used for the qualitative analysis. It lists the codes used to analyze the free-text responses, organized by research question (RQ1 and RQ2). For each code, a description is provided to define its meaning and scope. The codebook covers topics such as goal and query formation, inspect response, integrate response, critical thinking motivators, critical thinking inhibitors, and reasons for increased/decreased effort.
Scientific Validity
  • Importance of inter-coder reliability: The codebook provides essential information about the qualitative analysis process. It's important to establish inter-coder reliability to ensure the consistency and validity of the coding. Mentioning the process used to develop the codebook and assess inter-coder reliability would strengthen the methodological rigor.
  • Theoretical grounding of the categories: The codebook's categories seem comprehensive, covering various aspects of the research questions. However, it is important to know how these categories were derived (e.g., from the literature, grounded theory approach) to assess their theoretical grounding and potential biases.
Communication
  • Transparency and structure of the codebook: The codebook provides a structured overview of the coding scheme used in the qualitative analysis, which enhances transparency and allows readers to understand how the data was interpreted. Clear and concise descriptions of each code are essential for ensuring inter-coder reliability and facilitating replication.
  • Lack of explicit reference in the main text: Since the codebook is not explicitly referenced in the main text, its importance may be overlooked by readers. Including a clear reference to Table 5 in the results section when discussing the qualitative findings would improve its integration and impact.
↑ Back to Top