This research investigates how Generative AI (GenAI) tools influence critical thinking among knowledge workers. The study, conducted through a survey of 319 professionals using GenAI at least weekly, sought to understand when and how these workers perceive their engagement in critical thinking while using these tools, and how GenAI affects the perceived effort required for such thinking. The researchers used a mixed-methods approach, combining quantitative analysis of survey responses with qualitative analysis of participants' free-text explanations. Critical thinking was operationalized (defined in a measurable way) using Bloom's taxonomy, a hierarchical framework of cognitive skills ranging from basic knowledge recall to higher-order evaluation.
The study found a complex relationship between confidence and critical thinking. Higher confidence in the GenAI tool itself was associated with less reported critical thinking, while higher self-confidence in performing the task without AI was associated with more critical thinking. Quantitatively, the majority of participants reported that GenAI reduced the perceived effort required for various cognitive activities associated with critical thinking. For example, 72% reported less effort for 'Knowledge,' 79% for 'Comprehension,' and 69% for 'Application' tasks when using GenAI compared to not using it. However, the qualitative analysis revealed a more nuanced picture.
Qualitatively, the researchers identified three key shifts in the nature of critical thinking. First, effort shifted from gathering information to verifying information provided by the AI. Second, for tasks involving application of knowledge, effort shifted from problem-solving to integrating the AI's response into a usable form. Finally, for higher-order tasks like analysis, synthesis, and evaluation, effort shifted from direct task execution to stewardship of the AI, meaning guiding and monitoring the AI's output to ensure it met the desired quality and objectives.
The main conclusion is that GenAI tools are not simply replacing critical thinking but are transforming it. While GenAI may reduce the effort required for some cognitive tasks, it increases the need for others, particularly those related to verification, integration, and oversight. This has significant implications for the design of GenAI tools and the training of knowledge workers, suggesting a need to focus on skills that complement, rather than are replaced by, AI capabilities.
The study provides valuable insights into the complex relationship between Generative AI (GenAI) use and critical thinking in knowledge work, demonstrating that correlation does not equal causation. While the data shows associations between confidence levels and critical thinking, it's crucial to remember that these are observed relationships, not necessarily direct cause-and-effect links. For example, increased self-confidence might correlate with more critical thinking, but other factors, like prior experience or personality traits, could also play a significant role.
The research's practical utility lies in its identification of how GenAI is changing the nature of cognitive tasks. The shift from information gathering to verification, and from task execution to stewardship, highlights the need for new skills and training for knowledge workers. These findings place the study within a broader context of understanding how technology reshapes work and the need for continuous adaptation.
Based on the findings, a key recommendation is to design GenAI tools that promote a balance between leveraging AI's capabilities and fostering users' critical thinking skills. This includes incorporating features that encourage verification of AI-generated content and support users in effectively managing and overseeing AI-assisted tasks. However, it's important to acknowledge that the study relies on self-reported data, which may be subject to biases.
Several critical questions remain unanswered. How can we objectively measure critical thinking engagement in GenAI-assisted tasks, beyond self-reporting? What are the long-term effects of GenAI use on cognitive skills and professional development? While the study's limitations, such as the potential for self-reporting bias and the focus on a specific population, don't invalidate the findings, they do highlight the need for further research to explore these questions and confirm the generalizability of the results.
The abstract clearly states the research questions, providing a concise overview of the study's objectives.
The abstract succinctly summarizes the key findings, including both quantitative and qualitative results.
The abstract effectively highlights the implications of the study, pointing to design challenges and opportunities.
This medium-impact improvement would enhance the reader's understanding of the scope of the study. The abstract, being the first point of contact, should provide a clear indication of the study population. Explicitly stating "knowledge workers" were the subjects improves the clarity. This addition provides immediate context for the reader, helping them understand the target population and the generalizability of the findings.
Implementation: Add the phrase "knowledge workers" to the first sentence, specifying the participant group. For example: "We survey 319 knowledge workers..."
This low-impact improvement would provide a more complete overview of the research approach. While the abstract mentions a survey, briefly including the use of both quantitative and qualitative methods adds a layer of detail. This clarifies the mixed-methods nature of the study, enhancing the reader's understanding of the methodological rigor.
Implementation: Add a phrase indicating the mixed-methods approach. For example, after mentioning the survey, add: "...using both quantitative and qualitative analyses."
This high-impact improvement would strengthen the abstract by providing a more specific indication of the type of knowledge work investigated. This enhances the abstract's informativeness and helps readers quickly assess the study's relevance to their own interests. It also clarifies the scope and context, improving the abstract's overall impact.
Implementation: Include a brief phrase or a few examples of the types of knowledge work considered. For instance: "...in work tasks (e.g., writing, data analysis, coding)."
The introduction effectively establishes the context by referencing historical concerns about the impact of new technologies on human thought, drawing parallels to writing, printing, and calculators.
The introduction clearly defines the scope of the research, focusing on critical thinking in knowledge work and distinguishing it from studies in educational settings.
The research questions are clearly stated and directly address the identified gap in the literature regarding the impact of GenAI on critical thinking.
The introduction effectively motivates the research by highlighting the lack of empirical evidence connecting mechanized convergence with critical thinking.
The introduction concisely outlines the study's methodology, mentioning the survey of knowledge workers and the elicitation of real-world examples.
The introduction provides a brief overview of the key findings, linking confidence in GenAI and self-confidence to critical thinking engagement.
The introduction succinctly lists the paper's contributions, connecting them to the literature review, survey deployment, and design implications.
This high-impact improvement would significantly enhance the introduction's framing by explicitly defining 'critical thinking' earlier. Currently, the term is used without a clear definition until Section 2. The Introduction section sets the stage for the entire paper, and defining this central concept early is crucial for reader comprehension. This change would strengthen the paper's conceptual foundation and improve clarity from the outset.
Implementation: Provide a concise definition of 'critical thinking' in the first or second paragraph, referencing the chosen framework (Bloom's taxonomy). For example: '...critical thinking (defined here, following Bloom et al. [12, 54], as a hierarchical set of cognitive skills including knowledge, comprehension, application, analysis, synthesis, and evaluation).'
This medium-impact improvement would provide a more complete picture of the research context. While the introduction mentions 'knowledge work,' it doesn't specify the types of knowledge work included. The Introduction is responsible for establishing the scope, and adding this detail would enhance reader understanding and allow for better assessment of the study's relevance. This would strengthen the paper's external validity by clarifying the target population.
Implementation: Add a brief phrase or examples of the types of knowledge work considered. For instance, after mentioning 'knowledge work,' add: '...involving tasks such as writing, data analysis, programming, and strategic planning.'
This low-impact improvement would add a layer of detail to the methodological overview. While the introduction mentions a survey, briefly stating the use of both quantitative and qualitative analysis would be beneficial. The Introduction should provide a high-level overview of the approach, and this addition enhances completeness. It signals the mixed-methods nature of the study, improving the reader's understanding of the research design.
Implementation: Add a phrase indicating the mixed-methods approach. For example: '...conducting a survey...and analyzing the data using both quantitative and qualitative methods.'
This medium-impact improvement would strengthen the link between the problem statement and the research questions. While the introduction mentions 'mechanised convergence,' it doesn't explicitly connect this concept to the research questions. The Introduction should clearly establish this link to reinforce the study's motivation. This addition would improve the logical flow and coherence of the introduction.
Implementation: Add a sentence or phrase directly linking 'mechanised convergence' to the research questions. For example: '...to investigate whether and how this mechanised convergence, potentially reflecting a decline in critical thinking, is perceived by knowledge workers using GenAI (RQ1) and how it affects their cognitive effort (RQ2).'
The section clearly outlines the research questions, providing a direct link to the introduction and setting the stage for the methodological approach.
The section provides a concise overview of the survey design, including the cognitive priming of participants with examples of critical thinking.
The section details the dependent and independent variables used in the regression models, providing a clear understanding of the quantitative analysis approach.
The section describes the operationalization of task types based on Brachman et al.'s taxonomy, providing a clear framework for classifying participant examples.
The section explains the measurement of task confidence, including confidence in self, GenAI, and evaluation, providing a comprehensive assessment of user confidence.
The section describes the use of validated instruments to measure user factors, such as the Reflective Thinking Inventory and the Propensity to Trust Technology scale.
The section clearly defines the cognitive activities based on Bloom's taxonomy, providing a framework for assessing perceived effort in critical thinking.
The section details the recruitment process through the Prolific platform, including the criteria for participant selection.
The section describes the data cleaning and analysis procedures, including the handling of missing data and the use of regression models.
The section outlines the qualitative analysis approach, including the open-coding process and the use of negotiated agreement.
This high-impact improvement would significantly enhance the clarity and reproducibility of the study. The Methods section should explicitly state the specific version or date of access for the Prolific platform and the O*NET occupational listings. Different versions can have variations in features, participant pools, or classification systems. Providing this detail ensures that other researchers can accurately replicate the study conditions and understand the specific context of the data collection. This is crucial for the study's transparency and scientific rigor.
Implementation: Include the specific version or date of access for both Prolific and ONET. For example: 'We recruited participants through the Prolific platform (version accessed March 6, 2025)...' and '...from the Occupational Information Network (ONET)'s occupational listings (version 2019, accessed March 6, 2025)...'
This medium-impact improvement would enhance the study's methodological rigor and transparency. While the section mentions excluding responses due to low quality, it doesn't define the specific criteria used to determine 'low-effort free-text responses.' The Methods section should provide these operational definitions. This addition would strengthen the paper by allowing readers to evaluate the validity of the exclusion criteria and understand how data quality was ensured. It also improves the reproducibility of the study.
Implementation: Provide specific criteria for excluding low-quality responses. For example: '...due to low response quality (i.e., responses with fewer than 10 words, responses containing only gibberish, or responses that did not address the question).'
This medium-impact improvement would strengthen the study's internal validity and address potential biases. The Methods section mentions randomizing the order of task types but doesn't specify the method of randomization. The Methods section should detail the randomization procedure. This addition is important because different randomization methods can have different properties, and explicitly stating the method used allows readers to assess the potential for bias and the effectiveness of the randomization in controlling for order effects.
Implementation: Specify the method of randomization. For example: 'The order of task types was randomised using a Latin square design...' or '...using a random number generator in the survey software.'
This low-impact improvement would provide additional context and clarity regarding the study's ethical considerations. While the section mentions IRB approval, it doesn't specify the name of the institution or the approval number. Including this information, while not strictly necessary, enhances transparency. It allows readers to verify the ethical oversight of the study and provides a point of contact for any ethical inquiries.
Implementation: Include the name of the institution and the IRB approval number. For example: 'Our study protocol was approved by the [Institution Name] ethics and compliance review board (Approval #XXXX).'
Figure 1: Schematic overview of the survey design and our corresponding analysis approach.
Table 4: Non-standardised coefficients of the mixed-effects regressions modeling knowledge workers' perceived enaction of critical thinking and perceived effort in cognitive activities when using generative AI tools.
The section clearly introduces the research question (RQ1) and provides a concise overview of the approach, linking qualitative and quantitative analyses.
The section effectively summarizes the qualitative findings, highlighting that knowledge workers view critical thinking as ensuring work quality and objectives.
The section provides a brief overview of the quantitative findings, mentioning the correlation between confidence, reflection, and critical thinking enaction.
The section clearly maps knowledge workers' critical thinking practices into three phases: goal and query formation, inspect response, and integrate response, providing a structured framework for understanding the process.
The section provides detailed descriptions of each critical thinking practice within the three phases, including specific examples and participant quotes.
The section effectively connects the findings to prior work, referencing relevant frameworks and studies on human cognitive problem solving and working with GenAI.
The section identifies and defines key motivators and inhibitors for critical thinking, providing a nuanced understanding of the factors influencing knowledge workers' engagement in critical thinking.
This medium-impact improvement would enhance the clarity and flow of the section by explicitly stating the number of participants who contributed to each qualitative finding (e.g., Form goal (6/319)). While this information is present, it's presented inconsistently. The Findings section should prioritize clarity and consistent presentation of data. This change makes the prevalence of each finding immediately apparent, improving reader comprehension and the overall impact of the qualitative results.
Implementation: Consistently present the number of participants for all qualitative findings in the same format. For example, change 'Form goal (6/319).' to 'Form goal: Six of 319 participants...' and apply this consistently throughout the section.
This low-impact improvement would improve the readability and flow of the section. The section uses parenthetical references to participant numbers (e.g., P140) extensively within sentences. While this provides traceability, it can disrupt the flow. The Findings section should prioritize clear communication of results. Moving these references to the end of the sentence or clause, where appropriate, would improve readability without sacrificing the connection to the participant data.
Implementation: Move parenthetical participant references to the end of the sentence or clause where possible. For example, change 'For example, when P140 tried to learn...' to 'For example, when one participant (P140) tried to learn...'
This medium-impact improvement would strengthen the section's analysis and interpretation of the findings. While the section describes various critical thinking practices, it could benefit from a more explicit discussion of the relationships between these practices. The Findings section should not only present findings but also analyze their interconnections. This would provide a more holistic understanding of how knowledge workers enact critical thinking and how different practices might support or hinder each other.
Implementation: Add a paragraph or subsection discussing the relationships between the identified critical thinking practices. For example, discuss how 'Form goal' and 'Form query' are interrelated, or how 'Ensure quality through objective criteria' might interact with 'Verify information by cross-referencing external sources.'
The section clearly introduces the research question (RQ2) and provides a concise overview of the approach, linking descriptive analysis and qualitative shifts in critical thinking effort.
The section effectively summarizes the quantitative findings, highlighting the perceived decreased effort for cognitive activities associated with critical thinking when using GenAI.
The section identifies and describes three distinct shifts in critical thinking effort due to GenAI: from information gathering to information verification, from problem-solving to response integration, and from task execution to task stewardship.
The section provides detailed descriptions of each qualitative shift, including specific examples and participant quotes for each cognitive activity category (Knowledge & Comprehension, Application, Analysis, Synthesis, and Evaluation).
The section presents quantitative data in a clear and visually accessible manner using Figure 2, showing the distribution of perceived effort in cognitive activities.
The section connects the findings to prior work and theoretical frameworks, referencing Bloom's taxonomy and studies on user confidence in AI-assisted decision-making.
This medium-impact improvement would enhance the clarity and flow of the section by explicitly stating the number of participants who contributed to each qualitative finding (e.g., "Participants perceived less effort to fetch task-specific information at scale, and in real-time (111/319)."). While this information is present, it's presented inconsistently. The Findings section should prioritize clarity and consistent presentation of data. This change makes the prevalence of each finding immediately apparent, improving reader comprehension and the overall impact of the qualitative results.
Implementation: Consistently present the number of participants for all qualitative findings in the same format. For example, change 'Participants perceived less effort to fetch task-specific information at scale, and in real-time (111/319).' to 'One hundred and eleven of 319 participants...' and apply this consistently throughout the section.
This low-impact improvement would improve the readability and flow of the section. The section uses parenthetical references to participant numbers (e.g., P232) extensively within sentences. While this provides traceability, it can disrupt the flow. The Findings section should prioritize clear communication of results. Moving these references to the end of the sentence or clause, where appropriate, would improve readability without sacrificing the connection to the participant data.
Implementation: Move parenthetical participant references to the end of the sentence or clause where possible. For example, change 'For instance, P232 shared...' to 'For instance, one participant (P232) shared...'
This high-impact improvement would strengthen the section's analysis and interpretation of the findings. While the section describes the three shifts in critical thinking effort, it could benefit from a more explicit discussion of the implications of these shifts. The Findings section should not only present findings but also analyze their significance. This would provide a more holistic understanding of how GenAI is changing the nature of critical thinking in knowledge work and what challenges and opportunities this presents.
Implementation: Add a paragraph or subsection discussing the implications of the three shifts in critical thinking effort. For example, discuss how the shift from information gathering to verification might require new skills or training for knowledge workers, or how the shift from task execution to stewardship might change the roles and responsibilities of workers.
This medium-impact improvement would enhance the section's clarity and completeness by providing a more explicit link between the quantitative findings (Section 5.1) and the qualitative shifts (Section 5.2). While the section presents both quantitative and qualitative results, it doesn't fully integrate them. The Findings section should synthesize different types of evidence to provide a coherent narrative. This addition would strengthen the paper by showing how the quantitative data supports and is explained by the qualitative findings, creating a more compelling argument.
Implementation: Add a paragraph or sentences explicitly connecting the quantitative findings (e.g., decreased perceived effort in specific cognitive activities) to the qualitative shifts (e.g., from information gathering to verification). For example: 'The quantitative finding that knowledge workers perceive decreased effort in Knowledge and Comprehension (Figure 2) aligns with the qualitative shift from information gathering to verification. Because GenAI automates the process of information retrieval, workers perceive less effort in this area, but, as the qualitative data shows, they now invest more effort in verifying the accuracy of the AI-generated information.'
Figure 2: Distribution of perceived effort (%) in cognitive activities (based on Bloom's taxonomy) when using a GenAI tool compared to not using one.
The section effectively connects the study's findings to practical implications for designing GenAI tools, addressing the core concerns of the research.
The section discusses the duality of confidence (self-confidence vs. confidence in AI) and its impact on critical thinking, providing a nuanced perspective on user engagement with AI.
The section proposes specific design strategies, such as incorporating feedback mechanisms and explicit controls, to address task confidence recalibration and support user empowerment.
The discussion of awareness, motivation, and execution of critical thinking provides a comprehensive framework for understanding the barriers and facilitators of critical thinking in the context of GenAI use.
The section identifies and describes three key shifts in critical thinking due to GenAI, providing a clear and concise summary of how GenAI is changing the nature of cognitive work.
The section connects the findings to prior work and theoretical frameworks, referencing concepts such as cognitive offloading, explainable AI, and human-AI collaboration.
The section introduces the concept of "task stewardship" to describe the shift in cognitive effort from task execution to oversight, providing a novel and insightful metaphor for understanding the changing role of knowledge workers.
The section acknowledges the limitations of the study, including potential biases in self-reporting, sample demographics, and the evolving nature of GenAI tools, providing a balanced and critical assessment of the research.
This high-impact improvement would significantly strengthen the discussion by providing a more in-depth analysis of the relationship between the three identified shifts in critical thinking effort. Currently, these shifts are presented as distinct phenomena. The Discussion section should synthesize findings and explore their interconnections. This would provide a more nuanced and holistic understanding of how GenAI is transforming critical thinking in knowledge work and would offer more robust implications for design and training.
Implementation: Add a paragraph or subsection explicitly discussing the relationships between the three shifts (information verification, response integration, and task stewardship). For example, discuss how information verification might be a prerequisite for effective response integration, or how task stewardship encompasses both verification and integration. Consider using a diagram or visual model to illustrate these relationships.
This medium-impact improvement would enhance the discussion's practical relevance by providing more concrete examples of how the identified design implications could be implemented in specific GenAI tools or workflows. The Discussion section should bridge the gap between theoretical findings and practical application. This would make the recommendations more actionable and useful for designers and developers of GenAI systems.
Implementation: For each design implication (e.g., enhancing awareness, motivation, ability), provide one or two concrete examples of how this could be achieved in a specific GenAI tool or workflow. For example, for enhancing awareness, describe a feature that proactively highlights potential biases in AI-generated text, or for improving ability, describe a tool that provides step-by-step guidance for verifying information from multiple sources.
This medium-impact improvement would strengthen the discussion by explicitly addressing the potential trade-offs between efficiency gains and critical thinking engagement. The Discussion section should acknowledge the complexities and potential downsides of GenAI use. This would provide a more balanced and realistic assessment of the impact of GenAI on knowledge work and would inform the design of more responsible and ethical AI systems.
Implementation: Add a paragraph or subsection discussing the potential trade-offs between efficiency and critical thinking. For example, acknowledge that while GenAI can automate tasks and reduce effort, this might also lead to reduced engagement and a decline in certain cognitive skills. Discuss how designers can mitigate these trade-offs and promote a balance between efficiency and critical thinking.
This low-impact improvement would improve the discussion's clarity and flow by more explicitly connecting the discussion of limitations (Section 6.3) to the earlier findings and implications. The Discussion section should present a cohesive narrative. This would strengthen the paper by showing how the limitations are considered in the context of the overall findings and how they might inform future research.
Implementation: Add sentences or phrases throughout the discussion (Sections 6.1 and 6.2) that briefly refer to the limitations and how they might affect the interpretation or generalizability of the findings. For example, when discussing the relationship between confidence and critical thinking, mention that self-reported confidence might not always align with objective expertise (as noted in the limitations).