The Impact of Generative Al on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers

Section Analysis

Abstract

Key Aspects

Problem Statement: GenAI's Impact on Critical Thinking: The abstract introduces the central research problem: the impact of Generative AI (GenAI) on critical thinking skills and practices in knowledge work. It establishes the context by noting the increasing use of GenAI in professional workflows and the associated concerns about its potential effects on cognitive processes. This sets the stage for the study's investigation into how GenAI influences critical thinking.
Methodology: Survey of Knowledge Workers: The study employs a survey methodology, gathering data from 319 knowledge workers. The abstract specifies that participants provided 936 examples of GenAI use in their work. This indicates a substantial dataset, allowing for both quantitative and qualitative analysis. The research focuses on understanding when and how workers perceive critical thinking in GenAI-assisted tasks, and how GenAI affects the effort involved.
Quantitative Findings: Confidence and Critical Thinking: The abstract presents key quantitative findings, highlighting the relationship between confidence and critical thinking. It states that higher confidence in GenAI is associated with reduced critical thinking, while higher self-confidence in the task is linked to increased critical thinking. This suggests a complex interplay between user trust, AI capabilities, and cognitive engagement.
Qualitative Findings: Shift in Critical Thinking Nature: The abstract summarizes qualitative findings, indicating that GenAI shifts the nature of critical thinking. It identifies three key areas: information verification (checking the accuracy and reliability of AI-generated content), response integration (incorporating AI output into a broader workflow), and task stewardship (overseeing and managing the overall task, including AI's contribution). This provides a nuanced understanding of how critical thinking manifests in the context of GenAI use.
Implications: Design Challenges and Opportunities: The abstract concludes by emphasizing the practical implications of the research. It points to new design challenges and opportunities for developing GenAI tools that better support knowledge work. This highlights the study's contribution to the field of Human-Computer Interaction (HCI) and its potential to inform the design of more effective and cognitively supportive AI systems.

Strengths

Clear Research Questions
The abstract clearly states the research questions, providing a concise overview of the study's objectives.

"We survey 319 knowledge workers to investigate 1) when and how they perceive the enaction of critical thinking when using GenAI, and 2) when and why GenAI affects their effort to do so." (Page 1)
Concise Summary of Findings
The abstract succinctly summarizes the key findings, including both quantitative and qualitative results.

"Quantitatively... a user’s task-specific self-confidence and confidence in GenAI are predictive of whether critical thinking is enacted and the effort of doing so in GenAI-assisted tasks. Specifically, higher confidence in GenAI is associated with less critical thinking, while higher self-confidence is associated with more critical thinking. Qualitatively, GenAI shifts the nature of critical thinking toward information verification, response integration, and task stewardship." (Page 1)
Highlights Implications
The abstract effectively highlights the implications of the study, pointing to design challenges and opportunities.

"Our insights reveal new design challenges and opportunities for developing GenAI tools for knowledge work." (Page 1)

Suggestions for Improvement

Specify Participant Group
This medium-impact improvement would enhance the reader's understanding of the scope of the study. The abstract, being the first point of contact, should provide a clear indication of the study population. Explicitly stating "knowledge workers" were the subjects improves the clarity. This addition provides immediate context for the reader, helping them understand the target population and the generalizability of the findings.

"We survey 319 knowledge workers to investigate 1) when and how they perceive the enaction of critical thinking when using GenAI, and 2) when and why GenAI affects their effort to do so." (Page 1)

Implementation: Add the phrase "knowledge workers" to the first sentence, specifying the participant group. For example: "We survey 319 knowledge workers..."
Mention Mixed-Methods Approach
This low-impact improvement would provide a more complete overview of the research approach. While the abstract mentions a survey, briefly including the use of both quantitative and qualitative methods adds a layer of detail. This clarifies the mixed-methods nature of the study, enhancing the reader's understanding of the methodological rigor.

"Participants shared 936 first-hand examples of using GenAI in work tasks." (Page 1)

Implementation: Add a phrase indicating the mixed-methods approach. For example, after mentioning the survey, add: "...using both quantitative and qualitative analyses."
Clarify Types of Knowledge Work
This high-impact improvement would strengthen the abstract by providing a more specific indication of the type of knowledge work investigated. This enhances the abstract's informativeness and helps readers quickly assess the study's relevance to their own interests. It also clarifies the scope and context, improving the abstract's overall impact.

"Participants shared 936 first-hand examples of using GenAI in work tasks." (Page 1)

Implementation: Include a brief phrase or a few examples of the types of knowledge work considered. For instance: "...in work tasks (e.g., writing, data analysis, coding)."

Introduction

Key Aspects

Problem Statement: GenAI and Critical Thinking: The introduction establishes the central research problem: the potential impact of Generative AI (GenAI) tools on critical thinking in knowledge work. It frames this problem within a historical context of concerns about technology's influence on human thought, citing examples like writing, printing, and calculators. This historical perspective highlights the ongoing debate about technology's role in shaping cognitive abilities.
Scope: Critical Thinking in Knowledge Work: The introduction defines the scope of the research as focusing on critical thinking within the context of knowledge work, distinguishing it from studies primarily focused on educational settings. It references Drucker and Kidd's conceptualizations of knowledge work, establishing a theoretical foundation for the investigation. This clarifies the target population and the specific type of cognitive activity under examination.
Research Gap: Mechanised Convergence and Critical Thinking: The introduction identifies a key gap in the existing literature: the lack of direct empirical evidence linking 'mechanised convergence' (reduced diversity of outcomes in AI-assisted tasks) to a potential decline in critical thinking. It argues that output diversity is a flawed proxy for critical thinking, motivating the need for a more direct investigation of knowledge workers' perceptions and experiences.
Research Questions: Enaction and Effort: The introduction presents two clear research questions (RQ1 and RQ2) that directly address the identified research gap. RQ1 focuses on when and how knowledge workers perceive the enactment of critical thinking when using GenAI. RQ2 investigates when and why GenAI affects the perceived effort required for critical thinking. These questions guide the study's investigation and provide a clear framework for the research.
Methodology: Survey and Enaction: The introduction provides a concise overview of the study's methodology: a survey of 319 knowledge workers, eliciting 936 real-world examples of GenAI use in work tasks. It highlights the focus on 'enaction' of critical thinking, acknowledging the difficulty of directly observing and reporting on internal mental processes. This clarifies the research approach and the type of data collected.
Findings (RQ1): Critical Thinking for Quality: The introduction briefly summarizes the key findings related to RQ1, highlighting that knowledge workers primarily engage in critical thinking to ensure work quality. It describes their approach as involving goal setting, prompt refinement, and assessment of AI-generated content. The introduction also mentions motivators (enhancing quality, avoiding negative outcomes, skill development) and inhibitors (lack of awareness, limited motivation, difficulty improving AI responses).
Findings (RQ2): Effort and Confidence: The introduction summarizes findings related to RQ2, indicating that GenAI tools generally reduce the perceived effort for critical thinking tasks, particularly for those with higher confidence in AI. However, it also notes that workers with higher self-confidence in their skills perceive greater effort, especially when evaluating and applying AI responses. This highlights a complex relationship between confidence, AI use, and cognitive effort.
Shift in Cognitive Effort: From Execution to Oversight: The introduction describes a shift in cognitive effort, with knowledge workers moving from task execution to oversight when using GenAI. It connects this finding to prior studies observing a shift 'from material production to critical integration' [114], emphasizing the broader relevance of this phenomenon in real-world GenAI use across various tasks and professions.
Paper Contributions: The introduction outlines the paper's contributions, including a literature review, the development and deployment of a survey, findings related to perceived effort and over-reliance on AI, and implications for designing GenAI tools to support critical thinking. This provides a roadmap of the paper's structure and its contribution to the field.

Strengths

Historical Contextualization
The introduction effectively establishes the context by referencing historical concerns about the impact of new technologies on human thought, drawing parallels to writing, printing, and calculators.

"Generative AI (GenAI) tools... are the latest in a long line of technologies that raise questions about their impact on the quality of human thought, a line that includes writing (objected to by Socrates), printing (objected to by Trithemius), calculators (objected to by teachers of arithmetic), and the Internet." (Page 1)
Clear Scope Definition
The introduction clearly defines the scope of the research, focusing on critical thinking in knowledge work and distinguishing it from studies in educational settings.

"Moreover, we focus on critical thinking for knowledge work (as conceptualised by Drucker [30] and Kidd [67])." (Page 1)
Well-Defined Research Questions
The research questions are clearly stated and directly address the identified gap in the literature regarding the impact of GenAI on critical thinking.

"RQ1 When and how do knowledge workers perceive the enaction of critical thinking when using GenAI? RQ2 When and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI?" (Page 2)
Motivation for Research
The introduction effectively motivates the research by highlighting the lack of empirical evidence connecting mechanized convergence with critical thinking.

"However, we lack direct empirical evidence for an interpretation that posits a connection between mechanised convergence and critical thinking." (Page 2)
Methodology Overview
The introduction concisely outlines the study's methodology, mentioning the survey of knowledge workers and the elicitation of real-world examples.

"In this paper, we aim to address this gap by conducting a survey of a professionally diverse set of knowledge workers (𝑛 = 319), eliciting detailed real-world examples of tasks (936) for which they use GenAI..." (Page 2)
Summary of Key Findings
The introduction provides a brief overview of the key findings, linking confidence in GenAI and self-confidence to critical thinking engagement.

"Regarding RQ2 (Section 5), GenAI tools appear to reduce the perceived effort required for critical thinking tasks among knowledge workers, especially when they have higher confidence in AI capabilities. However, workers who are confident in their own skills tend to perceive greater effort in these tasks, particularly when evaluating and applying AI responses." (Page 2)
Clear Contributions
The introduction succinctly lists the paper's contributions, connecting them to the literature review, survey deployment, and design implications.

"Our paper makes the following contributions: • We review the literature on interaction design interventions for critical thinking... • We describe the development and deployment of a survey for gathering empirical evidence... • Drawing from our survey insights, we highlight how the use of GenAI tools creates new challenges..." (Page 2)

Suggestions for Improvement

Define 'Critical Thinking' Earlier
This high-impact improvement would significantly enhance the introduction's framing by explicitly defining 'critical thinking' earlier. Currently, the term is used without a clear definition until Section 2. The Introduction section sets the stage for the entire paper, and defining this central concept early is crucial for reader comprehension. This change would strengthen the paper's conceptual foundation and improve clarity from the outset.

"In this paper, we focus on a higher-level concept that captures another aspect of thought considered desirable and worthy of preservation: critical thinking (defined in Section 2)." (Page 1)

Implementation: Provide a concise definition of 'critical thinking' in the first or second paragraph, referencing the chosen framework (Bloom's taxonomy). For example: '...critical thinking (defined here, following Bloom et al. [12, 54], as a hierarchical set of cognitive skills including knowledge, comprehension, application, analysis, synthesis, and evaluation).'
Specify Types of Knowledge Work
This medium-impact improvement would provide a more complete picture of the research context. While the introduction mentions 'knowledge work,' it doesn't specify the types of knowledge work included. The Introduction is responsible for establishing the scope, and adding this detail would enhance reader understanding and allow for better assessment of the study's relevance. This would strengthen the paper's external validity by clarifying the target population.

"Moreover, we focus on critical thinking for knowledge work (as conceptualised by Drucker [30] and Kidd [67])." (Page 1)

Implementation: Add a brief phrase or examples of the types of knowledge work considered. For instance, after mentioning 'knowledge work,' add: '...involving tasks such as writing, data analysis, programming, and strategic planning.'
Mention Mixed-Methods Approach
This low-impact improvement would add a layer of detail to the methodological overview. While the introduction mentions a survey, briefly stating the use of both quantitative and qualitative analysis would be beneficial. The Introduction should provide a high-level overview of the approach, and this addition enhances completeness. It signals the mixed-methods nature of the study, improving the reader's understanding of the research design.

"In this paper, we aim to address this gap by conducting a survey of a professionally diverse set of knowledge workers (𝑛 = 319)..." (Page 2)

Implementation: Add a phrase indicating the mixed-methods approach. For example: '...conducting a survey...and analyzing the data using both quantitative and qualitative methods.'
Connect Mechanised Convergence to RQs
This medium-impact improvement would strengthen the link between the problem statement and the research questions. While the introduction mentions 'mechanised convergence,' it doesn't explicitly connect this concept to the research questions. The Introduction should clearly establish this link to reinforce the study's motivation. This addition would improve the logical flow and coherence of the introduction.

"Recent work has motivated the need for critical thinking support in AI-assisted knowledge work [116, 119]. It is motivated primarily by the observation of the tendency of AI-assisted knowledge workflows to be subject to “mechanised convergence” [114]..." (Page 1)

Implementation: Add a sentence or phrase directly linking 'mechanised convergence' to the research questions. For example: '...to investigate whether and how this mechanised convergence, potentially reflecting a decline in critical thinking, is perceived by knowledge workers using GenAI (RQ1) and how it affects their cognitive effort (RQ2).'

Method

Key Aspects

Study Objective and Design: The Method section describes the study's objective: to investigate knowledge workers' perceptions of critical thinking enaction and effort when using GenAI tools. This is achieved through an online survey conducted on the Prolific platform, targeting knowledge workers who use GenAI tools at least once a week.
Cognitive Priming of Participants: Participants were introduced to the concept of critical thinking through concrete examples related to GenAI use, covering various levels of Bloom's taxonomy. This 'cognitive priming' aimed to ensure participants understood the scope of critical thinking and could better recognize their own critical thinking behaviors.
Data Collection: Real-World Examples: The study collected 936 real-world examples of GenAI tool use from 319 participants. Participants described their tasks, the GenAI tools used, and how critical thinking played a role. This provided a rich dataset for both quantitative and qualitative analysis.
Quantitative Analysis for RQ1: Regression Model: The study employed an explanatory regression model to answer RQ1 (when and how knowledge workers perceive critical thinking enaction). The dependent variable was whether participants perceived they had performed critical thinking. Independent variables included task factors (e.g., task type, confidence) and user factors (e.g., age, occupation, tendency to reflect, trust in GenAI).
Quantitative Analysis for RQ2: Regression Models: To answer RQ2 (when and why knowledge workers perceive increased/decreased effort), the study used explanatory regression models. Dependent variables measured perceived effort change for different cognitive activities (Bloom's taxonomy) when using GenAI. Independent variables were the same as for RQ1.
Task-Related Factors: Task-related factors included task type (classified according to Brachman et al.'s taxonomy: creation, information, advice), and task confidence (confidence in self, GenAI, and evaluation, measured on a five-point scale).
User Factors: User factors included participants' tendency to reflect on work (measured using Kember et al.'s Reflective Thinking Inventory), trust in GenAI (adapted from the Propensity to Trust Technology scale), and demographic information (gender, age, occupation). Occupations were classified as being at risk of automation based on Ghosh et al.'s analyses.
Measurement of Critical Thinking and Effort: Critical thinking enaction was measured by asking participants whether they had performed critical thinking for each task (yes/no) and to justify their response in free text. Perceived effort change was measured using a five-point scale for each cognitive activity in Bloom's taxonomy.
Participant Recruitment and Data Cleaning: Participants were recruited through Prolific, with the criterion of using GenAI tools at work at least once a week. Data cleaning involved removing responses with insufficient detail or duplicated/non-GenAI tool use examples.
Statistical Modeling: Quantitative analysis involved fitting random-intercepts logistic and linear regression models, accounting for repeated measures and correcting for multiple comparisons using the Benjamini-Hochberg procedure.
Qualitative Analysis: Open Coding and Negotiated Agreement: Qualitative analysis involved open-coding participants' free-text responses to understand why they did or did not think critically and why they perceived effort changes. A codebook was developed iteratively, and negotiated agreement was used to resolve disagreements.

Strengths

Clear Research Questions
The section clearly outlines the research questions, providing a direct link to the introduction and setting the stage for the methodological approach.

"To answer our research questions when and how knowledge workers perceive the enaction of critical thinking when using GenAI (RQ1), and when and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI (RQ2) we conducted an online survey on the Prolific platform² to study knowledge workers experiences with critical thinking when using GenAI tools for their work." (Page 4)
Overview of Survey Design
The section provides a concise overview of the survey design, including the cognitive priming of participants with examples of critical thinking.

"To ensure participants fully understood the scope and meaning of our questions on critical thinking, as part of the survey study onboarding, they were introduced to the concept of critical think- ing in the context of using GenAI through concrete examples of how critical thinking could be applied at various levels of Bloom s taxonomy (e.g., checking the tone of generated emails, verifying the accuracy of code snippets, and assessing potential biases in data insights)." (Page 4)
Detailed Variable Description
The section details the dependent and independent variables used in the regression models, providing a clear understanding of the quantitative analysis approach.

"To answer RQ1, we created an explanatory regression model with a dependent variable measuring whether participants perceived the enaction of critical thinking when using GenAI tools for the tasks they shared, and independent variables corresponding to two sets of factors that we hypothesised might correlate with the tendency to engage with tasks critically: 1) task factors: measures about the task at hand e.g., task type, confidence in doing the task. 2) User factors: measures about users e.g., age, gender, occupation, tendency to reflect in work, and trust in GenAI." (Page 4)
Operationalization of Task Types
The section describes the operationalization of task types based on Brachman et al.'s taxonomy, providing a clear framework for classifying participant examples.

"Task type. Brachman et al. [13] classify knowledge workers cur- rent usage of GenAI tools into nine types (See Table 1), grouped into three major categories: 1) for creation, 2) to find or work with information, 3) to get advice." (Page 4)
Measurement of Task Confidence
The section explains the measurement of task confidence, including confidence in self, GenAI, and evaluation, providing a comprehensive assessment of user confidence.

"Task confidence. Guided by prior studies on user confidence in AI-assisted decision-making [20, 85, 130], for each self-reported task we consider three aspects of user confidence: 1) confidence in self (i.e., How confident are you in your ability to do this task without GenAI?), 2) confidence in GenAI (i.e., How confident are you in the ability of GenAI to do this task?), and 3) confidence in evaluation (i.e., How confident are you, in the course of your normal work, in evaluating the output that AI produces for this task?)." (Page 5)
Use of Validated Instruments
The section describes the use of validated instruments to measure user factors, such as the Reflective Thinking Inventory and the Propensity to Trust Technology scale.

"Tendency to reflect on work. We use Kember et al. [65] s Reflec- tive Thinking Inventory to measure participants baseline tendency to think reflectively." (Page 5)
Definition of Cognitive Activities
The section clearly defines the cognitive activities based on Bloom's taxonomy, providing a framework for assessing perceived effort in critical thinking.

"Perceived effort in critical thinking: Bloom s taxonomy. As dis- cussed in Section 2, we selected Bloom s taxonomy as the frame- work to operationalise the measurement of critical thinking activ- ities [12]." (Page 5)
Detailed Recruitment Process
The section details the recruitment process through the Prolific platform, including the criteria for participant selection.

"We recruited participants through the Prolific platform who self- reported using GenAI tools at work at least once per week. This criterion ensured the study focused on knowledge workers with direct, ongoing experience integrating GenAI tools into their day- to-day work tasks." (Page 5)
Data Cleaning and Analysis Procedures
The section describes the data cleaning and analysis procedures, including the handling of missing data and the use of regression models.

"3.3.1 Dataset Cleaning and Overview. Our 319 participants shared a total of 957 real-world examples of their use of GenAI tools at work. We removed 11 examples lacking sufficient detail to analyse (e.g., brief or vague examples like To build my portfolio. ). We also removed 11 examples for which a participant shared duplicated or non-GenAI tool use examples in their responses." (Page 6)
Qualitative Analysis Approach
The section outlines the qualitative analysis approach, including the open-coding process and the use of negotiated agreement.

"3.3.3 Qualitative Analysis. Guided by our research questions, we open-coded [23] participants free-text responses on i) why they did or did not think critically when using GenAI tool for the task, ii) why they perceived more or less effort to perform critical think- ing activities with the GenAI tool." (Page 7)

Suggestions for Improvement

Specify Versions of Online Resources
This high-impact improvement would significantly enhance the clarity and reproducibility of the study. The Methods section should explicitly state the specific version or date of access for the Prolific platform and the O*NET occupational listings. Different versions can have variations in features, participant pools, or classification systems. Providing this detail ensures that other researchers can accurately replicate the study conditions and understand the specific context of the data collection. This is crucial for the study's transparency and scientific rigor.

"We recruited participants through the Prolific platform" (Page 5)

Implementation: Include the specific version or date of access for both Prolific and ONET. For example: 'We recruited participants through the Prolific platform (version accessed March 6, 2025)...' and '...from the Occupational Information Network (ONET)'s occupational listings (version 2019, accessed March 6, 2025)...'
Define Criteria for Excluding Low-Quality Responses
This medium-impact improvement would enhance the study's methodological rigor and transparency. While the section mentions excluding responses due to low quality, it doesn't define the specific criteria used to determine 'low-effort free-text responses.' The Methods section should provide these operational definitions. This addition would strengthen the paper by allowing readers to evaluate the validity of the exclusion criteria and understand how data quality was ensured. It also improves the reproducibility of the study.

"We received 333 responses but excluded 14 from the analysis due to low response quality (i.e., low-effort free-text responses)." (Page 6)

Implementation: Provide specific criteria for excluding low-quality responses. For example: '...due to low response quality (i.e., responses with fewer than 10 words, responses containing only gibberish, or responses that did not address the question).'
Specify Randomization Method
This medium-impact improvement would strengthen the study's internal validity and address potential biases. The Methods section mentions randomizing the order of task types but doesn't specify the method of randomization. The Methods section should detail the randomization procedure. This addition is important because different randomization methods can have different properties, and explicitly stating the method used allows readers to assess the potential for bias and the effectiveness of the randomization in controlling for order effects.

"The order of task types was randomised to avoid order and fatigue effects." (Page 6)

Implementation: Specify the method of randomization. For example: 'The order of task types was randomised using a Latin square design...' or '...using a random number generator in the survey software.'
Include IRB Approval Number and Institution
This low-impact improvement would provide additional context and clarity regarding the study's ethical considerations. While the section mentions IRB approval, it doesn't specify the name of the institution or the approval number. Including this information, while not strictly necessary, enhances transparency. It allows readers to verify the ethical oversight of the study and provides a point of contact for any ethical inquiries.

"Our study protocol was approved by our institution's ethics and compliance review board." (Page 6)

Implementation: Include the name of the institution and the IRB approval number. For example: 'Our study protocol was approved by the [Institution Name] ethics and compliance review board (Approval #XXXX).'

Non-Text Elements

Table 1: Categories and sub-categories for GenAI tool usage [13].

Figure/Table Image (Page 5)

First Reference in Text

Task type. Brachman et al. [13] classify knowledge workers' current usage of GenAI tools into nine types (See Table 1), grouped into three major categories: 1) for creation, 2) to find or work with information, 3) to get advice.

Description

Categorization of GenAI Tool Usage: Table 1 categorizes the different ways knowledge workers use GenAI tools. The categories are grouped into three major areas: 1) creation, 2) information handling, and 3) advice-seeking. Within these major categories, there are nine specific types of GenAI tool usage. For creation, the tool can generate an artefact or an idea. For information handling, the tool can search, help a user learn, or summarize information. For advice, the tool can improve something, provide guidance, or validate an artefact.

Scientific Validity

Classification based on existing research and MECE categories.: The classification is based on existing research (Brachman et al. [13]), which adds to its validity. The categories are mutually exclusive and collectively exhaustive (MECE), which is important for a well-defined taxonomy. The table serves as a basis for further analysis and comparison within the study.

Communication

The table is clearly referenced and contributes to the overall understanding of the methodology.: The table provides a structured overview of how GenAI tools are used, which helps readers understand the scope of the study and how different tasks were classified. The reference to Brachman et al. [13] provides credibility and allows readers to explore the original source for more details.

Table 2: Cognitive activities defined in Bloom's taxonomy [12].

Figure/Table Image (Page 6)

First Reference in Text

See Table 2 for more details.

Description

Bloom's Taxonomy Levels: Table 2 outlines Bloom's taxonomy, which categorizes learning objectives into different levels of cognitive complexity. The levels are: Knowledge (recalling information), Comprehension (understanding the meaning), Application (using knowledge in new situations), Analysis (breaking down information), Synthesis (creating something new), and Evaluation (judging the value of information). Each level is defined by specific cognitive activities.

Scientific Validity

Validity of Bloom's Taxonomy: Bloom's taxonomy is a widely accepted and validated framework in education and cognitive psychology. Its inclusion provides a solid theoretical basis for analyzing critical thinking activities. The table accurately represents the core concepts of Bloom's taxonomy, providing a valid framework for the study.

Communication

Clarity and conciseness of descriptions: The table effectively communicates the different levels of cognitive skills within Bloom's taxonomy. The descriptions are concise and provide a clear understanding of each level. Referencing Bloom's taxonomy [12] provides a strong foundation and allows readers to consult the original source for a more in-depth explanation.

Table 3: Participant demographics.

Figure/Table Image (Page 6)

First Reference in Text

The 319 participants (159 men, 153 women, 5 non-binary/gender diverse, 2 prefer not to say) came from diverse age groups, occupations, and countries of residence (see Table 3).

Description

Demographic Characteristics: Table 3 presents the demographic characteristics of the 319 participants in the study. The table includes information on gender (159 men, 153 women, 5 non-binary/gender diverse, and 2 prefer not to say), age groups, occupations, and countries of residence. The top 5 GenAI tools used are also listed, along with the top 5 occupations and countries of residency. The table provides a snapshot of the sample's diversity across various attributes.

Scientific Validity

Importance of demographic information: Providing participant demographics is crucial for assessing the generalizability of the study's findings. Including a diverse range of demographic factors strengthens the study's rigor. Listing the top 5 GenAI tools and occupations adds valuable context for interpreting the results.

Communication

Comprehensive overview of participant demographics: The table effectively communicates the demographic composition of the study participants. Including information on gender, age, occupation, and country of residence provides a comprehensive overview of the sample. The table format allows for easy comparison of different demographic categories.

Figure 1: Schematic overview of the survey design and our corresponding...

Full Caption

Figure 1: Schematic overview of the survey design and our corresponding analysis approach.

Figure/Table Image (Page 7)

First Reference in Text

Both RQ1 when and how do knowledge workers perceive the enaction of critical thinking when using GenAI? - and RQ2 - when and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI? - were answered via both quantitative and qualitative analysis (See Figure 1 for an overview of our approach).

Description

Schematic overview of the methodology: Figure 1 provides a schematic overview of the study's survey design and analysis approach. The figure outlines the key components of the survey, including the GenAI tool use examples, task factors, critical thinking assessment, user factors, and the types of analysis performed (qualitative and quantitative). The diagram visually connects the survey questions to the research questions (RQ1 and RQ2) and indicates the flow of information from the survey to the analysis.

Scientific Validity

Accurate representation of the methodology: The schematic accurately represents the methodology described in the paper. The connection between survey components, research questions, and analysis methods is clearly depicted. The figure provides a valid overview of the study's design and contributes to the transparency and reproducibility of the research.

Communication

Clarity and organization of the schematic: The figure clearly illustrates the survey design and analysis approach. The use of color-coding and labels helps to differentiate between different components. The schematic provides a high-level understanding of the study's methodology, making it easier for readers to follow the subsequent analysis.

Table 4: Non-standardised coefficients of the mixed-effects regressions...

Full Caption

Table 4: Non-standardised coefficients of the mixed-effects regressions modeling knowledge workers' perceived enaction of critical thinking and perceived effort in cognitive activities when using generative AI tools.

Figure/Table Image (Page 10)

First Reference in Text

Table 4 summarises the seven models and reports the corrected p-values.

Description

Regression Model Results: Table 4 presents the non-standardized coefficients from mixed-effects regression models used to analyze the relationship between various factors and knowledge workers' perceived enaction of critical thinking and perceived effort in cognitive activities when using GenAI tools. The table includes several models, each examining a different dependent variable, such as perceived enaction of critical thinking, knowledge, comprehension, application, analysis, synthesis, and evaluation. The independent variables include task factors (task type, confidence in self, confidence in AI, confidence in evaluation) and user factors (gender, age, occupation, tendency to reflect, trust in GenAI). Each cell in the table contains the coefficient, and the p-value, indicating the statistical significance of the relationship.

Scientific Validity

Appropriateness of statistical methods: Mixed-effects regression is an appropriate method for analyzing the data, as it accounts for the nested structure of the data (multiple observations per participant). Reporting non-standardized coefficients is useful for comparing the relative magnitude of effects across different variables. Including corrected p-values addresses the issue of multiple comparisons and strengthens the validity of the findings. It is crucial that the assumptions of the mixed models were checked and met.

Communication

Density and accessibility of the table: The table is quite dense and may be challenging for readers unfamiliar with regression analysis. Clearer labeling of the rows and columns, along with a brief explanation of the key variables, could improve its accessibility. Highlighting the significant p-values more prominently would also help readers quickly identify the key findings.

Findings for RQ1: When and how do knowledge workers perceive the enaction of critical thinking when using GenAI?

Key Aspects

Objective: Addressing RQ1: The section's primary objective is to address RQ1: "When and how do knowledge workers perceive the enaction of critical thinking when using GenAI?". This is achieved through a combination of qualitative and quantitative analyses, examining knowledge workers' definitions, practices, motivators, and inhibitors related to critical thinking in the context of GenAI use.
Definition of Critical Thinking: Knowledge workers define critical thinking in the context of GenAI use as cognitive activities performed to ensure the quality of AI responses and intentionality while using the tools. This encompasses a range of practices aimed at verifying, validating, and integrating AI-generated content into their work.
Critical Thinking Phases: The study categorizes critical thinking practices into three main phases: 1) Goal and query formation (setting objectives and crafting prompts), 2) Inspect response (evaluating AI output based on objective and subjective criteria, verifying information), and 3) Integrate response (selecting and modifying AI-generated content to fit the task).
Goal and Query Formation: During goal and query formation, knowledge workers enact critical thinking by reflecting on their goals and needs (Form goal: 6/319) and by creating or revising prompts to elicit desired responses from the GenAI tool (Form query: 30/319).
Inspect Response: In the 'Inspect response' phase, knowledge workers engage in critical thinking by ensuring the quality of AI output through objective criteria (125/319) and subjective standards (77/319). They also verify information by assessing referenced sources (23/319) and cross-referencing external sources (114/319).
Integrate Response: The 'Integrate response' phase involves critical thinking through the selection and incorporation of relevant parts of the GenAI output (Integrate partial response: 36/319) and the modification of the output's style to be appropriate for the task (Modify style to be appropriate for the task: 45/319).
Quantitative Findings: Confidence and Reflection: The quantitative analysis reveals a positive correlation between knowledge workers' confidence in themselves (doing and evaluating the task) and their general tendency to reflect on work, and their perceived enaction of critical thinking. Conversely, there's a negative correlation between perceived critical thinking enaction and confidence in the AI doing the task.
Critical Thinking Motivators: The qualitative analysis identifies three key motivators for enacting critical thinking: 1) Work quality (74/319): Improving the quality of the work artifact. 2) Potential negative outcomes (116/319): Avoiding harm or undesirable consequences. 3) Skill development (13/319): Learning and improving skills related to the task.
Critical Thinking Inhibitors: The study identifies three types of inhibitors to critical thinking: 1) Awareness barriers: Including situations where critical thinking is deemed unnecessary due to the task's perceived unimportance or the user's trust and reliance on GenAI (secondary tasks (14/319), trivial tasks (55/319), trust in GenAI (83/319)). 2) Motivation barriers: Such as lack of time (44/319) or lack of incentives (11/319) when critical thinking isn't perceived as part of job responsibilities. 3) Ability barriers: Including difficulties in inspecting AI responses (58/319) (e.g., lack of domain knowledge) and revising queries to improve the output (72/319).

Strengths

Clear Introduction and Overview
The section clearly introduces the research question (RQ1) and provides a concise overview of the approach, linking qualitative and quantitative analyses.

"To answer RQ1, we investigated how knowledge workers define critical thinking (Section 4.1), and when (Section 4.2) and why (Section 4.3) they enact critical thinking in their use of GenAI tools." (Page 7)
Concise Summary of Qualitative Findings
The section effectively summarizes the qualitative findings, highlighting that knowledge workers view critical thinking as ensuring work quality and objectives.

"Qualitatively, we found that knowledge workers view critical thinking as ensuring the objectives and quality of their work." (Page 7)
Overview of Quantitative Findings
The section provides a brief overview of the quantitative findings, mentioning the correlation between confidence, reflection, and critical thinking enaction.

"Through our quantitative analysis of when knowledge workers do critical thinking, we found their confidence in themselves doing and evaluating the task, and their general tendency to reflect on work strongly correlated with their perceived enaction of critical thinking." (Page 7)
Structured Framework for Critical Thinking Practices
The section clearly maps knowledge workers' critical thinking practices into three phases: goal and query formation, inspect response, and integrate response, providing a structured framework for understanding the process.

"We classified knowledge workers' critical thinking practices into 1) goal and query formation, 2) inspect response, and 3) integrate response." (Page 8)
Detailed Descriptions and Examples
The section provides detailed descriptions of each critical thinking practice within the three phases, including specific examples and participant quotes.

"For example, when P140 tried to learn the functionality of a code snippet through ChatGPT, he saw critical thinking as the need to "analyze what my goal was and how I was going to achieve it... I had to first learn what was I going to use in order to make progress."" (Page 8)
Connection to Prior Work
The section effectively connects the findings to prior work, referencing relevant frameworks and studies on human cognitive problem solving and working with GenAI.

"Our analysis is based primarily on workflow characterisations from previous work [29, 46, 130], though more general frameworks for human cognitive problem solving [137] and problem solving with AI [60, 89, 108] are also related." (Page 8)
Identification of Motivators and Inhibitors
The section identifies and defines key motivators and inhibitors for critical thinking, providing a nuanced understanding of the factors influencing knowledge workers' engagement in critical thinking.

"Finally, we qualitatively analysed participants’ free-text responses to understand why they do or do not enact critical thinking, identifying three key motivators (work quality, potential negative outcomes, skill development) and three inhibitors (awareness, motivation, ability) for critical thinking." (Page 7)

Suggestions for Improvement

Consistently Report Number of Participants for Qualitative Findings
This medium-impact improvement would enhance the clarity and flow of the section by explicitly stating the number of participants who contributed to each qualitative finding (e.g., Form goal (6/319)). While this information is present, it's presented inconsistently. The Findings section should prioritize clarity and consistent presentation of data. This change makes the prevalence of each finding immediately apparent, improving reader comprehension and the overall impact of the qualitative results.

"Form goal (6/319)." (Page 8)

Implementation: Consistently present the number of participants for all qualitative findings in the same format. For example, change 'Form goal (6/319).' to 'Form goal: Six of 319 participants...' and apply this consistently throughout the section.
Improve Readability of Participant References
This low-impact improvement would improve the readability and flow of the section. The section uses parenthetical references to participant numbers (e.g., P140) extensively within sentences. While this provides traceability, it can disrupt the flow. The Findings section should prioritize clear communication of results. Moving these references to the end of the sentence or clause, where appropriate, would improve readability without sacrificing the connection to the participant data.

"For example, when P140 tried to learn the functionality of a code snippet through ChatGPT, he saw critical thinking as the need to "analyze what my goal was and how I was going to achieve it..."" (Page 8)

Implementation: Move parenthetical participant references to the end of the sentence or clause where possible. For example, change 'For example, when P140 tried to learn...' to 'For example, when one participant (P140) tried to learn...'
Discuss Relationships Between Critical Thinking Practices
This medium-impact improvement would strengthen the section's analysis and interpretation of the findings. While the section describes various critical thinking practices, it could benefit from a more explicit discussion of the relationships between these practices. The Findings section should not only present findings but also analyze their interconnections. This would provide a more holistic understanding of how knowledge workers enact critical thinking and how different practices might support or hinder each other.

"We classified knowledge workers’ critical thinking practices into 1) goal and query formation, 2) inspect response, and 3) integrate response." (Page 8)

Implementation: Add a paragraph or subsection discussing the relationships between the identified critical thinking practices. For example, discuss how 'Form goal' and 'Form query' are interrelated, or how 'Ensure quality through objective criteria' might interact with 'Verify information by cross-referencing external sources.'

Findings for RQ2: When and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI?

Key Aspects

Objective: Addressing RQ2: The section's primary objective is to address RQ2: "When and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI?". This is accomplished through a descriptive analysis of participants' perceived effort in cognitive activities (based on Bloom's taxonomy) and a qualitative analysis of their free-text elaborations.
Quantitative Findings: Decreased Perceived Effort: The section presents quantitative findings showing that, in the majority of examples, knowledge workers perceive decreased effort for cognitive activities associated with critical thinking when using GenAI compared to not using it. This trend is observed across most cognitive activities: Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation.
Correlation between Confidence in AI and Effort: Knowledge workers' confidence in AI doing the tasks is negatively correlated with perceived effort for five of the six cognitive activities (all except Application). This suggests that higher confidence in AI leads to a greater perceived reduction in effort.
Correlation between Self-Confidence and Effort: Knowledge workers' self-confidence in doing the task correlates positively with perceived effort in Application and Evaluation. This indicates that workers who are more confident in their own abilities tend to perceive greater effort in these specific activities when using GenAI.
Qualitative Shifts in Critical Thinking Effort: The section identifies three distinct qualitative shifts in the effort of critical thinking due to GenAI: 1) For Knowledge and Comprehension, the effort shifts from information gathering to information verification. 2) For Application, effort shifts from problem-solving to AI response integration. 3) For Analysis, Synthesis, and Evaluation, effort shifts from task execution to task stewardship.
Shift: Information Gathering to Verification: For Knowledge and Comprehension, GenAI automates information retrieval, reducing perceived effort in this area. However, knowledge workers perceive increased effort in verifying the information provided by the AI, as it can be incorrect or require validation.
Shift: Problem-Solving to Response Integration: For Application, GenAI reduces perceived effort in problem-solving by providing personalized solutions. However, knowledge workers must spend effort integrating the AI-generated output, both in terms of content and form, to align it with their specific needs and task requirements.
Shift: Task Execution to Task Stewardship: For Analysis, Synthesis, and Evaluation, GenAI helps scaffold complex tasks and information, automates the creation process, and provides personalized feedback loops. However, knowledge workers perceive increased effort in "AI stewardship" – translating intentions into queries, steering AI responses, and assessing if the AI response meets quality standards.

Strengths

Clear Introduction and Overview
The section clearly introduces the research question (RQ2) and provides a concise overview of the approach, linking descriptive analysis and qualitative shifts in critical thinking effort.

"To answer RQ2, we report a descriptive analysis of participants' per-ceived effort in cognitive activities associated with critical thinking, as defined by Bloom's taxonomy (Section 5.1)... We complement this with an analysis of participants' free text elaborations on why they perceived an increase or de-crease in effort due to GenAI, observing three qualitative shifts in critical thinking effort (Section 5.2)." (Page 11)
Concise Summary of Quantitative Findings
The section effectively summarizes the quantitative findings, highlighting the perceived decreased effort for cognitive activities associated with critical thinking when using GenAI.

"In the majority of examples, knowledge workers perceive decreased effort for cognitive activities associated with critical thinking when using GenAI compared to not using one — examples that were re- ported as "much less effort" or "less effort" comprise 72% in Knowl- edge, 79% in Comprehension, 69% in Application, 72% in Analysis, 76% in Synthesis, and 55% in Evaluation dataset (See Figure 2)." (Page 12)
Identification of Qualitative Shifts in Critical Thinking Effort
The section identifies and describes three distinct shifts in critical thinking effort due to GenAI: from information gathering to information verification, from problem-solving to response integration, and from task execution to task stewardship.

"We found that GenAI tools shift the effort of critical thinking in three distinct ways: for Knowledge and Comprehension, the effort shifts from information gathering to information verification; for Application, effort shifts from problem-solving to AI response integration; and for Analysis, Synthesis, and Evaluation, effort shifts from task execution to task stewardship." (Page 12)
Detailed Descriptions and Examples
The section provides detailed descriptions of each qualitative shift, including specific examples and participant quotes for each cognitive activity category (Knowledge & Comprehension, Application, Analysis, Synthesis, and Evaluation).

"For instance, P232 shared that her market research results through ChatGPT "are im- mediate and at a sufficient level of detail for me to get to grips with the basics of the industries. I would otherwise have to read a lot of press reports and subscribe to multiple newsletters."" (Page 12)
Effective Use of Visual Data Representation
The section presents quantitative data in a clear and visually accessible manner using Figure 2, showing the distribution of perceived effort in cognitive activities.

"Figure 2: Distribution of perceived effort (%) in cognitive activities (based on Bloom's taxonomy) when using a GenAI tool compared to not using one." (Page 13)
Connection to Prior Work and Theory
The section connects the findings to prior work and theoretical frameworks, referencing Bloom's taxonomy and studies on user confidence in AI-assisted decision-making.

"Moreover, knowledge workers' confidence in themselves doing the task correlates positively with perceived effort in Application (𝛽=0.08, 𝑝 = 0.029) and Evaluation ((𝛽=0.10, 𝑝 = 0.027). We qualitatively analyse participant rationales in the next section in more detail, but one explanation for why knowledge workers' confidence in AI and in themselves had the opposite effects on perceived effort in these cognitive activities is the following..." (Page 12)

Suggestions for Improvement

Consistently Report Number of Participants for Qualitative Findings
This medium-impact improvement would enhance the clarity and flow of the section by explicitly stating the number of participants who contributed to each qualitative finding (e.g., "Participants perceived less effort to fetch task-specific information at scale, and in real-time (111/319)."). While this information is present, it's presented inconsistently. The Findings section should prioritize clarity and consistent presentation of data. This change makes the prevalence of each finding immediately apparent, improving reader comprehension and the overall impact of the qualitative results.

"Participants perceived less effort to fetch task-specific infor-mation at scale, and in real-time (111/319)." (Page 12)

Implementation: Consistently present the number of participants for all qualitative findings in the same format. For example, change 'Participants perceived less effort to fetch task-specific information at scale, and in real-time (111/319).' to 'One hundred and eleven of 319 participants...' and apply this consistently throughout the section.
Improve Readability of Participant References
This low-impact improvement would improve the readability and flow of the section. The section uses parenthetical references to participant numbers (e.g., P232) extensively within sentences. While this provides traceability, it can disrupt the flow. The Findings section should prioritize clear communication of results. Moving these references to the end of the sentence or clause, where appropriate, would improve readability without sacrificing the connection to the participant data.

"For instance, P232 shared that her market research results through ChatGPT "are im- mediate and at a sufficient level of detail for me to get to grips with the basics of the industries..."" (Page 12)

Implementation: Move parenthetical participant references to the end of the sentence or clause where possible. For example, change 'For instance, P232 shared...' to 'For instance, one participant (P232) shared...'
Discuss Implications of Shifts in Critical Thinking Effort
This high-impact improvement would strengthen the section's analysis and interpretation of the findings. While the section describes the three shifts in critical thinking effort, it could benefit from a more explicit discussion of the implications of these shifts. The Findings section should not only present findings but also analyze their significance. This would provide a more holistic understanding of how GenAI is changing the nature of critical thinking in knowledge work and what challenges and opportunities this presents.

"We found that GenAI tools shift the effort of critical thinking in three distinct ways: for Knowledge and Comprehension, the effort shifts from information gathering to information verification; for Application, effort shifts from problem-solving to AI response integration; and for Analysis, Synthesis, and Evaluation, effort shifts from task execution to task stewardship." (Page 12)

Implementation: Add a paragraph or subsection discussing the implications of the three shifts in critical thinking effort. For example, discuss how the shift from information gathering to verification might require new skills or training for knowledge workers, or how the shift from task execution to stewardship might change the roles and responsibilities of workers.
Integrate Quantitative and Qualitative Findings More Explicitly
This medium-impact improvement would enhance the section's clarity and completeness by providing a more explicit link between the quantitative findings (Section 5.1) and the qualitative shifts (Section 5.2). While the section presents both quantitative and qualitative results, it doesn't fully integrate them. The Findings section should synthesize different types of evidence to provide a coherent narrative. This addition would strengthen the paper by showing how the quantitative data supports and is explained by the qualitative findings, creating a more compelling argument.

"In the majority of examples, knowledge workers perceive decreased effort for cognitive activities associated with critical thinking when using GenAI compared to not using one...Moreover, knowledge workers tend to perceive that GenAI reduces the effort for cognitive activities associated with critical thinking when they have greater confidence in Al doing the tasks and possess higher overall trust in GenAI (see Table 4)." (Page 12)

Implementation: Add a paragraph or sentences explicitly connecting the quantitative findings (e.g., decreased perceived effort in specific cognitive activities) to the qualitative shifts (e.g., from information gathering to verification). For example: 'The quantitative finding that knowledge workers perceive decreased effort in Knowledge and Comprehension (Figure 2) aligns with the qualitative shift from information gathering to verification. Because GenAI automates the process of information retrieval, workers perceive less effort in this area, but, as the qualitative data shows, they now invest more effort in verifying the accuracy of the AI-generated information.'

Non-Text Elements

Figure 2: Distribution of perceived effort (%) in cognitive activities (based...

Full Caption

Figure 2: Distribution of perceived effort (%) in cognitive activities (based on Bloom's taxonomy) when using a GenAI tool compared to not using one.

Figure/Table Image (Page 13)

First Reference in Text

Figure 2: Distribution of perceived effort (%) in cognitive activities (based on Bloom's taxonomy) when using a GenAI tool compared to not using one.

Description

Distribution of Perceived Effort: Figure 2 presents the distribution of perceived effort in cognitive activities when using a GenAI tool compared to not using one. The cognitive activities are based on Bloom's taxonomy and include Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation. The figure shows the percentage of responses for each effort level (Much less effort, Less effort, About the same, More effort, Much more effort) for each cognitive activity. For instance, for Knowledge, approximately 41% of respondents reported 'Less effort,' while about 30% reported 'About the same.'

Scientific Validity

Limitations of self-reported data: The figure provides a visual representation of the survey responses regarding perceived effort. However, it's important to consider the limitations of self-reported data and potential biases in participants' responses. The figure should be interpreted in conjunction with the statistical analysis and qualitative data to provide a more complete understanding.
Use of percentages and sample size: The use of percentages allows for comparison across different cognitive activities. The sample size for each activity should be considered when interpreting the distributions. It is important to note whether the differences in distributions across the cognitive activities are statistically significant.

Communication

Effectiveness of stacked bar chart visualization: The figure uses stacked bar charts to display the distribution of perceived effort for each cognitive activity. This visualization is generally effective in showing the relative proportions of different effort levels. However, the overlapping segments could make it difficult to precisely compare specific effort levels across different activities.
Clarity of labels and percentages: The figure clearly labels the cognitive activities based on Bloom's taxonomy and includes percentages, which aids in understanding the data. However, adding the actual number of responses for each category could further enhance clarity.

Discussion

Key Aspects

Reiteration of the Research Problem: The discussion section reiterates the central research problem: the impact of Generative AI (GenAI) on critical thinking in knowledge work. It builds upon the findings presented in previous sections, emphasizing the complex interplay between GenAI use, cognitive effort, and user confidence. The discussion focuses on interpreting these findings and exploring their implications for the design of GenAI tools.
Duality of Confidence: AI Confidence vs. Self-Confidence: A key argument presented is the duality of confidence: higher confidence in GenAI is associated with reduced critical thinking effort, while higher self-confidence in one's own abilities is associated with increased critical thinking effort. This highlights a potential risk of overreliance on AI and the importance of fostering users' domain expertise.
Three Shifts in Critical Thinking Effort: The discussion identifies three significant shifts in critical thinking effort due to GenAI use: 1) From information gathering to information verification. 2) From problem-solving to AI response integration. 3) From task execution to task stewardship. These shifts represent a fundamental change in the nature of cognitive work, with implications for training and skill development.
Task Stewardship: A New Role for Knowledge Workers: The concept of "task stewardship" is introduced to describe the new role of knowledge workers as overseers of AI-generated output. This involves guiding and monitoring AI, ensuring quality, and maintaining accountability for the work, even when much of the material production is delegated to the AI.
Design Implications: Addressing Awareness, Motivation, and Ability: The discussion proposes design implications for GenAI tools, focusing on addressing barriers related to awareness, motivation, and ability to engage in critical thinking. Suggestions include incorporating feedback mechanisms, providing explanations of AI reasoning, and offering guided critiques.
Limitations of the Study: The discussion acknowledges the limitations of the study, including potential biases in self-reporting, the focus on English-speaking participants, the sample's bias towards younger and tech-savvy individuals, and the rapidly evolving nature of GenAI tools. These limitations provide context for interpreting the findings and suggest avenues for future research.
Framing within Existing Literature: The discussion frames the findings within existing literature and theoretical frameworks, referencing concepts such as cognitive offloading, explainable AI, and human-AI collaboration. This situates the research within the broader context of HCI and AI research.
Implications for Training and Skill Development: The discussion emphasizes the need for training knowledge workers in new skills related to information verification, response integration, and task stewardship. This highlights the importance of adapting educational and professional development programs to the changing demands of knowledge work in the age of GenAI.
Balancing Efficiency and Critical Thinking: The discussion explores the potential for GenAI to both enhance and hinder critical thinking. While GenAI can improve efficiency and automate certain tasks, it also poses risks of overreliance and diminished critical engagement. The discussion calls for a balanced approach that leverages the benefits of AI while mitigating its potential downsides.

Strengths

Clear Implications for Design
The section effectively connects the study's findings to practical implications for designing GenAI tools, addressing the core concerns of the research.

"Our study identifies key motivators for and inhibitors of critical thinking among knowledge workers using GenAI. The design implications are clear: critical thinking interventions for GenAI tools should aim to enhance and leverage motivators while mitigating and avoiding inhibitors." (Page 14)
Duality of Confidence
The section discusses the duality of confidence (self-confidence vs. confidence in AI) and its impact on critical thinking, providing a nuanced perspective on user engagement with AI.

"Confidence in AI is associated with reduced critical thinking effort, while self-confidence is associated with increased critical thinking effort. This duality indicates that design strategies should focus on balancing these aspects." (Page 14)
Specific Design Strategies
The section proposes specific design strategies, such as incorporating feedback mechanisms and explicit controls, to address task confidence recalibration and support user empowerment.

"To address task confidence recalibration, AI tools could incorporate feedback mechanisms that help users gauge the reliability of AI outputs, when to trust the AI and when to apply their critical thinking skills. This aligns with the goals of explainable AI [33]." (Page 14)
Comprehensive Framework for Critical Thinking
The discussion of awareness, motivation, and execution of critical thinking provides a comprehensive framework for understanding the barriers and facilitators of critical thinking in the context of GenAI use.

"Our study identifies key motivators for and inhibitors of critical thinking among knowledge workers using GenAI...critical thinking interventions for GenAI tools should aim to enhance and leverage motivators while mitigating and avoiding inhibitors." (Page 14)
Identification of Shifts in Critical Thinking
The section identifies and describes three key shifts in critical thinking due to GenAI, providing a clear and concise summary of how GenAI is changing the nature of cognitive work.

"Critical thinking in knowledge work involves a range of cognitive activities, such as analysis, synthesis, and evaluation. We observed that the use of GenAI tools shifts the knowledge workers’ perceived critical thinking effort in three ways...for recall and comprehension, the focus shifts from information gathering to information verification...for analysis, synthesis, and evaluation, effort shifts from task execution to task stewardship." (Page 15)
Connection to Prior Work and Theory
The section connects the findings to prior work and theoretical frameworks, referencing concepts such as cognitive offloading, explainable AI, and human-AI collaboration.

"This reliance on AI can be seen as a form of cognitive offloading [8], where users depend on AI to perform tasks they feel less confident in handling themselves." (Page 14)
Novel Concept of Task Stewardship
The section introduces the concept of "task stewardship" to describe the shift in cognitive effort from task execution to oversight, providing a novel and insightful metaphor for understanding the changing role of knowledge workers.

"With GenAI, knowledge workers also shift from task execution to oversight, requiring them to guide and monitor AI to produce high-quality outputs — a role we describe as “stewardship”." (Page 15)
Acknowledgement of Limitations
The section acknowledges the limitations of the study, including potential biases in self-reporting, sample demographics, and the evolving nature of GenAI tools, providing a balanced and critical assessment of the research.

"Our study has limitations that warrant consideration and offer avenues for future research." (Page 15)

Suggestions for Improvement

Analyze Relationships Between Shifts in Critical Thinking
This high-impact improvement would significantly strengthen the discussion by providing a more in-depth analysis of the relationship between the three identified shifts in critical thinking effort. Currently, these shifts are presented as distinct phenomena. The Discussion section should synthesize findings and explore their interconnections. This would provide a more nuanced and holistic understanding of how GenAI is transforming critical thinking in knowledge work and would offer more robust implications for design and training.

"Critical thinking in knowledge work involves a range of cognitive activities, such as analysis, synthesis, and evaluation. We observed that the use of GenAI tools shifts the knowledge workers’ perceived critical thinking effort in three ways. Specifically, for recall and comprehension, the focus shifts from information gathering to information verification. For application, the emphasis shifts from problem-solving to AI response integration. Lastly, for analysis, synthesis, and evaluation, effort shifts from task execution to task stewardship." (Page 15)

Implementation: Add a paragraph or subsection explicitly discussing the relationships between the three shifts (information verification, response integration, and task stewardship). For example, discuss how information verification might be a prerequisite for effective response integration, or how task stewardship encompasses both verification and integration. Consider using a diagram or visual model to illustrate these relationships.
Provide Concrete Examples of Design Implications
This medium-impact improvement would enhance the discussion's practical relevance by providing more concrete examples of how the identified design implications could be implemented in specific GenAI tools or workflows. The Discussion section should bridge the gap between theoretical findings and practical application. This would make the recommendations more actionable and useful for designers and developers of GenAI systems.

"One design approach is to enhance awareness of critical thinking opportunities. Our findings indicate that knowledge workers tend to forgo critical thinking for tasks perceived as unimportant or secondary, while engaging in it when aiming to improve task quality or avoid negative outcomes." (Page 14)

Implementation: For each design implication (e.g., enhancing awareness, motivation, ability), provide one or two concrete examples of how this could be achieved in a specific GenAI tool or workflow. For example, for enhancing awareness, describe a feature that proactively highlights potential biases in AI-generated text, or for improving ability, describe a tool that provides step-by-step guidance for verifying information from multiple sources.
Address Trade-offs Between Efficiency and Critical Thinking
This medium-impact improvement would strengthen the discussion by explicitly addressing the potential trade-offs between efficiency gains and critical thinking engagement. The Discussion section should acknowledge the complexities and potential downsides of GenAI use. This would provide a more balanced and realistic assessment of the impact of GenAI on knowledge work and would inform the design of more responsible and ethical AI systems.

"Conversely, some cognitive tasks become less necessary due to GenAI. For instance, information gathering has been significantly reduced. GenAI tools automate the process of fetching and curating task-relevant information, making it less effortful for knowledge workers." (Page 15)

Implementation: Add a paragraph or subsection discussing the potential trade-offs between efficiency and critical thinking. For example, acknowledge that while GenAI can automate tasks and reduce effort, this might also lead to reduced engagement and a decline in certain cognitive skills. Discuss how designers can mitigate these trade-offs and promote a balance between efficiency and critical thinking.
Connect Limitations to Earlier Findings and Implications
This low-impact improvement would improve the discussion's clarity and flow by more explicitly connecting the discussion of limitations (Section 6.3) to the earlier findings and implications. The Discussion section should present a cohesive narrative. This would strengthen the paper by showing how the limitations are considered in the context of the overall findings and how they might inform future research.

"Our study has limitations that warrant consideration and offer avenues for future research. Firstly, we observed that participants occasionally conflated reduced effort in using GenAI with reduced effort in critical thinking with GenAI." (Page 15)

Implementation: Add sentences or phrases throughout the discussion (Sections 6.1 and 6.2) that briefly refer to the limitations and how they might affect the interpretation or generalizability of the findings. For example, when discussing the relationship between confidence and critical thinking, mention that self-reported confidence might not always align with objective expertise (as noted in the limitations).

Non-Text Elements

Table 5: Codebook for the qualitative analysis.

Figure/Table Image (Page 23)

First Reference in Text

Not explicitly referenced in main text

Description

Overview of the codebook: Table 5 presents the codebook used for the qualitative analysis. It lists the codes used to analyze the free-text responses, organized by research question (RQ1 and RQ2). For each code, a description is provided to define its meaning and scope. The codebook covers topics such as goal and query formation, inspect response, integrate response, critical thinking motivators, critical thinking inhibitors, and reasons for increased/decreased effort.

Scientific Validity

Importance of inter-coder reliability: The codebook provides essential information about the qualitative analysis process. It's important to establish inter-coder reliability to ensure the consistency and validity of the coding. Mentioning the process used to develop the codebook and assess inter-coder reliability would strengthen the methodological rigor.
Theoretical grounding of the categories: The codebook's categories seem comprehensive, covering various aspects of the research questions. However, it is important to know how these categories were derived (e.g., from the literature, grounded theory approach) to assess their theoretical grounding and potential biases.

Communication

Transparency and structure of the codebook: The codebook provides a structured overview of the coding scheme used in the qualitative analysis, which enhances transparency and allows readers to understand how the data was interpreted. Clear and concise descriptions of each code are essential for ensuring inter-coder reliability and facilitating replication.
Lack of explicit reference in the main text: Since the codebook is not explicitly referenced in the main text, its importance may be overlooked by readers. Including a clear reference to Table 5 in the results section when discussing the qualitative findings would improve its integration and impact.

The Impact of Generative Al on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers

Table of Contents

Overall Summary

Study Background and Main Findings

Research Impact and Future Directions

Critical Analysis and Recommendations

Section Analysis

Abstract

Key Aspects

Strengths

Suggestions for Improvement

Introduction

Key Aspects

Strengths

Suggestions for Improvement

Method

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Findings for RQ1: When and how do knowledge workers perceive the enaction of critical thinking when using GenAI?

Key Aspects

Strengths

Suggestions for Improvement

Findings for RQ2: When and why do knowledge workers perceive increased/decreased effort for critical thinking due to GenAI?

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Discussion

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements