Phenotypic divergence between individuals with self-reported autistic traits and clinically ascertained autism

Sarah M. Banker, Miles Harrington, Matthew Schafer, Soojung Na, Matthew Heflin, Sarah Barkley, Jadyn Trayvick, Arabella W. Peters, Abigaël A. Thinakaran, Daniela Schiller, Jennifer H. Foss-Feig, Xiaosi Gu
Nature Mental Health
Icahn School of Medicine at Mount Sinai, New York, NY, USA

Table of Contents

Overall Summary

Study Background and Main Findings

This study investigated the differences between clinically-ascertained individuals with ASD and individuals recruited online based on self-reported autistic traits. Key findings include a lack of significant relationship between self-reported and clinician-rated autistic traits in the ASD group (b = 0.025, P = 0.251), and differences in social behavior, with the ASD group showing reduced ability to exert social influence in the social controllability task and less affiliative behavior in the social navigation task. The online high-trait group reported significantly higher levels of social anxiety (F(2,163) = 59.80, P = 3.33 x 10^-20) and AVPD symptoms (F(2,163) = 107.84, P = 1.46 x 10^-30) compared to both the ASD and low-trait groups.

Research Impact and Future Directions

The study provides valuable insights into the differences between clinically-ascertained individuals with ASD and individuals recruited online based on self-reported autistic traits. It clearly demonstrates that while self-reported traits can be informative, they do not always align with clinician-administered assessments and may not accurately reflect observable social behavior. The study makes a strong case for caution when using online self-report measures for autism research, particularly when drawing conclusions about the ASD population as a whole.

The study's findings have practical implications for both research and clinical practice. For researchers, it highlights the need to carefully consider the limitations of online recruitment and self-report measures, particularly in the context of ASD. It suggests that online studies should be used in conjunction with, rather than as a replacement for, traditional lab-based research. For clinicians, the study reinforces the importance of using a multi-faceted approach to assessment, incorporating both self-report and observational measures, and being aware that high self-reported autistic traits may be indicative of other conditions, such as social anxiety or AVPD.

While the study provides valuable guidance, it also acknowledges key uncertainties. The reliance on a single self-report measure (BAPQ) and the lack of a direct measure of insight are limitations that need to be addressed in future research. The study also suggests that future research should explore the use of additional trait measures and self-reported diagnoses in online studies, as well as investigate potential platform differences in social profiles.

Critical unanswered questions remain, such as the extent to which the findings generalize to other populations and the specific mechanisms underlying the observed discrepancies between self-report and behavior. The methodological limitations, particularly the reliance on a single self-report measure and the lack of a direct measure of insight, do not fundamentally undermine the study's conclusions, but they do highlight the need for further research to replicate and extend these findings. The high range of IQ scores in the in-person sample is another limitation that affects the generalizability of the conclusions.

Critical Analysis and Recommendations

Clear Research Question (written-content)
The abstract clearly states the research question, comparing in-person recruited, clinically-assessed individuals with autism to online-recruited individuals. This is important because it immediately informs the reader of the study's focus and target populations.
Section: Abstract
Problem Statement: Self-Report Limitations (written-content)
The introduction effectively introduces the core problem of relying on self-report measures for ASD research in online settings. This is crucial because it highlights the potential for misrepresentation and the need for validation of online methods.
Section: Introduction
Key Finding: Discrepancy Between Self-Report and Clinician Ratings (written-content)
The Results section clearly reports the lack of a significant relationship between self-reported ASD traits (BAPQ) and clinician-rated traits (ADOS) within the in-person ASD group (b = 0.025, P = 0.251). This is a key finding because it challenges the assumption of agreement between these measures and suggests they may capture different aspects of ASD.
Section: Results
Clear Presentation of Social Controllability Results (written-content)
The Results section effectively presents the results of the social controllability task, showing that the ASD group rejected a smaller percentage of high offers in the controllable condition and perceived less control. This is important because it indicates a reduced ability to exert social influence, a key aspect of social interaction.
Section: Results
Consideration of Social Anxiety and AVPD (written-content)
The Discussion acknowledges the potential role of social anxiety and AVPD in the online high-trait group. This is important because it considers alternative explanations for the observed differences and suggests that self-reported autistic traits in the general population may reflect generalized social avoidance rather than autism-specific difficulties.
Section: Discussion
Detailed Participant Recruitment Description (written-content)
The Methods section clearly describes the participant recruitment process for both online and in-person samples, providing detailed eligibility criteria. This is crucial for transparency and reproducibility of the study.
Section: Methods
Add Subheadings for Clarity (written-content)
The Results section lacks subheadings, making it harder to follow the flow of the findings. Adding subheadings would improve the clarity and organization of the Results section, making it easier for readers to navigate the different types of results.
Section: Results
Balance Discussion of Online Research Limitations and Benefits (written-content)
The Discussion does not fully balance the discussion of online research limitations and benefits. Including a more explicit acknowledgment of the potential benefits of online research would provide a more nuanced perspective.
Section: Discussion

Section Analysis

Abstract

Key Aspects

Strengths

Suggestions for Improvement

Introduction

Key Aspects

Strengths

Suggestions for Improvement

Results

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Table 1 | Group demographic information
Figure/Table Image (Page 3)
Table 1 | Group demographic information
First Reference in Text
See Table 1 for demographic characteristics of each group.
Description
  • Overview: Table 1 presents the demographic information for three groups of participants: a 'Clinical ASD' group, a 'High-trait' group, and a 'Low-trait' group. These groups are compared across several demographic variables to assess the differences in their group compositions.
  • Sample Sizes: The table includes the sample size ('n') for each group, which is 56 for each of the three groups (Clinical ASD (n=56), High-trait (n=56), Low-trait (n=56)).
  • Age Statistics: Age is reported as the mean (average) along with the standard deviation (a measure of the spread or variability) for each group. For example, the Clinical ASD group has a mean age of 28.07 years with a standard deviation of 8.53 years. The other groups similarly report their mean age and standard deviation.
  • Gender Breakdown: Gender is reported as the percentage of women, men, non-binary individuals, and those who did not report their gender in each group. For instance, in the Clinical ASD group, 30.3% are women, 51.8% are men, and 14.3% are non-binary.
  • Sex Breakdown: Sex, reported as the percentage of female and male participants, is also included. For example, in the Clinical ASD group, 51.8% are female and 48.2% are male.
  • Ethnicity: Ethnicity is described using percentages for different ethnic groups such as American Indian or Alaska Native, Asian, Black, White, and Other. For example, in the Clinical ASD group, 0% are American Indian or Alaska Native, 10.7% are Asian, 19.7% are Black, 57.1% are White, and 12.5% are Other.
  • IQ and Cognitive Ability: The table also includes the mean and standard deviation for IQ (Intelligence Quotient) for the Clinical ASD group (Mean: 112.38, Standard Deviation: 16.26) and a measure of Cognitive ability for the High-trait and Low-trait groups (Mean: 7.40, Standard Deviation: 3.42 and Mean: 7.02, Standard Deviation: 3.53, respectively). IQ scores are designed to represent a person's reasoning and problem-solving abilities, with the average score set at 100 and the standard deviation at 15.
  • Employment Status: Employment status is reported as the percentage of employed and unemployed individuals, with some participants not reporting their employment status. For example, in the Clinical ASD group, 48.2% are employed and 51.8% are unemployed.
  • Household Income: Household income is categorized into income brackets such as 10-50k, 50-100k, and >100k, with percentages reported for each group. For example, in the Clinical ASD group, 42.9% have a household income between 10-50k.
  • Education Level: Education level is reported as the percentage of individuals with different levels of education such as graduate school, college, some college, high school, some high school, and no high school. For example, in the Clinical ASD group, 17.9% have a graduate school education.
  • Statistical Significance: The table includes a column labeled 'Group difference,' indicating whether there were statistically significant differences between the groups for each demographic variable. The statistical significance is determined using Kruskal-Wallis tests (for continuous variables) and Chi-squared tests (for categorical variables). A p-value is reported to indicate the likelihood that the observed difference occurred by chance, where a p-value less than 0.05 is commonly considered statistically significant.
Scientific Validity
  • Essential Demographic Information: The table provides essential demographic information, which is crucial for understanding the composition of each study group and assessing the generalizability of the findings. The range of demographic variables is comprehensive and relevant.
  • Appropriate Statistical Tests: The use of appropriate statistical tests (Kruskal-Wallis and Chi-squared) for different types of variables is methodologically sound. However, the specific post-hoc tests used to determine pairwise group differences following a significant omnibus test are not explicitly stated, which should be clarified.
  • Lack of Effect Sizes: The inclusion of p-values allows the reader to quickly identify statistically significant group differences. However, reporting effect sizes alongside p-values would provide a more complete picture of the magnitude of these differences.
  • Inconsistent Cognitive Measures: While the table provides IQ scores for the Clinical ASD group and cognitive ability scores for the other groups, these measures are not directly comparable. Ideally, the same cognitive assessment should be used across all groups to allow for more meaningful comparisons. If this was not possible, justification for the use of different measures should be provided.
Communication
  • Clear Reference: The reference to Table 1 in the Results section is clear and appropriately placed, guiding the reader to the relevant information about the participant groups.
  • Potential for Overwhelm: The table provides a comprehensive overview of the demographic characteristics, but the sheer volume of information might overwhelm some readers. Strategic use of bolding or shading could highlight key differences between groups.
  • Lack of Visual Emphasis on Significance: While the table includes statistical test results (p-values), more visual cues could enhance the immediate understanding of significant group differences. For example, using asterisks to denote statistical significance directly in the table would be beneficial.
Fig. 1 | Trait comparisons. a, The ASD (n = 56 participants) and high-trait...
Full Caption

Fig. 1 | Trait comparisons. a, The ASD (n = 56 participants) and high-trait (HT; n = 56 participants) groups had comparable levels of self-reported autistic traits (measured via BAPQ; two-sided pairwise comparisons using estimated marginal means, with confidence intervals and P values adjusted for multiple comparisons using the Tukey method: t(111) = -0.28, P=0.957, estimated difference = -0.026, 95% CI [-0.25, 0.19], Cohen's d = -0.05; mean ASD: 3.82, mean HT: 3.85, mean low-trait (LT): 2.11).

Figure/Table Image (Page 4)
Fig. 1 | Trait comparisons. a, The ASD (n = 56 participants) and high-trait (HT; n = 56 participants) groups had comparable levels of self-reported autistic traits (measured via BAPQ; two-sided pairwise comparisons using estimated marginal means, with confidence intervals and P values adjusted for multiple comparisons using the Tukey method: t(111) = -0.28, P=0.957, estimated difference = -0.026, 95% CI [-0.25, 0.19], Cohen's d = -0.05; mean ASD: 3.82, mean HT: 3.85, mean low-trait (LT): 2.11).
First Reference in Text
As anticipated owing to how the groups were defined, the three groups differed in their self-reported autistic traits, as measured by BAPQ scores (F(2,163) = 232.86, P = 1.66 × 10-48, Npartial² = 0.74; Fig. 1a).
Description
  • Overall Focus: The caption refers to 'Fig. 1a', which is a component of a larger figure (Fig. 1) that presents trait comparisons across different groups. This specific component focuses on comparing the levels of self-reported autistic traits between an Autism Spectrum Disorder (ASD) group, a High-Trait (HT) group, and a Low-Trait (LT) group.
  • Measurement Tool: The caption specifies that self-reported autistic traits were measured using the Broad Autism Phenotype Questionnaire (BAPQ). The BAPQ is a questionnaire designed to quantify autistic-like traits in individuals, even if they don't have a formal ASD diagnosis.
  • Sample Sizes: The number of participants in the ASD and High-Trait groups is explicitly stated as n = 56 for each group, indicating that there were 56 individuals in each of these groups.
  • Statistical Analysis: The statistical analysis used to compare the groups is described as 'two-sided pairwise comparisons using estimated marginal means, with confidence intervals and P values adjusted for multiple comparisons using the Tukey method.' This means that the researchers compared each pair of groups (ASD vs. HT, ASD vs. LT, HT vs. LT) to see if their means were different, and they used a method (Tukey's) to correct for the fact that they were doing multiple comparisons, which can increase the chance of finding a significant difference just by chance. The 'estimated marginal means' are the average scores for each group, adjusted for any other variables in the model.
  • Statistical Results: The results of the comparison between the ASD and HT groups are presented as follows: t(111) = -0.28, P=0.957, estimated difference = -0.026, 95% CI [-0.25, 0.19], Cohen's d = -0.05. Here, 't(111) = -0.28' refers to the t-statistic, a measure of the difference between the two groups' means relative to the variability within the groups, with 111 degrees of freedom (related to the sample size). 'P=0.957' is the p-value, indicating that there is a 95.7% chance of observing the data (or more extreme data) if there is truly no difference between the groups. 'Estimated difference = -0.026' is the estimated difference in the means of the two groups. '95% CI [-0.25, 0.19]' is the 95% confidence interval, which provides a range of values within which the true difference between the group means is likely to fall. 'Cohen's d = -0.05' is a measure of effect size, quantifying the size of the difference between the two groups' means in standard deviation units.
  • Mean BAPQ Scores: The mean BAPQ scores for each group are also provided: mean ASD: 3.82, mean HT: 3.85, mean low-trait (LT): 2.11. These scores represent the average level of self-reported autistic traits in each group, according to the BAPQ.
Scientific Validity
  • Statistical Support: The claim that the ASD and HT groups had comparable levels of autistic traits is supported by the statistical analysis provided, which shows a non-significant difference (P = 0.957).
  • Appropriate Post-Hoc Testing: The use of appropriate post-hoc tests (Tukey method) for multiple comparisons strengthens the validity of the conclusion, as it controls for the increased risk of Type I error.
  • Effect Size Provided: Providing the effect size (Cohen's d = -0.05) is valuable, as it quantifies the magnitude of the non-significant difference. A small effect size further supports the conclusion that the groups are similar in their self-reported autistic traits.
Communication
  • Concise Summary: The caption concisely summarizes the key finding that the ASD and high-trait groups showed comparable levels of self-reported autistic traits. The inclusion of group abbreviations (ASD, HT, LT) aids in quick comprehension.
  • Potential Overload of Statistical Detail: The caption includes a substantial amount of statistical detail. While comprehensive, this might overwhelm some readers. Moving some of the detailed statistical information (e.g., degrees of freedom) to the main text or a footnote could improve readability.
  • Effective Cross-Referencing: The reference to 'Fig. 1a' in the Results section is appropriate and helps the reader quickly locate the relevant visual representation of the data.
Fig. 1 | Trait comparisons. b, c, Investigation into traits of other disorders...
Full Caption

Fig. 1 | Trait comparisons. b, c, Investigation into traits of other disorders characterized by social impairment revealed that, compared with both other groups (n = 56 participants each), the high-trait group (n = 56 participants) self-reported a higher level of social anxiety (two-sided mixed-effects model with random intercept for matched pair ID: F(2,163) = 59.80, P = 3.33 × 10-20, Npartial2 = 0.42; mean ASD: 35.39, mean HT: 46.43, mean LT: 19.21 (b)) and avoidant personality disorder (AVPD) symptoms (two-sided mixed-effects model with random intercept for

Figure/Table Image (Page 4)
Fig. 1 | Trait comparisons. b, c, Investigation into traits of other disorders characterized by social impairment revealed that, compared with both other groups (n = 56 participants each), the high-trait group (n = 56 participants) self-reported a higher level of social anxiety (two-sided mixed-effects model with random intercept for matched pair ID: F(2,163) = 59.80, P = 3.33 × 10-20, Npartial2 = 0.42; mean ASD: 35.39, mean HT: 46.43, mean LT: 19.21 (b)) and avoidant personality disorder (AVPD) symptoms (two-sided mixed-effects model with random intercept for
First Reference in Text
The groups differed in their social anxiety symptoms (F(2,163) = 59.80, P = 3.33 × 10-20, Npartial² = 0.42; Fig. 1b), such that the high-trait group had higher scores (indicating more symptoms) than both the low-trait group (t(110) = 10.87, P = 5.72 × 10-14, estimated difference = 27.3, 95% CI [21.4, 33.3], Cohen's d = 2.06) and the ASD
Description
Scientific Validity
Communication
Fig. 1 | Trait comparisons. matched pair ID: F(2,163) = 107.84, P=1.46×10-30,...
Full Caption

Fig. 1 | Trait comparisons. matched pair ID: F(2,163) = 107.84, P=1.46×10-30, Npartial² = 0.57; mean ASD: 20.09, mean HT: 23.80, mean LT: 11.36 (c)).

Figure/Table Image (Page 4)
Fig. 1 | Trait comparisons. matched pair ID: F(2,163) = 107.84, P=1.46×10-30, Npartial² = 0.57; mean ASD: 20.09, mean HT: 23.80, mean LT: 11.36 (c)).
First Reference in Text
The pairwise group differences for AVPD traits follow the same pattern as social anxiety: the high-trait group had higher scores (indicating more symptoms) than both the low-trait group (t(110) = 14.58, P = 2.27 × 10-14, estimated difference = 12.50, 95% CI [10.42, 14.58], Cohen's d = 2.70) and the ASD group (t(111) = -4.18,
Description
  • Overall Focus: The caption describes part of Figure 1, specifically component '(c)', which presents results related to Avoidant Personality Disorder (AVPD) symptoms across three groups: ASD, High-Trait (HT), and Low-Trait (LT). The sample size for each group is implicitly stated as n = 56, as it is mentioned in prior captions for Figure 1.
  • Statistical Analysis: A mixed-effects model with a random intercept for matched pair ID was used for the analysis. A mixed-effects model is a statistical technique that allows for the analysis of data with both fixed effects (effects that are of direct interest) and random effects (effects that account for variability in the data). In this case, the random intercept for matched pair ID controls for the fact that some participants were matched, meaning their data points might be more similar to each other than to data points from other participants.
  • Statistical Results: The results of the mixed-effects model are presented as F(2,163) = 107.84, P=1.46×10-30, and Npartial² = 0.57. 'F(2,163) = 107.84' refers to the F-statistic, a measure of the variance between group means relative to the variance within groups, with 2 and 163 degrees of freedom. 'P=1.46×10-30' is the p-value, indicating extremely strong statistical significance. 'Npartial² = 0.57' is partial eta-squared, a measure of effect size, indicating the proportion of variance in AVPD symptoms that is explained by group membership, after controlling for other variables.
  • Mean AVPD Scores: The mean AVPD scores for each group are provided: mean ASD: 20.09, mean HT: 23.80, mean LT: 11.36. These scores represent the average level of self-reported AVPD symptoms in each group.
Scientific Validity
  • Appropriate Statistical Model: The use of a mixed-effects model with a random intercept for matched pair ID is appropriate, given the study design with matched participants. This approach accounts for the non-independence of observations within matched pairs.
  • Strong Statistical Evidence: The reported F-statistic, p-value, and partial eta-squared provide strong evidence for a significant group difference in AVPD symptoms. The effect size (partial eta-squared = 0.57) indicates a large effect.
  • Incomplete Pairwise Comparison Details: The reference text correctly points out that the pairwise group differences follow a similar pattern to social anxiety. However, the statistical details of these pairwise comparisons are not fully provided in the caption, requiring the reader to consult the main text.
Communication
  • Clear Summary: The caption effectively summarizes that the figure component (c) displays the results of a mixed-effects model analysis for Avoidant Personality Disorder (AVPD) traits, including the F-statistic, p-value, and partial eta-squared. The group means are clearly presented, which allows for a quick comparison of AVPD symptoms across the groups.
  • Assumed Knowledge: The use of abbreviations is consistent with prior captions, which aids in reader comprehension. However, the caption assumes that the reader is familiar with the meaning and interpretation of mixed-effects models and associated statistics.
  • Missing Key Finding: The caption could be improved by briefly stating the key finding (e.g., 'High-trait group reported highest AVPD symptoms') to provide more context.
Fig. 1 | Trait comparisons. d, In the in-person ASD group (n = 56...
Full Caption

Fig. 1 | Trait comparisons. d, In the in-person ASD group (n = 56 participants), there was no relationship between clinician-rated autistic traits measured via ADOS (mean = 13.85) and self-reported autistic traits measured via BAPQ (two-sided general linear model: b = 0.025, s.e.m. = 0.02, t(51) = 1.16, P = 0.251, 95% CI [-0.018, 0.067], Npartial² = 0.01).

Figure/Table Image (Page 4)
Fig. 1 | Trait comparisons. d, In the in-person ASD group (n = 56 participants), there was no relationship between clinician-rated autistic traits measured via ADOS (mean = 13.85) and self-reported autistic traits measured via BAPQ (two-sided general linear model: b = 0.025, s.e.m. = 0.02, t(51) = 1.16, P = 0.251, 95% CI [-0.018, 0.067], Npartial² = 0.01).
First Reference in Text
In addition to the self-report measures, in-person participants completed the Autism Diagnostic Observation Schedule (ADOS-2; module 4)28, considered the 'gold standard' clinical assessment measure for ASD.
Description
  • Overall Focus: The caption describes part of Figure 1, specifically component '(d)', which focuses on the relationship between two different measures of autistic traits within the in-person Autism Spectrum Disorder (ASD) group. The number of participants in this group is stated as n = 56.
  • Clinician-Rated Measure: The caption specifies that clinician-rated autistic traits were measured using the Autism Diagnostic Observation Schedule (ADOS). The ADOS is a semi-structured, standardized assessment used to diagnose autism, administered by trained clinicians.
  • Self-Report Measure: The caption also specifies that self-reported autistic traits were measured using the Broad Autism Phenotype Questionnaire (BAPQ). The BAPQ is a self-report questionnaire that assesses autistic-like traits in individuals.
  • Statistical Analysis: The statistical analysis used to assess the relationship between the ADOS and BAPQ scores is described as a 'two-sided general linear model'. A general linear model is a statistical technique used to model the relationship between a dependent variable (in this case, ADOS score) and one or more independent variables (in this case, BAPQ score). The 'two-sided' aspect indicates that the researchers were testing for both positive and negative relationships.
  • Statistical Results: The results of the general linear model are presented as follows: b = 0.025, s.e.m. = 0.02, t(51) = 1.16, P = 0.251, 95% CI [-0.018, 0.067], Npartial² = 0.01. Here, 'b = 0.025' refers to the unstandardized regression coefficient, representing the change in ADOS score for each one-unit increase in BAPQ score. 's.e.m. = 0.02' is the standard error of the mean, a measure of the precision of the estimated regression coefficient. 't(51) = 1.16' refers to the t-statistic, used to test the null hypothesis that the regression coefficient is zero, with 51 degrees of freedom. 'P = 0.251' is the p-value, indicating the probability of observing the data (or more extreme data) if there is truly no relationship between the ADOS and BAPQ scores. '95% CI [-0.018, 0.067]' is the 95% confidence interval for the regression coefficient. 'Npartial² = 0.01' is partial eta-squared, a measure of effect size, indicating the proportion of variance in ADOS scores that is explained by BAPQ scores, controlling for other variables.
Scientific Validity
  • Appropriate Statistical Model: The use of a general linear model is appropriate for assessing the relationship between two continuous variables (ADOS and BAPQ scores).
  • Sufficient Statistical Information: The reported statistics (b, s.e.m., t, P, CI, Npartial²) provide sufficient information to evaluate the strength and direction of the relationship between the ADOS and BAPQ scores.
  • Support for Conclusion: The non-significant p-value (P = 0.251) supports the conclusion that there is no statistically significant relationship between clinician-rated and self-reported autistic traits in this sample. The small effect size (Npartial² = 0.01) further supports this conclusion.
  • Missing Participants: It is important to note that the degrees of freedom are 51. The total number of participants is 56, so the analysis likely excluded five participants for some reason. It would be helpful to know why those five participants were excluded.
Communication
  • Clear Statement of Key Finding: The caption clearly states the key finding: there was no significant relationship between clinician-rated (ADOS) and self-reported (BAPQ) autistic traits within the in-person ASD group. The inclusion of the mean ADOS score provides context for the overall level of autistic traits in this sample.
  • Comprehensive Statistical Detail: The caption provides a comprehensive overview of the statistical analysis, including the model used (general linear model) and relevant statistics (b, s.e.m., t, P, CI, Npartial²). However, the sheer volume of statistical information might be overwhelming for some readers.
  • Assumed Familiarity with Abbreviations: The use of abbreviations (ADOS, BAPQ, s.e.m., CI) is consistent and aids in conciseness, but assumes familiarity with these abbreviations. Briefly defining these abbreviations upon first use in the figure caption could improve accessibility.
Fig. 1 | Trait comparisons. e,f, Broken down by subscales, there was no...
Full Caption

Fig. 1 | Trait comparisons. e,f, Broken down by subscales, there was no agreement in the restricted and repetitive behavior domain (RRB; general linear model: b = 0.12, s.e.m. = 0.06, t(51) = 1.95, P = 0.057, 95% CI [0.0, 1.0], Npartial² = 0.05 (e)) or the social domain (general linear model: b = 0.05, s.e.m. = 0.04, t(51) = 1.17, P = 0.249, 95% CI [0.0, 1.0], Npartial² = 0.03 (f)).

Figure/Table Image (Page 4)
Fig. 1 | Trait comparisons. e,f, Broken down by subscales, there was no agreement in the restricted and repetitive behavior domain (RRB; general linear model: b = 0.12, s.e.m. = 0.06, t(51) = 1.95, P = 0.057, 95% CI [0.0, 1.0], Npartial² = 0.05 (e)) or the social domain (general linear model: b = 0.05, s.e.m. = 0.04, t(51) = 1.17, P = 0.249, 95% CI [0.0, 1.0], Npartial² = 0.03 (f)).
First Reference in Text
Broken down by subdomain, there was also no relationship between self- and clinician-rated traits in the restricted and repetitive behavior domain (b = 0.12, s.e.m. = 0.06, t(51) = 1.95, P = 0.057, 95% CI [0.0, 1.0], Npartial² = 0.05; Fig. 1e) or the social domain (b = 0.05, s.e.m. = 0.04, t(51) = 1.17, P = 0.249, 95% CI
Description
  • Overall Focus: The caption describes parts of Figure 1, specifically components '(e)' and '(f)', which present results related to the relationship between clinician-rated and self-reported autistic traits, broken down into subscales: Restricted and Repetitive Behavior (RRB) and the Social Domain. The number of participants (n=56) is consistent with previous parts of Figure 1.
  • Statistical Analysis: The caption indicates that a general linear model was used to assess the relationship between clinician-rated and self-reported traits within each subdomain. As mentioned before, a general linear model is a statistical technique used to model the relationship between a dependent variable and one or more independent variables.
  • RRB Results: For the Restricted and Repetitive Behavior (RRB) domain, the results of the general linear model are presented as: b = 0.12, s.e.m. = 0.06, t(51) = 1.95, P = 0.057, 95% CI [0.0, 1.0], Npartial² = 0.05. As before, 'b' is the unstandardized regression coefficient, 's.e.m.' is the standard error of the mean, 't(51)' is the t-statistic with 51 degrees of freedom, 'P' is the p-value, '95% CI' is the 95% confidence interval, and 'Npartial²' is partial eta-squared.
  • Social Domain Results: For the Social Domain, the results of the general linear model are presented as: b = 0.05, s.e.m. = 0.04, t(51) = 1.17, P = 0.249, 95% CI [0.0, 1.0], Npartial² = 0.03. The statistics are interpreted as described above.
  • Non-Significant Results: The p-values for both the RRB domain (P = 0.057) and the Social Domain (P = 0.249) are greater than 0.05, indicating that there is no statistically significant relationship between clinician-rated and self-reported traits in either domain.
Scientific Validity
  • Appropriate Statistical Model: The use of a general linear model is appropriate for assessing the relationship between two continuous variables within each subdomain.
  • Sufficient Statistical Information: The reported statistics provide sufficient information to evaluate the strength and direction of the relationship between the ADOS and BAPQ subscales.
  • Support for Conclusion: The non-significant p-values for both subdomains support the conclusion that there is no statistically significant relationship between clinician-rated and self-reported autistic traits at the subdomain level.
  • Near-Significant Trend: The p-value for the RRB domain (P = 0.057) is close to the significance threshold of 0.05. While not statistically significant, it is worth noting that this trend might warrant further investigation with a larger sample size.
  • Multiple comparisons inflation: The analysis of the subdomains uses the same participants as the overall analysis in Figure 1d. This raises the risk of inflating Type I error due to multiple comparisons. It is good that the authors used the Bonferroni correction, but they could have mentioned it here as well.
Communication
  • Clear Reinforcement of Key Finding: The caption clearly states that, even when breaking down autistic traits into subscales (restricted and repetitive behavior, and social domain), there was still no significant agreement between clinician-rated and self-reported measures. This reinforces the overall finding of a discrepancy between these two types of assessments.
  • Need for RRB Definition: The inclusion of the 'RRB' abbreviation is helpful, but defining it earlier in the figure caption or in the main text would improve clarity for readers unfamiliar with this specific terminology.
  • High Statistical Density: The caption is very dense with statistical information. While comprehensive, it might be overwhelming for some readers. Consider moving some of the detailed statistical information (e.g., s.e.m., CI) to a footnote or the main text.
Fig. 2 | Social controllability. a, As shown in the representative task screen,...
Full Caption

Fig. 2 | Social controllability. a, As shown in the representative task screen, the social control task involved participants accepting or rejecting splits of $20 proposed by members of two virtual teams. b, Participants played the game with two different teams sequentially, the order of which was counterbalanced.

Figure/Table Image (Page 5)
Fig. 2 | Social controllability. a, As shown in the representative task screen, the social control task involved participants accepting or rejecting splits of $20 proposed by members of two virtual teams. b, Participants played the game with two different teams sequentially, the order of which was counterbalanced.
First Reference in Text
To measure social controllability, we used a monetary exchange task 26,27,34,35 modified from the ultimatum game, in which participants decide whether to accept or reject proposed splits of US$20 offered by players from two independent teams (Fig. 2a; see Methods for details).
Description
  • Overall Focus: The caption describes Figure 2, which focuses on 'Social controllability.' This refers to the ability of an individual to influence the outcomes of social interactions.
  • Representative Task Screen: Figure 2a shows a representative task screen. This means that the figure includes a visual example of what the participants saw when they were performing the task. The task involved participants making decisions about how to split $20.
  • Virtual Teams: Participants interacted with two 'virtual teams.' This means that the participants weren't interacting with real people but instead were told they were interacting with members of two different teams. The offers for splitting the $20 came from these virtual teams.
  • Counterbalancing: Figure 2b indicates that participants played the game with two different teams, one after the other. The order in which they played with the teams was 'counterbalanced'. Counterbalancing is a technique used to control for order effects. In this context, it means that some participants played with Team A first and then Team B, while others played with Team B first and then Team A.
Scientific Validity
  • Established Paradigm: The caption provides a basic description of the task. The reference text elaborates that the task was modified from the ultimatum game, a well-established paradigm in behavioral economics. This provides a basis for understanding the task's validity.
  • Sound Methodology: The caption mentions that the order of teams was counterbalanced. Counterbalancing is a crucial aspect of the experimental design, as it controls for potential order effects that could confound the results.
  • Missing Manipulation Details: The caption lacks information about the specific manipulations used to assess social controllability. It would be useful to briefly mention the key manipulation that allowed participants to exert influence over one of the teams (e.g., by rejecting offers).
Communication
  • Clear Overview: The caption provides a clear, high-level overview of the social controllability task. It effectively introduces the main elements of the task, including the monetary exchange and the presence of two virtual teams.
  • Lack of Specificity (a): The mention of a 'representative task screen' in (a) is helpful, as it indicates the figure includes a visual depiction of the task interface. However, it does not specify what key information the representative task screen shows.
  • Importance of Counterbalancing: Stating that the order of teams was counterbalanced is important for understanding the experimental design. This strengthens the claim that there is no systematic bias based on team order.
Fig. 2 | Social controllability. c, All groups (n = 56 participants each)...
Full Caption

Fig. 2 | Social controllability. c, All groups (n = 56 participants each) showed comparable overall rejection rates for both conditions (two-sided mixed-effects model with random intercept for matched pair ID: F(2,281) = 0.77, P = 0.46, Npartial² = 0.006; mean ASD controllable: 52.4%, mean HT controllable: 55.5%, mean LT controllable: 54.7%, mean ASD uncontrollable: 49.6%, mean HT uncontrollable: 51.9%, mean LT uncontrollable: 48.2%).

Figure/Table Image (Page 5)
Fig. 2 | Social controllability. c, All groups (n = 56 participants each) showed comparable overall rejection rates for both conditions (two-sided mixed-effects model with random intercept for matched pair ID: F(2,281) = 0.77, P = 0.46, Npartial² = 0.006; mean ASD controllable: 52.4%, mean HT controllable: 55.5%, mean LT controllable: 54.7%, mean ASD uncontrollable: 49.6%, mean HT uncontrollable: 51.9%, mean LT uncontrollable: 48.2%).
First Reference in Text
We found that the three groups showed similar overall rejection rates during the task (F(2,281) = 0.77, P = 0.46, Npartial² = 0.006; Fig. 2c).
Description
  • Overall Focus: The caption refers to Figure 2c, which presents the overall rejection rates for three groups of participants (ASD, High-Trait (HT), and Low-Trait (LT)) in two conditions: a 'controllable' condition and an 'uncontrollable' condition. The sample size for each group is stated as n = 56.
  • Rejection Rate: The rejection rate refers to the percentage of times participants rejected the proposed split of $20 in the monetary exchange task. A higher rejection rate suggests that participants were less willing to accept unfair offers.
  • Controllable vs. Uncontrollable Conditions: The two conditions, 'controllable' and 'uncontrollable,' refer to whether or not the participants could influence future offers by rejecting current offers. In the controllable condition, rejecting an offer could lead to better offers in the future. In the uncontrollable condition, offers were random.
  • Statistical Analysis: The statistical analysis used is described as a 'two-sided mixed-effects model with random intercept for matched pair ID'. As mentioned before, a mixed-effects model is a statistical technique that allows for the analysis of data with both fixed effects and random effects, and the random intercept accounts for the non-independence of observations within matched pairs.
  • Statistical Results: The results of the mixed-effects model are presented as: F(2,281) = 0.77, P = 0.46, Npartial² = 0.006. 'F(2,281) = 0.77' refers to the F-statistic, 'P = 0.46' is the p-value, and 'Npartial² = 0.006' is partial eta-squared. The p-value is above the significance threshold of 0.05, indicating that there is no statistically significant difference in overall rejection rates between the groups or conditions.
  • Mean Rejection Rates: The mean rejection rates for each group and condition are provided (e.g., mean ASD controllable: 52.4%). These values represent the average percentage of times participants in each group rejected offers in each condition.
Scientific Validity
  • Appropriate Statistical Model: The use of a mixed-effects model with a random intercept for matched pair ID is appropriate, given the study design.
  • Support for Null Finding: The reported statistics provide sufficient information to evaluate the null finding. The non-significant p-value and small effect size support the conclusion that there were no significant differences in overall rejection rates.
  • Sample Size Considerations: The sample size of 56 participants per group is adequate for detecting moderate to large effect sizes. However, the small effect size observed (Npartial² = 0.006) suggests that a much larger sample size would be needed to detect any real differences in overall rejection rates, if they exist.
Communication
  • Clear Statement of Null Finding: The caption clearly states that there were no significant differences in overall rejection rates between the groups or conditions. This directly addresses a key aspect of the social controllability task.
  • Helpful Descriptive Statistics: The inclusion of the means for each group and condition provides valuable descriptive information, allowing readers to compare the rejection rates across groups.
  • Potential for Statistical Overload: The caption provides a comprehensive statistical summary, including the test used, F-statistic, p-value, and effect size. However, the sheer volume of statistical information might overwhelm some readers.
Fig. 2 | Social controllability. d, When rejection rate is broken down by offer...
Full Caption

Fig. 2 | Social controllability. d, When rejection rate is broken down by offer size, we see that the ASD group (n = 56 participants) rejected a lower percentage of high offers than the two online groups (n = 56 participants each) during the controllable condition (two-sided mixed-effects model with random intercept for matched pair ID, P values false discovery rate (FDR)-corrected for multiple

Figure/Table Image (Page 5)
Fig. 2 | Social controllability. d, When rejection rate is broken down by offer size, we see that the ASD group (n = 56 participants) rejected a lower percentage of high offers than the two online groups (n = 56 participants each) during the controllable condition (two-sided mixed-effects model with random intercept for matched pair ID, P values false discovery rate (FDR)-corrected for multiple
First Reference in Text
Breaking rejection rate down by offer size, we found that, while the groups showed similar rejection rates for low offers (F(2,74) = 0.20, P = 0.82, Npartial² = 0.005) and medium offers (F(2,162) = 1.67, P = 0.29, npartial² = 0.02), the ASD group rejected a smaller percentage of high offers (F(2,122) = 6.12, P = 0.009, npartial² = 0.09; Fig. 2d) compared with both low-trait
Description
  • Overall Focus: The caption describes part of Figure 2, specifically component '(d)', which presents results related to how often participants rejected offers of different sizes (high, medium, low) in the 'controllable' condition of the social controllability task. The focus is on comparing the rejection rates of the ASD group to the online groups.
  • Controllable Condition: The caption specifies that the analysis focuses on the 'controllable condition'. As a reminder, in the controllable condition, participants could influence future offers by rejecting current offers.
  • High Offers: The caption highlights that the ASD group rejected a lower percentage of 'high offers' compared to the online groups. The offer sizes were categorized as low, medium, and high. Although the exact monetary values defining each category aren't provided in this caption, it's implied that 'high offers' represent the most advantageous offers for the participant.
  • Sample Size: The sample size is stated as n = 56 for each group (ASD and the two online groups).
  • Statistical Analysis: The statistical analysis used is described as a 'two-sided mixed-effects model with random intercept for matched pair ID'. As before, a mixed-effects model is used to account for the non-independence of observations within matched pairs.
  • FDR Correction: The caption mentions that the p-values are 'false discovery rate (FDR)-corrected for multiple'. FDR correction is a method used to adjust p-values when performing multiple statistical tests to control the rate of false positives (incorrectly rejecting the null hypothesis).
Scientific Validity
  • Appropriate Statistical Model and Correction: The use of a mixed-effects model is appropriate, given the study design. The mention of FDR correction strengthens the validity of the finding.
  • Incomplete Statistical Information: The caption is cut off, so the full statistical results are not presented. This limits the ability to fully evaluate the strength of the evidence.
  • Missing Post-Hoc Test Details: The reference text provides additional statistical details. It indicates that there is a statistically significant difference in rejection rates of high offers (F(2,122) = 6.12, P = 0.009, npartial² = 0.09). However, it is not clear from the caption or reference text which specific post-hoc tests were used to determine that the ASD group rejected fewer high offers than *both* online groups.
  • Adequate Sample Size: The sample size is adequate for detecting moderate to large effect sizes. The reported partial eta-squared of 0.09 suggests a small to moderate effect.
Communication
  • Clear Key Finding: The caption clearly states the key finding: the ASD group rejected high offers less often than the online groups in the controllable condition. This highlights a difference in decision-making based on offer size.
  • Mention of FDR Correction: The caption mentions that p-values are FDR-corrected, which strengthens the validity of the finding. However, the specific FDR correction method is not provided in the caption.
  • Incomplete Caption: The caption is incomplete, as it is cut off mid-sentence. This makes it difficult to fully understand the context and scope of the finding.
Fig. 2 | Social controllability. e, Unlike the online groups (n = 56...
Full Caption

Fig. 2 | Social controllability. e, Unlike the online groups (n = 56 participants each), the ASD group (n = 56 participants) did not detect a difference in controllability between the conditions (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,269) = 18.52, P = 2.91×10-8, Npartial² = 0.12; mean ASD controllable: 45.47, mean HT controllable: 67.45, mean LT controllable: 61.07, mean ASD uncontrollable: 41.79, mean HT uncontrollable: 19.66, mean LT uncontrollable: 24.70).

Figure/Table Image (Page 5)
Fig. 2 | Social controllability. e, Unlike the online groups (n = 56 participants each), the ASD group (n = 56 participants) did not detect a difference in controllability between the conditions (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,269) = 18.52, P = 2.91×10-8, Npartial² = 0.12; mean ASD controllable: 45.47, mean HT controllable: 67.45, mean LT controllable: 61.07, mean ASD uncontrollable: 41.79, mean HT uncontrollable: 19.66, mean LT uncontrollable: 24.70).
First Reference in Text
Indeed, we detected a significant group-by-condition interaction on perceived control ratings (F(2,269) = 18.52, P = 2.91 × 10¯8, Npartial² = 0.12; Fig. 2e).
Description
  • Overall Focus: The caption refers to Figure 2e, which presents results related to perceived controllability, i.e., how much control participants felt they had in the task. The key finding is that the ASD group's perception differed from that of the online groups.
  • Lack of Perceived Difference: The caption highlights that the ASD group 'did not detect a difference in controllability between the conditions.' This means that, unlike the online groups, the ASD participants did not perceive a difference in their ability to influence the offers they received, regardless of whether the condition was controllable or uncontrollable.
  • Sample Size: The sample size is stated as n = 56 for each group.
  • Statistical Analysis: The statistical analysis used is described as a 'two-sided mixed-effects model with random intercepts for matched pair IDs'.
  • Statistical Results: The results of the mixed-effects model are presented as: F(2,269) = 18.52, P = 2.91×10-8, Npartial² = 0.12. As before, 'F' is the F-statistic, 'P' is the p-value, and 'Npartial²' is partial eta-squared. The extremely small P-value indicates a statistically significant difference in perceived controllability between the groups and/or conditions.
  • Mean Scores: The mean perceived controllability scores for each group and condition are provided (e.g., mean ASD controllable: 45.47).
Scientific Validity
  • Standard Statistics: The reported statistics from the mixed-effects model are standard, which adds some validity to the analysis. However, the caption is cut off, preventing a full evaluation of its validity.
  • Missing Post-Hoc Test Details: The reference text includes additional statistical details, but the specific post-hoc tests used to determine the significant difference between the ASD and low-trait groups are not explicitly stated. This omission limits a comprehensive assessment of the analysis's validity.
  • Adequate sample size: The sample size and effect size should be considered to have adequate power.
Communication
  • Clear Statement of Key Finding: The caption clearly states that, unlike the online groups, the ASD group did not perceive a difference in controllability between the two conditions. This is a crucial finding, highlighting a potential difference in the perception of social control.
  • Helpful Descriptive Statistics: The inclusion of the means for each group and condition allows for a quick comparison of perceived controllability across groups and conditions. The large difference between the HT/LT groups and the ASD group is particularly noticeable.
  • Potential Lack of Context: The caption provides a comprehensive statistical summary. However, some readers may not be familiar with the concept of 'perceived controllability' and might benefit from a brief explanation of what this refers to in the context of the task.
Fig. 3 | Social navigation. a, The social navigation task involved participants...
Full Caption

Fig. 3 | Social navigation. a, The social navigation task involved participants interacting with different characters with the goal of finding a job and a home. At each interaction, participants could choose between two options that affected either the affiliation or power dynamics of the relationship.

Figure/Table Image (Page 6)
Fig. 3 | Social navigation. a, The social navigation task involved participants interacting with different characters with the goal of finding a job and a home. At each interaction, participants could choose between two options that affected either the affiliation or power dynamics of the relationship.
First Reference in Text
To evaluate participants' social feelings and actions during dynamic interactions, we utilized the social navigation task36.
Description
  • Overall Focus: The caption describes Figure 3, which focuses on a 'social navigation task.' This task is designed to assess how people navigate social situations and relationships.
  • Virtual Characters: The task involved participants interacting with different characters. These characters are not real people but are instead virtual characters presented to the participants.
  • Task Goal: The participants' goal in the task was to find a job and a home. This provides a specific context for the social interactions.
  • Affiliation and Power Dynamics: At each interaction, participants had to choose between two options. These options were designed to affect either the affiliation or power dynamics of the relationship with the character. 'Affiliation' refers to how close or friendly the relationship is, while 'power dynamics' refers to the balance of control between the participant and the character.
Scientific Validity
  • Theoretically Sound: The task has a clear goal and a structured interaction, which are important for experimental validity. The use of affiliation and power dynamics as key dimensions is theoretically grounded in social psychology.
  • Lack of Task Details: The caption and reference text provide a general description of the task but lack specific details about the task's design, scoring, and validation. More information about the task's psychometric properties and sensitivity would be beneficial.
  • Use of Pre-Existing Task: The use of a pre-existing task (as indicated by the citation in the reference text) strengthens the validity of the study, as it suggests that the task has been previously validated and used in research.
Communication
  • Clear and Concise Overview: The caption provides a clear and concise overview of the social navigation task. It effectively conveys the task's goal (finding a job and home) and the nature of the interactions (affecting affiliation or power).
  • Lack of Specificity: The caption effectively introduces the key concepts of 'affiliation' and 'power dynamics,' which are central to understanding the task's design. However, it doesn't elaborate on how these dynamics were manipulated or measured.
  • Appropriate Introduction: The reference text appropriately introduces the social navigation task as a method for evaluating social feelings and actions during dynamic interactions.
Fig. 3 | Social navigation. b, Compared with the low-trait group, the...
Full Caption

Fig. 3 | Social navigation. b, Compared with the low-trait group, the high-trait and ASD groups (n = 56 participants each) both reported a reduced liking of the characters in the social navigation task (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,111) = 8.11, P = 0.0005, npartial² = 0.13; mean ASD: 51.09, mean HT: 51.98, mean LT: 59.10).

Figure/Table Image (Page 6)
Fig. 3 | Social navigation. b, Compared with the low-trait group, the high-trait and ASD groups (n = 56 participants each) both reported a reduced liking of the characters in the social navigation task (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,111) = 8.11, P = 0.0005, npartial² = 0.13; mean ASD: 51.09, mean HT: 51.98, mean LT: 59.10).
First Reference in Text
We began by investigating participants' subjective feelings toward characters in the task and found that the three groups differed in their ratings of character likability (F(2,111) = 8.11, P = 0.0005, npartial² = 0.13; Fig. 3b).
Description
  • Overall Focus: The caption describes part of Figure 3, specifically component '(b)', which presents results related to how much the participants liked the virtual characters in the social navigation task. The focus is on comparing the liking ratings of the ASD and High-Trait groups to the Low-Trait group.
  • Reduced Liking: The caption indicates that the high-trait and ASD groups reported a 'reduced liking' of the characters compared to the low-trait group. This suggests that participants in these groups generally felt less positively toward the virtual characters they interacted with in the task.
  • Sample Size: The sample size is stated as n = 56 for each group.
  • Statistical Analysis: The statistical analysis used is described as a 'two-sided mixed-effects model with random intercepts for matched pair IDs'. As before, a mixed-effects model is used to account for the non-independence of observations within matched pairs.
  • Statistical Results: The results of the mixed-effects model are presented as: F(2,111) = 8.11, P = 0.0005, npartial² = 0.13. 'F(2,111) = 8.11' refers to the F-statistic, 'P = 0.0005' is the p-value, and 'npartial² = 0.13' is partial eta-squared. The small P-value indicates a statistically significant difference in character liking between the groups.
  • Mean Liking Scores: The mean liking scores for each group are provided: mean ASD: 51.09, mean HT: 51.98, mean LT: 59.10. These values represent the average liking ratings for each group.
Scientific Validity
  • Appropriate Statistical Model: The use of a mixed-effects model is appropriate, given the study design.
  • Significant Finding: The reported statistics provide sufficient information to evaluate the finding. The significant p-value and moderate effect size suggest a meaningful difference in character liking between the groups.
  • Need for Post-Hoc Details: It is important to note that post-hoc analyses are needed to determine which specific groups differed significantly from each other. The caption only states that the high-trait and ASD groups both reported reduced liking *compared to the low-trait group*. It is possible that the high-trait and ASD groups did not differ from each other, which should be explicitly stated.
Communication
  • Clear Summary of Main Finding: The caption clearly states that both the high-trait and ASD groups reported less liking of the characters compared to the low-trait group. This provides a concise summary of the main finding related to character likability.
  • Lack of Scale Information: The caption provides the means for each group, which is helpful for comparing the average liking scores. However, it does not provide information about the scale used to measure liking, making it difficult to interpret the magnitude of the differences.
  • Assumed Familiarity with Abbreviations: The use of abbreviations (ASD, HT, LT) is consistent and aids in conciseness. However, the caption assumes familiarity with these abbreviations.
Fig. 3 | Social navigation. c, Despite having comparable feelings toward...
Full Caption

Fig. 3 | Social navigation. c, Despite having comparable feelings toward characters, the ASD group (n = 56 participants) acted less affiliative than the high-trait group (n = 56 participants; two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,111) = 17.21, P = 3.098×10-7, Npartial2 = 0.24; mean ASD: 0.16, mean HT: 0.30, mean LT: 0.46).

Figure/Table Image (Page 6)
Fig. 3 | Social navigation. c, Despite having comparable feelings toward characters, the ASD group (n = 56 participants) acted less affiliative than the high-trait group (n = 56 participants; two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,111) = 17.21, P = 3.098×10-7, Npartial2 = 0.24; mean ASD: 0.16, mean HT: 0.30, mean LT: 0.46).
First Reference in Text
A significant three-group difference in affiliation tendency (F(2,111) = 17.21, P = 3.10 × 10¯7, Npartial² = 0.24; Fig. 3c) revealed that the ASD group acted significantly less affiliative with the characters than both the high-trait group (t(111) = -2.63, P = 0.026, estimated difference = -0.13, 95% CI [-0.25, -0.01], Cohen's d = -0.50) and the low-
Description
  • Overall Focus: The caption describes part of Figure 3, specifically component '(c)', which presents results related to 'affiliative behavior' in the social navigation task. 'Affiliative behavior' refers to actions that promote social connection and closeness with the virtual characters.
  • Dissociation of Feelings and Behavior: The caption highlights that, 'Despite having comparable feelings toward characters,' the ASD group acted less affiliative than the high-trait group. This suggests that the ASD group did not translate their feelings of liking into actions that would build stronger relationships with the virtual characters.
  • Sample Size: The sample size is stated as n = 56 for both the ASD and High-Trait groups.
  • Statistical Analysis: The statistical analysis used is described as a 'two-sided mixed-effects model with random intercepts for matched pair IDs'.
  • Statistical Results: The results of the mixed-effects model are presented as: F(2,111) = 17.21, P = 3.098×10-7, Npartial2 = 0.24. As before, 'F' is the F-statistic, 'P' is the p-value, and 'Npartial²' is partial eta-squared. The small P-value indicates a statistically significant difference in affiliative behavior between the groups.
  • Mean Affiliative Behavior Scores: The mean affiliative behavior scores for each group are provided: mean ASD: 0.16, mean HT: 0.30, mean LT: 0.46.
Scientific Validity
Communication
  • Effective Highlighting of Dissociation: The caption effectively highlights the contrast between comparable feelings (liking) and differing behaviors (affiliative actions). This emphasizes the dissociation between subjective feelings and objective actions in the ASD group.
  • Omission of ASD vs. Low-Trait Comparison: The caption focuses on the difference between the ASD and high-trait groups, but the reference text indicates that the ASD group also acted less affiliative than the low-trait group. This omission limits the caption's completeness.
  • Need for Scale Context: The inclusion of the means for each group helps quantify the differences in affiliative behavior. However, without knowing the scale's range or meaning, the absolute values are difficult to interpret.
Fig. 3 | Social navigation. d, The groups (n = 56 participants each) did not...
Full Caption

Fig. 3 | Social navigation. d, The groups (n = 56 participants each) did not differ in their power tendencies (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,163) = 1.89, P = 0.15, Npartial2 = 0.02; mean ASD: 0.13, mean HT: 0.19, mean LT: 0.09).

Figure/Table Image (Page 6)
Fig. 3 | Social navigation. d, The groups (n = 56 participants each) did not differ in their power tendencies (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,163) = 1.89, P = 0.15, Npartial2 = 0.02; mean ASD: 0.13, mean HT: 0.19, mean LT: 0.09).
First Reference in Text
The groups did not differ in their power tendencies (F(2,163) = 1.89, P = 0.15, Npartial² = 0.02; Fig. 3d).
Description
  • Overall Focus: The caption describes part of Figure 3, specifically component '(d)', which presents results related to 'power tendencies' in the social navigation task. 'Power tendencies' refers to the degree to which participants tried to exert control or influence over the virtual characters.
  • No Group Differences: The caption highlights that the groups 'did not differ in their power tendencies.' This suggests that there were no significant differences between the ASD, High-Trait, and Low-Trait groups in how much they tried to control the interactions with the virtual characters.
  • Sample Size: The sample size is stated as n = 56 for each group.
  • Statistical Analysis: The statistical analysis used is described as a 'two-sided mixed-effects model with random intercepts for matched pair IDs'.
  • Statistical Results: The results of the mixed-effects model are presented as: F(2,163) = 1.89, P = 0.15, Npartial² = 0.02. As before, 'F' is the F-statistic, 'P' is the p-value, and 'Npartial²' is partial eta-squared. The large P-value indicates that there was no statistically significant difference in power tendencies between the groups.
Scientific Validity
  • Appropriate Statistical Model: The use of a mixed-effects model is appropriate, given the study design.
  • Support for Null Finding: The reported statistics provide sufficient information to evaluate the null finding. The non-significant p-value supports the conclusion that there were no significant differences in power tendencies.
  • Small Effect Size: The small effect size (Npartial² = 0.02) indicates that any real differences in power tendencies are likely minimal, further supporting the null finding.
Communication
  • Clear Communication of Null Finding: The caption clearly states that the groups did not differ in their power tendencies. This directly communicates the null finding, which is important for understanding the overall results of the social navigation task.
  • Helpful Descriptive Statistics: The inclusion of the means for each group provides valuable descriptive information, even though the overall difference was not statistically significant. This allows readers to assess the relative power tendencies of each group.
  • Lack of Interpretation of Effect Size: The caption provides a comprehensive statistical summary. The interpretation of the results could be enhanced by mentioning the small effect size, which indicates that any real differences in power tendencies are likely minimal.
Fig. 3 | Social navigation. e, No group-by-trait interaction on character...
Full Caption

Fig. 3 | Social navigation. e, No group-by-trait interaction on character liking was detected (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,155) = 1.76, P= = 0.18, npartial² = 12 = 0.02).

Figure/Table Image (Page 6)
Fig. 3 | Social navigation. e, No group-by-trait interaction on character liking was detected (two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,155) = 1.76, P= = 0.18, npartial² = 12 = 0.02).
First Reference in Text
There was no group-by-trait interaction on character liking (F(2,155) = 1.76, P = 0.18, Npartial² = 0.02; Fig. 3e).
Description
  • Likely Description: The caption is truncated, but it seems to be describing a statistical test related to a group-by-trait interaction on character liking, which is a measure of how much the groups liked the virtual characters.
  • Statistical Analysis: The statistical test is identified as a 'two-sided mixed-effects model with random intercepts for matched pair IDs'. As previously, this indicates the model being used accounts for non-independence of observations.
Scientific Validity
Communication
  • Incomplete Caption: The caption is truncated which makes it difficult to determine its communication effectiveness.
  • Clear reference text: The reference text is clear and supports the caption's purpose, but this is difficult to determine fully without the complete caption.
Fig. 3 | Social navigation. f, However, the relationship between affiliative...
Full Caption

Fig. 3 | Social navigation. f, However, the relationship between affiliative behavior and self-reported traits differed by group (n = 56 participants each; two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,160) = 3.42, P = 0.035, npartial² = 0.04).

Figure/Table Image (Page 6)
Fig. 3 | Social navigation. f, However, the relationship between affiliative behavior and self-reported traits differed by group (n = 56 participants each; two-sided mixed-effects models with random intercepts for matched pair IDs: F(2,160) = 3.42, P = 0.035, npartial² = 0.04).
First Reference in Text
Finally, there was a significant group-by-trait interaction on affiliation tendency (F(2,160) = 3.42, P = 0.035, Npartial² = 0.04; Fig. 3f).
Description
  • Overall Focus: The caption refers to Figure 3f, which shows that the relationship between 'affiliative behavior' (actions promoting social connection) and 'self-reported traits' (personal characteristics reported by the participants themselves) is not the same across the three groups.
  • Self-Reported Traits: The caption does not specify which self-reported traits are being considered here, but it refers back to earlier descriptions of the measures used in the study.
  • Sample Size: The sample size is stated as n = 56 for each group.
  • Statistical Analysis: The statistical analysis used is described as a 'two-sided mixed-effects model with random intercepts for matched pair IDs.'
  • Statistical Results: The results of the mixed-effects model are presented as: F(2,160) = 3.42, P = 0.035, npartial² = 0.04. The P-value is slightly above the conventional significance level of 0.05, which is important to note.
Scientific Validity
  • Small Effect Size and Missing Post-Hoc Analysis Information: While a mixed-effects model is used, the relatively small effect size (partial eta-squared = 0.04) suggests that the interaction explains only a small portion of the variance in affiliative behavior. The statistical significance (p = 0.035) is marginal, and the caption does not describe any of the post hoc tests.
  • Appropriate Statistical Model but Weak Conclusion: The statistical analysis used is appropriate, given the study design. However, the marginal statistical significance and the lack of post-hoc analyses make it difficult to draw strong conclusions about the nature of the group differences.
Communication
  • Limited Informativeness: The caption clearly states that the relationship between affiliative behavior and self-reported traits differed by group, which is a crucial finding. However, it does not specify which groups differed or the nature of the differences. This lack of detail limits the caption's informativeness.
  • Missing Post-Hoc Analysis Information: The inclusion of the F-statistic, p-value, and effect size provides essential statistical information. However, the caption lacks information about which specific statistical comparisons revealed the group differences.

Discussion

Key Aspects

Strengths

Suggestions for Improvement

Methods

Key Aspects

Strengths

Suggestions for Improvement

↑ Back to Top