Predictive equation derived from 6,497 doubly labelled water measurements enables the detection of erroneous self-reported energy intake

Section Analysis

Abstract

Key Aspects

Study Objective: The primary objective of this study is to address the pervasive issue of inaccurate dietary intake data in nutritional epidemiology. The researchers aim to develop a more accurate method for identifying unreliable self-reported dietary data by comparing reported energy intake with total energy expenditure (TEE).
Methodology: The study utilizes the International Atomic Energy Agency Doubly Labeled Water Database, which contains 6,497 TEE measurements from individuals aged 4 to 96 years. This data is used to derive a predictive equation for TEE based on easily acquired variables such as body weight, age, and sex.
Predictive Equation Development: A regression equation is developed to predict expected TEE. This equation has 95% predictive limits, which are then used to screen for misreporting by participants in dietary studies. The equation is designed to be more accurate than previous methods, such as the Goldberg cut-off, which have limitations due to errors in predicting basal metabolic rate and arbitrary multipliers.
Application and Findings: The predictive equation is applied to two large datasets: the National Diet and Nutrition Survey and the National Health and Nutrition Examination Survey. The analysis reveals that the level of misreporting in these studies is greater than 50%.
Bias Identification: The study finds that the macronutrient composition reported in these dietary studies is systematically biased. Specifically, the level of misreporting is correlated with the reported proportions of different macronutrients, leading to potentially spurious associations between diet components and body mass index.
Implications: The findings highlight the significant issue of misreporting in dietary studies and its impact on the accuracy of nutritional epidemiology research. The developed predictive equation offers a more robust tool for identifying and potentially correcting for this misreporting, thereby improving the reliability of research in this field.

Strengths

Comprehensive Data Utilization
The study leverages a large and diverse dataset from the International Atomic Energy Agency Doubly Labeled Water Database, enhancing the robustness and generalizability of the predictive equation.

"In this study, we used the International Atomic Energy Agency Doubly Labeled Water Database to derive a predictive equation for TEE using 6,497 measures of TEE in individuals aged 4 to 96 years." (Page 58)
Clear Methodology
The abstract clearly outlines the methodological approach, including the development of a regression equation and its application to large datasets, providing a concise overview of the study's design.

"The resultant regression equation predicts expected TEE from easily acquired variables, such as body weight, age and sex, with 95% predictive limits that can be used to screen for misreporting by participants in dietary studies." (Page 58)
Significant Findings
The study reveals a high level of misreporting in dietary studies and identifies systematic biases in reported macronutrient composition, highlighting critical issues in nutritional epidemiology.

"We applied the equation to two large datasets (National Diet and Nutrition Survey and National Health and Nutrition Examination Survey) and found that the level of misreporting was >50%." (Page 58)

Suggestions for Improvement

Enhance Clarity on Equation Variables
This medium-impact improvement would enhance the reader's understanding of the predictive equation's inputs and their relevance. The Abstract section particularly needs this detail as it sets the stage for the study's methodology and findings. Elaborating on the specific variables used in the regression equation would strengthen the paper by providing a clearer picture of how TEE is predicted and how these variables were chosen. This would also help readers appreciate the novelty and robustness of the approach compared to previous methods. Ultimately, clarifying the variables used in the predictive equation would improve the study's scientific contribution by ensuring the methodology is transparent and easily understood.

"The resultant regression equation predicts expected TEE from easily acquired variables, such as body weight, age and sex, with 95% predictive limits" (Page 58)

Implementation: Specifically mention the key variables used in the regression equation, such as body weight, age, sex, and any other significant predictors. For example, "The resultant regression equation predicts expected TEE from easily acquired variables, such as body weight, age, sex, height, and ethnicity, with 95% predictive limits..."
Provide Context on Misreporting Implications
This high-impact improvement would significantly enhance the reader's understanding of the study's broader implications. The Abstract section needs this context to effectively communicate the significance of the findings to the field of nutritional epidemiology. Briefly elaborating on the consequences of misreporting for research and public health would strengthen the paper by highlighting the importance of accurate dietary data and the potential impact of the new predictive equation. This would also underscore the study's contribution to improving the reliability of nutritional research. Ultimately, providing context on the implications of misreporting would significantly improve the study's impact by emphasizing the practical value of the findings.

"The macronutrient composition from dietary reports in these studies was systematically biased as the level of misreporting increased, leading to potentially spurious associations between diet components and body mass index." (Page 58)

Implementation: Include a sentence or phrase that briefly explains the implications of misreporting for nutritional epidemiology. For example, "This misreporting can lead to inaccurate conclusions about diet-disease relationships and hinder the development of effective public health interventions."
Clarify the Novelty of the Approach
This medium-impact improvement would enhance the reader's understanding of the study's unique contribution to the field. The Abstract section needs this clarification to effectively position the research within the existing literature. Briefly explaining how the new predictive equation differs from and improves upon previous methods would strengthen the paper by highlighting its novelty and potential to advance the field. This would also help readers appreciate the significance of the study's findings in the context of existing research limitations. Ultimately, clarifying the novelty of the approach would improve the study's scientific contribution by clearly demonstrating its advancement over previous methods.

"Nutritional epidemiology aims to link dietary exposures to chronic disease, but the instruments for evaluating dietary intake are inaccurate." (Page 58)

Implementation: Add a sentence or phrase that explicitly states how the new predictive equation differs from previous approaches, such as the Goldberg cut-off. For example, "Unlike previous methods that rely on basal metabolic rate estimations and arbitrary multipliers, this equation uses a data-driven approach to predict TEE and identify misreporting."

Introduction

Key Aspects

Problem of Misreporting in Dietary Studies: The Introduction section highlights the pervasive issue of misreporting in dietary studies, which complicates the accurate quantification of food intake and hinders the ability to link nutritional exposures to disease outcomes. Misreporting encompasses inaccuracies in estimating food amounts, memory lapses, deliberate falsification of reports, and changes in eating behavior during recording periods. This issue is not limited to self-reporting but also includes errors introduced by investigators during data conversion, such as assuming uniform portion sizes.
Consequences of Misreporting: The section emphasizes that misreporting has led to significant misunderstandings in nutritional epidemiology, such as the erroneous belief that individuals with obesity have very low energy intakes. This has resulted in a misattribution of obesity to defects in energy expenditure rather than inaccuracies in dietary reporting. The severity of misreporting has prompted calls to cease publishing studies relying on self-reported dietary intake, yet these studies continue to proliferate, supported by endorsements from various government bodies.
Early Attempts to Address Misreporting: Early efforts to address misreporting involved defining cut-off limits for screening intake records based on predicted basal energy expenditure (BEE). The 'Goldberg cut-off' was developed by multiplying estimated BEE by 1.35, assuming that a daily total energy expenditure (TEE) lower than this value would be incompatible with survival. However, this method is prone to errors in predicting resting metabolic rate and relies on an arbitrary multiplier, thus only detecting very low reported intakes and missing many other inaccuracies.
Introduction of Doubly Labeled Water Technique: The doubly labeled water (DLW) technique, which measures energy expenditure directly from the elimination of isotopes of oxygen and hydrogen, is introduced as a more accurate method. An analytical error of about 7% is associated with this technique, depending on the equation used. McCrory et al. proposed using DLW measurements to predict TEE and screen dietary recalls, offering an improvement over the Goldberg cut-off. However, this approach was limited by its reliance on equations derived from a small sample and the use of arbitrary cut-off limits.
Current Study's Approach: The current study aims to address the limitations of previous methods by assembling a large database of DLW measurements from over 7,500 individuals. This database is used to derive prediction equations for TEE based on easily measured parameters such as body weight, height, age, and sex. The study's goal is to provide a more robust tool for identifying individuals who may be under- or over-reporting their dietary intake in surveys, thereby improving the accuracy of nutritional epidemiology research.
Application to Large Datasets: The derived prediction equations are applied to two large publicly available dietary surveys, the National Diet and Nutrition Survey (NDNS) and the National Health and Nutrition Examination Survey (NHANES). The study demonstrates the use of these equations in identifying misreporting and shows that the level of under-reporting is underestimated by previous tools, leading to biases in evaluating dietary composition.

Strengths

Comprehensive Overview of the Problem
The Introduction effectively outlines the pervasive issue of misreporting in dietary studies, providing a clear rationale for the study's focus on developing a more accurate method for identifying such errors.

"All these methods are prone to 'misreporting' because people cannot accurately estimate the amount of food they are eating" (Page 58)
Clear Explanation of Previous Methods and Limitations
The section clearly explains previous methods, such as the Goldberg cut-off and its modifications, and highlights their limitations, setting the stage for the need for a new approach.

"However, this approach is susceptible to two major problems: error in the predicted resting metabolic rate and the arbitrary nature of the 1.35 multiplier." (Page 59)
Strong Justification for the Current Study
The Introduction provides a strong justification for the current study by highlighting the limitations of existing methods and the need for a more robust approach based on a large dataset of doubly labeled water measurements.

"In this context, we have assembled a database of DLW measurements of healthy individuals" (Page 59)

Suggestions for Improvement

Expand on the Novelty of the DLW Database
This medium-impact improvement would enhance the reader's understanding of the study's unique contribution to the field. The Introduction section needs this clarification to effectively position the research within the existing literature and highlight the innovative use of the extensive DLW database. Elaborating on the specific advantages and novel aspects of this database, such as its size, diversity, and the inclusion of various age groups and ethnicities, would strengthen the paper by emphasizing its potential to overcome limitations of previous studies. This would also help readers appreciate the significance of the study's findings in the context of existing research limitations and underscore the advancement this database represents in the field of nutritional epidemiology. Ultimately, expanding on the novelty of the DLW database would improve the study's scientific contribution by clearly demonstrating its advancement over previous methods and its potential to provide more accurate and generalizable insights into dietary misreporting.

"In this context, we have assembled a database of DLW measurements of healthy individuals." (Page 59)

Implementation: Include a paragraph that details the unique features of the DLW database, such as its size, diversity, and the range of ages and ethnicities included. For example, "This study leverages an unprecedentedly large and diverse database of doubly labeled water measurements, encompassing over 7,500 individuals aged 8 days to 96 years from various ethnic backgrounds. This extensive dataset allows for the development of more robust and generalizable prediction equations for TEE, addressing limitations of previous studies that relied on smaller, less diverse samples."
Clarify the Implications of Misreporting for Public Health
This high-impact improvement would significantly enhance the reader's understanding of the study's broader implications for public health. The Introduction section needs this context to effectively communicate the significance of accurate dietary assessment beyond the research setting. Briefly elaborating on how misreporting can lead to flawed public health policies, inaccurate dietary guidelines, and ineffective interventions would strengthen the paper by highlighting the real-world consequences of the problem. This would also underscore the study's contribution to improving the evidence base for public health nutrition and emphasize the practical value of the new predictive equation in addressing these issues. Ultimately, clarifying the implications of misreporting for public health would significantly improve the study's impact by emphasizing the importance of accurate dietary data for promoting population health and preventing chronic diseases.

"Misreporting has real negative consequences." (Page 58)

Implementation: Add a few sentences that explain the potential consequences of misreporting for public health policy and interventions. For example, "Accurate dietary assessment is crucial not only for research but also for informing public health policies, developing dietary guidelines, and designing effective interventions to prevent chronic diseases. Misreporting can lead to erroneous conclusions about diet-disease relationships, resulting in misguided policies and interventions that fail to address the true nutritional needs of the population."
Provide More Context on the Study Population
This medium-impact improvement would enhance the reader's understanding of the study's scope and generalizability. The Introduction section needs this context to effectively frame the research within the broader population and highlight any potential limitations. Briefly describing the characteristics of the study population, such as age range, sex distribution, and ethnicity, would strengthen the paper by providing a clearer picture of the individuals included in the database and the potential applicability of the findings. This would also help readers assess the representativeness of the sample and identify any potential biases or limitations in generalizing the results to other populations. Ultimately, providing more context on the study population would improve the study's scientific contribution by ensuring transparency and allowing for a more nuanced interpretation of the findings.

"The database includes measurements of over 7,500 individuals of diverse ethnicity aged 8 days to 96 years." (Page 59)

Implementation: Include a brief description of the study population, mentioning the age range, sex distribution, and ethnicity of the individuals included in the DLW database. For example, "The database includes measurements from over 7,500 individuals aged 8 days to 96 years, with a diverse representation of ethnicities, including White, African, Asian, and Hispanic populations. The sample includes both males and females, providing a comprehensive dataset for developing prediction equations across different demographic groups."

Results

Key Aspects

Predictive Model Development: The researchers developed predictive models for Total Energy Expenditure (TEE) using two primary approaches: classical general linear regression and machine learning models. The general linear regression included variables such as body weight, height, age, age squared, self-reported ethnicity, sex, and elevation. Machine learning models, including Random Forest, XGBoost, and Support Vector Regression, were also tested but did not improve upon the classical regression, likely because the predictors were linearly related to the output variable. The natural logarithm of body weight (ln(BW)) was the most significant predictor, with other variables like height, age, age squared, elevation, and sex also being highly significant. Notably, females had lower TEE than males, and there were significant differences in TEE based on self-reported ethnicity, particularly among White and African participants living outside Africa.
Predictive Equation Derivation: A predictive equation for TEE was derived with coefficients reduced to four significant figures, resulting in minimal discrepancy (0.03%) compared to an equation using full precision. The equation is expressed as: In(TEE) = -0.2127 + 0.4167 × In(BW) + 0.006565 × Height - 0.02054 × Age + 0.0003308 × Age² - 0.000001852 × Age³ + 0.09126 × In(Elevation) - 0.04092 × Sex + 0.01940 × A - 0.03899 × AA + 0.006238 × AS + 0.02626 × W - 0.0155 × H + 0.003589 × NA - 0.0006759 × Height × In(Elevation) + 0.002018 × Age × In(Elevation) - 0.00002262 × Age² × In(Elevation) - 0.006947 × Sex × In(Elevation). This equation allows for the prediction of TEE based on easily measured parameters and includes adjustments for various demographic factors.
Validation and Predictive Intervals: The derived equation was validated using a separate dataset of 598 individuals. The validation confirmed that 94.6% of independent TEE measurements fell within the 95% predictive limits. Ninety-five percent predictive intervals (95% PI) were calculated to provide a range of values likely to contain the true value for a new observation. The lower and upper 95% PI were defined as: Lower 95% PI = (pTEE × 0.7466) – 1.5405 and Upper 95% PI = (pTEE × 1.3395) + 2.7668, where pTEE is the predicted mean TEE. These intervals offer an objective evaluation of confidence for each prediction, surpassing previous methods that relied on arbitrary cut-off points.
Application to Survey Data: The predictive equation was applied to two large dietary surveys: the National Diet and Nutrition Survey (NDNS) and the National Health and Nutrition Examination Survey (NHANES). The analysis revealed that a significant proportion of dietary reports fell outside the predicted TEE limits. In NHANES, approximately 43.7% of adult dietary reports were within the predictive interval, while in NDNS, the figures were 36.6% for males and 38.1% for females. Children generally had a higher percentage of reports within the predicted range compared to adults. These findings suggest a high level of misreporting or undereating in both surveys.
Effects of Age and BMI on Under-reporting: The study examined the relationship between the discrepancy in reported energy intake and predicted TEE as a function of age and Body Mass Index (BMI). In adults, the extent of under-reporting was largely independent of age, although a slight improvement was observed with age in the NDNS dataset. The discrepancy between reported intake and predicted expenditure was strongly negatively correlated with individual BMI. Individuals with higher BMI exhibited larger discrepancies, indicating greater under-reporting. This effect was more pronounced in children than in adults.
Under-reporting and Macronutrient Intake: The researchers explored the relationship between the discrepancy in energy intake and the proportional macronutrient composition of the reported diet. They found a strong negative relationship between the reported percentage of energy from protein and the absolute size of the energy discrepancy. As the reported protein intake increased, the discrepancy became more negative, indicating greater under-reporting. Conversely, as the percentage of fat energy increased, the discrepancy became more positive. These findings suggest that individuals who under-reported their total energy intake also reported a greater percentage of protein and a reduced percentage of fat in their diets.

Strengths

Clear Presentation of Predictive Equation
The Results section clearly presents the derived predictive equation for TEE, including all significant terms and their coefficients, which enhances the transparency and reproducibility of the study.

"In this analysis, we were able to derive a predictive equation with each coefficient reduced to four significant figures." (Page 59)
Comprehensive Validation
The researchers conducted a thorough validation of the predictive equation using an independent dataset, demonstrating its robustness and confirming that a high percentage of measurements fell within the predicted limits.

"The validation dataset confirmed that 94.6% of independent TEE measurements were within these 95% predictive limits" (Page 60)
Detailed Analysis of Survey Data
The application of the predictive equation to two large dietary surveys (NDNS and NHANES) provides a detailed and insightful analysis of misreporting, including stratification by age, sex, and BMI.

"For adults in NHANES, approximately 43.7% of dietary reports were within the predictive interval" (Page 60)

Suggestions for Improvement

Expand on Machine Learning Model Comparisons
This medium-impact improvement would provide a more comprehensive understanding of the methodological choices made in the study. The Results section needs this detail to fully justify the selection of the classical general linear regression model over the machine learning alternatives. Elaborating on the specific reasons why the machine learning models (Random Forest, XGBoost, and Support Vector Regression) did not outperform the classical regression would strengthen the paper by providing a clearer rationale for the chosen approach. This would also help readers appreciate the nuances of model selection in the context of predicting TEE and understand the limitations of each method. Ultimately, expanding on the machine learning model comparisons would improve the study's methodological rigor by ensuring transparency in the model selection process and providing a more complete picture of the analytical approach.

"These machine learning models did not improve on the classical general linear regression modelling" (Page 59)

Implementation: Include a paragraph that details the performance metrics of each machine learning model compared to the classical regression, such as R-squared values, mean absolute error, and any other relevant statistics. For example, "While the machine learning models showed comparable performance to the classical regression, with R-squared values of X for Random Forest, Y for XGBoost, and Z for Support Vector Regression, they did not offer significant improvements in predictive accuracy. This is likely due to the linear relationships between the predictors and TEE, which are adequately captured by the classical regression model."
Clarify the Rationale for Using 95% Predictive Intervals
This medium-impact improvement would enhance the reader's understanding of the statistical methods employed in the study. The Results section needs this clarification to justify the choice of using 95% predictive intervals (PI) over other potential metrics for assessing misreporting. Providing a more detailed explanation of why 95% PI were selected and how they offer advantages over previous methods, such as the Goldberg cut-off, would strengthen the paper by providing a stronger statistical foundation for the analysis. This would also help readers appreciate the novelty and robustness of the approach in identifying misreporting. Ultimately, clarifying the rationale for using 95% PI would improve the study's methodological rigor by ensuring transparency in the statistical methods and providing a clearer justification for their use.

"Ninety-five per cent predictive intervals (95% PI) are the range of values that are 95% likely to contain the true value" (Page 60)

Implementation: Add a few sentences that explain the advantages of using 95% PI, such as their ability to account for individual variability and provide a more accurate assessment of misreporting compared to fixed cut-off values. For example, "The 95% PI were chosen over traditional cut-off methods because they provide a statistically sound range that accounts for individual variability in TEE. Unlike fixed cut-offs, which can lead to misclassification, the 95% PI offer a more nuanced and accurate approach to identifying potential misreporting."
Provide More Context on the Implications of the Findings for Specific Populations
This high-impact improvement would significantly enhance the reader's understanding of the study's broader implications for different demographic groups. The Results section needs this context to effectively communicate the significance of the findings beyond the overall levels of misreporting. Briefly elaborating on how the observed patterns of misreporting vary across specific populations, such as children, adults, and individuals with different BMIs, and the potential consequences for nutritional research and public health in these groups would strengthen the paper by highlighting the practical relevance of the findings. This would also underscore the study's contribution to improving the accuracy of dietary assessment in diverse populations. Ultimately, providing more context on the implications of the findings for specific populations would significantly improve the study's impact by emphasizing the importance of tailored approaches to addressing misreporting and promoting accurate dietary data collection in different demographic groups.

"The deficit between reported intake and predicted expenditure was strongly negatively correlated with individual BMI" (Page 61)

Implementation: Include a few sentences or a paragraph that discusses the implications of the findings for specific populations, such as the higher prevalence of under-reporting among individuals with higher BMI and the potential impact on obesity research. For example, "The finding that under-reporting is more prevalent among individuals with higher BMI has important implications for studies investigating the relationship between diet and obesity. This highlights the need for targeted strategies to address misreporting in this population and improve the accuracy of dietary data in obesity research."

Non-Text Elements

Table 1 | Significant terms in the general linear model analysis (10 decimal...

Full Caption

Table 1 | Significant terms in the general linear model analysis (10 decimal places) predicting TEE

Figure/Table Image (Page 3)

First Reference in Text

The derived significant predictors and their regression coeffi-cients are reported in Table 1.

Description

Purpose of the table: This table shows the results of a statistical analysis called 'general linear model analysis'. You can think of this like drawing a line through a cloud of points on a graph to see if there's a relationship between different things. Here, they're looking at what factors might be related to something called 'Total Energy Expenditure' (TEE), which is the amount of energy (like calories) a person uses in a day.
Content of the table: The table lists different factors that the analysis found to be important in predicting TEE. For each factor, it gives a 'coefficient,' which is like the slope of the line in our graph analogy – it tells us how much TEE is expected to change if that factor changes. It also shows the 'standard error' of the coefficient, which is a measure of how precise that estimate is. Additionally, it includes a 'T-value,' which helps determine if the factor is statistically significant, and a 'P-value,' which is the probability of seeing the observed relationship (or a stronger one) if there was actually no real relationship between the factor and TEE. A low P-value (typically less than 0.05) suggests that the relationship is statistically significant, meaning it's unlikely to have occurred by chance.
Specific factors listed: The factors listed in the table are things like body weight, height, age, sex, elevation, and ethnicity. 'In[BW (kg)]' refers to the natural logarithm of body weight in kilograms. The natural logarithm is a mathematical function that helps to transform the data in a way that makes it more suitable for this type of analysis. Similarly, 'In[Elevation (m)]' is the natural logarithm of the elevation where the measurement was taken, in meters. 'Sex' is a categorical variable, likely coded as male or female. 'Ethnicity' is also a categorical variable, with categories like 'A' for African, 'AA' for African living outside Africa, 'AS' for Asian, 'W' for White, 'H' for Hispanic, and 'NA' for not available.
Precision of the coefficients: The coefficients in the table are reported to 10 decimal places. This level of precision is unusual in many scientific fields, but the authors state later in the paper that they did this for reproducibility, so that others can get the exact same results if they use their model.

Scientific Validity

Model Specification: The authors have included a comprehensive set of predictors in their model, including demographic, anthropometric, and environmental variables. The inclusion of interaction terms (e.g., Height x In[Elevation (m)]) is appropriate and suggests a thorough exploration of potential relationships. However, the rationale for including specific interaction terms could be more explicitly stated.
Statistical Significance: The reported T-values and P-values allow for a clear assessment of the statistical significance of each predictor. The use of a general linear model is appropriate given the nature of the dependent variable (TEE) and the mix of continuous and categorical predictors.
Precision of Coefficients: While reporting coefficients to 10 decimal places is unconventional, the authors justify this decision based on the need for precise replication of their predictive model. This level of precision, although unusual, does not inherently invalidate the scientific rigor of the analysis. It is crucial, however, that the authors provide a sensitivity analysis demonstrating the impact of this precision on the model's predictions.

Communication

Clarity of Column Headers: The column headers are generally clear and provide sufficient information to understand the table's contents. However, 'SE coefficient' could be more explicitly labeled as 'Standard Error of Coefficient' for better clarity.
Footnotes: The footnote defining the ethnicity abbreviations is helpful. However, providing a brief explanation of the coding for 'Sex' within the table or in a footnote would further enhance clarity.
Caption Clarity: The caption is concise but could be slightly more descriptive. For example, it could be revised to: 'Table 1 | Significant terms and their coefficients from the general linear model analysis predicting Total Energy Expenditure (TEE), presented with 10 decimal places for reproducibility.'

Table 2 | Summary of observations inside and outside the tolerance limits in...

Full Caption

Table 2 | Summary of observations inside and outside the tolerance limits in the NDNS and NHANES datasets

Figure/Table Image (Page 4)

First Reference in Text

Using the predictive equations developed above, the number and percentage of individuals that fell outside the predicted limits (both over and under) and within the predicted limits are shown in Table 2, stratified by data source, age (adults versus children) and sex.

Description

Purpose of the table: This table summarizes how well the predictions from their equation match up with real-world data from two large dietary surveys, the NDNS (National Diet and Nutrition Survey) and NHANES (National Health and Nutrition Examination Survey). It's like checking if a weatherman's forecast (the equation's prediction) accurately reflects the actual weather (the survey data).
Tolerance Limits: The 'tolerance limits' refer to the range of values within which the researchers expect the real-world data to fall, based on their equation. It's like setting a margin of error around the prediction. If the weatherman predicts a temperature of 20 degrees Celsius, he might say the actual temperature will likely be between 18 and 22 degrees. That range is the tolerance limit.
Organization of the table: The table is organized by dividing the survey participants into groups based on which survey they were part of (NDNS or NHANES), whether they were adults or children, and their sex (male or female). For each group, the table shows how many people had reported dietary intakes that fell within the predicted range (inside the tolerance limits), below the predicted range (underestimated), and above the predicted range (overestimated). These counts are also shown as percentages of the total number of people in each group.
Interpretation of the data: If the equation is a good predictor, most people's reported intakes should fall within the tolerance limits. If a large percentage of people fall outside the limits, it suggests that the equation may not be accurately reflecting real-world dietary intake, or that there's a lot of misreporting in the surveys. For example a high percentage of 'underestimated' would suggest that many people are reporting eating less than what the equation predicts they should need based on factors like their weight, height, and age.

Scientific Validity

Appropriateness of Tolerance Limits: The scientific validity of this table hinges on the appropriateness of the tolerance limits used. The authors have previously described how these limits were derived (95% prediction intervals), which is a statistically sound approach. However, the validity of applying these limits to assess misreporting relies on the assumption that deviations from the predicted TEE primarily reflect misreporting rather than individual variability or other factors not captured by the model.
Stratification by Data Source, Age, and Sex: Stratifying the results by data source, age, and sex is crucial for identifying potential biases or differences in the performance of the predictive equation across different populations. This allows for a more nuanced interpretation of the results and helps to pinpoint specific groups where misreporting may be more prevalent. The choice of these stratification variables is justified given their known associations with dietary intake and reporting behaviors.
Use of Number and Percentage: Presenting both the number and percentage of individuals within each category is helpful for interpretation. Percentages provide a standardized way to compare across groups of different sizes, while raw numbers give a sense of the actual sample sizes involved.

Communication

Clarity of Column Headers: The column headers are relatively clear but could be improved. 'Number underestimated' and 'Number overestimated' could be more explicitly defined as 'Number below the lower tolerance limit' and 'Number above the upper tolerance limit,' respectively. Similarly, 'Number within range' could be clarified as 'Number within tolerance limits.'
Caption Clarity: The caption is generally clear and informative. It could be slightly improved by explicitly stating that the tolerance limits are based on the 95% prediction intervals of the predictive equation.
Footnote: The footnote is helpful in explaining what the table shows. However, it could be made more informative by briefly mentioning the years the datasets cover, which is relevant context for interpreting the results.
Overall Readability: The table is well-organized and relatively easy to read. The use of bold font for the main categories (e.g., Male children, Female children) enhances readability.

Fig. 1 | Misreporting in relation to age, BMI and sex. a, Comparison of the...

Full Caption

Fig. 1 | Misreporting in relation to age, BMI and sex. a, Comparison of the difference between predicted TEE and self-reported energy intake (EI) in the NDNS (n = 12,694) and NHANES (n = 5,873) datasets in relation to age for children (≤16 yr) and adults (>16 yr). b, Comparison of the difference between predicted TEE and self-reported energy intake in the same datasets in relation to BMI for children (≤16 yr) and adults (>16 yr). Negative values show observations lower than prediction and positive values show prediction higher than observation.

Figure/Table Image (Page 5)

First Reference in Text

We plotted the difference between the survey estimate of daily energy intake and the predicted TEE as a function of age and body mass index (BMI) for both the NDNS and NHANES datasets (Fig. 1).

Description

Overall Purpose: This figure is trying to show how well people's self-reported food intake matches up with what a scientific equation predicts they should be eating. The equation predicts Total Energy Expenditure (TEE), which is the number of calories a person burns in a day. The researchers are comparing this prediction to self-reported energy intake (EI), which is what people say they eat in dietary surveys. The difference between these two (predicted TEE - self-reported EI) is an indication of potential 'misreporting' - either under-reporting (eating less than they say) or over-reporting (eating more than they say).
Structure of the Figure: The figure is divided into two parts, labeled 'a' and 'b'. Each part contains four graphs. Part 'a' looks at the relationship between misreporting and age, while part 'b' looks at the relationship between misreporting and Body Mass Index (BMI), which is a measure of body fat based on height and weight. Each graph is a scatter plot, with each dot representing a person in the study. The graphs are further split by the dataset used (NDNS or NHANES) and whether the participants were children or adults.
X and Y Axes: In part 'a', the x-axis (horizontal) represents age in years, while in part 'b', it represents BMI. In both parts, the y-axis (vertical) represents the difference between predicted TEE and self-reported EI. A value of 0 on the y-axis means that the predicted TEE and self-reported EI are the same. Negative values mean that people are reporting eating less than the equation predicts (under-reporting), while positive values mean they are reporting eating more (over-reporting).
Interpretation of the Data Points: Each dot on the scatter plots shows an individual's data. For example, in part 'a', a dot on the NDNS-Adults graph with an x-axis value of 40 and a y-axis value of -5 would represent a 40-year-old adult in the NDNS study who reported eating 5 megajoules (a unit of energy) less per day than the equation predicted. The red line on each graph is a trend line, which is like drawing a line through the middle of the dots to see the general pattern. If the trend line slopes downwards, it means that as age or BMI increases, the difference between predicted TEE and self-reported EI tends to become more negative (more under-reporting).

Scientific Validity

Appropriateness of Visualization: Using scatter plots with trend lines is an appropriate way to visualize the relationship between continuous variables like age, BMI, and the difference between predicted and reported energy intake. This allows for a visual assessment of the magnitude and direction of misreporting across different age and BMI groups.
Statistical Analysis: The reference text indicates that the authors plotted the difference between predicted and reported energy intake, but it doesn't specify the method used to generate the trend lines. It's crucial to know whether these are simple linear regressions or if a more sophisticated smoothing technique was employed. The choice of method can influence the interpretation of the trends.
Sample Size: The large sample sizes from the NDNS and NHANES datasets provide robust data for this analysis. However, it's important to consider potential biases or limitations inherent in these datasets, such as the reliance on self-reported dietary intake.

Communication

Clarity of Axes Labels: The axes labels are generally clear and informative. The y-axis label could be slightly improved by specifying the units (MJ d-1) for the difference between predicted TEE and self-reported EI.
Legend: The figure lacks a legend to differentiate between the NDNS and NHANES datasets and between children and adults. Adding a legend would significantly improve the clarity and interpretability of the graphs.
Caption: The caption is relatively clear but could be more concise. It could also benefit from explicitly stating that the red lines represent trend lines.
Visual Clutter: The use of different colors and symbols for each dataset and age group, combined with the trend lines, creates some visual clutter. Using a more minimalist color scheme or separating the graphs for children and adults could improve readability.
Trend Line Description: The caption should mention that the red lines are trend lines, and the method used to generate them should be specified in the methods section.

Table 3 | Relationships between the discrepancy of intake to expenditure and...

Full Caption

Table 3 | Relationships between the discrepancy of intake to expenditure and self-reported dietary macronutrient composition

Figure/Table Image (Page 5)

First Reference in Text

Next, we explored the relationship between the discrepancy in energy intake and the proportional macronutrient composition (percentage energy) of the reported diet (Table 3).

Description

Purpose of the Table: This table explores whether the tendency for people to over- or under-report their food intake is related to the types of food they eat. Specifically, it looks at whether the difference between what people report eating and what an equation predicts they need (the 'discrepancy') is linked to the proportion of their diet that comes from carbohydrates, protein, and fat. These three are called 'macronutrients' and are the main components of food that provide energy.
Structure of the Table: The table is divided into four sections, each representing a different set of data: the full NDNS dataset, the screened NDNS dataset, the full NHANES dataset, and the screened NHANES dataset. 'Screened' here likely refers to removing data points that were considered unreliable based on some criteria, like falling outside the tolerance limits mentioned in previous tables. Each section shows the results of a statistical analysis called 'multiple regression analysis'. This is a method used to examine the relationship between a dependent variable (in this case, the discrepancy between reported and predicted energy intake) and several independent variables (the percentage of energy from carbohydrates, protein, and fat).
Key Terms Explained: 'Coefficient' in this context refers to the estimated change in the discrepancy (in kilojoules per day) for a one-unit change in the percentage of energy from each macronutrient. For example, a coefficient of -207.3 for 'Percentage protein' in the full NDNS dataset means that for every 1% increase in the proportion of protein in the diet, the discrepancy between reported and predicted intake is estimated to decrease by 207.3 kJ/day (meaning more under-reporting). 'SE coefficient' stands for the standard error of the coefficient, which is a measure of the precision of the estimate. 'P-value' is the probability of observing the relationship (or a stronger one) if there was actually no real relationship between the macronutrient and the discrepancy. A low P-value (typically less than 0.05) suggests that the relationship is statistically significant. 'R²' is a measure of how well the model fits the data, with higher values indicating a better fit.

Scientific Validity

Appropriateness of Statistical Method: Multiple regression analysis is an appropriate method for examining the relationship between the discrepancy in energy intake and the proportional macronutrient composition of the diet. The use of both full and screened datasets allows for an assessment of the robustness of the findings.
Interpretation of Coefficients: The interpretation of the coefficients is crucial. The authors should provide a more detailed discussion of the implications of the positive and negative coefficients for different macronutrients. For instance, they should discuss potential reasons why a higher reported protein intake is associated with a greater discrepancy (more under-reporting).
Consideration of Confounding Factors: While the table presents the results of multiple regression, which adjusts for the other included macronutrients, there may be other confounding factors that influence both macronutrient composition and the discrepancy in energy intake. These potential confounders should be acknowledged and discussed.
Comparison of Full and Screened Datasets: Comparing the results from the full and screened datasets is valuable for assessing the impact of removing potentially unreliable data points. The authors should provide a more detailed comparison of these results and discuss any notable differences.

Communication

Clarity of Column Headers: The column headers are generally clear and informative. However, 'Term' could be more explicitly labeled as 'Predictor Variable' or 'Macronutrient.'
Caption Clarity: The caption is concise but could be more descriptive. It could be revised to: 'Table 3 | Results of multiple regression analyses examining the relationships between the discrepancy of reported energy intake to predicted expenditure and the self-reported dietary macronutrient composition (percentage of total energy) in the NDNS and NHANES datasets, using both full and screened data.'
Table Organization: The organization of the table into four sections is logical and facilitates comparison across datasets and data treatments (full vs. screened). However, the table is quite dense, and the use of bold font or shading could help to visually separate the different sections.
Explanation of Screening: The table would benefit from a brief explanation of the screening criteria used to define the 'screened' datasets. This information could be included in a footnote or in the methods section.
Units: It would be helpful to include the units for the coefficients (kJ/day per 1% change in macronutrient) in the column header or a footnote.

Fig. 2 | Misreporting and macronutrient intake. a-c, The discrepancy between...

Full Caption

Fig. 2 | Misreporting and macronutrient intake. a-c, The discrepancy between the predicted TEE and the reported energy intake in the NHANES and NDNS surveys plotted against the self-reported intakes of fat (a), protein (b) and carbohydrates (c) as a percentage of the total energy. For each macronutrient, the top two plots show data from the whole sample (full data) and the bottom two plots show the data from the sample screened to include only those individuals within the predictive interval of the equation (screened). Significant effects in the whole sample were severely attenuated in the screened sample (see Table 3 for regression details).

Figure/Table Image (Page 6)

First Reference in Text

As the level of protein in the diet increased, the discrepancy became more negative. For each 1.0% increase in reported protein energy, the differ- ence between reported energy intake and actual intake decreased by around 200 kJ d¯¹ in both NDNS and NHANES (Table 3). Note that as most data fall below the line of equality, this negative relationship means that as the self-reported percentage of protein in the diet increased, the discrepancy between the self-reported total energy intake and the predicted total energy expenditure got larger (Fig. 2).

Description

Purpose of the Figure: This figure investigates the relationship between how much people misreport their food intake and the proportion of fat, protein, and carbohydrates in their diet. It's like asking: are people who say they eat a high-protein diet more or less likely to underreport their total calorie intake compared to people who say they eat a high-fat or high-carbohydrate diet?
Structure of the Figure: The figure is divided into three sections (a, b, and c), each representing a different macronutrient: fat, protein, and carbohydrates, respectively. Each section contains four graphs. The top two graphs in each section show data from the entire sample of two studies (NHANES and NDNS), while the bottom two graphs show data only from a 'screened' sample. This 'screened' sample includes only individuals whose reported energy intake was close to what the researchers' equation predicted they should be eating. This is like taking out the data from people who might have very inaccurate reporting to see if the patterns hold up when looking at more reliable data.
X and Y Axes: On the x-axis (horizontal) of each graph is the percentage of total energy that comes from the specific macronutrient (fat, protein, or carbohydrates) according to what people reported eating. The y-axis (vertical) shows the difference between the predicted Total Energy Expenditure (TEE, the number of calories a person burns in a day) and the reported energy intake. A negative value on the y-axis means people are reporting eating less than the equation predicts (under-reporting), while a positive value means they are reporting eating more (over-reporting).
Interpretation of the Graphs: Each dot on the graphs represents a person in the study. The red line is a trend line, which is like drawing a line through the middle of the dots to see the general pattern. If the trend line slopes downwards, it suggests that as the percentage of a particular macronutrient in the diet increases, people tend to under-report their intake more. The caption mentions that the effects seen in the whole sample were 'severely attenuated' in the screened sample. This means that the relationship between macronutrient intake and misreporting became weaker or disappeared when looking only at the more reliable data. This suggests that the relationship observed in the whole sample might be driven by inaccurate reporting.

Scientific Validity

Rationale for Screening: The rationale for screening the data to include only individuals within the predictive interval of the equation is sound. This helps to reduce the influence of extreme misreporting and provides a more accurate picture of the relationship between macronutrient intake and the discrepancy between predicted and reported energy intake in individuals with more plausible reports.
Comparison of Full and Screened Samples: Comparing the results from the full and screened samples is crucial for assessing the robustness of the findings. The observation that significant effects in the whole sample are attenuated in the screened sample highlights the importance of addressing misreporting in dietary studies.
Statistical Analysis: The caption refers to Table 3 for regression details, indicating that the relationships depicted in the figure are based on statistical modeling. The validity of the conclusions drawn from the figure depends on the appropriateness and rigor of the statistical analysis performed in Table 3.
Causality: It is important to note that the figure and the associated analyses demonstrate correlations but do not prove causation. While the results suggest that macronutrient composition may be related to misreporting, it is not possible to determine from these data whether dietary composition influences misreporting or vice-versa. Other factors may contribute to both.

Communication

Clarity of Axes Labels: The axes labels are generally clear and informative. The y-axis label could be slightly improved by specifying the units (MJ d-1) for the discrepancy between predicted TEE and reported energy intake.
Legend: The figure lacks a legend to differentiate between the NHANES and NDNS datasets. Adding a legend would improve the clarity of the graphs.
Caption Clarity: The caption is relatively clear and provides a good overview of the figure's content. It could be improved by briefly explaining the rationale for screening the data.
Visual Clutter: The use of multiple graphs with different datasets and data treatments (full vs. screened) creates some visual clutter. Using a more minimalist color scheme or further separating the graphs could improve readability.
Trend Line Description: The caption should mention that the red lines are trend lines, and the method used to generate them should be specified in the methods section. The specific type of trendline used (e.g. linear, loess) could impact the visual interpretation of the data.

Discussion

Key Aspects

Impact of Repeated Recalls on Survey Validity: The Discussion section addresses the potential impact of repeated dietary recalls on survey validity. It explores whether the problem with misreported intakes is due to undereating rather than under-reporting. The authors suggest that if undereating is the primary issue, repeated surveys should alleviate the problem unless the act of being surveyed itself causes undereating. They analyze data from the NDNS and NHANES surveys, where some participants completed multiple surveys, to determine if reporting accuracy declined with an increasing number of surveys, indicating reporting fatigue. The analysis found that the accuracy of reporting did not decline with the number of surveys, suggesting that there was no survey fatigue. Furthermore, averaging intakes across multiple days did not improve the percentage of intakes that fell within the predicted range, indicating that the general problem is not undereating but consistent under-reporting across days.
Predictive Equation Performance and Additional Variables: The section discusses the performance of the predictive equation based on general linear modeling, which explained over 69% of the variation in Total Energy Expenditure (TEE). This is acknowledged as being lower than the 83% explained by an equation based on fat-free mass (FFM), fat mass, and age derived from the same dataset but restricted to adults. The authors explain that the significant effects of additional variables beyond body weight, such as height, sex, and self-reported ethnicity, are likely due to these traits impacting FFM as a component of body weight. For instance, females of a given height and weight tend to have greater fat mass and lower FFM than males. Consequently, when body weight is used as a predictor, sex becomes a significant term, whereas when FFM is used, sex is no longer significant. The effects of elevation and its interactions with other variables are also discussed, with the suggestion that they may be related to trends in FFM with elevation. The authors clarify that the elevation effect is unlikely due to declining ambient temperature, as no such effect was found in a subset of the data restricted to the USA.
Rationale for Not Using FFM and FM in the Equation: The Discussion section explains the rationale for not using an equation based on FFM and FM to base the screening, despite it explaining more of the variation in TEE. The authors argue that the accuracy of the estimates of FFM and FM is a limiting factor. The previously derived equations used percentage FFM and FM from isotope dilution estimates of body water, which derive from the Doubly Labeled Water (DLW) method. While performing isotope dilution on all survey participants in large surveys would be challenging and costly, alternative approaches to measuring FFM in survey settings are less accurate. Thus, the extra predictability of TEE afforded by FFM and FM estimates is negated by the reduced accuracy of cheaper assessment methods. This is a practical consideration, as large-scale dietary surveys often rely on more readily available and cost-effective measurements.
Consideration of Physical Activity Levels: The authors address how the equations account for different levels of physical activity. They explain that the modified Goldberg approach uses different levels of Physical Activity Level (PAL), which is the ratio of TEE to Basal Energy Expenditure (BEE). However, they point out the problem of equating PAL level to a level of physical activity and the inaccuracies in self-reported activity levels. In the current approach, a large sample of individuals with diverse PAL levels was included, making up the total TEE. By predicting TEE directly, the model automatically accounts for the diverse effects of factors like age, sex, and ethnicity on PAL and TEE. The 95% prediction limits are designed to cover the vast majority of individuals, except for groups with particularly active lifestyles, such as athletes or those in physically demanding occupations. The authors acknowledge that the equations significantly underestimate the expenditure of such groups.
Detection of Under- or Over-Reporting and Undereating: The Discussion section examines the detection of under- or over-reporting and undereating. The authors note that there was very little change in the level of undereating/under-reporting with age, which contrasts with previous findings that identified age as a strong factor in under-reporting energy intake. They highlight that in their study, there was no increase in under-reporting in individuals aged over 70 years, where memory functions might impair recall fidelity. In contrast, for young children, whose intake diaries are generally completed by an adult, the agreement between expectation from the equation and the estimates from the survey report was much better. This suggests that the accuracy of reporting is higher when adults are responsible for recording the dietary intake of children.
Relationship Between Misreporting and Macronutrient Composition: The authors discuss the relationship between misreporting and macronutrient composition, addressing the claim that dietary survey instruments are designed to assess food types consumed, not total energy intake. They argue that if misreporting were unbiased, the discrepancy between survey intake and DLW prediction would be unrelated to macronutrient composition. However, their analysis reveals that the level of reporting was strongly related to reported protein intake, with lesser and opposite effects for fat, and lower, contrasting effects for carbohydrates. This indicates that individuals who under-reported their intake tended to report an elevated percentage of protein and a lower percentage of fat. The authors emphasize that this relationship between misreporting and macronutrient composition is consistent with previous work and highlights the potential for misinterpreting associations between dietary survey reports of macronutrient intake and BMI.
Limitations and Assumptions: The Discussion section acknowledges several limitations and assumptions of the study. The authors note that they used estimates of TEE derived from the DLW method to infer food energy intake, which involves assumptions about the Respiratory Quotient (RQ). While deviations from the assumed RQ value due to dietary composition might complicate comparisons, they argue that the impact on detecting misreporting in relation to fat intake would be marginal. Another assumption is that individuals are in energy balance over the measurement period. The authors consider it likely that the individuals in the sample used to derive the equation were in energy balance, but this may not be the case for individuals in dietary surveys. They also acknowledge that the predictive model explained only 69% of the variation in TEE, and the resultant absolute error in predicted values averaged 11.2%. They note that using the 95% PI to define implausible records means that about 5% of such records will be erroneously identified as valid reports. The authors suggest that future improvements to the prediction may involve integrating independent measures of physical activity, such as accelerometry.
Implications and Future Directions: The final part of the Discussion section outlines the implications of the study and suggests future directions. The authors emphasize the importance of accurately measuring what people eat for understanding the consequences of food intake for health, food security, and quantifying food waste. They acknowledge that the main tools currently used for dietary assessment are over 50 years old, depend on self-report, and are widely acknowledged to provide inaccurate information. While tools to identify misreported data already exist, this study developed an enhanced approach to identify potentially erroneous and implausible reports. The authors concede that the tool is not perfect and will misidentify about 5% of reports as wrong when they are correct. However, they argue that it improves on previous approaches. Applying the tool to two large surveys suggested that more than 50% of the dietary reports had implausible energy intakes and probably erroneous intake of macro- and micronutrients. The authors suggest that the main benefit of this tool is that it may highlight the true level of dietary misreporting and drive innovation towards approaches that do not rely so much (or at all) on self-report.

Strengths

Comprehensive Discussion of Findings
The Discussion section provides a thorough and comprehensive discussion of the study's findings, effectively placing them within the context of existing literature and addressing potential implications.

"The relationship between misreporting and macronutrient composition is consistent with previous work" (Page 65)
Clear Explanation of Methodological Choices
The authors provide clear and well-reasoned explanations for their methodological choices, such as the decision not to use FFM and FM in the predictive equation despite their higher explanatory power for TEE variation.

"Thus, the extra predictability of TEE afforded by having estimates of FFM and FM is negated by the reduced accuracy of cheap FFM and FM assessments." (Page 64)
Thorough Consideration of Limitations
The Discussion section demonstrates a thorough consideration of the study's limitations, including the assumptions made and potential sources of error, which enhances the transparency and credibility of the research.

"We used estimates of TEE derived from the DLW method to infer food energy intake." (Page 65)

Suggestions for Improvement

Expand on the Implications of Findings for Dietary Guidelines
This high-impact improvement would significantly enhance the paper's relevance to public health and policy. The Discussion section needs this expansion to fully explore the practical consequences of the study's findings for the development and implementation of dietary guidelines. Elaborating on how the identified biases in self-reported dietary intake, particularly the underreporting of fat and overreporting of protein, could lead to flawed dietary recommendations and hinder efforts to address diet-related health issues would strengthen the paper by highlighting the real-world implications of the research. This would also underscore the importance of accurate dietary assessment in shaping effective public health interventions. Ultimately, expanding on the implications of the findings for dietary guidelines would significantly improve the study's impact by emphasizing the need for evidence-based recommendations that account for the limitations of self-reported data.

"Accurately measuring what people eat is essential for understand- ing the consequences of components of food intake for health." (Page 65)

Implementation: Include a paragraph that specifically addresses the potential impact of the study's findings on the development and implementation of dietary guidelines. For example: "The systematic biases observed in self-reported dietary intake, particularly the underreporting of fat and overreporting of protein, have significant implications for the development of dietary guidelines. If these guidelines are based on flawed data, they may inadvertently promote dietary patterns that do not accurately reflect actual consumption and could potentially exacerbate diet-related health issues. The findings of this study underscore the need for caution when interpreting self-reported dietary data and highlight the importance of developing more accurate methods for assessing dietary intake to inform evidence-based dietary recommendations."
Discuss Potential Strategies to Mitigate Misreporting
This medium-impact improvement would enhance the paper's practical utility and contribute to the development of more effective strategies for collecting accurate dietary data. The Discussion section needs this exploration of potential solutions to address the pervasive issue of misreporting identified in the study. Providing a more detailed discussion of possible strategies to mitigate misreporting, such as incorporating objective biomarkers, using technology-assisted methods, or implementing more rigorous data cleaning and validation techniques, would strengthen the paper by offering concrete steps towards improving the quality of dietary data. This would also help researchers and practitioners in the field to better address the challenges of misreporting in future studies. Ultimately, discussing potential strategies to mitigate misreporting would improve the study's contribution to the field by providing actionable insights for enhancing the accuracy and reliability of dietary assessment.

"In this study, we developed an enhanced approach to identify potentially erroneous and implausible reports." (Page 65)

Implementation: Include a section that explores potential strategies to mitigate misreporting in dietary studies. For example: "Several strategies could be employed to mitigate the impact of misreporting on dietary data. One approach is to incorporate objective biomarkers of nutrient intake, such as urinary nitrogen for protein intake or doubly labeled water for energy expenditure, to validate self-reported data. Another possibility is to leverage technology-assisted methods, such as mobile apps with image recognition capabilities, to improve the accuracy of portion size estimation and reduce reliance on memory. Additionally, implementing more rigorous data cleaning and validation techniques, such as identifying and excluding implausible energy intake values based on predicted TEE, could help to improve the quality of dietary data."
Address the Generalizability of the Findings to Other Populations
This medium-impact improvement would enhance the paper's external validity and provide a more nuanced understanding of the study's applicability to diverse populations. The Discussion section needs this consideration of generalizability to acknowledge potential limitations and inform future research directions. Briefly discussing how the findings might vary across different populations, such as those with different cultural backgrounds, socioeconomic statuses, or health conditions, would strengthen the paper by providing a more comprehensive assessment of the study's scope. This would also help readers to better understand the potential limitations of applying the findings to populations that differ significantly from those included in the study. Ultimately, addressing the generalizability of the findings to other populations would improve the study's scientific contribution by providing a more nuanced and context-specific interpretation of the results.

"Applying the tool to two large surveys suggested that more than 50% of the dietary reports had implausible energy intakes" (Page 65)

Implementation: Include a paragraph that discusses the potential generalizability of the findings to other populations. For example: "While this study provides valuable insights into the patterns and correlates of misreporting in dietary data, it is important to consider the potential limitations of generalizing these findings to other populations. The study sample was primarily drawn from the NDNS and NHANES datasets, which may not be fully representative of other populations with different cultural backgrounds, socioeconomic statuses, or health conditions. Future research should investigate the extent to which these findings apply to diverse populations and explore potential factors that may influence the patterns of misreporting across different groups."

Non-Text Elements

Table 4 | Relationships between macronutrient intake and BMI in both datasets

Figure/Table Image (Page 8)

First Reference in Text

The gradient and R² values of the relationship between BMI and protein were both strongly reduced (Fig. 3 and Table 4), while the negative gradient for the relationship between BMI and carbohy- drates became more negative and the R² value approximately doubled.

Description

Purpose of the Table: This table examines how the amount of fat, protein, and carbohydrates people eat relates to their Body Mass Index (BMI), which is a measure of body weight relative to height. It's like trying to see if there's a connection between diet composition and body weight.
Structure of the Table: The table is divided into two main sections: 'Whole data' and 'Within 95% PI'. 'Whole data' refers to the analysis using all the data collected in the NHANES and NDNS surveys. 'Within 95% PI' refers to the analysis using only the data from individuals whose reported energy intake fell within the 95% predictive interval of the researchers' equation, meaning their reported intake was close to what the equation predicted they should be eating. This is a way of focusing on more reliable data. Each section shows the results of a statistical analysis, likely a regression analysis, that examines the relationship between BMI and the percentage of each macronutrient in the diet.
Key Terms Explained: 'Gradient' here likely refers to the slope of the line in a regression analysis, which shows how much BMI is expected to change for a one-unit change in the percentage of each macronutrient. A positive gradient means that as the percentage of the macronutrient increases, BMI tends to increase as well. A negative gradient means the opposite. 'R²' (R-squared) is a statistical measure that represents the proportion of the variation in BMI that is explained by the macronutrient intake. It's a measure of how well the statistical model fits the data, with values closer to 1 indicating a better fit. 'P' is the p-value, which is the probability of observing the relationship between BMI and macronutrient intake (or a stronger one) if there was actually no real relationship. A low p-value (typically less than 0.05) suggests that the relationship is statistically significant.
Macronutrient Focus: The table focuses on three macronutrients: fat, carbohydrates, and protein. These are the main components of food that provide energy. The table shows how the relationship between each of these macronutrients and BMI changes when looking at all the data versus only the more reliable data (within 95% PI).

Scientific Validity

Appropriateness of Statistical Analysis: The use of regression analysis (presumably linear regression, although this is not explicitly stated in the table) is appropriate for examining the relationship between macronutrient intake and BMI. The inclusion of both the whole dataset and the data within the 95% PI allows for a comparison of the results with and without potentially unreliable data points.
Interpretation of Gradient and R²: The reference text highlights the changes in gradient and R² values for protein and carbohydrates between the whole and screened datasets. This comparison is crucial for understanding the impact of misreporting on the observed relationships. However, the authors should provide a more comprehensive discussion of the results for all three macronutrients in both datasets, including the direction and magnitude of the gradients and the goodness-of-fit (R²) values.
Consideration of Confounding Factors: While the table presents the results of a statistical analysis that likely adjusts for other variables, there may be other confounding factors that influence both macronutrient intake and BMI. These potential confounders should be acknowledged and discussed in the main text. For example, socioeconomic status, physical activity levels, and genetic factors could all play a role.

Communication

Clarity of Column Headers: The column headers are generally clear. However, 'Gradient' could be more explicitly labeled as 'Regression Coefficient' or 'Slope' for better clarity. Also, adding a column for the standard error of the gradient would be informative.
Caption Clarity: The caption is concise but could be more informative. It could be revised to: 'Table 4 | Results of regression analyses examining the relationships between macronutrient intake (percentage of total energy) and BMI in the NHANES and NDNS datasets, using both the whole data and data within the 95% predictive interval (95% PI).'
Table Organization: The organization of the table into two sections (whole data and within 95% PI) is logical and facilitates comparison. However, the table could be made more visually appealing and easier to read by using bold font or shading to separate the different sections and macronutrients.
Units: The table would benefit from including the units for the gradient (e.g., change in BMI per 1% change in macronutrient intake).
Explanation of 95% PI: While the concept of the 95% PI has been introduced earlier in the paper, it would be helpful to briefly reiterate its meaning in the context of this table, either in the caption or in a footnote.

Fig. 3 | Relationships between the reported dietary intakes of macronutrients...

Full Caption

Fig. 3 | Relationships between the reported dietary intakes of macronutrients and BMI. a-f, Relationships between BMI and the intakes of fat (a,b), protein (c,d) and carbohydrate (e,f) for the NHANES and NDNS surveys. Panels a, c and e show the data for the whole sample and panels b, d and f show the data for those individuals whose total energy intake was within the predictive interval (that is, excluding under- and over-reporters).

Figure/Table Image (Page 7)

First Reference in Text

As there is a systematic trend between macronutrient intake and the extent of under-reporting and because under-reporting is related to BMI, there was a strong positive relationship between the reported dietary intakes of protein and BMI in both surveys (Fig. 3 and Table 4).

Description

Purpose of the Figure: This figure shows the relationship between what people say they eat (specifically fat, protein, and carbohydrates) and their Body Mass Index (BMI), which is a measure of body weight relative to height. It's like looking for a connection between diet and weight, but focusing on what people *report* eating, which might not always be accurate.
Structure of the Figure: The figure is divided into six panels (a-f), each of which is a scatter plot. Each dot on a plot represents one person in the study. The plots are grouped in pairs for each of the three macronutrients: fat (a, b), protein (c, d), and carbohydrates (e, f). For each macronutrient, one plot shows the data for the entire sample from a specific survey (either NHANES or NDNS), and the other plot shows the data only for individuals whose reported total energy intake was close to what a scientific equation predicted they should be eating. These individuals are considered to be more likely to be reporting their diet accurately.
X and Y Axes: In all the plots, the x-axis (horizontal) represents the percentage of total energy that comes from a specific macronutrient (fat, protein, or carbohydrates) in a person's reported diet. The y-axis (vertical) represents the person's BMI. The red line on each plot is a trend line, which is a line drawn through the middle of the dots to show the general pattern of the relationship between the two variables. If the line slopes upwards, it suggests that people with a higher percentage of that macronutrient in their diet tend to have a higher BMI. If it slopes downwards, it suggests the opposite.
Interpretation of the Plots: By comparing the plots for the whole sample and the screened sample (those within the predictive interval), we can see how the relationship between reported macronutrient intake and BMI changes when potentially inaccurate data is removed. For example, if the trend line is steeper in the whole sample than in the screened sample, it suggests that the relationship might be exaggerated by misreporting in the whole sample. The reference text highlights that there was a strong positive relationship between reported protein intake and BMI, meaning that people who reported eating a higher proportion of protein tended to have a higher BMI. However, this relationship was affected when looking at only the more reliable data, as mentioned in the discussion of Table 4.

Scientific Validity

Visualization of Relationship: Using scatter plots with trend lines is an appropriate way to visualize the relationship between reported macronutrient intake and BMI. This allows for a visual assessment of the direction and strength of the relationship in both the whole sample and the screened sample.
Comparison of Whole and Screened Samples: Comparing the plots for the whole sample and the screened sample is crucial for understanding the potential impact of misreporting on the observed relationships. The differences between these plots highlight the importance of considering data quality when interpreting dietary data.
Statistical Analysis: The reference text and the caption imply that the relationships depicted in the figure are based on statistical analyses, likely regression analyses as presented in Table 4. The validity of the conclusions drawn from the figure depends on the appropriateness and rigor of these underlying analyses. It's important that the authors have adequately controlled for potential confounding factors in their analyses.
Causality: It is important to remember that these plots show correlations and do not prove causation. Even if a strong relationship is observed between reported macronutrient intake and BMI, it does not necessarily mean that one causes the other. There could be other factors that influence both dietary reporting and BMI.

Communication

Clarity of Axes Labels: The axes labels are generally clear and informative. The x-axis label could be slightly improved by specifying that it represents the 'Percentage of total energy from macronutrient' to be consistent with previous tables.
Legend: The figure lacks a legend to differentiate between the NHANES and NDNS datasets. Adding a legend with different colors or symbols for each dataset would significantly improve the clarity and interpretability of the graphs.
Caption Clarity: The caption is relatively clear and provides a good overview of the figure's content. It could be improved by briefly explaining the rationale for screening the data and by explicitly stating that the red lines represent trend lines.
Visual Clutter: The use of six panels with different datasets and data treatments (whole vs. screened) creates some visual clutter. Using a more minimalist color scheme or further separating the graphs could improve readability.
Trend Line Description: The caption should mention that the red lines are trend lines, and the method used to generate them (e.g., linear regression, LOESS smoothing) should be specified in the methods section. The type of trendline used could impact the interpretation of the relationships depicted.

Methods

Key Aspects

Development of Predictive Algorithm: The researchers developed a predictive algorithm for Total Energy Expenditure (TEE) using data from the International Atomic Energy Agency Doubly Labeled Water (DLW) Database. This database contains 7,646 measurements of TEE from individuals across 32 countries, carefully compiled from 128 published and unpublished studies. The researchers specifically focused on data from individuals not engaged in dietary or exercise interventions, ensuring the data reflected typical energy expenditure patterns. They meticulously screened out individuals with specific diseases or those engaged in extreme physical activities to maintain the representativeness of the data. The researchers used a common equation to recalculate all the data in the database, ensuring consistency and accuracy in the TEE measurements. This equation was validated against chamber calorimetry, a highly accurate method for measuring energy expenditure. They converted these TEE estimates into energy expenditure using the modified Weir equation, a standard formula in the field. To derive a predictive equation applicable across a wide age range, they initially analyzed data from all age groups but found high residual error, particularly among younger participants. Consequently, they restricted the final analysis to individuals aged 4 years and older, resulting in 6,497 measurements for this age group.
Data Inclusion and Exclusion Criteria: The researchers established specific criteria for including and excluding data in their analysis. They included measurements from individuals who were not part of dietary or exercise intervention studies. They excluded individuals with specific diseases, such as type 2 diabetes or cancer, as these conditions could affect energy expenditure. They also excluded data from individuals engaged in unusually high levels of physical activity, such as professional athletes or participants in extreme endurance events like the Race Across America. Pregnant and lactating females were also excluded due to their altered energy requirements. The researchers did not exclude data from hunter-gatherer or subsistence agriculture populations, as evidence suggests their energy expenditures, when normalized for body weight, do not significantly differ from those of westernized populations. These criteria aimed to ensure that the data used to develop the predictive equation reflected the energy expenditure patterns of a broad, healthy population.
Variables and Statistical Modeling: The researchers incorporated several variables into their predictive model, including body mass, height, self-identified sex, age, and self-reported ethnicity. They also included the elevation of the study location but excluded ambient temperature, as previous analysis indicated it was not a significant predictor of TEE, at least for data from the USA. They used a generalized linear model to analyze the data, incorporating interaction terms up to three-way interactions. They refined the model by iteratively deleting non-significant terms, starting with the three-way interactions and then the two-way interactions. Because the relationship between body mass and TEE follows a power law, they log-transformed TEE and body weight before analysis. They also log-transformed other variables, such as elevation, that were not normally distributed. They included both age and age squared as predictors due to the curvilinear relationship between normalized TEE and age. The researchers addressed missing data by eliminating 70 sets of data with incomplete predictor information, all of which were missing the elevation of the measurement site. They also acknowledged the potential impact of missing self-reported ethnicity data and provided a default value for cases where this information was unavailable.
Model Validation and Sensitivity Analysis: To validate their predictive model, the researchers randomly divided the dataset into an analysis set (90% of the data) and a validation set (10% of the data). They compared the predicted TEE with the observed TEE in the validation set and found a strong correspondence, with 94.6% of observations falling within the 95% predictive intervals. The average absolute deviation between the predicted and observed TEE in the validation set was 11.2%. They also conducted a sensitivity analysis to assess the impact of missing data on the model's predictions. Specifically, they examined the effect of not having information on elevation and self-reported ethnicity. They found that using a 'dummy' elevation of 100 meters resulted in a 2.3% absolute error in the predicted TEE. The impact of not knowing ethnicity varied depending on the specific ethnic group, but the errors were generally small relative to the predictive interval. These findings suggest that while having complete predictor data is preferable, the model can still provide reasonable predictions even when some information is missing.
Machine Learning Approaches: In addition to classical general linear regression, the researchers explored three machine learning approaches: Random Forest, XGBoost, and Support Vector Regression. They used the same predictor variables in all three methods. Random Forest builds multiple decision trees and combines their predictions to improve accuracy and reduce overfitting. XGBoost is an optimized gradient-boosting algorithm known for its efficiency and performance. Support Vector Regression (SVR) aims to find the line of best fit while minimizing errors. The researchers compared the performance of these machine learning models with the classical regression model by examining the correlation between predicted and observed TEE and calculating the summed deviations. They found that all three machine learning approaches performed similarly to the classical regression model, with an average absolute percentage error of around 11.5%. The deviations between predicted and actual data were highly correlated across all methods, suggesting that they were extracting similar predictive information from the data. The researchers concluded that additional predictor variables would likely be needed to further improve the model's accuracy.
Application to Survey Data: The researchers applied their predictive equation to two large dietary surveys: the UK National Diet and Nutrition Survey (NDNS) and the US National Health and Nutrition Examination Survey (NHANES). They used data from years 1-11 (2008-2019) of the NDNS and the 2017-2018 cycle of NHANES. For the NDNS, they analyzed data from 12,694 individuals aged 4 and over, while for NHANES, they analyzed data from 5,873 individuals in the same age range. They compared the reported energy intake from these surveys with the predicted TEE based on their equation. They also assessed the impact of using repeated dietary recalls in the NDNS to determine if there was an improvement in reporting accuracy with greater familiarity with the survey protocol. They calculated the percentage of individuals whose reported energy intake fell within the predicted range based on the 95% predictive intervals of the equation. They further analyzed the data by stratifying it according to age (children versus adults) and sex.

Strengths

Comprehensive Data and Methodology
The Methods section describes a robust and comprehensive approach to developing a predictive equation for TEE, utilizing a large and diverse dataset from the IAEA DLW Database and employing rigorous statistical methods.

"This included data derived from DLW studies in 32 countries with 7,646 male and female participants" (Page 65)
Clear Inclusion and Exclusion Criteria
The researchers clearly outline the criteria for including and excluding data, ensuring the sample's representativeness and minimizing potential confounding factors related to disease, extreme physical activity, and pregnancy.

"The component studies have generally screened out people who have specific diseases, such as type 2 diabetes or cancer" (Page 65)
Thorough Validation and Sensitivity Analysis
The study includes a thorough validation of the predictive equation using a separate dataset and a sensitivity analysis to assess the impact of missing data, enhancing the model's credibility and applicability.

"We compared the predicted TEE with the observed TEE for the randomly selected 598 data in the validation dataset" (Page 66)

Suggestions for Improvement

Provide More Detail on Data Handling for Machine Learning
This medium-impact improvement would enhance the transparency and reproducibility of the machine learning analyses. The Methods section needs a more detailed description of how the data were prepared and handled specifically for the Random Forest, XGBoost, and Support Vector Regression models. Elaborating on data preprocessing steps, such as handling of missing values, scaling, and feature engineering, would strengthen the paper by allowing other researchers to better understand and potentially replicate these analyses. This would also help readers assess the robustness of the machine learning models and their comparability to the classical regression approach. Ultimately, providing more detail on data handling for the machine learning models would improve the study's methodological rigor by ensuring a more complete and transparent description of these advanced analytical techniques.

"We used three different machine learning approaches to analyse the data using the same predictor variables: Random Forest, XGBoost and Support Vector Regression." (Page 67)

Implementation: Include a subsection dedicated to the machine learning approaches that details the data preprocessing steps, such as how missing values were handled (e.g., imputation), whether and how features were scaled or transformed, and any feature engineering techniques employed. For example: "For the machine learning models, missing values for elevation were imputed using the median elevation from the dataset. All predictor variables were standardized to have a mean of 0 and a standard deviation of 1 to ensure that features with larger values did not disproportionately influence the models. No further feature engineering was performed."
Clarify the Rationale for Excluding Younger Children
This medium-impact improvement would provide a more complete understanding of the study's scope and limitations. The Methods section needs a clearer explanation for the decision to exclude children under 4 years of age from the final analysis. Providing a more detailed rationale, including the specific challenges or limitations associated with modeling TEE in this age group, would strengthen the paper by justifying this methodological choice and helping readers understand the applicability of the predictive equation. This would also highlight potential areas for future research focused on developing accurate TEE prediction models for younger children. Ultimately, clarifying the rationale for excluding younger children would improve the study's transparency and help readers better interpret the findings within the context of the defined study population.

"An initial analysis suggested that deriving a common equation that covered all age classes had a high level of residual error." (Page 66)

Implementation: Add a few sentences that explain the specific reasons for excluding children under 4 years of age. For example: "Children under 4 years of age were excluded from the final analysis due to the higher residual error observed in this age group, which may be attributed to the rapid and non-linear changes in body composition and metabolic rate during early childhood. Additionally, the relationship between body weight and TEE may differ significantly in this age group compared to older children and adults, requiring a different modeling approach."
Elaborate on the Handling of 'Other' and Mixed Race Ethnicities
This medium-impact improvement would enhance the clarity and transparency of the study's approach to handling ethnicity data. The Methods section needs a more detailed explanation of how individuals who identified as 'other' or mixed race were handled in the analysis, beyond simply coding them as 'not available'. Providing a more thorough description of the rationale for this decision and discussing any potential implications for the model's generalizability would strengthen the paper by addressing potential concerns about the representation of diverse ethnic groups. This would also help readers better understand the limitations of the ethnicity variable and its interpretation in the context of the study's findings. Ultimately, elaborating on the handling of 'other' and mixed race ethnicities would improve the study's methodological rigor by ensuring a more complete and nuanced discussion of this important demographic variable.

"A small number of individuals identified as mixed race or 'other' (2.9%) and these were all coded as 'not available' as there were insufficient data to include different combinations separately." (Page 66)

Implementation: Expand the discussion on the handling of ethnicity data to include a more detailed explanation for the decision to code 'other' and mixed race individuals as 'not available'. For example: "Individuals who identified as 'other' or mixed race were coded as 'not available' due to the small sample sizes within these categories and the heterogeneity within the 'other' category, which precluded meaningful analysis. This decision was made to avoid potential misinterpretation of results based on small and potentially non-representative subgroups. However, we acknowledge the limitation of this approach and the need for future research to explore the relationship between ethnicity and TEE in more diverse populations with larger sample sizes within specific ethnic groups."

Predictive equation derived from 6,497 doubly labelled water measurements enables the detection of erroneous self-reported energy intake

Table of Contents

Overall Summary

Study Background and Main Findings

Research Impact and Future Directions

Critical Analysis and Recommendations

Section Analysis

Abstract

Key Aspects

Strengths

Suggestions for Improvement

Introduction

Key Aspects

Strengths

Suggestions for Improvement

Results

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Discussion

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Methods

Key Aspects

Strengths

Suggestions for Improvement