Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way

Chenglong Wang, Bongshin Lee, Steven Drucker, Dan Marshall, Jianfeng Gao
arXiv: arXiv:2408.16119v2
Microsoft Research

First Page Preview

First page preview

Table of Contents

Overall Summary

Study Background and Main Findings

This paper introduces Data Formulator 2 (Df2), an AI-powered visualization system designed to address the challenges of iterative authoring in exploratory data analysis. Existing AI tools often require users to provide complete, text-only descriptions of visualizations upfront, which is impractical when analytical goals evolve during exploration. Df2 tackles this limitation by blending a graphical user interface (GUI) with natural language (NL) input, allowing users to specify chart designs precisely while delegating data transformation tasks to the AI. The system also introduces "data threads," a mechanism for tracking the history of data transformations and visualizations, enabling users to easily revisit, revise, and branch from previous steps.

The core of Df2's methodology involves decoupling chart specification from data transformation. Users define their visualization intent through a combination of GUI interactions (e.g., drag-and-drop field mapping) and concise NL instructions. The system then generates a Vega-Lite specification (a high-level grammar for interactive graphics) and prompts a large language model (LLM) to produce Python code for the necessary data transformations. Df2 executes this code, handles potential errors, and instantiates the Vega-Lite specification with the transformed data to generate the visualization. Data threads provide a visual representation of the user's interaction history, facilitating navigation and reuse of previous results.

A user study with eight participants, with varying levels of expertise in data analysis and programming, demonstrated Df2's effectiveness in supporting iterative visualization authoring. Participants successfully completed a series of tasks involving the creation of 16 visualizations, requiring diverse data transformations. The study revealed distinct iteration styles among users, some preferring broader exploration with multiple branches (wide trees), while others favored deeper, more linear progressions (deep trees). Participants also employed various prompting techniques, ranging from imperative commands to questions and chat-style interactions. The study highlighted the importance of Df2's transparency features, such as code explanations and data provenance tracking, in building user trust and facilitating verification of AI-generated outputs.

The discussion explores future directions for Df2, including integration with visualization recommendation systems and the development of agent-based systems for coordinating data transformation and chart editing. The authors acknowledge the limitations of the user study, particularly its focus on reproduction tasks and the lab setting, and propose future research involving open-ended exploration and longitudinal studies to investigate long-term user behavior and learning effects.

Research Impact and Future Directions

Data Formulator 2 (Df2) presents a compelling approach to iterative visualization authoring, effectively addressing the limitations of existing AI-powered tools. The system's innovative blend of GUI and NL input, coupled with its sophisticated data threading mechanism, empowers users to navigate complex data transformations and explore diverse visualization strategies with remarkable efficiency. The user study, while limited in sample size, provides strong qualitative evidence for Df2's usability and its potential to transform data analysis workflows. The system's transparency features, including code explanations and data provenance tracking, foster user trust and facilitate verification of AI-generated outputs.

However, the study's reliance on reproduction tasks and the lab setting constrains the generalizability of findings to real-world, open-ended exploration scenarios. Future research addressing these limitations, along with the proposed enhancements for recommendation systems and agent-based chart editing, will be crucial for realizing Df2's full potential. The core strength of this work lies in its robust, user-centered design and its potential to democratize access to sophisticated data visualization techniques by lowering the barrier to entry for users with varying levels of programming expertise. The integration of AI capabilities within an intuitive interface offers a promising pathway for more efficient, insightful, and accessible data exploration.

Critical Analysis and Recommendations

Clear Problem Articulation (written-content)
The abstract effectively establishes the context by highlighting the limitations of current AI visualization tools in handling iterative authoring, a crucial aspect of exploratory data analysis. This clear problem articulation sets the stage for the paper's contribution.
Section: Abstract
Concise Solution Introduction (written-content)
The abstract concisely introduces Df2 and its core functionalities, including the hybrid GUI/NL input and support for iteration history. This allows readers to quickly grasp the system's key innovations.
Section: Abstract
Precise Identification of Tool Limitations (written-content)
The Introduction effectively transitions from the general problem of iterative visualization to the specific limitations of existing tools, providing a strong rationale for Df2's design.
Section: Introduction
Clear Workflow Illustration (graphical-figure)
Figure 1 effectively communicates the core workflow of Df2, illustrating how users can iteratively refine visualizations using the data threads and chart builder. The visual representation clarifies the system's key functionalities.
Section: Introduction
Clear Articulation of Dual Design Strategy (written-content)
The Method section clearly articulates the dual design strategy of decoupling chart specification from data transformation and using data threads. This provides a solid foundation for understanding the system's architecture and functionality.
Section: Method
Sophisticated AI Prompting and Error Handling (written-content)
The Method section thoroughly describes the sophisticated AI prompting and error handling mechanisms, demonstrating a robust approach to AI integration and enhancing the system's reliability.
Section: Method
In-depth Analysis of Emergent Iteration and Prompting Styles (written-content)
The Results section provides a detailed analysis of emergent iteration and prompting styles, offering valuable insights into how users adapt to and utilize the novel features of Df2. This qualitative data enriches the understanding of user behavior.
Section: Results
Visualization of Workflow Variability (graphical-figure)
Figure 12 effectively visualizes the variability in user workflows, supporting the qualitative analysis of iteration styles and providing compelling evidence for the system's flexibility.
Section: Results
Visionary Integration with Recommendation Systems (written-content)
The Discussion section proposes a visionary integration with recommendation systems, leveraging Df2's strengths to address limitations of existing tools and potentially broaden the scope of data exploration.
Section: Discussion and Future Work
Transparent Acknowledgment of Study Limitations (written-content)
The Discussion acknowledges the limitations of the user study, particularly the use of reproduction tasks and the lab setting, and proposes specific future studies to address these gaps and enhance the generalizability of findings. This strengthens the paper's methodological rigor.
Section: Discussion and Future Work
Quantify Observed Differences in Iteration Styles (written-content)
The Results section lacks quantitative analysis of the observed differences in iteration styles. Quantifying these differences (e.g., average branch depth, frequency of backtracking) would strengthen the claims about distinct user approaches. This would provide more objective support for the qualitative observations.
Section: Results
Elaborate on Bias Mitigation in AI-Enhanced Recommendations (written-content)
The Discussion section could be strengthened by elaborating on strategies to mitigate potential biases in AI-enhanced recommendations. Addressing this concern is crucial for responsible development and deployment of such features. The lack of discussion on bias mitigation weakens the proposal for integrating recommendation systems.
Section: Discussion and Future Work

Section Analysis

Abstract

Key Aspects

Strengths

Suggestions for Improvement

Introduction

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Figure 1: With Data Formulator 2, analysts can iterate on a previous design by...
Full Caption

Figure 1: With Data Formulator 2, analysts can iterate on a previous design by (1) selecting a chart from data threads and (2) providing combined natural language and graphical user interface inputs in the chart builder to specify the new design. The AI model generates code to transform the data and update the chart. Data threads are updated with new charts for future use.

Figure/Table Image (Page 1)
Figure 1: With Data Formulator 2, analysts can iterate on a previous design by (1) selecting a chart from data threads and (2) providing combined natural language and graphical user interface inputs in the chart builder to specify the new design. The AI model generates code to transform the data and update the chart. Data threads are updated with new charts for future use.
First Reference in Text
For example, when exploring renewable energy trends, an analyst may find that similar trends across countries make a simple line chart (Figure 1) too dense for detailed comparisons.
Description
  • Overall Workflow Demonstration: Figure 1 illustrates the user interface and workflow of Data Formulator 2, a system for iterative data visualization. It shows how a user can refine a chart by selecting a previous version from "Data Threads" and then providing instructions in a "Chart Builder" using both graphical inputs and natural language.
  • Data Threads Panel: The "Data Threads" panel on the left displays a history of chart versions, akin to saved states in a design process. For instance, "thread-1" shows an initial chart from "energy.csv" with axes like "Year", "Electricity", and "Entity". "thread-2" shows an evolution, starting with "energy.csv" to a chart labeled "table-42" plotting "Renewable Per..." (Renewable Percentage) by "Entity" over "Year", and then further transformed into "table-86" based on a textual command.
  • Chart Builder Panel and User Input: The central "Chart Builder" panel shows an active chart modification step. An initial line chart (from "table-42") displays "Renewable Percentage" for multiple entities over years. The user inputs a natural language command: "Show only top 5 CO2 emission countries' trends." This panel also shows GUI elements for selecting chart type ("Line Chart") and mapping data fields to visual properties (e.g., x-axis: Year, y-axis: Renewable Per..., color: Entity).
  • Data Table Preview: A data table preview within the "Chart Builder" provides a snapshot of the underlying data being visualized. It includes columns like "Year", "Entity", and "Renewable Percentage", with example rows such as "2000, China, 16.639" and "2020, Japan, 21.325".
  • Resulting Refined Chart (New Thread): The outcome of the user's interaction is displayed on the right as a "new thread". This is a refined line chart showing "Renewable Percentage" (y-axis, values from approximately 0 to 40) versus "Year" (x-axis, from 2000 to 2015) for a filtered set of entities: China, Germany, India, Japan, and United States. This visualization directly reflects the application of the user's natural language instruction.
Scientific Validity
  • ✅ Illustrates core system functionality: The figure effectively demonstrates the core functionality of Data Formulator 2—iterative visualization refinement through combined GUI and natural language inputs—which aligns with the paper's claimed contributions.
  • ✅ Realistic use-case scenario: The chosen example, filtering a dense line chart of renewable energy trends to show specific countries, represents a common and realistic task in exploratory data analysis, thereby underscoring the system's practical relevance.
  • 💡 AI mechanism not visually detailed: The caption states that an AI model generates code to transform data. While the figure shows the input (NL query) and output (refined chart), the AI's role and the nature of the code generation are not visually detailed within the figure itself. This is acceptable for a high-level overview but might leave a reader wondering about the underlying AI mechanism's complexity or verifiability from this figure alone.
  • 💡 Ambiguity in data source for filtering criteria: The instruction "Show only top 5 CO2 emission countries' trends" implies filtering based on CO2 emissions. However, the visible input chart and data table in the "Chart Builder" focus on "Renewable Percentage." The figure does not explicitly show how CO2 emission data is accessed or linked for this filtering operation. It's assumed the AI handles this, but clarity on data provenance for the filtering criteria could be improved. For example, is CO2 data part of the 'energy.csv' or 'table-42' dataset?
  • 💡 Minor inconsistency in year ranges: There's a minor inconsistency: the "new thread" chart displays data up to the year 2015, whereas the data table preview in the "Chart Builder" includes entries for the year 2020 (e.g., "2020, Japan, 21.325"). While not critical, aligning the year ranges or explaining the discrepancy would enhance consistency.
Communication
  • ✅ Clear process illustration: The figure effectively uses a left-to-right visual flow (Data Threads -> Chart Builder -> new thread) to illustrate the iterative chart creation process, making the workflow easy to follow.
  • ✅ Effective communication of multi-modal input: The direct annotation of the natural language query ("Show only top 5 CO2 emission countries' trends") on the UI mockup clearly communicates the multi-modal input capability of the system.
  • ✅ Highlights system benefit: The visual contrast between the implied complexity of an initial chart and the filtered, cleaner "new thread" chart successfully highlights a key benefit of the system in reducing information overload.
  • ✅ Clear and aligned caption: The caption accurately and concisely describes the overall process depicted, aligning well with the visual elements shown in the figure.
  • 💡 Low resolution of text in Data Thread previews: The text within the small chart previews in the "Data Threads" panel (e.g., for table-49, table-42, table-86) is of low resolution and difficult to read. While these act as thumbnails, improving legibility could enhance the understanding of the iteration history. Suggestion: Increase the font size or simplify these preview elements if detailed content is not critical, or use higher-resolution inserts.
  • 💡 Non-semantic labels for tables: The labels like "table-42", "table-49", etc., are likely internal system identifiers and lack semantic meaning for the reader, potentially adding slight clutter without clear informational value in this context. Suggestion: Consider de-emphasizing them or using more descriptive placeholders if these are not crucial for understanding the figure's message.
  • 💡 High information density: The figure is information-dense, particularly the "Chart Builder" section. While this showcases various system features, it might initially overwhelm a viewer trying to grasp the core iterative step. Suggestion: Ensure visual hierarchy clearly guides the viewer's attention to the most critical elements for the illustrated workflow, perhaps with more prominent callouts or by slightly graying out less relevant UI components for this specific example.
Figure 2: An analyst explores electricity from different energy sources,...
Full Caption

Figure 2: An analyst explores electricity from different energy sources, renewable percentage trends, and country rankings by renewable percentages using a dataset on CO2 and electricity for 20 countries (2000-2020, table 1). The analyst creates five data versions in three branches to support different chart designs. DF2 allows users to manage iteration directions and create rich visualizations using a blended UI and natural language inputs.

Figure/Table Image (Page 3)
Figure 2: An analyst explores electricity from different energy sources, renewable percentage trends, and country rankings by renewable percentages using a dataset on CO2 and electricity for 20 countries (2000-2020, table 1). The analyst creates five data versions in three branches to support different chart designs. DF2 allows users to manage iteration directions and create rich visualizations using a blended UI and natural language inputs.
First Reference in Text
The initial dataset, shown in Figure 2-1, includes each country's energy produced from three sources (fossil fuel, renewables, and nuclear) each year and annual CO2 emission value (the CO2 emission data only ranges from 2000 to 2019).
Description
  • Panel Content and Identification: Panel 1 of Figure 2, identified as "Figure 2-1" in the reference text, displays a snippet of the initial tabular dataset. This dataset is central to the subsequent analyses and visualizations depicted in the overall figure, concerning energy production and CO2 emissions for various countries.
  • Data Columns and Units: The table in Panel 1 includes several data columns: "Year" (showing years like 2000 and 2020), "Entity" (representing the country, e.g., Australia, Brazil, China, United Kingdom, United States), "CO2 emissions (kt)" where 'kt' signifies kilotonnes, a unit of mass (e.g., Australia in 2000 reported 339450 kt of CO2 emissions), "Electricity from fossil fuels (TWh)", "Electricity from nuclear (TWh)", and "Electricity from renewables (TWh)". 'TWh' stands for Terawatt-hours, a common unit for large-scale energy measurement. For example, in 2000, Australia generated 181.05 TWh from fossil fuels, 0 TWh from nuclear, and 17.11 TWh from renewables.
  • CO2 Data Range and Null Values: An important detail visible in Panel 1 is that the "CO2 emissions (kt)" column shows "null" values for the United Kingdom and United States for the year 2020. This is consistent with the reference text, which clarifies that the CO2 emission data in this dataset only covers the period from 2000 to 2019.
  • Dataset Scope Illustration: The main caption for Figure 2 states that the complete dataset encompasses 20 countries over the years 2000-2020. Therefore, Panel 1 presents only an illustrative subset of this larger dataset, showcasing its structure and the types of data it contains.
Scientific Validity
  • ✅ Foundation for Subsequent Analysis: This table (Panel 1) appropriately presents a sample of the raw input data. This serves as the necessary foundation for the data transformations and visualizations illustrated in the subsequent panels of Figure 2, thereby promoting transparency in the depicted analytical workflow.
  • ✅ Comprehensive Variables for Energy Analysis: The dataset snippet includes key variables relevant to energy and environmental analysis: CO2 emissions and electricity generation categorized by source (fossil fuels, nuclear, renewables). This allows for a comprehensive basis for exploring trends in energy consumption and sustainability.
  • 💡 Handling and Clarification of Missing CO2 Data: The representation of "null" values for CO2 emissions in 2020 is consistent with the accompanying reference text, which states that CO2 data is available only up to 2019. This accurate representation of data limitations is good. However, if this table were to be presented as a standalone element, a direct footnote explaining these nulls would be crucial for immediate clarity and to prevent users from misinterpreting them as simple data omissions or errors.
  • 💡 Data Source and Representativeness Context: Panel 1 displays data for only a few countries, while the main figure caption specifies the dataset covers "20 countries." While this is acceptable for an illustrative snippet within a larger workflow diagram, a complete assessment of the dataset's scientific validity would necessitate more details, such as the specific source of the data (e.g., International Energy Agency, World Bank), the criteria for the selection of these 20 countries, and any preprocessing steps. This information is typically provided in a methods section rather than a figure caption.
  • 💡 Clarity and Consistency of "Entity" Variable: The term "Entity" is used for countries in the examples shown. For robust analysis, it would be important to confirm that all entries under "Entity" across the full dataset consistently refer to national entities and do not include regional aggregates or other types of organizations, which could affect data comparability.
Communication
  • ✅ Clear Tabular Structure: Panel 1 uses a standard and easily understandable tabular format for presenting the data. Column headers are descriptive and clearly labeled.
  • ✅ Units Indicated: The units for measurements (kt for CO2 emissions, TWh for electricity) are explicitly provided in the column headers, which is a key aspect of good data presentation.
  • ✅ Role as Input Clearly Indicated: Its position at the start of the workflow diagram in Figure 2, with arrows leading to transformation steps like "pivot" and "calculate", effectively communicates its role as the initial input data for the subsequent analyses shown in other panels.
  • 💡 Font Legibility: While the text within this table panel is generally legible, it is quite small. Given the density of the overall Figure 2, any improvement in font size or contrast for this panel could enhance readability. Suggestion: If space allows, slightly increase font size for table contents or ensure high-resolution rendering.
  • 💡 Context within Overall Figure Density: This panel is one of many in a complex figure. While its individual clarity is good, its effectiveness is tied to how well it integrates into the narrative of the entire Figure 2. The transformation labels originating from this panel are helpful in this regard. Suggestion: Ensure that the visual flow from this input table to the first derived charts is unmistakably clear, perhaps with slightly bolder connecting arrows or a subtle background grouping for this initial stage.
Figure 3: DF2 overview. Users create visualizations by providing fields...
Full Caption

Figure 3: DF2 overview. Users create visualizations by providing fields (drag-and-drop or type) and NL instructions to the Chart Builder, delegating data transformation to AI. Data View shows derived data. Users navigate data history and select contexts for the next iteration using (the thread in use is displayed as local data threads). They refine or create new charts by providing instructions in Chart Builder. The main panel provides pop-up windows to inspect code, explanations, and chat history.

Figure/Table Image (Page 4)
Figure 3: DF2 overview. Users create visualizations by providing fields (drag-and-drop or type) and NL instructions to the Chart Builder, delegating data transformation to AI. Data View shows derived data. Users navigate data history and select contexts for the next iteration using (the thread in use is displayed as local data threads). They refine or create new charts by providing instructions in Chart Builder. The main panel provides pop-up windows to inspect code, explanations, and chat history.
First Reference in Text
As Figure 4-2 shows, Megan first drags and drops existing fields Year and Entity to the x-axis and color, respectively.
Description
  • Overall UI Overview: Figure 3 presents a screenshot of the Data Formulator 2 (DF2) user interface, illustrating its main components and their arrangement. The interface is designed to help users iteratively create data visualizations.
  • 1. Chart Builder: Component 1, labeled "Chart Builder for specifying chart with visual encodings and NL instructions," is located in the central-right part of the UI. It shows a configuration area where users can define a chart (e.g., a "Custom Line" chart) by assigning data fields like "Year" to the x-axis and "Renewable Per..." (Renewable Percentage) to the y-axis. It also includes a text input field for Natural Language (NL) instructions, here showing "include global median as an entity". A list of available "Data Fields" (e.g., "Global Median?", "Rank", "Renewable Percentage") is visible to the right of the chart builder.
  • 2. Local Data Thread: Component 2, "Local Data Thread visualizes the current data thread and supports quick backtracking," is situated directly above the Chart Builder. It displays a sequence of chart states (e.g., from "energy.csv" to "table-42" to "table-77" to "table-18") within the currently active iteration path. This component allows users to see the immediate history and potentially revert or branch from recent steps.
  • 3. Data Threads: Component 3, "Data Threads for navigating and selecting contexts to guide AI in the next iteration," is shown in the top-left panel. It displays multiple independent iteration histories (e.g., "thread-1", "thread-2", "thread-3"), each representing a distinct line of exploration. Users can select a chart from these threads to use as a starting point for new visualizations.
  • 4. Data View: Component 4, "Data View for inspecting original and derived data," is located at the bottom of the UI. It shows a tabular representation of the data associated with a selected chart (here, "table-18"). Columns include "Entity", "Global Median?", "Renewable Percentage", and "Year", with example data like "China, No, 16.639126586, 2000". This allows users to inspect the data values underlying their visualizations.
  • Additional UI Features: The figure also indicates that pop-up windows are available for inspecting code, explanations, and chat history, although these pop-ups are not actively displayed in this particular screenshot. The main central panel shows a line chart visualizing "Renewable Percentage" over "Year" for different entities, including a "Global Median".
Scientific Validity
  • ✅ Coherent UI representation: The figure provides a plausible and coherent visual representation of a user interface designed for iterative data visualization, aligning with the system's description in the caption and the broader context of the paper.
  • ✅ Demonstrates key system concepts: The layout effectively demonstrates the key concepts of DF2, such as the separation of global "Data Threads" for managing different exploration paths, a "Local Data Thread" for the current iterative sequence, a "Chart Builder" for multi-modal input (GUI and NL), and a "Data View" for inspection. This supports the paper's claims about the system's architecture and user interaction model.
  • ✅ Illustrates multi-modal input: The figure illustrates the multi-modal input capability through the Chart Builder, which shows both GUI elements for field mapping (e.g., drag-and-drop implied for axes) and a text box for NL instructions. This is a central aspect of the DF2 system.
  • ✅ Consistent with AI-delegated tasks: The caption mentions that the AI is delegated data transformation tasks, and the main panel can show pop-ups for code and explanations. While the pop-ups are not shown, the overall UI structure is consistent with a system where AI plays a significant role in processing user requests and generating outputs (charts and derived data).
  • 💡 Static representation of an interactive system: The figure is a static representation. The dynamic aspects of interaction, such as the actual drag-and-drop mechanism, the process of typing NL queries, or the appearance of pop-up windows, are implied rather than explicitly shown. This is a common limitation of static figures for interactive systems but does not detract from its validity as an overview.
  • 💡 Mismatched reference text: The reference text provided ("As Figure 4-2 shows...") does not pertain to Figure 3. This figure's validity should be assessed based on its own content and caption, and its consistency with the overall paper narrative about DF2.
Communication
  • ✅ Effective use of callouts: The use of numbered callouts (1, 2, 3, 4) effectively highlights the key components of the DF2 interface, guiding the viewer's attention to the most important functional areas described in the caption.
  • ✅ Comprehensive system overview: The figure successfully provides a comprehensive visual overview of the DF2 system, illustrating the spatial arrangement and interplay of its main functional panels, which aids in understanding the user workflow.
  • ✅ Clear and aligned caption: The caption is clear and directly corresponds to the visual elements and callouts in the figure, making it relatively self-contained for understanding the system's basic operation.
  • 💡 Legibility of small text elements: The text within some UI elements, particularly the smaller chart previews in the "Data Threads" panel (top left) and some labels in the "Data Fields" list (top right), is quite small and difficult to read. Suggestion: For a figure intended as an overview, consider using slightly larger fonts in the mock-up, or if these are direct screenshots, ensure high resolution and potentially use magnified insets for critical small text areas.
  • 💡 High information density: The figure is information-dense, presenting many UI elements simultaneously. While this shows the system's richness, it could be slightly overwhelming for a first-time viewer. Suggestion: Ensure a clear visual hierarchy, perhaps by using subtle color differences or line weights to differentiate primary interaction areas from secondary ones, or by focusing callouts on a more streamlined workflow if the goal is to illustrate a specific interaction path.
  • 💡 Subtle workflow indicators: The arrows indicating workflow (e.g., from "Data Threads" to "Chart Builder") are somewhat subtle. Suggestion: Make these workflow indicators more prominent to better emphasize the iterative process described.
Figure 4: Experiences with DF2: (1) creating the basic renewable energy chart...
Full Caption

Figure 4: Experiences with DF2: (1) creating the basic renewable energy chart using drag-and-drop to encode fields; (2 and 3) creating charts requiring new fields by providing field names and optional natural language instructions to derive new data.

Figure/Table Image (Page 5)
Figure 4: Experiences with DF2: (1) creating the basic renewable energy chart using drag-and-drop to encode fields; (2 and 3) creating charts requiring new fields by providing field names and optional natural language instructions to derive new data.
First Reference in Text
As Figure 4-2 shows, Megan first drags and drops existing fields Year and Entity to the x-axis and color, respectively.
Description
  • Basic Chart Creation: Panel 1 of Figure 4 demonstrates the initial step of creating a basic chart in DF2. It shows the Chart Builder interface where a user is constructing a line chart. The data source is "energy.csv".
  • Field Encoding: The user has encoded the chart by assigning data fields to visual channels: "Year" is mapped to the x-axis, "Electricity from r..." (presumably 'Electricity from renewables (TWh)') is mapped to the y-axis, and "Entity" (representing countries or regions) is mapped to the color channel. This implies a drag-and-drop interaction, as described in the caption.
  • Available Data Fields: A "Data Fields" list is visible on the right, showing available fields like "CO2 emissions (kt)", "Electricity from fossil fuels (...", "Electricity from nuclear (T...", and "Electricity from renewables". This is where the user would select fields to encode.
Scientific Validity
  • ✅ Represents fundamental chart creation: This panel accurately depicts a fundamental and common method for chart creation in many visualization tools—mapping existing data fields to visual encodings. It serves as a valid baseline interaction.
  • ✅ Plausible UI for basic tasks: The UI shown is plausible for a system aiming to simplify visualization authoring, providing direct manipulation for basic tasks.
  • ✅ Supports caption claim (part 1): The panel clearly supports the first part of the caption: "(1) creating the basic renewable energy chart using drag-and-drop to encode fields."
  • 💡 Mismatch with provided reference text for Figure 4-2: The reference text mentions "Figure 4-2" which corresponds to the next panel, not this one. This panel (4-1) illustrates a different step than what is described in the provided reference text for Figure 4-2.
Communication
  • ✅ Clear illustration of drag-and-drop: This panel clearly illustrates the drag-and-drop functionality for basic chart creation by showing fields being assigned to visual channels (x-axis, y-axis, color). The UI elements are distinct and the action is intuitive.
  • ✅ Standard terminology: The use of standard chart builder terminology (e.g., "Line Chart", "x-axis", "y-axis", "color") makes the interface understandable.
  • ✅ Clear Data Fields list: The visual representation of the "Data Fields" list on the right, from which fields are presumably dragged, is clear and aids in understanding the source of the encoded fields.
  • 💡 Truncated field names: The text for the field names (e.g., "Electricity from r...") is slightly truncated due to space constraints. While the meaning can often be inferred, full field names would be ideal for complete clarity. Suggestion: If possible in the UI mockup, allow for slightly wider field display areas or use tooltips in the actual system (though not representable here).
Figure 5: Iteration with DF2: (1) provide an instruction to filter the...
Full Caption

Figure 5: Iteration with DF2: (1) provide an instruction to filter the renewable energy percentage chart by top CO2 countries, (2) update the chart with Global Median? and instruct DF2 to add the global median alongside the top 5 CO2 countries' trends, and (3) move Global Median? from column to opacity to update the chart design without deriving new data.

Figure/Table Image (Page 5)
Figure 5: Iteration with DF2: (1) provide an instruction to filter the renewable energy percentage chart by top CO2 countries, (2) update the chart with Global Median? and instruct DF2 to add the global median alongside the top 5 CO2 countries' trends, and (3) move Global Median? from column to opacity to update the chart design without deriving new data.
First Reference in Text
On top of that, Megan provides a new instruction below the local data thread, "show only top 5 CO2 emission countries' trends," and clicks the “derive” button (Figure 5-1).
Description
  • Initial Chart Context: Panel 1 of Figure 5 (referenced as Figure 5-1 in the text) shows an intermediate step in the DF2 workflow. It displays the Chart Builder interface, which currently shows a line chart derived from "table-42". This chart plots "Renewable Per..." (Renewable Percentage) on the y-axis against "Year" on the x-axis, with different lines colored by "Entity" (countries).
  • Natural Language Instruction: The key action illustrated is the user providing a natural language (NL) instruction. In the text input field below the chart, the user has typed: "Show only top 5 CO2 emission countries' trends." This instruction is intended to filter the currently displayed renewable energy percentage chart based on a different criterion (CO2 emissions).
  • Action Button: A button labeled "formulate data" is visible below the NL instruction input field, indicating the action the user would take to submit this instruction to the DF2 system for processing.
Scientific Validity
  • ✅ Represents NL-driven filtering input: This panel accurately depicts the input stage for a natural language-driven data transformation and filtering task, which is a core capability of the DF2 system as described.
  • ✅ Realistic and non-trivial task: The scenario—filtering a chart based on criteria not directly plotted (CO2 emissions influencing a renewable energy chart)—is a realistic and non-trivial task in exploratory data analysis, showcasing the potential utility of the AI-driven transformation.
  • ✅ Supports caption claim (part 1): The figure panel directly supports the first part of the main Figure 5 caption: "(1) provide an instruction to filter the renewable energy percentage chart by top CO2 countries".
  • 💡 Validity depends on subsequent AI processing: The panel clearly shows the user's intent expressed through natural language. The scientific validity of the outcome of this instruction would depend on the AI's ability to correctly interpret the query, access relevant CO2 data (which is not explicitly shown as part of "table-42"), perform the ranking and filtering, and then apply this to the renewable percentage data. This panel only shows the input step.
  • 💡 Minor discrepancy in button label (text vs. figure): The reference text mentions a "derive" button, while the figure shows a "formulate data" button. This is a minor discrepancy but worth noting for consistency between text and visual. Assuming "formulate data" is the intended label, it reasonably conveys the action.
Communication
  • ✅ Clear focus on NL instruction: The panel clearly highlights the natural language instruction input field, making the user's action (providing a textual command) the central focus, which aligns with the caption's description of this step.
  • ✅ Context of existing chart is clear: The context of the existing chart (Renewable Energy Percentage) is visible, providing a clear before-state for the intended filtering operation.
  • ✅ Clear visual cue for step 1: The numbering "1" effectively links this panel to the first step described in the main caption for Figure 5.
  • 💡 Legibility of chart preview details: The chart preview within this panel is relatively small, and the details (e.g., specific country names in the legend, axis labels) are difficult to discern. Suggestion: While this is an intermediate step, ensuring slightly better legibility or using a simplified iconic representation of the chart could improve clarity without losing context.
  • 💡 Button label clarity: The term "derive" on the button, as mentioned in the reference text (though the button label in the figure is "formulate data"), might be slightly ambiguous. "Formulate data" or "Apply filter" or "Update chart" might be more direct. This is a minor point regarding the UI terminology itself. Suggestion: Ensure button labels clearly reflect the action being performed; "Formulate data" is reasonably clear.

Method

Key Aspects

Strengths

Suggestions for Improvement

Non-Text Elements

Figure 6: DF2's workflow: (1) DF2 generates a Vega-Lite spec skeleton based on...
Full Caption

Figure 6: DF2's workflow: (1) DF2 generates a Vega-Lite spec skeleton based on user specifications and chart type. (2) If new fields (e.g., Rank) are required, DF2 prompts its AI model to generate data transformation code. (3) The Vega-Lite skeleton is then instantiated with the new data to produce the desired chart.

Figure/Table Image (Page 7)
Figure 6: DF2's workflow: (1) DF2 generates a Vega-Lite spec skeleton based on user specifications and chart type. (2) If new fields (e.g., Rank) are required, DF2 prompts its AI model to generate data transformation code. (3) The Vega-Lite skeleton is then instantiated with the new data to produce the desired chart.
First Reference in Text
Below shows the LLM's refined goal for the task in Figure 6, and the generated code is shown in Figure 6-2.
Description
  • User Specification Input: Panel 1 of Figure 6 illustrates the initial phase of DF2's workflow. It shows a user interacting with the Chart Builder interface. The user has selected 'Line Chart' as the chart type from the 'table-42' data source. They have mapped 'Year' to the x-axis, a new field 'Rank' to the y-axis, and 'Entity' to the color encoding. Additionally, a natural language (NL) instruction, "rank by renewable percentage," has been provided.
  • Vega-Lite Skeleton Generation: As a result of these user specifications, DF2 generates an initial Vega-Lite JSON skeleton. Vega-Lite is a high-level grammar for interactive graphics, allowing concise descriptions of visualizations. The generated skeleton shown is: `{"mark": "line", "encoding": {"x": {"field": "Year", "type": "temporal"}, "y": {"field": "Rank", "type": "?"}, "color": {"field": "Entity", "type": "nominal"}}}`. Notably, the 'type' for the 'Rank' field is marked with a '?', indicating it's yet to be determined, as 'Rank' is a new field to be derived.
Scientific Validity
  • ✅ Plausible first step in AI-assisted workflow: This panel accurately represents the first step in a plausible workflow for AI-assisted visualization: translating user intent (expressed via GUI and NL) into a structured chart specification (Vega-Lite).
  • ✅ Correct handling of new field specification: The generation of a Vega-Lite skeleton with placeholders (like the '?' for the type of 'Rank') correctly reflects a scenario where new data fields need to be derived before the chart can be fully specified, aligning with the overall problem DF2 aims to solve.
  • ✅ Demonstrates multi-modal input: The combination of GUI inputs for visual encoding and NL input for data derivation intent is a key aspect of DF2's proposed multi-modal interaction, and this panel effectively demonstrates that combination.
Communication
  • ✅ Clear depiction of user input and skeleton generation: This part of the figure clearly shows the user's input (chart type, field mappings like 'Year' to x-axis, 'Rank' to y-axis, and 'Entity' to color, plus an NL instruction 'rank by renewable percentage') and the corresponding initial Vega-Lite JSON skeleton. The visual connection between user input and the skeleton is evident.
  • ✅ Readable Vega-Lite skeleton: The Vega-Lite JSON is presented in a readable format, making it easy to understand the structure of the chart specification being generated. The placeholder '?' for the type of 'Rank' clearly indicates an unresolved part of the specification.
  • ✅ Intuitive UI representation: The UI elements for user specification (e.g., x-axis, y-axis, color dropdowns) are standard and intuitively represent how a user would define a chart.
  • ✅ Clear workflow indication: The connection from the user input panel to the Vega-Lite skeleton is indicated by an arrow, which helps in following the workflow step.
Figure 7: DF2 converts user encodings into a Vega-Lite specification, which is...
Full Caption

Figure 7: DF2 converts user encodings into a Vega-Lite specification, which is combined with AI-transformed data to visualize country ranks in 2000 and 2020.

Figure/Table Image (Page 7)