This paper introduces Data Formulator 2 (Df2), an AI-powered visualization system designed to address the challenges of iterative authoring in exploratory data analysis. Existing AI tools often require users to provide complete, text-only descriptions of visualizations upfront, which is impractical when analytical goals evolve during exploration. Df2 tackles this limitation by blending a graphical user interface (GUI) with natural language (NL) input, allowing users to specify chart designs precisely while delegating data transformation tasks to the AI. The system also introduces "data threads," a mechanism for tracking the history of data transformations and visualizations, enabling users to easily revisit, revise, and branch from previous steps.
The core of Df2's methodology involves decoupling chart specification from data transformation. Users define their visualization intent through a combination of GUI interactions (e.g., drag-and-drop field mapping) and concise NL instructions. The system then generates a Vega-Lite specification (a high-level grammar for interactive graphics) and prompts a large language model (LLM) to produce Python code for the necessary data transformations. Df2 executes this code, handles potential errors, and instantiates the Vega-Lite specification with the transformed data to generate the visualization. Data threads provide a visual representation of the user's interaction history, facilitating navigation and reuse of previous results.
A user study with eight participants, with varying levels of expertise in data analysis and programming, demonstrated Df2's effectiveness in supporting iterative visualization authoring. Participants successfully completed a series of tasks involving the creation of 16 visualizations, requiring diverse data transformations. The study revealed distinct iteration styles among users, some preferring broader exploration with multiple branches (wide trees), while others favored deeper, more linear progressions (deep trees). Participants also employed various prompting techniques, ranging from imperative commands to questions and chat-style interactions. The study highlighted the importance of Df2's transparency features, such as code explanations and data provenance tracking, in building user trust and facilitating verification of AI-generated outputs.
The discussion explores future directions for Df2, including integration with visualization recommendation systems and the development of agent-based systems for coordinating data transformation and chart editing. The authors acknowledge the limitations of the user study, particularly its focus on reproduction tasks and the lab setting, and propose future research involving open-ended exploration and longitudinal studies to investigate long-term user behavior and learning effects.
Data Formulator 2 (Df2) presents a compelling approach to iterative visualization authoring, effectively addressing the limitations of existing AI-powered tools. The system's innovative blend of GUI and NL input, coupled with its sophisticated data threading mechanism, empowers users to navigate complex data transformations and explore diverse visualization strategies with remarkable efficiency. The user study, while limited in sample size, provides strong qualitative evidence for Df2's usability and its potential to transform data analysis workflows. The system's transparency features, including code explanations and data provenance tracking, foster user trust and facilitate verification of AI-generated outputs.
However, the study's reliance on reproduction tasks and the lab setting constrains the generalizability of findings to real-world, open-ended exploration scenarios. Future research addressing these limitations, along with the proposed enhancements for recommendation systems and agent-based chart editing, will be crucial for realizing Df2's full potential. The core strength of this work lies in its robust, user-centered design and its potential to democratize access to sophisticated data visualization techniques by lowering the barrier to entry for users with varying levels of programming expertise. The integration of AI capabilities within an intuitive interface offers a promising pathway for more efficient, insightful, and accessible data exploration.
The abstract effectively establishes the existing gap in AI-powered visualization tools, specifically their inadequacy for iterative authoring, which is a common practice in exploratory data analysis.
The abstract clearly introduces Data Formulator 2 (Df2) as the proposed solution and immediately states its primary design goal: to overcome the limitations of existing systems in iterative authoring.
The abstract successfully communicates the core mechanisms of Df2 that address the identified problem, such as the blend of GUI and NL inputs, AI-driven data transformation, and support for navigating iteration history.
The inclusion of a user study with a specific number of participants lends credibility to the system's claims and indicates that the findings are backed by empirical evidence.
The abstract mentions that Df2 helped participants complete "challenging data exploration sessions." While conciseness is key in an abstract, adding a very brief, impactful descriptor for the nature of these challenges (e.g., involving complex data transformations, multi-step analyses) could subtly enhance the reader's understanding of Df2's capabilities and impact. This would be a low-impact change, but could add a touch more specificity without significantly increasing length. It belongs in the abstract as it clarifies the context of the user study's findings, which is a crucial part of summarizing the paper's contribution.
Implementation: Consider revising the last sentence to incorporate a brief qualifier for the challenges. For example: "A user study with eight participants demonstrated that Df2 allowed participants to develop their own iteration styles to complete challenging data exploration sessions, such as those involving evolving analytical goals and multi-step data transformations."
The Introduction clearly articulates the core problem: the mismatch between the iterative nature of data exploration and the capabilities of existing AI-powered visualization tools. It effectively sets the stage by detailing why current solutions fall short.
The paper doesn't just state a general problem but pinpoints specific deficiencies in current tools, namely the issues with text-only prompts (lack of precision, difficulty in describing complex designs) and the lack of support for iterative behaviors like branching and backtracking.
The Introduction provides a strong logical bridge from the identified problems to the proposed key insights of Df2. The multi-modal chart builder and data threads are presented as direct responses to the limitations discussed.
The section concludes with a clear, bulleted list of the paper's main contributions, which helps the reader understand the scope and impact of the work upfront.
The Introduction mentions that the user study 'discovered data analysts’ different iteration styles.' While the full details are rightly reserved for later sections, providing a very brief, high-level characterization of these styles (e.g., 'ranging from cautious, linear refinements to more exploratory, branched investigations') within the introduction could further pique reader interest and make this specific contribution more tangible from the outset. This is a medium-impact suggestion that could enhance the foreshadowing of key findings, fitting well within the summary of contributions.
Implementation: Consider expanding the sentence slightly, for example: 'We conducted a user study that discovered data analysts’ different iteration styles (e.g., varying in their approach to branching and refinement) and rich experiences using our new interaction approaches...'
Figure 1: With Data Formulator 2, analysts can iterate on a previous design by (1) selecting a chart from data threads and (2) providing combined natural language and graphical user interface inputs in the chart builder to specify the new design. The AI model generates code to transform the data and update the chart. Data threads are updated with new charts for future use.
Figure 2: An analyst explores electricity from different energy sources, renewable percentage trends, and country rankings by renewable percentages using a dataset on CO2 and electricity for 20 countries (2000-2020, table 1). The analyst creates five data versions in three branches to support different chart designs. DF2 allows users to manage iteration directions and create rich visualizations using a blended UI and natural language inputs.
Figure 3: DF2 overview. Users create visualizations by providing fields (drag-and-drop or type) and NL instructions to the Chart Builder, delegating data transformation to AI. Data View shows derived data. Users navigate data history and select contexts for the next iteration using (the thread in use is displayed as local data threads). They refine or create new charts by providing instructions in Chart Builder. The main panel provides pop-up windows to inspect code, explanations, and chat history.
Figure 4: Experiences with DF2: (1) creating the basic renewable energy chart using drag-and-drop to encode fields; (2 and 3) creating charts requiring new fields by providing field names and optional natural language instructions to derive new data.
Figure 5: Iteration with DF2: (1) provide an instruction to filter the renewable energy percentage chart by top CO2 countries, (2) update the chart with Global Median? and instruct DF2 to add the global median alongside the top 5 CO2 countries' trends, and (3) move Global Median? from column to opacity to update the chart design without deriving new data.
The section clearly lays out the foundational design choices of Df2—decoupling chart specification from data transformation and using data threads for iteration—providing a strong conceptual framework for the subsequent detailed descriptions. This upfront clarity helps the reader understand the core architectural decisions.
The description of how users compose charts using a blend of GUI (shelf-configuration) and NL inputs is thorough and effectively justifies the benefits of this approach, such as saving users effort in writing verbose prompts for complex designs.
The method details a sophisticated, multi-segment prompting strategy for the LLM, including a "goal refinement" step and the inclusion of dialog history. The automated error correction mechanism, where Df2 queries the LLM with error messages, demonstrates a robust approach to AI integration.
The distinction between global and local data threads, along with their specific roles in navigation, context awareness, and facilitating quick revisions, is well-explicated. This highlights a nuanced understanding of user needs during iterative analysis.
The system provides multiple avenues for users to inspect AI-generated results (data, code, explanations, chat history) and allows direct manipulation of chart styles without AI intervention. This empowers users and builds trust.
The "goal refinement" step, where the LLM elaborates the user's intent into a JSON object before code generation, is an important design feature for improving transformation accuracy. While its rationale is clearly stated, the Method section could enhance reader understanding by briefly clarifying how, or if, this refined goal is exposed to the user. Knowing whether users can inspect or influence this AI-interpreted goal before final code generation is pertinent to understanding the system's transparency and the user's agency in the AI-assisted workflow. This is a medium-impact suggestion as it relates to the interpretability of the AI's intermediate reasoning and user oversight.
Implementation: Add a sentence after describing the "goal refinement" step (page 6) to specify if the refined JSON is visible to the user (e.g., in logs, or as a pre-confirmation step) or if it's a purely internal process. For instance: "This refined JSON goal is logged as part of the interaction history, accessible via the 'view chat history' pop-up, allowing users to retrospectively understand the LLM's interpretation, though it is not presented for pre-confirmation in the current design."
The paper comprehensively describes error handling for the AI-generated Python data transformation code. However, after successful code execution, the process involves instantiating the Vega-Lite script with the new data, including inferring semantic types. It would strengthen the Method section to briefly address how Df2 handles potential errors that might arise specifically during this Vega-Lite instantiation or subsequent rendering phase (e.g., type mismatches not caught by Python, Vega-Lite spec errors, or rendering engine issues with the transformed data). This is a low-to-medium impact suggestion that would provide a more complete picture of the system's robustness.
Implementation: Following the description of Vega-Lite script instantiation (page 7), add a sentence clarifying the handling of errors at this stage. For example: "If errors arise during the Vega-Lite instantiation or rendering (e.g., due to incompatible data types with the chart template or malformed Vega-Lite specifications), Df2 currently surfaces these errors to the user, prompting a revision of either the chart design or the transformation logic. Future work could explore AI-assisted diagnostics for such visualization-specific errors."
Figure 6: DF2's workflow: (1) DF2 generates a Vega-Lite spec skeleton based on user specifications and chart type. (2) If new fields (e.g., Rank) are required, DF2 prompts its AI model to generate data transformation code. (3) The Vega-Lite skeleton is then instantiated with the new data to produce the desired chart.
Figure 7: DF2 converts user encodings into a Vega-Lite specification, which is combined with AI-transformed data to visualize country ranks in 2000 and 2020.