Skip to content

Bring Your Own AI

TypeDataEditorChat<TRow>

When provided, the editor shows a chat panel alongside the grid. Users type natural language (“fix all emails”, “fill empty roles with Viewer”) and your AI turns those requests into changes the SDK applies to the data.

You bring the AI integration. The SDK gives you context about the current dataset and applies your streamed response.

  1. The user types a message.
  2. The SDK calls your onMessage(context) with the prompt and dataset context (columns, a row sample, error summary, counts).
  3. You build a prompt, call your LLM, and stream chunks back to the SDK.
  4. The SDK renders status and message text, applies ops and rows, and groups each chunk into a single undo step.
type DataEditorChat<TRow> = {
sampleSize?: number;
emptyTitle?: string;
suggestionsCount?: number;
loadSuggestions?: (context: ChatDataContext<TRow>) => Promise<string[]>;
onMessage: (context: ChatContext<TRow>) => AsyncIterable<ChatResponseChunk<TRow>>;
onCancel?: () => void;
};
FieldPurpose
sampleSizeHow many rows to include in the context sample.
emptyTitleTitle above the empty-state suggestion chips.
suggestionsCountHow many suggestion chips (and loading skeletons) the panel shows.
loadSuggestionsGenerates the empty-state suggestion chips.
onMessageHandles the user’s message. Receives dataset context, streams back response chunks.
onCancelCalled when the user cancels a pending request.
Typenumber

How many rows to include in context.sample. The SDK picks a representative slice and prioritizes rows with validation errors. When omitted, the SDK chooses.

Raise it when your LLM benefits from broader coverage. Lower it to save tokens. The sample is for the LLM to learn the shape of the data, not a working set. For full-dataset reads, call context.getRows().

Type(context: ChatContext<TRow>) => AsyncIterable<ChatResponseChunk<TRow>>

Called when the user sends a message. Returns an async iterable of response chunks. The SDK streams them into the UI as they arrive.

onMessage is where the real integration work lives. There are three things to get right:

  1. Read the context the SDK gives you (below).
  2. Write a transformation function the SDK can apply per row.
  3. Write a prompt that makes your LLM produce that function.

The rest of this section walks through each one, then covers what to stream back.

type ChatContext<TRow> = {
message: string;
columns: DataEditorColumn[];
primaryKey: keyof TRow;
totalRowCount: number;
filteredRowCount: number;
sample: ChatRow<TRow>[];
errorSummary: ChatErrorSummary[];
getRows: () => ChatRow<TRow>[];
};
FieldWhat it isWhat to do with it
messageThe user’s chat prompt.Pass it through to your LLM as the user message.
columnsFull column definitions, including id, editor, unique.Serialize a compact schema into your prompt so the LLM knows the field shape.
primaryKeyThe row identifier field.Only needed if you emit rows chunks (the SDK matches rows by this key).
totalRowCountRows in the dataset.Lets the LLM reason about scope.
filteredRowCountRows in the current filtered view.Ops only apply to these. Tell the LLM “you are looking at N of M rows” when it matters.
sampleA representative slice, weighted toward rows with errors. Size controlled by sampleSize.Send to the LLM as example data. Each item is a ChatRow.
errorSummaryAggregated error counts grouped by field and message, with a few example values.The single most useful field for “fix all the bad emails” requests. Send the whole thing.
getRowsFunction that returns every row, status and errors included.Use for full-dataset operations (counting, statistics). Avoid sending the result wholesale to an LLM.
type ChatRow<TRow> = {
data: TRow;
status: "new" | "edited" | "original";
errors: Record<string, string[]>;
source: string;
};

data is the row keyed by column ID. errors is keyed by column ID with one or more validation messages per field. status tells you whether the row is freshly added, edited since import, or unchanged. source is which import source the row came from.

type ChatErrorSummary = {
field: string;
message: string;
count: number;
examples: string[];
};

One entry per (field, message) pair across the current view. count is how many rows hit it. examples is a short list of values that triggered the error.

The point of errorSummary is to let the LLM see what is wrong without sending every bad row. A prompt like “fix all the email errors” works because the LLM sees { field: "email", message: "Invalid email", count: 47, examples: ["jane@", "bob..com"] } and infers the fix pattern.

The SDK applies changes by running a function you provide, once per row in the current filtered view. You send the function as a string of JavaScript source inside an ops chunk; the SDK runs it.

type ChatOp =
| { action: "edit"; fn: string } // (r, ctx) => void
| { action: "delete"; fn: string }; // (r, ctx) => boolean
actionSignatureBehavior
"edit"(r, ctx) => voidMutate r in place. Changed fields become column deltas. Rows with no changes are no-ops.
"delete"(r, ctx) => booleanTruthy flags the row for soft deletion. Subsequent ops skip this row.
  • Use exact column IDs. r is keyed by the id of each column from your schema. Not the title, not a shortened version. r["firstName"], not r["first"].
  • ctx.opts[fieldId] is a Set<string> of allowed values for columns with editor.type === "select". Use ctx.opts.country.has(value), never inline the option list into the function source.
  • Per-row errors are silent. If the function throws on a row, that row is skipped and the rest continue. This is good for resilience (one malformed row does not abort the batch) but means buggy functions can quietly no-op. Always include a message chunk in your response that summarizes intent, so the user can compare it to what they see in the grid.
  • Multiple ops in one ops chunk run in array order, per row. Use multiple ops when the request has independent steps (“set country to ‘US’ for empty rows, then delete inactive users”).
  • delete is terminal for the row. Once an op flags a row, later ops in the same chunk do not see that row.
  • One ops chunk produces one undo step. Splitting a single user request across multiple chunks splits its undo.
// Normalize emails: trim whitespace, lowercase, append ".com" if missing TLD.
{
action: "edit",
fn: `(r) => {
if (typeof r.email !== "string") return;
let v = r.email.trim().toLowerCase();
if (v && !/\.[a-z]{2,}$/.test(v)) v += ".com";
r.email = v;
}`
}
// Delete rows where country is not in the allowed set.
{
action: "delete",
fn: `(r, ctx) => !ctx.opts.country.has(r.country)`
}

Your prompt has two parts: a system prompt that pins down the output contract, and a user message built from the request and the dataset context.

The LLM has no idea what shape the SDK expects. Your system prompt has to spell it out:

  1. Output a JSON object with optional ops and optional message. Nothing else at the top level.
  2. Each op is { "action": "edit" | "delete", "fn": "<JavaScript function source>" }.
  3. r is keyed by exact column IDs from the schema in the user message.
  4. edit functions mutate r in place and return nothing. delete functions return a boolean.
  5. For select columns, use ctx.opts[fieldId].has(value); never inline the option list.
  6. Skip values you cannot fix. Do not clear or guess unless the user asked for that.

Build the user message from ChatContext:

  • A compact schema: per column, id plus editor.type, and options for selects.
  • The errorSummary rendered as a short list: field: "message" (count×) e.g. "example1", "example2".
  • A trimmed sample: just data, plus an _errors field for rows that have validation errors.
  • The user’s message as the final line.

Keep it compact. The LLM does not need a row’s status or source to write a transformation; it needs the schema and a few representative rows.

This is enough to get a working integration. Extend it with fix patterns specific to your domain (date formats, phone normalization, etc.).

You generate per-row transformations for spreadsheet data.
Output a single JSON object:
{ "ops": [ ... ], "message": "short summary" }
Both fields are optional. Omit "ops" for non-data requests (questions, clarifications).
Each op has one of two shapes:
{ "action": "edit", "fn": "(r, ctx) => { ... }" } // mutate r in place
{ "action": "delete", "fn": "(r, ctx) => <boolean>" } // truthy flags the row
Rules:
- r is keyed by EXACT column IDs from the schema below. Use r["id"]; do not shorten or rename.
- edit functions mutate r in place. Do not return.
- delete functions return a boolean. True means flag the row for deletion.
- ctx.opts[fieldId] is a Set of allowed values for select columns. Use .has(value). Never inline the option list.
- Skip values you cannot confidently fix. Do not clear or guess.
Examples:
Request: "Lowercase all emails."
{
"ops": [
{ "action": "edit", "fn": "(r) => { if (typeof r.email === 'string') r.email = r.email.toLowerCase(); }" }
],
"message": "Lowercased all email values."
}
Request: "Remove rows where country is not in the allowed list."
{
"ops": [
{ "action": "delete", "fn": "(r, ctx) => !ctx.opts.country.has(r.country)" }
],
"message": "Flagged rows with an unrecognized country for deletion."
}

If your model supports JSON mode or structured output (DeepSeek, OpenAI, Anthropic with tool use, Gemini), enable it. It eliminates the most common failure mode (the model wrapping JSON in prose, code fences, or commentary).

onMessage returns an async iterable of chunks. The SDK consumes them in order.

type ChatResponseChunk<TRow> =
| { type: "status"; content: string }
| { type: "message"; content: string }
| { type: "rows"; content: TRow[] }
| { type: "ops"; content: ChatOp[] };
TypeWhen to use it
statusProgress text shown while the request is running (“Analyzing 500 rows…”). Replaced by the next status or by the final message.
messageFinal chat reply shown to the user. Send one per response, at the end.
opsPer-row transformations applied to the current filtered view. The recommended path for data changes.
rowsConcrete row data to merge into the grid, matched by primaryKey. Use as an escape hatch (see below).

A typical response looks like:

yield { type: "status", content: "Fixing emails..." };
yield { type: "ops", content: [{ action: "edit", fn: "..." }] };
yield { type: "message", content: "Fixed 47 invalid emails." };
  • Scales. One function string applies to a 500-row or 50,000-row filtered view. You do not pay for the row payload going through the LLM.
  • Supports deletion. Only ops can flag rows for soft delete.
  • Undo grouping. All ops in a single chunk become one undo step.

Use rows when you already have the concrete row data and a transformation function would be awkward. The classic case is enrichment from an external API: you fetched fresh data for a set of rows and want to merge it in by primaryKey. For more than a few hundred rows, prefer ops.

The fn string is JavaScript that runs in your user’s browser. Ops only apply to the rows in the current filtered view, which gives the user a natural scope. For destructive operations the user did not explicitly ask for, surface a confirmation or preview step before applying the chunk.

Type(context: ChatDataContext<TRow>) => Promise<string[]>

Generates the suggestion chips shown when the chat is empty. Receives the same dataset context as onMessage, minus message. Return an array of suggestion strings.

type ChatDataContext<TRow> = Omit<ChatContext<TRow>, "message">;

Use the dataset to make the suggestions specific. Suggestions that mention the data the user is looking at convert better than generic ones.

loadSuggestions: async (context) => {
const hasEmailErrors = context.errorSummary.some((e) => e.field === "email");
return [
hasEmailErrors ? "Fix invalid emails" : "Remove duplicate rows",
"Fill empty roles with 'Viewer'",
"Delete rows with no salary",
];
};

For more variety, call your LLM here too with a shorter prompt: “Given this schema and error summary, suggest three useful one-line edits.”

Typestring

Title shown above the suggestion chips when the chat is empty. Falls back to a sensible default if omitted.

Typenumber

How many suggestions the chat panel expects. While loadSuggestions is still resolving, the SDK renders this many skeleton placeholders so the layout does not jump when the suggestions appear. Once loadSuggestions resolves, the SDK shows the returned array capped at this number.

Set it to the number of suggestions you actually return.

Type() => void

Called when the user cancels a pending AI request from the chat UI. Pair it with an AbortController to abort your in-flight fetch.

const abortRef = useRef<AbortController | null>(null);
const chat: DataEditorChat<Row> = {
async *onMessage(context) {
const controller = new AbortController();
abortRef.current = controller;
// ... fetch with controller.signal ...
},
onCancel() {
abortRef.current?.abort();
abortRef.current = null;
},
};

A working onMessage that POSTs to your server, reads a server-sent-events stream, and yields response chunks. Your server is responsible for calling the LLM and serializing each chunk as a data: <json>\n\n line.

import type { ChatResponseChunk, DataEditorChat } from "@updog/data-editor";
import { useMemo, useRef } from "react";
export function useChat<Row>(): DataEditorChat<Row> {
const abortRef = useRef<AbortController | null>(null);
return useMemo<DataEditorChat<Row>>(
() => ({
sampleSize: 50,
loadSuggestions: async (ctx) => {
if (ctx.errorSummary.length > 0) {
return ["Fix the flagged values", "Delete rows with errors"];
}
return ["Lowercase all emails", "Trim whitespace everywhere"];
},
async *onMessage(context) {
const controller = new AbortController();
abortRef.current = controller;
yield { type: "status", content: "Thinking..." };
const res = await fetch("/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
message: context.message,
primaryKey: context.primaryKey,
totalRowCount: context.totalRowCount,
filteredRowCount: context.filteredRowCount,
errorSummary: context.errorSummary,
sample: context.sample,
columns: context.columns.map((c) => ({
id: c.id,
title: c.title,
editor: c.editor,
unique: c.unique,
})),
}),
signal: controller.signal,
});
if (!res.ok || !res.body) {
yield { type: "message", content: `Server error: ${res.status}` };
return;
}
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop() ?? "";
for (const line of lines) {
if (!line.startsWith("data: ")) continue;
const data = line.slice(6).trim();
if (data === "[DONE]") return;
try {
yield JSON.parse(data) as ChatResponseChunk<Row>;
} catch {
// skip malformed chunks
}
}
}
},
onCancel() {
abortRef.current?.abort();
abortRef.current = null;
},
}),
[],
);
}

On the server side, your handler builds the system prompt (use the skeleton above as a starting point), calls your LLM, parses the JSON response, and emits each ChatResponseChunk as a data: <json>\n\n SSE line followed by data: [DONE]\n\n.