For Research Analysts ·
What you'll accomplish
By the end of this guide, you'll have a repeatable workflow for coding open-ended survey responses in ChatGPT — turning a 4–8 hour manual task into 45–90 minutes of AI-assisted work. You'll use a two-step approach: first develop a theme list, then code all responses against it in batches.
What you'll need
Before going to ChatGPT, get your data in order.
What you should see: A numbered list of clean verbatim responses in a spreadsheet. Troubleshooting: If you have over 500 responses, split them into batches of 75–100 rows — this prevents the AI from truncating output.
Start a fresh ChatGPT conversation. Copy your first 50–75 responses and paste them with this prompt.
Here are open-ended responses to the survey question: "[paste your exact question text]"
[paste 50–75 verbatims here]
Please:
1. Identify 6–8 recurring themes across all responses
2. For each theme, give it a short label (2–4 words) and a 1-sentence definition
3. Note which 3 responses best illustrate each theme
Output only the theme list for now. We will code responses in the next step.
What you should see: A numbered theme list with definitions and example quotes. Troubleshooting: If themes are too broad (e.g., "Negative Experience"), ask: "Break theme 3 into more specific subcategories."
In the same conversation (so the AI remembers your theme list), send this:
Now using the theme list above, code each of the following responses.
For each response, assign it to 1–2 of the themes from our list.
Output as a table: Row # | Response (first 8 words) | Theme 1 | Theme 2
Responses to code:
[paste rows 1–75 with their row numbers]
What you should see: A table with each response coded to 1–2 theme labels from your approved list.
Continue in the same ChatGPT conversation:
Troubleshooting: If the AI starts inventing new theme names, say: "Stay with the original 8 themes we defined. Do not create new categories."
After all responses are coded:
What you should see: A summary table ready to paste into your report.
Theme generation:
Here are [N] responses to "[question text]". Identify 6–8 recurring themes. For each: short label (2–4 words), 1-sentence definition, best illustrating quote. Output theme list only.
Batch coding:
Using the theme list above, code each response below to 1–2 themes. Output as table: Row # | First 8 words | Theme 1 | Theme 2. Responses: [paste batch]
Frequency summary:
Based on all coding so far, give me a frequency table: Theme | Count | % of Total | 2 best quotes per theme.
Theme refinement:
Theme [X] seems too broad. Break it into 2–3 more specific subcategories based on the responses we've seen.