Marlinda Galapon

Home

Portfolio

Writing

Speaking and Media

Writing

Notes and short essays on AI, systems, trust, and human behavior.

LLMs Turned Conversation Design Into Perpetual QA

September 2025

Once upon a yesterday, most Conversation Designers owned the entire loop:

Empathize → Define → Ideate → Prototype → Test & QA → Implement

We wore every hat. Researcher. Writer. Prototyper. Builder. Validator. Sometimes PM. Sometimes engineer. Often all in one

day.

When systems were slow, deterministic, and predictable, the work stayed centered on design. You could reason about behavior upfront and trust that it would mostly hold.

LLMs changed that.

With probabilistic models and constantly shifting outputs, the work is drifting toward system analysis: reviewing conversations, auditing behavior, tagging root causes. That’s where system behavior is actually inspected now. And it’s where trust is either built or broken.

But this kind of QA can’t stop at checklists or accuracy scores.

Maybe the wrong answer wasn’t a prompt issue.Maybe the source data was outdated.Maybe the model ignored relevant context.Maybe it hallucinated entirely.

To fix it, you have to trace the full system: prompt → retrieval → model behavior → output → user impact.

And while you’re doing that, the ground keeps moving. Vendors update models. Engineering tweaks backend rules. Even subtle upstream changes can quietly unravel yesterday’s work. A prompt that behaved perfectly three days ago can fail today, without warning.

No single Conversation Designer can keep a system healthy long-term under those conditions. Not when models shift constantly and every fix cascades. Maintaining trust is a team sport now.

We may see fewer traditional CxD roles and more specialized paths emerge: analyst, conversation reviewer, QA lead, LLM workflow strategist. In chasing automation, we created perpetual QA.

Traditional QA asked, “Did this pass?”

LLM-era QA has to ask, “Is this still behaving as expected?”

That’s a full-time job.

One way this work can break down across roles:

Discover → AI Analysts or Program Managers
Design & Implement → Conversation Designers
QA → Specialists auditing fallbacks and tagging patterns (Was that 1-star CSAT about the system, or about a branding change outside its scope?)
Gather Insights → Quality or CX leads triangulating metrics with confusion signals
Resolve → Once QA confirms the signal, Conversation Designers update prompts or route issues to the right team

Startups may still expect one person to do it all. It’s rarely sustainable.

And depending on how teams are structured, LLM evaluation often requires joint QA and review. QA might flag an issue, but someone still has to trace why. Was the content wrong? The model confused? The prompt too open? You can’t debug what you can’t diagnose. Without shared evaluation flows, ownership gets murky fast.

If your org has split the work, a few questions matter:

Who owns what, especially if you have multiple bots?
How do you keep context intact across handoffs?
What happens when quality is judged through a different lens than the one it was designed with?
How do you prevent Conversation Designers and QA from duplicating effort when both need to review the same conversations to understand model behavior?

This shift doesn’t mean conversation design matters less. It means the work has moved closer to the system itself.

Designing trust now means maintaining it.

The Inversion of Expertise

August 2025

Recent conversations have me thinking about something I’m seeing more and more of: an inversion of expertise.

In most fields, leadership comes with time. You start at the bottom. You learn the work. You build judgment. And by the time you’re managing others, you’ve actually done the job yourself. That experience gives leaders the context to mentor, evaluate, and support the people coming in.

But in AI and conversation design, that timeline has collapsed.

People are being hired to build systems their leaders have never used. Interviewers are asking questions they only half understand. And the candidates who do understand are often labeled “not a fit.”

Why?

Because they’re speaking a language the interviewer hasn’t learned yet. Because their answers don’t match a mental model that’s already out of date. Because deep fluency can feel threatening when you don’t have it.

And when those candidates do get hired, they often get stuck. The people above them don’t know how to support the work they’ve brought in. It’s not enough to hire brilliance if you don’t know how to hold it.

I’ve seen this. I’ve lived it. Expertise gets framed as arrogance. Teams choose comfort over capability. It’s happening more than people admit, and it’s costing us.

Until more teams raise their AI fluency, many candidates and practitioners will be more knowledgeable than the people hiring and managing them. And honestly, that may not be the problem.

Maybe this is the new norm.

Maybe the real skill now is knowing how to lead from behind. The best leaders I know aren’t trying to prove they know everything. They listen. They learn. And they know how to let talent lead.

When Conversation Design Stops Being a Role

July 2025

“Conversation design is dead.”

That quote made the rounds not long ago and sparked a lot of debate in the community.

My gut reaction at the time? That’s silly. Of course it’s not dead. It’s just evolving, thanks to LLMs. Now everyone’s learning to be a prompt engineer or prompt designer. Whichever.

But lately, I’ve been seeing it differently.

If you define conversation design as scripting every single line, then sure, maybe that version is fading. (Tooling matters. Not every company has ditched decision trees.) But if you define it as designing how people and AI align, then it’s more alive than ever.

Conversation design still matters. But its meaning, scope, and impact have expanded. What once lived in a single role now moves across teams, tools, workflows, and the org.

It’s no longer just a craft. It’s a lens. A logic. A layer of empathy and intentionality that now belongs inside every AI role. Because we’re no longer just designing conversations. We’re aligning people, processes, and systems inside a much larger ecosystem.

That mindset shows up everywhere:

In AI operations, where alignment and governance matter just as much as outputs
In AI QA, where tone and trust are as critical as accuracy. A bot might sound warm one day and robotic the next. Tone is designed in language and prompts. Trust is earned in interaction. QA is what keeps both intact
In tooling, tagging, workflows, and taxonomy, where users, agents, infrastructure, and business goals all collide
And in all the connective tissue in between

Conversation designers are becoming architects of alignment.

This work isn’t about writing better lines. It’s about designing meaning, then operationalizing that meaning across the entire AI ecosystem.

If you’re hiring:Look beyond the words. Don’t hire someone just to polish a prompt. Hire someone who can shape systems. Someone who understands how language builds trust, how structure creates coherence, and how AI changes the rules of engagement. The title might be Conversation Designer, Prompt Engineer, or AI UX Lead. The real question is whether this person can design how meaning moves through your org, your product, and your customer experience.

If you’re job-seeking:Don’t box yourself into a title. This work might live on an AI team, or in product, ops, support, research, or design. What matters is the mindset. Lead with how you think. How you listen, align, structure, and scale. Show that you understand not just how to write a prompt, but how to make conversations work for users, businesses, and the systems behind them.

The role is evolving. The title is in flux. The need is only growing.

Marlinda GalaponAI Experience Architect

Leave a message

Marlinda Galapon

Home

Portfolio

Writing

Speaking and Media

Writing

Notes and short essays on AI, systems, trust, and human behavior.

LLMs Turned Conversation Design Into Perpetual QA

September 2025

Once upon a yesterday, most Conversation Designers owned the entire loop:

Empathize → Define → Ideate → Prototype → Test & QA → Implement

We wore every hat. Researcher. Writer. Prototyper. Builder. Validator. Sometimes PM. Sometimes engineer. Often all in one

day.

When systems were slow, deterministic, and predictable, the work stayed centered on design. You could reason about behavior upfront and trust that it would mostly hold.

LLMs changed that.

With probabilistic models and constantly shifting outputs, the work is drifting toward system analysis: reviewing conversations, auditing behavior, tagging root causes. That’s where system behavior is actually inspected now. And it’s where trust is either built or broken.

But this kind of QA can’t stop at checklists or accuracy scores.

Maybe the wrong answer wasn’t a prompt issue.Maybe the source data was outdated.Maybe the model ignored relevant context.Maybe it hallucinated entirely.

To fix it, you have to trace the full system: prompt → retrieval → model behavior → output → user impact.

And while you’re doing that, the ground keeps moving. Vendors update models. Engineering tweaks backend rules. Even subtle upstream changes can quietly unravel yesterday’s work. A prompt that behaved perfectly three days ago can fail today, without warning.

No single Conversation Designer can keep a system healthy long-term under those conditions. Not when models shift constantly and every fix cascades. Maintaining trust is a team sport now.

We may see fewer traditional CxD roles and more specialized paths emerge: analyst, conversation reviewer, QA lead, LLM workflow strategist. In chasing automation, we created perpetual QA.

Traditional QA asked, “Did this pass?”

LLM-era QA has to ask, “Is this still behaving as expected?”

That’s a full-time job.

One way this work can break down across roles:

Discover → AI Analysts or Program Managers
Design & Implement → Conversation Designers
QA → Specialists auditing fallbacks and tagging patterns (Was that 1-star CSAT about the system, or about a branding change outside its scope?)
Gather Insights → Quality or CX leads triangulating metrics with confusion signals
Resolve → Once QA confirms the signal, Conversation Designers update prompts or route issues to the right team

Startups may still expect one person to do it all. It’s rarely sustainable.

And depending on how teams are structured, LLM evaluation often requires joint QA and review. QA might flag an issue, but someone still has to trace why. Was the content wrong? The model confused? The prompt too open? You can’t debug what you can’t diagnose. Without shared evaluation flows, ownership gets murky fast.

If your org has split the work, a few questions matter:

Who owns what, especially if you have multiple bots?
How do you keep context intact across handoffs?
What happens when quality is judged through a different lens than the one it was designed with?
How do you prevent Conversation Designers and QA from duplicating effort when both need to review the same conversations to understand model behavior?

This shift doesn’t mean conversation design matters less. It means the work has moved closer to the system itself.

Designing trust now means maintaining it.

The Inversion of Expertise

August 2025

Recent conversations have me thinking about something I’m seeing more and more of: an inversion of expertise.

In most fields, leadership comes with time. You start at the bottom. You learn the work. You build judgment. And by the time you’re managing others, you’ve actually done the job yourself. That experience gives leaders the context to mentor, evaluate, and support the people coming in.

But in AI and conversation design, that timeline has collapsed.

People are being hired to build systems their leaders have never used. Interviewers are asking questions they only half understand. And the candidates who do understand are often labeled “not a fit.”

Why?

Because they’re speaking a language the interviewer hasn’t learned yet. Because their answers don’t match a mental model that’s already out of date. Because deep fluency can feel threatening when you don’t have it.

And when those candidates do get hired, they often get stuck. The people above them don’t know how to support the work they’ve brought in. It’s not enough to hire brilliance if you don’t know how to hold it.

I’ve seen this. I’ve lived it. Expertise gets framed as arrogance. Teams choose comfort over capability. It’s happening more than people admit, and it’s costing us.

Until more teams raise their AI fluency, many candidates and practitioners will be more knowledgeable than the people hiring and managing them. And honestly, that may not be the problem.

Maybe this is the new norm.

Maybe the real skill now is knowing how to lead from behind. The best leaders I know aren’t trying to prove they know everything. They listen. They learn. And they know how to let talent lead.

When Conversation Design Stops Being a Role

July 2025

“Conversation design is dead.”

That quote made the rounds not long ago and sparked a lot of debate in the community.

My gut reaction at the time? That’s silly. Of course it’s not dead. It’s just evolving, thanks to LLMs. Now everyone’s learning to be a prompt engineer or prompt designer. Whichever.

But lately, I’ve been seeing it differently.

If you define conversation design as scripting every single line, then sure, maybe that version is fading. (Tooling matters. Not every company has ditched decision trees.) But if you define it as designing how people and AI align, then it’s more alive than ever.

Conversation design still matters. But its meaning, scope, and impact have expanded. What once lived in a single role now moves across teams, tools, workflows, and the org.

It’s no longer just a craft. It’s a lens. A logic. A layer of empathy and intentionality that now belongs inside every AI role. Because we’re no longer just designing conversations. We’re aligning people, processes, and systems inside a much larger ecosystem.

That mindset shows up everywhere:

In AI operations, where alignment and governance matter just as much as outputs
In AI QA, where tone and trust are as critical as accuracy. A bot might sound warm one day and robotic the next. Tone is designed in language and prompts. Trust is earned in interaction. QA is what keeps both intact
In tooling, tagging, workflows, and taxonomy, where users, agents, infrastructure, and business goals all collide
And in all the connective tissue in between

Conversation designers are becoming architects of alignment.

This work isn’t about writing better lines. It’s about designing meaning, then operationalizing that meaning across the entire AI ecosystem.

If you’re hiring:Look beyond the words. Don’t hire someone just to polish a prompt. Hire someone who can shape systems. Someone who understands how language builds trust, how structure creates coherence, and how AI changes the rules of engagement. The title might be Conversation Designer, Prompt Engineer, or AI UX Lead. The real question is whether this person can design how meaning moves through your org, your product, and your customer experience.

If you’re job-seeking:Don’t box yourself into a title. This work might live on an AI team, or in product, ops, support, research, or design. What matters is the mindset. Lead with how you think. How you listen, align, structure, and scale. Show that you understand not just how to write a prompt, but how to make conversations work for users, businesses, and the systems behind them.

The role is evolving. The title is in flux. The need is only growing.

Marlinda GalaponAI Experience Architect

Leave a message

Marlinda Galapon

Home

Portfolio

Writing

Speaking and Media

Writing

Notes and short essays on AI, systems, trust, and human behavior.

LLMs Turned Conversation Design Into Perpetual QA

September 2025

Once upon a yesterday, most Conversation Designers owned the entire loop:

Empathize → Define → Ideate → Prototype → Test & QA → Implement

We wore every hat. Researcher. Writer. Prototyper. Builder. Validator. Sometimes PM. Sometimes engineer. Often all in one

day.

When systems were slow, deterministic, and predictable, the work stayed centered on design. You could reason about behavior upfront and trust that it would mostly hold.

LLMs changed that.

With probabilistic models and constantly shifting outputs, the work is drifting toward system analysis: reviewing conversations, auditing behavior, tagging root causes. That’s where system behavior is actually inspected now. And it’s where trust is either built or broken.

But this kind of QA can’t stop at checklists or accuracy scores.

Maybe the wrong answer wasn’t a prompt issue.Maybe the source data was outdated.Maybe the model ignored relevant context.Maybe it hallucinated entirely.

To fix it, you have to trace the full system: prompt → retrieval → model behavior → output → user impact.

And while you’re doing that, the ground keeps moving. Vendors update models. Engineering tweaks backend rules. Even subtle upstream changes can quietly unravel yesterday’s work. A prompt that behaved perfectly three days ago can fail today, without warning.

No single Conversation Designer can keep a system healthy long-term under those conditions. Not when models shift constantly and every fix cascades. Maintaining trust is a team sport now.

We may see fewer traditional CxD roles and more specialized paths emerge: analyst, conversation reviewer, QA lead, LLM workflow strategist. In chasing automation, we created perpetual QA.

Traditional QA asked, “Did this pass?”

LLM-era QA has to ask, “Is this still behaving as expected?”

That’s a full-time job.

One way this work can break down across roles:

Discover → AI Analysts or Program Managers
Design & Implement → Conversation Designers
QA → Specialists auditing fallbacks and tagging patterns (Was that 1-star CSAT about the system, or about a branding change outside its scope?)
Gather Insights → Quality or CX leads triangulating metrics with confusion signals
Resolve → Once QA confirms the signal, Conversation Designers update prompts or route issues to the right team

Startups may still expect one person to do it all. It’s rarely sustainable.

And depending on how teams are structured, LLM evaluation often requires joint QA and review. QA might flag an issue, but someone still has to trace why. Was the content wrong? The model confused? The prompt too open? You can’t debug what you can’t diagnose. Without shared evaluation flows, ownership gets murky fast.

If your org has split the work, a few questions matter:

Who owns what, especially if you have multiple bots?
How do you keep context intact across handoffs?
What happens when quality is judged through a different lens than the one it was designed with?
How do you prevent Conversation Designers and QA from duplicating effort when both need to review the same conversations to understand model behavior?

This shift doesn’t mean conversation design matters less. It means the work has moved closer to the system itself.

Designing trust now means maintaining it.

The Inversion of Expertise

August 2025

Recent conversations have me thinking about something I’m seeing more and more of: an inversion of expertise.

In most fields, leadership comes with time. You start at the bottom. You learn the work. You build judgment. And by the time you’re managing others, you’ve actually done the job yourself. That experience gives leaders the context to mentor, evaluate, and support the people coming in.

But in AI and conversation design, that timeline has collapsed.

People are being hired to build systems their leaders have never used. Interviewers are asking questions they only half understand. And the candidates who do understand are often labeled “not a fit.”

Why?

Because they’re speaking a language the interviewer hasn’t learned yet. Because their answers don’t match a mental model that’s already out of date. Because deep fluency can feel threatening when you don’t have it.

And when those candidates do get hired, they often get stuck. The people above them don’t know how to support the work they’ve brought in. It’s not enough to hire brilliance if you don’t know how to hold it.

I’ve seen this. I’ve lived it. Expertise gets framed as arrogance. Teams choose comfort over capability. It’s happening more than people admit, and it’s costing us.

Until more teams raise their AI fluency, many candidates and practitioners will be more knowledgeable than the people hiring and managing them. And honestly, that may not be the problem.

Maybe this is the new norm.

Maybe the real skill now is knowing how to lead from behind. The best leaders I know aren’t trying to prove they know everything. They listen. They learn. And they know how to let talent lead.

When Conversation Design Stops Being a Role

July 2025

“Conversation design is dead.”

That quote made the rounds not long ago and sparked a lot of debate in the community.

My gut reaction at the time? That’s silly. Of course it’s not dead. It’s just evolving, thanks to LLMs. Now everyone’s learning to be a prompt engineer or prompt designer. Whichever.

But lately, I’ve been seeing it differently.

If you define conversation design as scripting every single line, then sure, maybe that version is fading. (Tooling matters. Not every company has ditched decision trees.) But if you define it as designing how people and AI align, then it’s more alive than ever.

Conversation design still matters. But its meaning, scope, and impact have expanded. What once lived in a single role now moves across teams, tools, workflows, and the org.

It’s no longer just a craft. It’s a lens. A logic. A layer of empathy and intentionality that now belongs inside every AI role. Because we’re no longer just designing conversations. We’re aligning people, processes, and systems inside a much larger ecosystem.

That mindset shows up everywhere:

In AI operations, where alignment and governance matter just as much as outputs
In AI QA, where tone and trust are as critical as accuracy. A bot might sound warm one day and robotic the next. Tone is designed in language and prompts. Trust is earned in interaction. QA is what keeps both intact
In tooling, tagging, workflows, and taxonomy, where users, agents, infrastructure, and business goals all collide
And in all the connective tissue in between

Conversation designers are becoming architects of alignment.

This work isn’t about writing better lines. It’s about designing meaning, then operationalizing that meaning across the entire AI ecosystem.

If you’re hiring:Look beyond the words. Don’t hire someone just to polish a prompt. Hire someone who can shape systems. Someone who understands how language builds trust, how structure creates coherence, and how AI changes the rules of engagement. The title might be Conversation Designer, Prompt Engineer, or AI UX Lead. The real question is whether this person can design how meaning moves through your org, your product, and your customer experience.

If you’re job-seeking:Don’t box yourself into a title. This work might live on an AI team, or in product, ops, support, research, or design. What matters is the mindset. Lead with how you think. How you listen, align, structure, and scale. Show that you understand not just how to write a prompt, but how to make conversations work for users, businesses, and the systems behind them.

The role is evolving. The title is in flux. The need is only growing.

Marlinda GalaponAI Experience Architect

Let’s connect.

Leave a message

This site requires JavaScript

To view this website, enable JavaScript in your browser settings and reload the page.