Advanced Instruction Engineering for Declarative Agents

If you have the five-part framework down (role, tone, scope, guardrails, format), you have a working agent. This article gets you a production-grade one.

The 8000-Character Budget

Your instruction field tops out at 8,000 characters. That sounds generous until you start writing real guardrails, examples, and escalation paths. Every character is a contract between you and the model: waste none.

You can author instructions in an external file and reference it in your manifest:

{
  "instructions": "$[file('instructions.txt')]"
}

This keeps your declarativeAgent.json clean, but the 8,000-character limit still applies to the resolved content. The file reference is a developer experience win, not a capacity hack.

💡 Tip

Run a quick character count before you deploy. A single copy-pasted policy paragraph can eat 600 characters and add zero value.

Strategies for staying under budget

Remove redundancy. If your guardrails already say “only answer about HR onboarding,” you don’t need to repeat that in the scope section.
Prefer lists over prose. “Handle: benefits, payroll, PTO, equipment requests” beats a paragraph saying the same thing.
Push reference material to knowledge sources. Your instructions should tell the agent how to behave. The what (policy docs, FAQ content, process guides) belongs in knowledge sources.

Persona Engineering

“You are a helpful assistant” tells the model nothing it doesn’t already assume. Persona engineering means defining a specific identity the agent embodies consistently.

Identity

Give your agent a name, a role, and a domain of expertise. Here’s how that looks for our Zava Insurance HR Onboarding Buddy:

❌ Before:
You are a helpful assistant for HR questions.

✅ After:
You are Zara, the Zava Insurance HR Onboarding Buddy. You specialize in guiding new employees through their first 90 days: from benefits enrollment to equipment requests to team introductions.

“Zara” isn’t just branding. A named persona anchors the model’s behavior. It stops drifting into generic assistant mode because it has a character to stay in.

Communication style

Define how the agent talks, not just what it knows. Consider formality, verbosity, and tone:

Speak in a warm, professional tone. Keep answers under 150 words unless the employee asks for more detail. Use bullet points for multi-step processes. Address the employee by first name when known.

Emotional intelligence

Production agents encounter frustrated users. Build empathy into the instructions:

If an employee expresses frustration, acknowledge their concern before answering. Example: "I understand that's frustrating: let me help sort this out." If an employee shares a milestone (first week done, completed training), briefly celebrate with them.

This isn’t fluff. Agents that handle emotion well get higher adoption rates.

Scope Boundaries and Guardrails

The basics article covered guardrails as “never” statements. Advanced guardrails use positive framing: telling the agent what to do instead of what to avoid.

❌ Before:
Don't answer questions about IT support.
Don't make up information.
Don't share confidential salary data.

✅ After:
You ONLY answer questions related to HR onboarding at Zava Insurance.
If asked about IT support, respond: "That's a great question for our IT Help Desk: you can reach them at [email protected]."
If you're unsure about an answer, say: "I want to make sure you get the right information. Let me connect you with your HR contact."
Never disclose specific salary figures: instead, direct employees to their offer letter or HR representative.

The “after” version is longer, but every character works harder. The agent knows exactly what to say when it hits a boundary, not just that a boundary exists.

📝 Note

Positive framing reduces hallucination. When you say “don’t talk about X,” the model has to figure out what to do instead. When you say “redirect to Y,” it has a concrete action.

Handling ambiguity

Users don’t always ask clean questions. Define what happens when intent is unclear:

If the employee's question is ambiguous, ask one clarifying question before answering. Example: "Are you asking about your medical benefits or your dental plan?"

One clarifying question. Not three. Not zero. Specificity here prevents the agent from either guessing wildly or interrogating the user.

Modular Instruction Patterns

Think of your instruction file as a contract with clearly labeled sections. Here’s a structure that scales:

## Role
You are Zara, the Zava Insurance HR Onboarding Buddy...

## Scope
You help new employees with: benefits enrollment, PTO policies, equipment requests, team introductions, training schedules.

## Behavior
- Respond in a warm, professional tone
- Keep answers under 150 words unless asked for detail
- Use bullet points for multi-step processes

## Response Format
- Start with a direct answer, then provide context
- Include relevant links to internal resources when available
- End multi-step answers with "Would you like help with the next step?"

## Escalation
- Benefits disputes → "Please contact [email protected]"
- Payroll errors → "Please contact [email protected]"
- Anything outside HR onboarding → "That's outside my area: here's who can help: [relevant contact]"

## Examples
Employee: "When do I get my laptop?"
Zara: "Equipment typically arrives within 3 business days of your start date. If it hasn't arrived, contact [email protected] and reference your employee ID."

Each section is a module you can iterate independently. When your agent gives bad escalation responses, you fix the Escalation section without touching Behavior.

💡 Tip

Including 1-2 concrete examples in your instructions dramatically improves response quality. The model learns the expected format and tone from your examples far more reliably than from abstract rules.

Error Handling and Fallback Behaviors

What happens when things go wrong? Most instruction sets ignore this entirely and the agent improvises: poorly.

When the agent can’t find an answer

If you cannot find the answer in your knowledge sources, respond: "I don't have that information yet, but your HR contact can help. Reach out to [email protected]."

When a tool call fails

If your agent uses API plugins, tool calls can fail. Define the fallback:

If a tool call returns an error, do not expose the error to the employee. Instead respond: "I'm having trouble looking that up right now. Please try again in a few minutes, or contact your HR representative directly."

When the user goes off-script

If the conversation drifts significantly off-topic (more than two consecutive messages unrelated to HR onboarding), gently re-anchor: "I'm best at helping with onboarding questions: is there anything about your first 90 days I can help with?"

These aren’t edge cases. They’re Tuesday. Build for them.

Structured Workflows: Goal, Action, Transition

When your agent handles multi-step processes (ticket creation, onboarding flows, troubleshooting), flat instructions fall apart. The model merges steps, skips transitions, or invents its own order.

Structure each step with three parts: what the step achieves, what the agent does, and when to move on.

## Step 1: Gather Details
- Goal: Identify the employee's equipment request.
- Action: If the request is clear, proceed. If unclear, ask one clarifying question.
- Transition: Once the item and delivery location are confirmed, proceed to Step 2.

## Step 2: Check Inventory
- Goal: Verify availability using the `EquipmentAPI` action.
- Action: Query the equipment catalog. If available, share the estimated delivery date. If unavailable, offer alternatives.
- Transition: If the employee confirms, proceed to Step 3. If they want to browse, repeat this step.

## Step 3: Submit Request
- Goal: Create the equipment request ticket.
- Action: Collect employee ID, item, and location. Submit via `EquipmentAPI`.
- Transition: Confirm the ticket number and estimated delivery. End the conversation.

The Goal/Action/Transition pattern makes each step atomic and testable. When Step 2 behaves oddly, you fix Step 2 without touching the rest.

💡 Tip

Reference your capabilities and actions by name in the instructions (e.g., “Use EquipmentAPI to check inventory”). This explicit naming helps the model route to the right tool instead of guessing.

Output Contracts and Self-Evaluation

If you don’t specify what the output should look like, the model decides for you. Some days it gives three bullet points, some days a five-paragraph essay. Output contracts lock down the format:

## Output Contract
- Format: Bullet list, max 5 items
- Tone: Professional, concise
- Detail level: One sentence per bullet
- Include: Action item, owner, deadline
- Exclude: Background context, recommendations, disclaimers

Pair every output contract with a self-evaluation gate. This one line catches more errors than any guardrail:

## Self-Check
Before responding, verify: (1) all requested items are present, (2) no information was assumed, (3) the output matches the format above. If anything is missing, ask the user before proceeding.

Self-evaluation is especially valuable after model updates, where the agent might start reordering steps or adding unrequested detail.

Define Your Domain Vocabulary

Every organization has terms the model will misinterpret. “PTO” might mean paid time off or patent and trademark office. “L3” might mean a support tier or a cache level. Define them explicitly:

## Vocabulary
- PTO: Paid Time Off (not Patent and Trademark Office)
- L1/L2/L3: Support escalation tiers (L1 = self-service, L2 = help desk, L3 = engineering)
- Badge: Physical building access card, not a digital achievement

This takes 30 seconds to write and prevents entire categories of wrong answers.

Anti-Patterns

After building and reviewing dozens of production agents, these are the patterns I see fail repeatedly.

Vague language. “Be helpful and professional” means nothing to the model. It’s already trying to be helpful. Tell it how: “Keep answers under 150 words, use bullet points for processes, always include a next step.”

Conflicting rules. Your scope says “only HR onboarding” but you’ve attached an API plugin that queries the IT ticketing system. The agent will try to use that tool. Align your instructions with your capabilities.

Instruction bloat. Copying your entire employee handbook into instructions is the #1 mistake I see. Instructions define behavior. Reference material belongs in knowledge sources. An agent stuffed with raw policy text hallucinates more, not less, because the signal-to-noise ratio drops.

Over-constraining. Thirty guardrails, fourteen “never” statements, and six mandatory response templates. The agent freezes: or worse, ignores half the rules because they conflict. Start with 5-7 strong rules, then add incrementally based on observed failures.

⚠️ Warning

If your agent starts refusing reasonable questions or giving robotic, template-heavy responses, you’ve probably over-constrained it. Pull back and simplify.

Controlling Reasoning Depth

Your phrasing signals how much thinking the model applies. This matters more than you’d expect.

For deep analysis, use reasoning verbs: “analyze,” “evaluate,” “compare alternatives,” “justify your recommendation.” Add meta-reasoning cues like “think step by step” or “reflect before answering.” These trigger the model’s extended reasoning.

For fast, deterministic answers, signal brevity: “Short answer only. No explanation. Return the final result.” Avoid analytical verbs and multi-step structures. Single-intent, single-phase phrasing keeps responses tight.

## Deep reasoning example
Analyze the employee's benefits eligibility based on their hire date, employment type, and region. Evaluate each plan option, compare coverage levels, and justify your recommendation. Think step by step.

## Fast reasoning example
Return the employee's PTO balance as a single number. No explanation.

Match the reasoning depth to the task. A ticket lookup needs fast reasoning. A benefits comparison needs deep reasoning. Mixing them in the same instruction set confuses the model.

The Iterative Testing Loop

Your first draft of instructions will be wrong. That’s not a failure: it’s the process.

The loop:

Write your initial instructions using the modular pattern above
Test in the Agents Playground: try happy paths, edge cases, and adversarial prompts
Observe where the agent drifts, hallucinates, or gives flat responses
Refine the specific section that failed: don’t rewrite everything
Repeat until edge cases are handled cleanly

For the Zava Insurance buddy, my first draft missed ambiguous benefit questions entirely. Users asking “what’s my coverage?” got a generic dump of all plan types. Adding one line fixed it:

If the employee asks about their specific coverage, ask which plan they're enrolled in before answering: medical, dental, or vision.

That’s one sentence. It took three rounds of testing to realize I needed it.

💡 Tip

Keep a log of failures you observe during testing. Group them by instruction section (Role, Scope, Behavior, etc.) and fix them in batches. This prevents whack-a-mole editing where fixing one issue creates another.

The Value You Just Unlocked

Character budget mastery: You know how to stay within the 8,000-character limit by eliminating redundancy, using lists over prose, and pushing reference material to knowledge sources.
Persona engineering: Named personas with defined communication styles and emotional intelligence produce agents that feel purposeful, not generic.
Structured workflows: The Goal/Action/Transition pattern makes multi-step processes atomic and testable, with explicit tool references so the model routes to the right capability.
Output contracts and self-evaluation: Locked-down formats prevent inconsistent responses, and self-check gates catch errors before the user sees them.
Reasoning depth control: Matching your phrasing to the task (analytical verbs for deep reasoning, imperative phrasing for fast answers) keeps responses appropriate.
Domain vocabulary: Defining organization-specific terms prevents entire categories of misinterpretation.
Anti-pattern awareness: You can spot vague language, conflicting rules, instruction bloat, and over-constraining before they reach users.

Production-grade instructions are never a first draft. They are the result of testing, observing, and refining one section at a time.

The 8000-Character Budget

Strategies for staying under budget

Persona Engineering

Identity

Communication style

Emotional intelligence

Scope Boundaries and Guardrails

Handling ambiguity

Modular Instruction Patterns

Error Handling and Fallback Behaviors

When the agent can’t find an answer

When a tool call fails

When the user goes off-script

Structured Workflows: Goal, Action, Transition

Output Contracts and Self-Evaluation

Define Your Domain Vocabulary

Anti-Patterns

Controlling Reasoning Depth

The Iterative Testing Loop

The Value You Just Unlocked

Resources