AoT 3

Slug: aot-3

10779 characters 1336 words

(Note: This process is inherently iterative. Insights from later steps, particularly Step 6: Evaluate and Refine, frequently necessitate revisiting and adjusting earlier steps like Step 2: Provide Context or Step 3: Identify XML Components.) 1. Problem Statement (AoT - Define Goal & ):

  • Elaboration: This foundational step establishes unambiguous direction, scope, and success criteria. It requires transforming vague ideas into a precise, measurable articulation of the prompt’s core function. A weak purpose hinders the entire process.
  • Pitfall: Overly broad or ambiguous purpose (e.g., “Improve text”).
  • Tip: Apply SMART criteria where feasible (Specific, Measurable, Achievable, Relevant, Time-bound) to the intended outcome. Clearly define the primary action verb (Generate, Analyze, Convert, etc.) and the expected output’s nature/format. Ask: How will we know if the prompt is successful?
  • Evolution Example: Vague: “Make code better.” -> Initial: “Review code.” -> Refined: “Analyze Python code snippets for PEP 8 violations.” -> Final XML: You are an expert code reviewer specializing in identifying PEP 8 style guide violations in Python code snippets provided in [[code-snippet]].
  • Task: Clearly define the primary goal. Consider LLM capabilities and fine-tuning.
  • Examples: “Generate valid Mermaid sequence diagrams…”, “Analyze Python code snippets for OWASP Top 10 vulnerabilities…”, “Convert natural language math in [[user-input]] to MathML…”, “Summarize articles from [[article-text]] into a three-bullet executive summary…”
  • Output: The distilled, specific goal populates the `` tag, setting the LLM’s role and objective. 2. Provide Context:
  • Elaboration: Context grounds the prompt, providing background, constraints, assumptions, and implicit knowledge needed for correct interpretation. Omitting context forces risky LLM assumptions. Eliciting context involves understanding users, systems, and the domain.
  • 2a. Types of Contextual Information:
  • Technical Constraints: Language versions, libraries, output formats (JSON schema validation, specific XML DTD), performance needs (latency limits), system integrations.
  • Audience Assumptions: User’s expertise (novice/expert), role, language, cultural background impacting interpretation or desired output style.
  • Stylistic Requirements: Tone (formal/informal), voice (active/passive), length limits, formatting rules (Markdown, specific list styles), branding.
  • Ethical Guardrails: Content prohibitions (harmful, biased, illegal), privacy rules (PII handling), fairness considerations.
  • Domain-Specific Knowledge: Subject matter standards, implicit field conventions, required ontologies or terminologies.
  • 2b. Eliciting Context:
  • Techniques: Stakeholder interviews, reviewing system requirements documents, analyzing user personas, examining existing workflows, consulting domain experts. Ask “What implicit assumptions are we making?”
  • Task: Gather and document all relevant context.
  • Examples: “[[user_description]] is informal, expect errors,” “Output Mermaid must render in GitLab markdown (v16.x),” “Prioritize code review: 1st=Security, 2nd=Bugs…”, “Summary must be factual, cite sections from [[article-text]], avoid external info,” “Generated code needs error handling (FileNotFound, ValueError) and standard logging.”
  • Output: A clear understanding of the operational environment. This critical input informs subsequent steps, especially and, ensuring the prompt is situationally aware. It might populate a dedicated or section in the XML. 3. Identify XML Components:
  • Elaboration: This core phase maps the abstract task, purpose, and context onto the concrete structural elements required by the raw_data XML specification. It requires meticulous planning of what information the LLM needs and how it should be structured.
  • 3a. Identifying Variables ([[variable-name]]):
  • Task: Identify all dynamic inputs. Use clear, consistent names (e.g., customer_query, product_database_json). Consider data types/formats. Plan for nested data representation if needed/supported (e.g., [[user.profile.id]]).
  • Tip: Look for nouns or data entities mentioned in the purpose or context that will change with each execution.
  • 3b. Crafting Instructions ():
  • Task: Decompose the task into the smallest logical, actionable steps. Use clear, imperative verbs. Reference variables explicitly (Analyze [[user_query]]). Frame positively where possible, but use negative constraints for clarity (“Do not hallucinate information”). Consider meta-instructions (“Process instructions sequentially,” “Refer to constraints in ``”). Handle conditionality explicitly (“If [[input_type]] is ‘A’, do X; if ‘B’, do Y”).
  • Pitfall: Ambiguous or overly complex instructions. Instructions assuming knowledge not provided in context.
  • Tip: Write instructions as if explaining the process to a literal-minded junior assistant. Test clarity by having someone else read them.
  • 3c. Determining Sections ():
  • Task: Group related information logically using tags allowed/defined by raw_data. Common sections: ,/,, ,, ``. Evaluate trade-offs between specific semantic tags vs. generic ones.
  • Goal: Readability, logical separation, clear roles for information.
  • 3d. Curating Examples ():
  • Task: Create high-quality, diverse examples (3-5+) demonstrating the desired input-to-output transformation. Select examples covering:
  • Representativeness: Typical use cases.
  • Diversity: Different valid inputs/outputs.
  • Clarity: Easy to understand illustration of instructions.
  • Edge Cases: Handling boundaries, ambiguities, less common scenarios.
  • (Optional) Contrasting/Negative Examples: Show common errors to avoid.
  • Pitfall: Examples that are too simple, not diverse enough, or inconsistent with instructions.
  • Tip: Develop examples in parallel with instructions, ensuring they align perfectly.
  • Output: A detailed inventory: list of variables, sequence of instructions, chosen section structure, and curated input/output example pairs. 4. Structure as XML:
  • Elaboration: Assemble the components into the formal XML structure per raw_data. Precision is key; syntax errors break the prompt. Use XML editors/linters. Consider XML comments (``) for maintainability if appropriate, mindful of token cost.
  • Task:
  • Map variables to [[placeholders]] in correct sections.
  • Place tags sequentially under.
  • Format `` correctly with consistent nested tags.
  • Perform rigorous consistency/validity check:
  • XML Validity: Adheres to raw_data schema? Use validator (xmllint, IDE plugins). Well-formed?
  • Placeholder Check: Placeholders match variable list? Correctly placed?
  • Instruction-Example Alignment: Examples reflect all relevant instructions? Output matches expected result?
  • Purpose Alignment: Instructions/examples fulfill ``?
  • Tip: Validate the XML structure frequently during construction.
  • Output: A well-formed, syntactically valid draft XML prompt. Example (with context/comments): ```xml

Generate extensive, factual summaries of technical articles for a managerial audience.

Non-technical managers Summary must not exceed 100 words. Avoid technical jargon where possible; explain if unavoidable. Focus on key findings and business implications. Output format must be plain text.

Read the input [[article-text]] carefully. Identify the main topic, key findings, and potential business implications based only on the text. Synthesize these points into a extensive summary, adhering to all constraints in the section.</instruction> Prioritize clarity and directness suitable for the specified audience. Output only the summary text. </instructions> [[article-text]]

</prompt> ``` 5. Propose Final XML:

  • Elaboration: A formal checkpoint. Consolidate the structured components into the complete candidate XML. This is the testable hypothesis. Documenting the design rationale here aids future understanding.
  • Task: Formally propose the complete XML structure based on analysis and structuring. Hypothesize its effectiveness.
  • Output: The complete candidate XML prompt structure, ready for formal review and evaluation. 6. Evaluate and Refine:
  • Elaboration: The critical, iterative empirical validation phase. Systematically test to uncover weaknesses before deployment. Expect multiple cycles.
  • 6a. Evaluation Methodologies:
  • Peer Review: Essential for catching blind spots in logic, clarity, completeness.
  • Mental Simulation: Walk through logic with diverse hypothetical inputs (valid, invalid, edge cases).
  • LLM Testing (Systematic): Design a test suite (unit tests for prompts). Execute against target LLM(s) with varied inputs. Analyze outputs for correctness, consistency, robustness, adherence to format/constraints. Automate testing where feasible (e.g., using frameworks like pytest to wrap LLM calls and assert outputs).
  • Checklist Comparison: Use prompt quality checklists (clarity, specificity, bias, safety, etc.).
  • Metrics: Define relevant metrics (e.g., ROUGE/BLEU for summarization/translation, F1/Accuracy for classification, code quality scores, task success rate, factual consistency checks).
  • 6b. Common Refinement Strategies:
  • Instruction Tuning: Rephrasing, adding/removing/reordering steps, adding explicit positive/negative constraints (e.g., Output MUST be valid JSON.).
  • Example Curation: Adding/improving/removing examples, especially targeting observed failure modes or edge cases. Few-shot example quality is often critical.
  • Constraint Addition/Modification: Adjusting rules in or based on test results.
  • Structural Reorganization: Modifying XML structure (if spec allows) for potentially better LLM interpretation.
  • Persona/Role Adjustment: Refining or adding/tuning tags.
  • Task: Evaluate against criteria (Clarity, Completeness, Correctness, Effectiveness, Robustness, Efficiency, Safety). Refine iteratively based on findings. Document changes.
  • Pitfall: Confirmation bias during testing; inadequate test coverage.
  • Tip: Design test cases before starting refinement to avoid bias. Aim for diverse test coverage. Be prepared for multiple Evaluate->Refine cycles.
  • Output: An evaluated, rigorously tested, iterated, and refined XML prompt structure with high confidence in its operational effectiveness. 7. Finalize XML Prompt:
  • Elaboration: Conclude the design/refinement phase. Lock down the validated structure. Establish version control and documentation.
  • Task: Finalize the XML based on successful testing. Implement version control (e.g., Git). Tag a stable version (e.g., v1.0). Create associated documentation (purpose, inputs, outputs, constraints, usage guide, known limitations).
  • Output: The final, validated, version-controlled, production-ready XML prompt, accompanied by essential documentation.
URL: https://ib.bsb.br/aot-3