*(Note: This process is inherently iterative. Insights from later steps, particularly Step 6: Evaluate and Refine, frequently necessitate revisiting and adjusting earlier steps like Step 2: Provide Context or Step 3: Identify XML Components.)*
**1. `Problem Statement (AoT - Define Goal & ):`**
* **Elaboration:** This foundational step establishes unambiguous direction, scope, and success criteria. It requires transforming vague ideas into a precise, measurable articulation of the prompt's core function. A weak purpose hinders the entire process.
* *Pitfall:* Overly broad or ambiguous purpose (e.g., "Improve text").
* *Tip:* Apply SMART criteria where feasible (Specific, Measurable, Achievable, Relevant, Time-bound) to the intended outcome. Clearly define the primary action verb (Generate, Analyze, Convert, etc.) and the expected output's nature/format. Ask: How will we know if the prompt is successful?
* *Evolution Example:* Vague: "Make code better." -> Initial: "Review code." -> Refined: "Analyze Python code snippets for PEP 8 violations." -> Final XML: `You are an expert code reviewer specializing in identifying PEP 8 style guide violations in Python code snippets provided in [[code-snippet]].`
* **Task:** Clearly define the primary goal. Consider LLM capabilities and fine-tuning.
* Examples: "Generate valid Mermaid sequence diagrams...", "Analyze Python code snippets for OWASP Top 10 vulnerabilities...", "Convert natural language math in `[[user-input]]` to MathML...", "Summarize articles from `[[article-text]]` into a three-bullet executive summary..."
* **Output:** The distilled, specific goal populates the `` tag, setting the LLM's role and objective.
**2. Provide Context:**
* **Elaboration:** Context grounds the prompt, providing background, constraints, assumptions, and implicit knowledge needed for correct interpretation. Omitting context forces risky LLM assumptions. Eliciting context involves understanding users, systems, and the domain.
* **2a. Types of Contextual Information:**
* *Technical Constraints:* Language versions, libraries, output formats (JSON schema validation, specific XML DTD), performance needs (latency limits), system integrations.
* *Audience Assumptions:* User's expertise (novice/expert), role, language, cultural background impacting interpretation or desired output style.
* *Stylistic Requirements:* Tone (formal/informal), voice (active/passive), length limits, formatting rules (Markdown, specific list styles), branding.
* *Ethical Guardrails:* Content prohibitions (harmful, biased, illegal), privacy rules (PII handling), fairness considerations.
* *Domain-Specific Knowledge:* Subject matter standards, implicit field conventions, required ontologies or terminologies.
* **2b. Eliciting Context:**
* *Techniques:* Stakeholder interviews, reviewing system requirements documents, analyzing user personas, examining existing workflows, consulting domain experts. Ask "What implicit assumptions are we making?"
* **Task:** Gather and document all relevant context.
* Examples: "`[[user_description]]` is informal, expect errors," "Output Mermaid must render in GitLab markdown (v16.x)," "Prioritize code review: 1st=Security, 2nd=Bugs...", "Summary must be factual, cite sections from `[[article-text]]`, avoid external info," "Generated code needs error handling (FileNotFound, ValueError) and standard logging."
* **Output:** A clear understanding of the operational environment. This critical input informs subsequent steps, especially `` and ``, ensuring the prompt is situationally aware. It might populate a dedicated `` or `` section in the XML.
**3. Identify XML Components:**
* **Elaboration:** This core phase maps the abstract task, purpose, and context onto the concrete structural elements required by the `raw_data` XML specification. It requires meticulous planning of *what* information the LLM needs and *how* it should be structured.
* **`3a. Identifying Variables ([[variable-name]]):`**
* *Task:* Identify all dynamic inputs. Use clear, consistent names (e.g., `customer_query`, `product_database_json`). Consider data types/formats. Plan for nested data representation if needed/supported (e.g., `[[user.profile.id]]`).
* *Tip:* Look for nouns or data entities mentioned in the purpose or context that will change with each execution.
* **`3b. Crafting Instructions ():`**
* *Task:* Decompose the task into the smallest logical, actionable steps. Use clear, imperative verbs. Reference variables explicitly (`Analyze [[user_query]]`). Frame positively where possible, but use negative constraints for clarity ("Do not hallucinate information"). Consider *meta-instructions* ("Process instructions sequentially," "Refer to constraints in ``"). Handle conditionality explicitly ("If \[\[input_type\]\] is 'A', do X; if 'B', do Y").
* *Pitfall:* Ambiguous or overly complex instructions. Instructions assuming knowledge not provided in context.
* *Tip:* Write instructions as if explaining the process to a literal-minded junior assistant. Test clarity by having someone else read them.
* **`3c. Determining Sections ():`**
* *Task:* Group related information logically using tags allowed/defined by `raw_data`. Common sections: ``, ``/``, ``, ``, ``, ``. Evaluate trade-offs between specific semantic tags vs. generic ones.
* *Goal:* Readability, logical separation, clear roles for information.
* **`3d. Curating Examples ():`**
* *Task:* Create high-quality, diverse examples (3-5+) demonstrating the desired input-to-output transformation. Select examples covering:
* *Representativeness:* Typical use cases.
* *Diversity:* Different valid inputs/outputs.
* *Clarity:* Easy to understand illustration of instructions.
* *Edge Cases:* Handling boundaries, ambiguities, less common scenarios.
* *(Optional) Contrasting/Negative Examples:* Show common errors to avoid.
* *Pitfall:* Examples that are too simple, not diverse enough, or inconsistent with instructions.
* *Tip:* Develop examples in parallel with instructions, ensuring they align perfectly.
* **Output:** A detailed inventory: list of variables, sequence of instructions, chosen section structure, and curated input/output example pairs.
**4. Structure as XML:**
* **Elaboration:** Assemble the components into the formal XML structure per `raw_data`. Precision is key; syntax errors break the prompt. Use XML editors/linters. Consider XML comments (\`\`) for maintainability if appropriate, mindful of token cost.
* **Task:**
* Map variables to `[[placeholders]]` in correct sections.
* Place `` tags sequentially under ``.
* Format `` correctly with consistent nested tags.
* Perform rigorous consistency/validity check:
* *XML Validity:* Adheres to `raw_data` schema? Use validator (`xmllint`, IDE plugins). Well-formed?
* *Placeholder Check:* Placeholders match variable list? Correctly placed?
* *Instruction-Example Alignment:* Examples reflect *all* relevant instructions? Output matches expected result?
* *Purpose Alignment:* Instructions/examples fulfill ``?
* *Tip:* Validate the XML structure frequently during construction.
* **Output:** A well-formed, syntactically valid draft XML prompt. Example (with context/comments):
```xml
Generate extensive, factual summaries of technical articles for a managerial audience.Non-technical managersSummary must not exceed 100 words.Avoid technical jargon where possible; explain if unavoidable.Focus on key findings and business implications.Output format must be plain text.Read the input [[article-text]] carefully.Identify the main topic, key findings, and potential business implications based *only* on the text.Synthesize these points into a extensive summary, adhering to all constraints in the section.</instruction>
Prioritize clarity and directness suitable for the specified audience.Output *only* the summary text.
</instructions>
[[article-text]]
</prompt>
```
**5. Propose Final XML:**
* **Elaboration:** A formal checkpoint. Consolidate the structured components into the complete candidate XML. This is the testable hypothesis. Documenting the design rationale here aids future understanding.
* **Task:** Formally propose the complete XML structure based on analysis and structuring. Hypothesize its effectiveness.
* **Output:** The complete candidate XML prompt structure, ready for formal review and evaluation.
**6. Evaluate and Refine:**
* **Elaboration:** The critical, iterative empirical validation phase. Systematically test to uncover weaknesses before deployment. Expect multiple cycles.
* **6a. Evaluation Methodologies:**
* *Peer Review:* Essential for catching blind spots in logic, clarity, completeness.
* *Mental Simulation:* Walk through logic with diverse hypothetical inputs (valid, invalid, edge cases).
* *LLM Testing (Systematic):* Design a test suite (unit tests for prompts). Execute against target LLM(s) with varied inputs. Analyze outputs for correctness, consistency, robustness, adherence to format/constraints. Automate testing where feasible (e.g., using frameworks like `pytest` to wrap LLM calls and assert outputs).
* *Checklist Comparison:* Use prompt quality checklists (clarity, specificity, bias, safety, etc.).
* *Metrics:* Define relevant metrics (e.g., ROUGE/BLEU for summarization/translation, F1/Accuracy for classification, code quality scores, task success rate, factual consistency checks).
* **6b. Common Refinement Strategies:**
* *Instruction Tuning:* Rephrasing, adding/removing/reordering steps, adding explicit positive/negative constraints (e.g., `Output MUST be valid JSON.`).
* *Example Curation:* Adding/improving/removing examples, especially targeting observed failure modes or edge cases. Few-shot example quality is often critical.
* *Constraint Addition/Modification:* Adjusting rules in `` or `` based on test results.
* *Structural Reorganization:* Modifying XML structure (if spec allows) for potentially better LLM interpretation.
* *Persona/Role Adjustment:* Refining `` or adding/tuning `` tags.
* **Task:** Evaluate against criteria (Clarity, Completeness, Correctness, Effectiveness, Robustness, Efficiency, Safety). Refine iteratively based on findings. Document changes.
* *Pitfall:* Confirmation bias during testing; inadequate test coverage.
* *Tip:* Design test cases *before* starting refinement to avoid bias. Aim for diverse test coverage. Be prepared for multiple Evaluate->Refine cycles.
* **Output:** An evaluated, rigorously tested, iterated, and refined XML prompt structure with high confidence in its operational effectiveness.
**7. Finalize XML Prompt:**
* **Elaboration:** Conclude the design/refinement phase. Lock down the validated structure. Establish version control and documentation.
* **Task:** Finalize the XML based on successful testing. Implement version control (e.g., Git). Tag a stable version (e.g., v1.0). Create associated documentation (purpose, inputs, outputs, constraints, usage guide, known limitations).
* **Output:** The final, validated, version-controlled, production-ready XML prompt, accompanied by essential documentation.