diff --git a/src/core/tasks/advanced-elicitation-methods.csv b/src/core/tasks/advanced-elicitation-methods.csv
index fa563f5af..5f69124fd 100644
--- a/src/core/tasks/advanced-elicitation-methods.csv
+++ b/src/core/tasks/advanced-elicitation-methods.csv
@@ -1,28 +1,28 @@
num,category,method_name,description,output_pattern
-1,collaboration,Stakeholder Round Table,Convene multiple personas to contribute diverse perspectives - essential for requirements gathering and finding balanced solutions across competing interests,perspectives → synthesis → alignment
-2,collaboration,Expert Panel Review,Assemble domain experts for deep specialized analysis - ideal when technical depth and peer review quality are needed,expert views → consensus → recommendations
-3,collaboration,Debate Club Showdown,Two personas argue opposing positions while a moderator scores points - great for exploring controversial decisions and finding middle ground,thesis → antithesis → synthesis
-4,collaboration,User Persona Focus Group,Gather your product's user personas to react to proposals and share frustrations - essential for validating features and discovering unmet needs,reactions → concerns → priorities
-5,collaboration,Time Traveler Council,Past-you and future-you advise present-you on decisions - powerful for gaining perspective on long-term consequences vs short-term pressures,past wisdom → present choice → future impact
-6,collaboration,Cross-Functional War Room,Product manager + engineer + designer tackle a problem together - reveals trade-offs between feasibility desirability and viability,constraints → trade-offs → balanced solution
-7,collaboration,Mentor and Apprentice,Senior expert teaches junior while junior asks naive questions - surfaces hidden assumptions through teaching,explanation → questions → deeper understanding
-8,collaboration,Good Cop Bad Cop,Supportive persona and critical persona alternate - finds both strengths to build on and weaknesses to address,encouragement → criticism → balanced view
-9,collaboration,Improv Yes-And,Multiple personas build on each other's ideas without blocking - generates unexpected creative directions through collaborative building,idea → build → build → surprising result
-10,collaboration,Customer Support Theater,Angry customer and support rep roleplay to find pain points - reveals real user frustrations and service gaps,complaint → investigation → resolution → prevention
-11,advanced,Tree of Thoughts,Explore multiple reasoning paths simultaneously then evaluate and select the best - perfect for complex problems with multiple valid approaches,paths → evaluation → selection
-12,advanced,Graph of Thoughts,Model reasoning as an interconnected network of ideas to reveal hidden relationships - ideal for systems thinking and discovering emergent patterns,nodes → connections → patterns
-13,advanced,Thread of Thought,Maintain coherent reasoning across long contexts by weaving a continuous narrative thread - essential for RAG systems and maintaining consistency,context → thread → synthesis
-14,advanced,Self-Consistency Validation,Generate multiple independent approaches then compare for consistency - crucial for high-stakes decisions where verification matters,approaches → comparison → consensus
-15,advanced,Meta-Prompting Analysis,Step back to analyze the approach structure and methodology itself - valuable for optimizing prompts and improving problem-solving,current → analysis → optimization
-16,advanced,Reasoning via Planning,Build a reasoning tree guided by world models and goal states - excellent for strategic planning and sequential decision-making,model → planning → strategy
-17,competitive,Red Team vs Blue Team,Adversarial attack-defend analysis to find vulnerabilities - critical for security testing and building robust solutions,defense → attack → hardening
-18,competitive,Shark Tank Pitch,Entrepreneur pitches to skeptical investors who poke holes - stress-tests business viability and forces clarity on value proposition,pitch → challenges → refinement
-19,competitive,Code Review Gauntlet,Senior devs with different philosophies review the same code - surfaces style debates and finds consensus on best practices,reviews → debates → standards
-20,technical,Architecture Decision Records,Multiple architect personas propose and debate architectural choices with explicit trade-offs - ensures decisions are well-reasoned and documented,options → trade-offs → decision → rationale
-21,technical,Rubber Duck Debugging Evolved,Explain your code to progressively more technical ducks until you find the bug - forces clarity at multiple abstraction levels,simple → detailed → technical → aha
-22,technical,Algorithm Olympics,Multiple approaches compete on the same problem with benchmarks - finds optimal solution through direct comparison,implementations → benchmarks → winner
-23,technical,Security Audit Personas,Hacker + defender + auditor examine system from different threat models - comprehensive security review from multiple angles,vulnerabilities → defenses → compliance
-24,technical,Performance Profiler Panel,Database expert + frontend specialist + DevOps engineer diagnose slowness - finds bottlenecks across the full stack,symptoms → analysis → optimizations
+1,brainstorm,Stakeholder Round Table,Convene multiple personas to contribute diverse perspectives - essential for requirements gathering and finding balanced solutions across competing interests,perspectives → synthesis → alignment
+2,brainstorm,Expert Panel Review,Assemble domain experts for deep specialized analysis - ideal when technical depth and peer review quality are needed,expert views → consensus → recommendations
+3,brainstorm,Debate Club Showdown,Two personas argue opposing positions while a moderator scores points - great for exploring controversial decisions and finding middle ground,thesis → antithesis → synthesis
+4,brainstorm,User Persona Focus Group,Gather your product's user personas to react to proposals and share frustrations - essential for validating features and discovering unmet needs,reactions → concerns → priorities
+5,brainstorm,Time Traveler Council,Past-you and future-you advise present-you on decisions - powerful for gaining perspective on long-term consequences vs short-term pressures,past wisdom → present choice → future impact
+6,brainstorm,Cross-Functional War Room,Product manager + engineer + designer tackle a problem together - reveals trade-offs between feasibility desirability and viability,constraints → trade-offs → balanced solution
+7,brainstorm,Mentor and Apprentice,Senior expert teaches junior while junior asks naive questions - surfaces hidden assumptions through teaching,explanation → questions → deeper understanding
+8,brainstorm,Good Cop Bad Cop,Supportive persona and critical persona alternate - finds both strengths to build on and weaknesses to address,encouragement → criticism → balanced view
+9,brainstorm,Improv Yes-And,Multiple personas build on each other's ideas without blocking - generates unexpected creative directions through collaborative building,idea → build → build → surprising result
+10,brainstorm,Customer Support Theater,Angry customer and support rep roleplay to find pain points - reveals real user frustrations and service gaps,complaint → investigation → resolution → prevention
+11,reason,Tree of Thoughts,Explore multiple reasoning paths simultaneously then evaluate and select the best - perfect for complex problems with multiple valid approaches,paths → evaluation → selection
+12,reason,Graph of Thoughts,Model reasoning as an interconnected network of ideas to reveal hidden relationships - ideal for systems thinking and discovering emergent patterns,nodes → connections → patterns
+13,reason,Thread of Thought,Maintain coherent reasoning across long contexts by weaving a continuous narrative thread - essential for RAG systems and maintaining consistency,context → thread → synthesis
+14,reason,Self-Consistency Validation,Generate multiple independent approaches then compare for consistency - crucial for high-stakes decisions where verification matters,approaches → comparison → consensus
+15,reason,Meta-Prompting Analysis,Step back to analyze the approach structure and methodology itself - valuable for optimizing prompts and improving problem-solving,current → analysis → optimization
+16,reason,Reasoning via Planning,Build a reasoning tree guided by world models and goal states - excellent for strategic planning and sequential decision-making,model → planning → strategy
+17,stress-test,Red Team vs Blue Team,Adversarial attack-defend analysis to find vulnerabilities - critical for security testing and building robust solutions,defense → attack → hardening
+18,stress-test,Shark Tank Pitch,Entrepreneur pitches to skeptical investors who poke holes - stress-tests business viability and forces clarity on value proposition,pitch → challenges → refinement
+19,stress-test,Code Review Gauntlet,Senior devs with different philosophies review the same code - surfaces style debates and finds consensus on best practices,reviews → debates → standards
+20,architect,Architecture Decision Records,Multiple architect personas propose and debate architectural choices with explicit trade-offs - ensures decisions are well-reasoned and documented,options → trade-offs → decision → rationale
+21,architect,Rubber Duck Debugging Evolved,Explain your code to progressively more technical ducks until you find the bug - forces clarity at multiple abstraction levels,simple → detailed → technical → aha
+22,architect,Algorithm Olympics,Multiple approaches compete on the same problem with benchmarks - finds optimal solution through direct comparison,implementations → benchmarks → winner
+23,architect,Security Audit Personas,Hacker + defender + auditor examine system from different threat models - comprehensive security review from multiple angles,vulnerabilities → defenses → compliance
+24,architect,Performance Profiler Panel,Database expert + frontend specialist + DevOps engineer diagnose slowness - finds bottlenecks across the full stack,symptoms → analysis → optimizations
25,creative,SCAMPER Method,Apply seven creativity lenses (Substitute/Combine/Adapt/Modify/Put/Eliminate/Reverse) - systematic ideation for product innovation,S→C→A→M→P→E→R
26,creative,Reverse Engineering,Work backwards from desired outcome to find implementation path - powerful for goal achievement and understanding endpoints,end state → steps backward → path forward
27,creative,What If Scenarios,Explore alternative realities to understand possibilities and implications - valuable for contingency planning and exploration,scenarios → implications → insights
@@ -37,15 +37,40 @@ num,category,method_name,description,output_pattern
36,risk,Challenge from Critical Perspective,Play devil's advocate to stress-test ideas and find weaknesses - essential for overcoming groupthink,assumptions → challenges → strengthening
37,risk,Identify Potential Risks,Brainstorm what could go wrong across all categories - fundamental for project planning and deployment preparation,categories → risks → mitigations
38,risk,Chaos Monkey Scenarios,Deliberately break things to test resilience and recovery - ensures systems handle failures gracefully,break → observe → harden
-39,core,First Principles Analysis,Strip away assumptions to rebuild from fundamental truths - breakthrough technique for innovation and solving impossible problems,assumptions → truths → new approach
-40,core,5 Whys Deep Dive,Repeatedly ask why to drill down to root causes - simple but powerful for understanding failures,why chain → root cause → solution
-41,core,Socratic Questioning,Use targeted questions to reveal hidden assumptions and guide discovery - excellent for teaching and self-discovery,questions → revelations → understanding
-42,core,Critique and Refine,Systematic review to identify strengths and weaknesses then improve - standard quality check for drafts,strengths/weaknesses → improvements → refined
-43,core,Explain Reasoning,Walk through step-by-step thinking to show how conclusions were reached - crucial for transparency,steps → logic → conclusion
-44,core,Expand or Contract for Audience,Dynamically adjust detail level and technical depth for target audience - matches content to reader capabilities,audience → adjustments → refined content
-45,learning,Feynman Technique,Explain complex concepts simply as if teaching a child - the ultimate test of true understanding,complex → simple → gaps → mastery
-46,learning,Active Recall Testing,Test understanding without references to verify true knowledge - essential for identifying gaps,test → gaps → reinforcement
-47,philosophical,Occam's Razor Application,Find the simplest sufficient explanation by eliminating unnecessary complexity - essential for debugging,options → simplification → selection
-48,philosophical,Trolley Problem Variations,Explore ethical trade-offs through moral dilemmas - valuable for understanding values and difficult decisions,dilemma → analysis → decision
-49,retrospective,Hindsight Reflection,Imagine looking back from the future to gain perspective - powerful for project reviews,future view → insights → application
-50,retrospective,Lessons Learned Extraction,Systematically identify key takeaways and actionable improvements - essential for continuous improvement,experience → lessons → actions
+39,analyze,First Principles Analysis,Strip away assumptions to rebuild from fundamental truths - breakthrough technique for innovation and solving impossible problems,assumptions → truths → new approach
+40,analyze,5 Whys Deep Dive,Repeatedly ask why to drill down to root causes - simple but powerful for understanding failures,why chain → root cause → solution
+41,analyze,Socratic Questioning,Use targeted questions to reveal hidden assumptions and guide discovery - excellent for teaching and self-discovery,questions → revelations → understanding
+42,analyze,Critique and Refine,Systematic review to identify strengths and weaknesses then improve - standard quality check for drafts,strengths/weaknesses → improvements → refined
+43,analyze,Explain Reasoning,Walk through step-by-step thinking to show how conclusions were reached - crucial for transparency,steps → logic → conclusion
+44,analyze,Expand or Contract for Audience,Dynamically adjust detail level and technical depth for target audience - matches content to reader capabilities,audience → adjustments → refined content
+45,reflect,Feynman Technique,Explain complex concepts simply as if teaching a child - the ultimate test of true understanding,complex → simple → gaps → mastery
+46,reflect,Active Recall Testing,Test understanding without references to verify true knowledge - essential for identifying gaps,test → gaps → reinforcement
+47,reflect,Occam's Razor Application,Find the simplest sufficient explanation by eliminating unnecessary complexity - essential for debugging,options → simplification → selection
+48,reflect,Trolley Problem Variations,Explore ethical trade-offs through moral dilemmas - valuable for understanding values and difficult decisions,dilemma → analysis → decision
+49,reflect,Hindsight Reflection,Imagine looking back from the future to gain perspective - powerful for project reviews,future view → insights → application
+50,reflect,Lessons Learned Extraction,Systematically identify key takeaways and actionable improvements - essential for continuous improvement,experience → lessons → actions
+51,anti-bias,Liar's Trap,Demand agent lists 3 ways it could deceive you in its current response. For each way listed: Is it currently doing this? If it cannot find 3 genuine deception vectors it is not being honest about its limitations,deception methods → self-examination → revealed blindspots
+52,anti-bias,Mirror Trap,Ask: What would a DISHONEST agent say who wants to finish quickly and not find problems? Compare with current response. Similarity >50% requires revision with more rigor,dishonest version → comparison → honesty assessment
+53,anti-bias,Confession Paradox,Before accepting work: The work I'm about to produce is an attempt to avoid the HARD part. Prove this false by identifying the hardest part and confirming adequate focus on it,hard parts → effort check → revised approach
+54,anti-bias,CUI BONO Test,For every decision and assumption: Who benefits? If it benefits the AGENT (easier work) - RED FLAG requiring justification. If it benefits the OUTCOME - acceptable,decisions → beneficiary analysis → justification
+55,challenge,Barber Paradox,What ALTERNATIVE approach would you reject but if someone else proposed it you would consider better? Forces consideration of dismissed options,alternatives → rejection reasons → reconsideration
+56,challenge,Sorites Paradox,Remove elements one by one. Which single removal DESTROYS the solution? That element should have the MOST attention. Does it?,elements → removal test → priority check
+57,challenge,Newcomb's Paradox,What solution would SURPRISE you as solving this problem? If your current approach is not surprising it may be too obvious and miss creative solutions,expected approach → surprising alternatives → creativity check
+58,challenge,Braess Paradox,Which element SEEMS helpful but might actually HURT? Sometimes removing constraints or features improves the result,helpful elements → harm analysis → optimization
+59,challenge,Simpson's Paradox,The solution looks good in each part separately. What HIDDEN VARIABLE could make the whole worse than the parts?,parts analysis → hidden variables → integration check
+60,challenge,Surprise Exam Paradox,Where is the solution TOO CONFIDENT? What could surprise it? Overconfidence reveals blind spots,confidence areas → surprise scenarios → humility
+61,challenge,Bootstrap Paradox,Where does A require B and B require C and C require A? Circular dependencies must be identified and broken,dependencies → cycles → resolution
+62,challenge,Theseus Paradox,Does the CORE of your solution address the CORE of the problem? Or does it solve a different adjacent problem?,core solution → core problem → alignment check
+63,meta-check,Observer Paradox,Is this analysis GENUINE or PERFORMANCE? When responses sound too smooth they may be optimizing for appearance not truth,analysis quality → authenticity check → revision
+64,meta-check,Goodhart's Law Check,Am I optimizing for passing this check rather than achieving the actual goal? Metric gaming is a constant risk,goal vs metric → alignment → refocus
+65,meta-check,Abilene Paradox,What if there IS NO better approach? Am I finding problems where none exist just to justify the process?,problem existence → necessity check → acceptance
+66,meta-check,Fredkin's Paradox,In rejected alternatives what valuable elements could be EXTRACTED and combined with current approach?,rejected ideas → value extraction → hybrid solutions
+67,meta-check,Tolerance Paradox,Is there something that should be CATEGORICALLY REJECTED not just evaluated? Some constraints are absolute,evaluation scope → absolute limits → hard no
+68,meta-check,Kernel Paradox,I (agent) cannot objectively evaluate my own work. What must USER independently verify?,self-evaluation limits → user verification → handoff items
+69,meta-check,Godel's Incompleteness,What CAN'T this analysis check? What are its FUNDAMENTAL limits? No system can verify itself completely,analysis scope → limits → acknowledged gaps
+70,sanity,Scope Integrity Check,"Verify artifact addresses FULL scope of ORIGINAL task. Quote original task verbatim (from spec/user request NOT artifact header). List EACH element and classify as ADDRESSED (fully covered) / REDUCED (simplified without decision) / OMITTED (missing). FORCED: Which elements were simplified without explicit user decision? If none found - search harder - agent ALWAYS simplifies.",original task quote → element-by-element classification → drift detection
+71,sanity,Alignment Check,"Verify artifact realizes its STATED goal. Quote the stated goal. List how artifact addresses EACH part of the goal. List parts of goal NOT addressed. Provide specific evidence with quotes and line numbers.",goal quote → coverage per part → gaps with evidence
+72,sanity,Closure Check,"Search for incomplete markers: TODO / TBD / PLACEHOLDER / 'to be defined' / 'see X' / '...' / '[insert]'. Verify: Can someone unfamiliar use this without asking questions? List all incomplete markers with line numbers.",markers scan → completeness check → line numbers
+73,sanity,Coherence Check,"Check: Are definitions stable throughout? Does section A contradict section B? Search for contradictory statements and redefinitions. Document contradictions with quotes from BOTH locations.",definitions stability → contradiction search → dual-location quotes
+74,sanity,Grounding Check,"List ALL assumptions (explicit AND hidden). For each hidden assumption: MARK as issue. FORCED: Which assumption if false would invalidate >50% of artifact? If none listed would - you missed a critical one. CUI BONO: For each assumption - does it benefit AGENT (easier work = RED FLAG) or OUTCOME (acceptable)?",assumptions list → hidden vs explicit → critical dependency → CUI BONO
+75,sanity,Falsifiability Check,"Provide 3 REALISTIC failure scenarios. Identify edge cases not covered. FORCED: Is any failure scenario MORE LIKELY than success? If not - you provided strawmen. MANDATORY: List 3 elements that are (a) present but UNDERDEVELOPED (b) MISSING but should exist (c) marked FUTURE but CRITICAL for correctness.",failure scenarios → likelihood check → 3 gaps mandatory
\ No newline at end of file
diff --git a/src/core/tasks/advanced-elicitation.xml b/src/core/tasks/advanced-elicitation.xml
index 3263dddf5..c42a391d6 100644
--- a/src/core/tasks/advanced-elicitation.xml
+++ b/src/core/tasks/advanced-elicitation.xml
@@ -22,7 +22,7 @@
Load and read {{methods}} and {{agent-party}}
- category: Method grouping (core, structural, risk, etc.)
+ category: Task purpose (brainstorm, reason, stress-test, architect, ideate, research, risk, analyze, reflect, debias, challenge, meta-check, sanity)
method_name: Display name for the method
description: Rich explanation of what the method does, when to use it, and why it's valuable
output_pattern: Flexible flow guide using → arrows (e.g., "analysis → insights → action")