# Research Plan Generation Pipeline

## Overview

The Research Plan feature in Shamra Academia Playground allows users to generate comprehensive, evidence-based academic research plans (~2500 words) using AI. The system now supports a **Multi-Agent Architecture** with three specialized agents.

## Multi-Agent Architecture

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                              Frontend (Vue.js)                               │
│                         public/js/playground.js                              │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                         PlaygroundAPIController                              │
│                    src/Controller/PlaygroundAPIController.php                │
│                                                                              │
│  Endpoints:                                                                  │
│  - POST /api/playground/research-plan/initiate                               │
│  - POST /api/playground/research-plan/generate                               │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                      ResearchPlanOrchestrator                                │
│              src/Service/Playground/Agent/ResearchPlanOrchestrator.php       │
│                                                                              │
│   Coordinates three specialized agents in sequence:                          │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
        ┌─────────────────────────────┼─────────────────────────────┐
        │                             │                             │
        ▼                             ▼                             ▼
┌───────────────────┐     ┌───────────────────┐     ┌───────────────────┐
│  RETRIEVAL AGENT  │     │   PLANNER AGENT   │     │   WRITER AGENT    │
│  (The Librarian)  │────▶│  (The Architect)  │────▶│  (The Researcher) │
│                   │     │                   │     │                   │
│  • Extract keywords│     │  • Create outline │     │  • Write sections │
│  • Search ES      │     │  • Define TOC     │     │  • Use evidence   │
│  • Find papers    │     │  • Set word targets│     │  • Expand content │
│  • Analyze gaps   │     │  • Academic reqs  │     │  • Ground in data │
└───────────────────┘     └───────────────────┘     └───────────────────┘
        │                                                     │
        └─────────────────────────────────────────────────────┘
                                      │
                                      ▼
                        ┌─────────────────────────┐
                        │     Elasticsearch       │
                        │  (Arabic & English)     │
                        └─────────────────────────┘
```

## The Three Agents

### 1. Retrieval Agent (The Librarian)
**Location:** `src/Service/Playground/Agent/RetrievalAgentService.php`

**Responsibilities:**
- Extracts 5-8 search keywords from the research topic
- Searches Elasticsearch using multiple queries
- Finds related papers in both Arabic and English
- Analyzes retrieved papers to identify:
  - Common methodologies used
  - Research gaps in the field
  - Key findings from existing studies
  - Theoretical frameworks
  - Sample populations studied

**Output:**
```php
[
    'keywords' => ['keyword1', 'keyword2', ...],
    'papers' => [...],  // Retrieved papers
    'analysis' => [
        'methodologies' => [...],
        'gaps' => [...],
        'key_findings' => [...],
        'theoretical_frameworks' => [...],
        'sample_populations' => [...],
        'recommendations' => [...]
    ],
    'paper_count' => 15
]
```

### 2. Planner Agent (The Architect)
**Location:** `src/Service/Playground/Agent/PlannerAgentService.php`

**Responsibilities:**
- Creates a detailed outline/Table of Contents
- Defines all required academic sections
- Sets word targets for each section
- Ensures all academic requirements are met
- Prevents AI from rambling by providing structure

**Standard Sections (13 total, ~2500 words):**

| Section | Arabic | Word Target |
|---------|--------|-------------|
| Abstract | ملخص البحث | 200-300 |
| Introduction | مقدمة | 300-500 |
| Problem Statement | مشكلة البحث | 200-400 |
| Literature Review | الدراسات السابقة | 400-600 |
| Research Questions | أسئلة البحث | 100-200 |
| Objectives | أهداف البحث | 150-300 |
| Methodology | منهجية البحث | 400-600 |
| Theoretical Framework | الإطار النظري | 200-400 |
| Expected Results | النتائج المتوقعة | 200-350 |
| Timeline | الجدول الزمني | 100-200 |
| Ethical Considerations | الاعتبارات الأخلاقية | 150-300 |
| Limitations | حدود البحث | 100-200 |
| References | المراجع | 50-150 |

**Output:**
```php
[
    'title' => 'Finalized research title',
    'sections' => [
        [
            'id' => 'abstract',
            'title' => 'Abstract / ملخص البحث',
            'description' => 'What this section should contain',
            'key_points' => ['point 1', 'point 2'],
            'word_target' => 250,
            'grounding_notes' => 'Evidence to include'
        ],
        // ... more sections
    ],
    'methodology_type' => 'mixed',
    'estimated_total_words' => 2500
]
```

### 3. Writer Agent (The Researcher)
**Location:** `src/Service/Playground/Agent/WriterAgentService.php`

**Responsibilities:**
- Writes each section following the outline
- Uses retrieved evidence as "ground truth"
- References existing papers appropriately
- Produces formal academic prose
- Automatically expands sections that are too short
- Maintains coherence across sections

**Writing Process:**
1. Takes the outline from Planner Agent
2. Takes the evidence from Retrieval Agent
3. Writes sections one at a time
4. Provides context from previous sections
5. Includes specific instructions per section type
6. If content is <60% of target, auto-expands

**Section-Specific Instructions:**
- **Abstract**: Standalone summary with problem, objectives, methodology, significance
- **Literature Review**: Thematic organization, compare/contrast studies
- **Methodology**: Specific about design, sampling, instruments, analysis
- **Ethical Considerations**: Informed consent, confidentiality, IRB, data security

## Two-Phase Generation Process

### Phase 1: Initiate (Clarification Questions)

**Endpoint:** `POST /api/playground/research-plan/initiate`

**Purpose:** Analyze the research topic and generate clarifying questions to better understand the user's needs.

**Flow:**
1. User submits research title and optional description
2. System searches Elasticsearch for related papers (bilingual search)
3. AI generates 3-5 clarifying questions based on:
   - The research topic
   - Gaps identified in existing literature
   - Methodological considerations

**Request:**
```json
{
  "title": "Research title",
  "description": "Optional description"
}
```

**Response:**
```json
{
  "success": true,
  "clarifications": [
    {
      "id": "q1",
      "question": "What is your target population?",
      "type": "text",
      "options": null
    },
    {
      "id": "q2", 
      "question": "Which methodology do you prefer?",
      "type": "select",
      "options": ["Quantitative", "Qualitative", "Mixed Methods"]
    }
  ],
  "context_summary": "Found 15 related papers..."
}
```

### Phase 2: Generate (Full Research Plan)

**Endpoint:** `POST /api/playground/research-plan/generate`

**Purpose:** Generate the complete research plan (~2500 words) based on the topic and user's answers to clarification questions.

**Request:**
```json
{
  "title": "Research title",
  "description": "Optional description",
  "clarifications": [
    {"id": "q1", "answer": "University students"},
    {"id": "q2", "answer": "Mixed Methods"}
  ],
  "use_agent": true
}
```

**Response:**
```json
{
  "success": true,
  "project": {
    "id": 123,
    "title": "Research title",
    "keywords": [...],
    "content": "Full 2500-word research plan..."
  },
  "research_plan": "# Title\n\n## Abstract\n...",
  "sources_used": 15,
  "agent_used": true,
  "metadata": {
    "total_words": 2543,
    "section_count": 13,
    "methodology_type": "mixed",
    "papers_referenced": 15,
    "keywords_used": ["keyword1", "keyword2"]
  }
}
```

## Mode Comparison

### Multi-Agent Mode (`use_agent: true`) - RECOMMENDED

**Services Used:**
- `ResearchPlanOrchestrator` - Coordinates agents
- `RetrievalAgentService` - Finds evidence
- `PlannerAgentService` - Creates outline
- `WriterAgentService` - Writes content

**Flow:**
1. Retrieval Agent extracts keywords and searches ES
2. Retrieval Agent analyzes papers for gaps, methodologies
3. Planner Agent creates detailed outline with 13 sections
4. Writer Agent writes each section with evidence grounding
5. Auto-expansion if sections are too short

**Output:** ~2500 words, 13 sections, evidence-based

**Pros:**
- Comprehensive output (~2500 words)
- Structured with all academic sections
- Evidence-grounded content
- Consistent quality

**Cons:**
- Slower (multiple AI calls per section)
- Higher cost (more API calls)

### Standard RAG Mode (`use_agent: false`)

**Services Used:**
- `PlaygroundRAGService` - Searches Elasticsearch
- `AzureOpenAIService` - Single API call

**Flow:**
1. Pre-fetch papers from ES
2. Build prompt with retrieved papers as context
3. Single API call to generate the plan

**Output:** ~350 words (basic)

**Pros:**
- Fast (single API call)
- Lower cost
- Predictable timing

**Cons:**
- Short output
- May miss sections
- Less structured

## Service Registration

```yaml
# config/services.yaml

# Multi-Agent Research Plan System
App\Service\Playground\Agent\RetrievalAgentService:
    arguments:
        $ragService: '@App\Service\Playground\PlaygroundRAGService'
        $openAIService: '@App\Service\Playground\AzureOpenAIService'
        $logger: '@logger'

App\Service\Playground\Agent\PlannerAgentService:
    arguments:
        $openAIService: '@App\Service\Playground\AzureOpenAIService'
        $logger: '@logger'

App\Service\Playground\Agent\WriterAgentService:
    arguments:
        $openAIService: '@App\Service\Playground\AzureOpenAIService'
        $logger: '@logger'

App\Service\Playground\Agent\ResearchPlanOrchestrator:
    arguments:
        $retrievalAgent: '@App\Service\Playground\Agent\RetrievalAgentService'
        $plannerAgent: '@App\Service\Playground\Agent\PlannerAgentService'
        $writerAgent: '@App\Service\Playground\Agent\WriterAgentService'
        $usageMonitor: '@App\Service\Playground\UsageMonitorService'
        $logger: '@logger'
```

## Configuration

### Required Parameters (services.yaml / .env)

```yaml
parameters:
    azure_openai_endpoint: '%env(AZURE_OPENAI_ENDPOINT)%'
    azure_openai_api_key: '%env(AZURE_OPENAI_API_KEY)%'
    azure_openai_deployment: '%env(AZURE_OPENAI_DEPLOYMENT)%'
    azure_openai_api_version: '%env(AZURE_OPENAI_API_VERSION)%'
    elastic_arabic_research_index: '%env(ELASTIC_ARABIC_INDEX)%'
    elastic_english_research_index: '%env(ELASTIC_ENGLISH_INDEX)%'
```

## Frontend Integration

### JavaScript (playground.js)

**State Variables:**
```javascript
const researchPlanUseAgent = ref(true);  // Toggle for multi-agent mode
```

**API Calls:**
```javascript
// Phase 1: Initiate
const response = await api.request('/api/playground/research-plan/initiate', {
    title: researchPlanTitle.value,
    description: researchPlanDescription.value
});

// Phase 2: Generate (with multi-agent)
const response = await api.request('/api/playground/research-plan/generate', {
    title: researchPlanTitle.value,
    description: researchPlanDescription.value,
    clarifications: researchPlanClarifications.value,
    use_agent: researchPlanUseAgent.value  // Enable multi-agent
});
```

## Error Handling

### Common Errors

1. **"max_tokens not supported"**
   - Fix: Use `max_completion_tokens` for GPT-4o+ models

2. **"temperature does not support X"**
   - Fix: Some models only support default temperature (1)

3. **Short output (<500 words)**
   - Fix: Enable multi-agent mode with `use_agent: true`
   - The Writer Agent auto-expands short sections

4. **Missing sections**
   - Fix: Planner Agent ensures all 13 sections are included

## Usage Monitoring

All API calls are logged via `UsageMonitorService`:

```php
$this->usageMonitor->logApiCall(
    $user,
    'multi_agent_research_plan',  // Operation type
    $deployment,
    $tokensInput,
    $tokensOutput,
    $latencyMs,
    $success,
    $errorMessage
);
```

## File Locations

| Component | Path |
|-----------|------|
| Orchestrator | `src/Service/Playground/Agent/ResearchPlanOrchestrator.php` |
| Retrieval Agent | `src/Service/Playground/Agent/RetrievalAgentService.php` |
| Planner Agent | `src/Service/Playground/Agent/PlannerAgentService.php` |
| Writer Agent | `src/Service/Playground/Agent/WriterAgentService.php` |
| Controller | `src/Controller/PlaygroundAPIController.php` |
| RAG Service | `src/Service/Playground/PlaygroundRAGService.php` |
| Azure Service | `src/Service/Playground/AzureOpenAIService.php` |

## Future Improvements

1. **Streaming Support** - Stream the research plan generation for better UX
2. **Section-by-Section Display** - Show sections as they're written
3. **External APIs** - Add arXiv, PubMed, Google Scholar search
4. **Revision Agent** - Review and improve the final output
5. **Citation Agent** - Properly format and verify citations
6. **Export Options** - Export to Word, PDF, LaTeX formats
7. **Collaboration** - Share and collaborate on research plans