# 25. Create Related Work — AI-Powered Related Work Section Generator

> إنشاء دراسات مرجعية — Multi-agent AI pipeline that generates cohesive "Related Work" academic sections from Shamra, ArXiv, and user reference library sources.

**Status:** ✅ Core implemented (April 2-3, 2026) | ✅ Advanced options & content_md (April 7, 2026)

---

## ✅ Implementation Log

### Core Feature (April 2, 2026)
- **Entity**: `RelatedWorkProject` with status lifecycle (draft→clarifying→processing→completed→failed), slug, JSON columns for references/agentSteps/metadata
- **Repository**: `RelatedWorkProjectRepository` with findByUser, countActive, findBySlug
- **Migration**: `Version20260402100000` — `related_work_project` table (`utf8mb4_unicode_ci`)
- **Orchestrator**: `RelatedWorkOrchestrator` — 5-agent pipeline (Clarifier, Scout, Curator, Reader, Writer) yielding SSE events
- **Service**: `RelatedWorkService` — project CRUD, credit reservation/refund, email sending
- **Controller**: `RelatedWorkController` — 9 routes including SSE streaming endpoint
- **Prompts**: 5 isolated files in `playground_prompts/related_work/` (clarify, search_keywords, relevance_score, analyze_paper, compose_related_work)
- **Template**: `templates/related_work/index.html.twig` — 3-panel Elicit/Perplexity-inspired layout with agent timeline, typing animation, project sidebar
- **Emails**: Completion + failure templates (reusing OCR design pattern)
- **Translations**: `RelatedWork.ar.yml` + `RelatedWork.en.yml`
- **Credits**: `'related_work' => 20` + `'related_work_refine' => 5` added to `UsageMonitorService`
- **Homepage**: "كتابة دراسة مرجعيّة" button added next to "اسألني" in search mode toggle
- **Nav**: Added to header AI tools dropdown + mobile sidebar

### Advanced Options & content_md Support (April 7, 2026)
- **`content_md` in RAG**: `PlaygroundRAGService::searchShamra()` now prefers `content_md` (high-quality OCR markdown) over `content` for both Arabic and English ES results. Added `content` and `has_full_text` fields to search result arrays.
- **Full-text analysis**: `analyzePapers()` in orchestrator now passes up to 4000 chars of full paper content (from `content_md`) to the Reader agent prompt, enabling much richer methodology/findings extraction. Max tokens increased 3000→4000.
- **Advanced options form**: Collapsible "Advanced Options" panel below topic textarea on the create form:
  - **Max Papers** dropdown (Auto, 5, 10, 15, 20, 30)
  - **Section Length** dropdown (Short ~1pg, Medium ~2pg, Long ~3-4pg, Comprehensive ~5+pg)
  - **Focus toggles**: Review only, Include methodology comparison, Present findings & results, Identify research gaps
- **Freemium gating**: Trial/demo users see the options panel with "PRO" badge and upgrade banner — all controls are visually disabled (`opacity: 0.5; pointer-events: none`) with lock icon and upgrade link to `/subscription`. Paid tier users get full access.
- **Controller**: `create()` accepts `options` in POST body, validates and stores in project `metadata` JSON column (only for paid tiers). Both `index()` and `view()` pass `isPaidTier` boolean to template. Added `EntityManagerInterface` dependency.
- **Orchestrator compose**: `compose()` now reads `focus_options` (review_only, include_methodology, include_findings, find_gaps) and `target_length` from project metadata. Max tokens dynamically scales: short=3000, medium=6000, long=8000, comprehensive=10000. `max_papers` option limits papers after Curator step.
- **Improved prompts**:
  - `analyze_paper.md`: Detailed methodology extraction (research design, sample size, analysis techniques), findings extraction (specific numbers, p-values, metrics), gap identification (sample limitations, generalizability). Papers with full content trigger richer analysis.
  - `compose_related_work.md`: Added `{{focus_options}}` and `{{target_length}}` template variables. Length guidelines now tied to short/medium/long/comprehensive. Focus option instructions control methodology comparison, findings presentation, and gap analysis inclusion.
- **Translations**: 18 new keys in both `RelatedWork.ar.yml` and `RelatedWork.en.yml` for all advanced option labels.
- **JavaScript**: `startProject()` collects advanced options via `getAdvancedOptions()` and sends them in the create payload for paid users. Toggle function for collapsible panel.

### Bug Fixes (April 2-3, 2026)
1. **Migration PRIMARY KEY** — `AUTO_INCREMENT` column needs `PRIMARY KEY` inline for MariaDB
2. **`references` reserved word** — MySQL reserved word caused SQL syntax errors; fixed with backtick-escaped column name (`name: '`references`'`) in ORM mapping
3. **Login route** — `fos_user_security_login` doesn't exist; corrected to `app_login` with `next` parameter
4. **Paper analysis parsing failure** — LLM returned `{"analyses": [...]}` (wrapped object) instead of `[...]` (flat array), causing empty analyses to reach the Writer agent. Fixed with:
   - Auto-unwrap nested JSON arrays
   - Fallback to positional pairing when index mapping fails
   - Fallback to raw paper data (title+abstract) when all parsing fails
   - Normalized analysis structure before passing to Writer
5. **Agent timeline English text** — Timeline replay showed English action names (`search`, `filter`, `analyze`, `compose`). Fixed by adding bilingual `message` field to all `addAgentStep()` calls
6. **Button/label updates** — Homepage button text changed to "كتابة دراسة مرجعيّة"; reference library tip button changed to "إضافة مراجع PDF بشكل يدوي"

---

## Overview

A new feature accessible via a button next to the existing "اسألني" (Ask Me) chat button on the homepage search bar. Clicking it navigates to a dedicated `/related-work` page where users can create, manage, and navigate multiple Related Work projects. Each project is generated by a 5-agent AI pipeline that searches for papers, filters by relevance, analyzes each paper, and composes a cohesive academic "Related Work" section — all streamed in real-time via SSE (Perplexity/Elicit-style).

### Key Characteristics

| Attribute | Value |
|-----------|-------|
| **Entry Point** | Homepage search bar → "دراسات مرجعية" button → `/related-work` |
| **UX Model** | Dedicated page with sidebar navigation (Elicit/Perplexity-inspired) |
| **Processing** | SSE streaming — live agent steps + content chunks |
| **Credit Cost** | 20 credits per generation (deducted upfront, refunded on failure) |
| **Paper Sources** | Shamra ES + ArXiv + User Reference Library |
| **AI Agents** | 5 specialized agents (Clarifier, Scout, Curator, Reader, Writer) |
| **Language** | Respects user input language (Arabic/English) |
| **Output** | Academic Related Work section with inline citations + reference list |
| **Email** | Completion/failure notification (reusing OCR email template pattern) |
| **Multi-project** | Users can create and manage multiple projects |

---

## User Flow

```
1. User clicks "دراسات مرجعية" button on homepage search bar
   ↓
2. Redirected to /related-work page (must be logged in + subscribed)
   ↓
3. Left sidebar shows existing projects (if any)
   Center shows "Create New" form: topic textarea + language toggle
   ↓
4. User enters research topic (e.g., "تأثير الذكاء الاصطناعي على التعليم العالي")
   Clicks "إنشاء" (Create)
   ↓
5. POST /related-work/api/create → project created with status=draft
   ↓
6. POST /related-work/api/clarify → Clarification Agent analyzes topic
   Returns 2-5 clarifying questions (scope, field, time range, focus)
   ↓
7. User answers questions in-page (inline form)
   ↓
   📚 "Boost Your Results" tip banner appears (see UX below):
   Shows user's reference library count + explains that uploaded PDFs
   with BibTeX citations will be used as additional sources.
   Links to /myreferences to add more papers before generating.
   ↓
8. User clicks "توليد الدراسات المرجعية" (Generate Related Work)
   20 credits deducted immediately
   ↓
9. POST /related-work/api/{slug}/generate → SSE stream begins
   ↓
   🔍 Scout Agent: "البحث عن أبحاث..." → searches Shamra + ArXiv + user refs
   📋 Curator Agent: "تصفية الأبحاث..." → filters to top 10-15 papers
   📖 Reader Agent: "تحليل البحث 3 من 12..." → analyzes each paper
   ✍️ Writer Agent: content streams in, chunk by chunk
   ↓
10. Content appears in real-time with typing animation
    References sidebar populates with paper cards
    Agent steps timeline shows completed steps
    ↓
11. On completion: status → completed, email sent
    User can edit title, edit content, copy, download DOCX
    ↓
12. On failure: credits refunded, error shown, failure email sent
```

---

## Architecture

### Database

#### `related_work_project` table

| Column | Type | Description |
|--------|------|-------------|
| `id` | INT, PK, AI | Primary key |
| `user_id` | INT, FK → `fos_user` | Project owner |
| `slug` | VARCHAR(64), UNIQUE | URL-safe identifier (hex) |
| `title` | VARCHAR(500) | User-editable project title |
| `research_topic` | TEXT | Original topic/question submitted |
| `clarifications` | JSON, nullable | Q&A pairs: `[{id, question, answer}]` |
| `content` | TEXT, nullable | Generated Related Work section (markdown/HTML) |
| `references` | JSON, nullable | Papers used: `[{id, title, authors, year, url, source, abstract, relevanceScore}]` |
| `search_keywords` | JSON, nullable | Keywords extracted by Scout agent |
| `status` | VARCHAR(20), default='draft' | `draft`, `clarifying`, `processing`, `completed`, `failed` |
| `language` | VARCHAR(10), default='ar' | `ar` or `en` |
| `credits_charged` | INT, default=0 | Credits deducted for this generation |
| `error_message` | TEXT, nullable | Error details if failed |
| `agent_steps` | JSON, nullable | Agent step log for timeline replay |
| `metadata` | JSON, nullable | Flexible: source counts, timing, word count |
| `created_at` | DATETIME | Creation timestamp |
| `updated_at` | DATETIME | Last modification (auto-updated via `@PreUpdate`) |
| `completed_at` | DATETIME, nullable | When generation finished |

**Collation**: `utf8mb4_unicode_ci` (matching existing tables)

**Status Lifecycle**:
```
draft → clarifying → processing → completed
                        ↓
                      failed
```

#### `agent_steps` JSON structure (for timeline replay)

```json
[
  {
    "agent": "scout",
    "icon": "🔍",
    "action": "searching",
    "status": "completed",
    "message": "تم العثور على 23 بحث من شمرا، ArXiv، ومكتبتك",
    "message_en": "Found 23 papers from Shamra, ArXiv, and your library",
    "timestamp": 1712000000.123,
    "details": { "shamra": 15, "arxiv": 6, "user_refs": 2 }
  },
  {
    "agent": "curator",
    "icon": "📋",
    "action": "filtering",
    "status": "completed",
    "message": "تمت تصفية 12 بحث ذو صلة من أصل 23",
    "timestamp": 1712000005.456,
    "details": { "relevant": 12, "total": 23, "min_score": 0.5 }
  }
]
```

---

### Multi-Agent Pipeline

Five specialized agents, coordinated by `RelatedWorkOrchestrator`:

#### Agent 1: Clarification Agent (🗣️ The Interviewer)

- **When**: Synchronous, runs on `POST /related-work/api/clarify` before generation starts
- **Prompt**: `playground_prompts/related_work/clarify.md`
- **Input**: Research topic + language
- **Output**: `{sufficient: bool, clarifications: [{id, question, hint}], initial_assessment: string}`
- **Purpose**: Ensure the AI has enough context to search effectively — asks about scope, field, time range, methodology focus, specific aspects of interest
- **Example questions** (Arabic):
  - "ما المجال المحدد الذي تركز عليه؟ (علوم الحاسوب، الطب، التعليم...)"
  - "هل تريد التركيز على فترة زمنية محددة؟"
  - "هل تبحث عن منهجيات معينة (تجريبية، نظرية، مراجعة)؟"

#### Agent 2: Scout Agent (🔍 The Searcher)

- **When**: First stage of SSE generation stream
- **Prompt**: `playground_prompts/related_work/search_keywords.md`
- **Input**: Topic + clarifications + language
- **Output**: 5-8 bilingual keywords → parallel search across 3 sources
- **Sources**:
  1. **Shamra ES** — via `PlaygroundRAGService::searchShamra()` (Arabic + English indices)
  2. **ArXiv** — via `PlaygroundRAGService::search()` with ArXiv source
  3. **User Reference Library** — via `ReferenceIndexingService::search()` (user's uploaded papers)
- **Deduplication**: By title similarity (fuzzy matching) to avoid duplicates across sources
- **SSE events**:
  ```
  data: {"type":"step","agent":"scout","icon":"🔍","message":"استخراج الكلمات المفتاحية..."}
  data: {"type":"step","agent":"scout","icon":"🔍","message":"البحث في شمرا أكاديميا..."}
  data: {"type":"step","agent":"scout","icon":"🔍","message":"البحث في ArXiv..."}
  data: {"type":"step","agent":"scout","icon":"🔍","message":"البحث في مكتبتك..."}
  data: {"type":"step","agent":"scout","icon":"🔍","message":"تم العثور على 23 بحث","papers_found":23}
  ```

#### Agent 3: Curator Agent (📋 The Filter)

- **When**: After Scout completes
- **Prompt**: `playground_prompts/related_work/relevance_score.md`
- **Input**: Topic + clarifications + all found papers (title + abstract)
- **Output**: `[{index, score, reason}]` — scores 0-1 per paper
- **Filter**: Keep papers with score ≥ 0.5, cap at top 15
- **SSE events**:
  ```
  data: {"type":"step","agent":"curator","icon":"📋","message":"تقييم صلة 23 بحث بموضوعك..."}
  data: {"type":"step","agent":"curator","icon":"📋","message":"تمت تصفية 12 بحث ذو صلة","relevant":12,"total":23}
  ```

#### Agent 4: Reader Agent (📖 The Analyzer)

- **When**: After Curator completes
- **Prompt**: `playground_prompts/related_work/analyze_paper.md`
- **Input**: Topic + each paper's title/abstract/content
- **Output**: Per paper: `{contribution, methodology, findings, relation_to_topic}`
- **Batching**: Process 3-5 papers per LLM call to save tokens/cost
- **SSE events**:
  ```
  data: {"type":"step","agent":"reader","icon":"📖","message":"تحليل البحث 1 من 12...","progress":"1/12"}
  data: {"type":"step","agent":"reader","icon":"📖","message":"تحليل البحث 4 من 12...","progress":"4/12"}
  ...
  data: {"type":"step","agent":"reader","icon":"📖","message":"اكتمل تحليل 12 بحث"}
  ```

#### Agent 5: Writer Agent (✍️ The Composer)

- **When**: After Reader completes
- **Prompt**: `playground_prompts/related_work/compose_related_work.md`
- **Input**: Topic + title + clarifications + all paper analyses + language + citation style
- **Output**: Cohesive academic Related Work section with [1][2]... inline citations and reference list
- **Content rules**:
  - Respect user's language (Arabic topic → Arabic output)
  - Group papers by theme/methodology (not one-by-one summaries)
  - Identify research gaps and trends
  - Academic tone, formal register
  - Cite only provided papers — never hallucinate references
- **SSE events** (content streams chunk by chunk):
  ```
  data: {"type":"step","agent":"writer","icon":"✍️","message":"إنشاء قسم الدراسات المرجعية..."}
  data: {"type":"content","chunk":"## الدراسات المرجعية\n\nتناولت عدة دراسات"}
  data: {"type":"content","chunk":" تأثير الذكاء الاصطناعي على التعليم العالي من زوايا مختلفة. "}
  ...
  data: {"type":"content","chunk":"\n\n### المراجع\n[1] أحمد، م. (2024)..."}
  data: {"type":"complete","project":{"slug":"a1b2c3","title":"...","wordCount":1500,"referencesCount":12}}
  ```

---

### Prompt Files

All stored in `playground_prompts/related_work/` (isolated directory). Each follows the existing format: `# System Prompt` + `# User Prompt` sections with `{{variable}}` placeholders.

| File | Purpose | Input Variables | Output Format |
|------|---------|-----------------|---------------|
| `clarify.md` | Analyze topic, ask clarifying questions | `{{topic}}`, `{{language}}` | JSON: `{sufficient, clarifications[], initial_assessment}` |
| `search_keywords.md` | Extract academic search keywords | `{{topic}}`, `{{clarifications}}`, `{{language}}` | JSON: array of keywords (bilingual) |
| `relevance_score.md` | Score papers for relevance | `{{topic}}`, `{{clarifications}}`, `{{papers}}` | JSON: `[{index, score, reason}]` |
| `analyze_paper.md` | Extract contribution/methodology/findings | `{{topic}}`, `{{papers}}` (with optional `full_content` from content_md) | JSON: `{contribution, methodology, findings, relation_to_topic, limitations}` |
| `compose_related_work.md` | Compose cohesive Related Work section | `{{topic}}`, `{{title}}`, `{{clarifications}}`, `{{papers_analysis}}`, `{{language}}`, `{{citation_style}}`, `{{focus_options}}`, `{{target_length}}` | Academic text with [1][2] citations + reference list |

**Prompt loading**: Via `AzureOpenAIService::loadPrompt()` — reads markdown file, regex-extracts system/user prompt sections, substitutes `{{variables}}`.

---

### Credit System

| Aspect | Detail |
|--------|--------|
| **Cost** | 20 credits per generation |
| **Operation type** | `related_work` (added to `UsageMonitorService::OPERATION_CREDITS`) |
| **Deduction timing** | Upfront at generation start (before SSE stream) |
| **Failure handling** | Full refund via `PlaygroundSubscription::refundCredits()` |
| **Access check** | `UsageMonitorService::canUserMakeRequest($user, 'related_work')` |
| **Concurrent limit** | Max 2 active generations per user |

**Credit flow**:
```
User clicks Generate
  → canUserMakeRequest() → check subscription + credits ≥ 20
  → deductCredits() → 20 credits reserved
  → Start SSE stream (5-agent pipeline)
  → On success: log usage, send email
  → On failure: refundCredits(20), log error, send failure email
```

---

### Email Notifications

Reuse OCR email template design (RTL, purple gradient header, CTA button).

#### Completion Email — `templates/emails/related_work_complete.html.twig`

- **Subject (AR)**: `شمرا أكاديميا - تم إنشاء قسم الدراسات المرجعية`
- **Subject (EN)**: `Shamra Academia - Your Related Work is Ready`
- **Variables**: `user.firstName`, `project.title`, `project.references|length`, `project.metadata.word_count`, `viewUrl`
- **CTA button**: "عرض الدراسات المرجعية" → `/related-work/{slug}`

#### Failure Email — `templates/emails/related_work_failed.html.twig`

- **Subject**: `شمرا أكاديميا - فشل إنشاء الدراسات المرجعية`
- **Body**: Error reason (if available) + credits refund notice
- **CTA button**: "حاول مرة أخرى" → `/related-work`

---

## API Reference

### Routes

| Route | Method | Purpose | Auth | Response |
|-------|--------|---------|------|----------|
| `/related-work` | GET | Main page (project list + create form) | ROLE_USER + subscription | HTML |
| `/related-work/{slug}` | GET | View single project | ROLE_USER | HTML |
| `/related-work/api/create` | POST | Create project | ROLE_USER + subscription | `{slug, project}` |
| `/related-work/api/clarify` | POST | Run Clarification Agent | ROLE_USER + subscription | `{sufficient, clarifications[]}` |
| `/related-work/api/{slug}/generate` | POST | Start generation (SSE) | ROLE_USER + subscription | `text/event-stream` |
| `/related-work/api/{slug}` | GET | Get project data | ROLE_USER | `{project}` |
| `/related-work/api/{slug}/update` | PUT | Update title/content | ROLE_USER | `{project}` |
| `/related-work/api/{slug}` | DELETE | Delete project | ROLE_USER | `{success}` |
| `/related-work/api/projects` | GET | List user projects | ROLE_USER | `{projects[], total}` |

### Request/Response Examples

**Create Project**:
```json
// POST /related-work/api/create
// Request:
{
  "topic": "تأثير الذكاء الاصطناعي على التعليم العالي",
  "language": "ar",
  "options": {
    "max_papers": 15,
    "target_length": "long",
    "review_only": false,
    "include_methodology": true,
    "include_findings": true,
    "find_gaps": true
  }
}
// Note: "options" is accepted only for paid tier users; ignored for trial/demo.

// Response (201):
{
  "slug": "a1b2c3d4e5f6g7h8",
  "project": {
    "id": 1,
    "slug": "a1b2c3d4e5f6g7h8",
    "title": "تأثير الذكاء الاصطناعي على التعليم العالي",
    "status": "draft",
    "language": "ar",
    "createdAt": "2026-04-02T10:00:00Z"
  }
}
```

**Clarify**:
```json
// POST /related-work/api/clarify
// Request:
{ "topic": "تأثير الذكاء الاصطناعي على التعليم العالي", "language": "ar" }

// Response:
{
  "sufficient": false,
  "clarifications": [
    { "id": "field", "question": "ما المجال المحدد؟ (علوم الحاسوب، التربية، الهندسة...)", "hint": "حدد التخصص الأكاديمي" },
    { "id": "scope", "question": "هل تريد التركيز على فترة زمنية محددة؟", "hint": "مثلاً: آخر 5 سنوات" },
    { "id": "methodology", "question": "هل تبحث عن منهجيات معينة؟", "hint": "تجريبية، نظرية، مراجعة منهجية" },
    { "id": "focus", "question": "ما الجوانب التي تهمك أكثر؟", "hint": "مثلاً: التعلم الآلي، معالجة اللغات، الروبوتات" }
  ],
  "initial_assessment": "موضوع واسع يتطلب تحديد نطاق أدق للحصول على نتائج مركزة"
}
```

**Generate (SSE stream)**:
```
// POST /related-work/api/{slug}/generate
// Request:
{ "clarifications": [{"id":"field","answer":"علوم الحاسوب والتربية"}, ...] }

// Response: text/event-stream
data: {"type":"step","agent":"scout","icon":"🔍","message":"استخراج الكلمات المفتاحية..."}

data: {"type":"step","agent":"scout","icon":"🔍","message":"البحث في شمرا أكاديميا..."}

data: {"type":"step","agent":"scout","icon":"🔍","message":"تم العثور على 23 بحث","papers_found":23}

data: {"type":"step","agent":"curator","icon":"📋","message":"تقييم صلة الأبحاث بموضوعك..."}

data: {"type":"step","agent":"curator","icon":"📋","message":"تمت تصفية 12 بحث ذو صلة","relevant":12,"total":23}

data: {"type":"step","agent":"reader","icon":"📖","message":"تحليل البحث 3 من 12","progress":"3/12"}

data: {"type":"step","agent":"reader","icon":"📖","message":"اكتمل تحليل 12 بحث"}

data: {"type":"step","agent":"writer","icon":"✍️","message":"إنشاء قسم الدراسات المرجعية..."}

data: {"type":"content","chunk":"## الدراسات المرجعية\n\nتناولت عدة دراسات"}

data: {"type":"content","chunk":" تأثير الذكاء الاصطناعي على التعليم العالي من زوايا مختلفة..."}

data: {"type":"complete","project":{"slug":"a1b2c3d4","title":"...","wordCount":1500,"referencesCount":12}}
```

---

## Frontend Design

### Homepage Button

Add a third button in the `.search-mode-toggle` div (file: `homepage_new.html.twig`, line ~1037):

```html
<div class="search-mode-toggle">
    <button type="button" class="search-mode-btn active" data-mode="search">
        <i class="fa fa-search"></i> بحث
    </button>
    <button type="button" class="search-mode-btn chat-mode" data-mode="chat">
        <i class="fa fa-magic"></i> اسألني
    </button>
    <a href="/related-work" class="search-mode-btn related-work-mode">
        <i class="fa fa-sitemap"></i> دراسات مرجعية
    </a>
</div>
```

This is a link (not a mode toggle) — clicking navigates to the dedicated page.

### Main Page Layout (`/related-work`)

Inspired by Elicit and Perplexity — three-panel layout:

```
┌──────────────────────────────────────────────────────────────────┐
│  Header / Nav                                                     │
├────────────┬──────────────────────────────┬───────────────────────┤
│            │                              │                       │
│  Projects  │     Main Content Area        │    References         │
│  Sidebar   │                              │    Panel              │
│            │  ┌─────────────────────────┐  │                       │
│  ┌──────┐  │  │ Title (editable)        │  │  ┌─────────────────┐ │
│  │ Proj │  │  │                         │  │  │ Paper Card 1    │ │
│  │  1   │  │  │ Agent Steps Timeline    │  │  │ Score: 0.92     │ │
│  │ ✅   │  │  │ 🔍 → 📋 → 📖 → ✍️     │  │  │ [shamra]        │ │
│  ├──────┤  │  │                         │  │  ├─────────────────┤ │
│  │ Proj │  │  │ Related Work Content    │  │  │ Paper Card 2    │ │
│  │  2   │  │  │ (streaming / rendered)  │  │  │ Score: 0.87     │ │
│  │ ⏳   │  │  │                         │  │  │ [arxiv]         │ │
│  ├──────┤  │  │                         │  │  ├─────────────────┤ │
│  │ Proj │  │  │                         │  │  │ Paper Card 3    │ │
│  │  3   │  │  │                         │  │  │ Score: 0.81     │ │
│  │ 📝   │  │  │                         │  │  │ [user_lib]      │ │
│  └──────┘  │  │                         │  │  └─────────────────┘ │
│            │  └─────────────────────────┘  │                       │
│  [+ New]   │  [Copy] [Download DOCX]       │  12 references        │
│            │                              │                       │
├────────────┴──────────────────────────────┴───────────────────────┤
│  Footer                                                           │
└──────────────────────────────────────────────────────────────────┘
```

#### Left Sidebar — Project List
- Scrollable list of user's projects
- Each item: title (truncated), date, status badge (✅/⏳/❌/📝)
- Click to navigate/switch
- "+ جديد" (New) button at bottom
- Search/filter input for many projects

#### Center — Main Content
- **No project selected**: Create form with topic textarea + language toggle + "إنشاء" button
- **Draft project**: Clarification Q&A form + Reference Library Tip (see below) + "توليد" (Generate) button
- **Processing project**: Agent steps timeline animating in + content streaming with typing animation
- **Completed project**: Editable title (inline), rendered content (editable textarea), action buttons (Copy, Download DOCX, Delete)
- **Failed project**: Error message + retry button

#### Right Panel — References (collapsible)
- Paper cards with: title, authors, year, source badge (shamra/arxiv/user), relevance score
- Click paper title → opens in new tab (Shamra research page or ArXiv)
- Populated after generation completes

#### Responsive Behavior
- Mobile: sidebar collapses to horizontal scrollable list at top
- Tablet: references panel collapses to toggle button
- RTL: Full support — Arabic projects render RTL, English projects render LTR

#### Pre-Generation Tip — "Boost Your Results" Banner

Displayed between the clarification answers and the Generate button. A soft, non-blocking info card that encourages users to enrich their reference library before generating.

**Design** (RTL example):
```
┌─────────────────────────────────────────────────────────────┐
│  📚  حسّن نتائجك!                                          │
│                                                             │
│  سنبحث تلقائياً في شمرا أكاديميا و ArXiv عن أبحاث ذات صلة. │
│                                                             │
│  💡 هل تعلم؟ يمكنك أيضاً رفع ملفات PDF أو استيراد           │
│  اقتباسات BibTeX إلى مكتبتك المرجعية، وسنستخدمها كمصادر    │
│  إضافية عند إنشاء الدراسات المرجعية.                        │
│                                                             │
│  📄 لديك حالياً 7 مراجع في مكتبتك.                          │
│                                                             │
│  ┌──────────────────────────┐                                │
│  │ 📂 فتح المكتبة المرجعية │  (opens /myreferences)         │
│  └──────────────────────────┘                                │
│                                                      [إخفاء]│
└─────────────────────────────────────────────────────────────┘
```

**English variant:**
```
┌─────────────────────────────────────────────────────────────┐
│  📚  Boost your results!                                    │
│                                                             │
│  We'll automatically search Shamra Academia & ArXiv for     │
│  relevant papers.                                           │
│                                                             │
│  💡 Did you know? You can also upload PDF files or import   │
│  BibTeX citations to your Reference Library — we'll use     │
│  them as additional sources when generating your study.      │
│                                                             │
│  📄 You currently have 7 references in your library.        │
│                                                             │
│  ┌──────────────────────────────┐                            │
│  │ 📂 Open Reference Library   │  (opens /myreferences)     │
│  └──────────────────────────────┘                            │
│                                                      [Hide] │
└─────────────────────────────────────────────────────────────┘
```

**Behavior:**
- Appears after clarification Q&A is completed, before the Generate button
- Shows the user's current reference count (fetched via existing `/myreferences/api/list` endpoint)
- "Open Reference Library" button opens `/myreferences` in a new tab so users don't lose their draft
- Dismissible via [Hide] link — remembers preference in `localStorage` (key: `rw_hide_ref_tip`)
- If user has 0 references: extra emphasis — "لم تضف أي مراجع بعد" / "You haven't added any references yet"
- If user returns from `/myreferences` (tab focus), auto-refresh the reference count
- Subtle design: light blue/purple background, rounded corners, no hard border — informational, not alarming

#### Agent Steps Timeline
```
🔍 البحث عن أبحاث — تم العثور على 23 بحث          ✅ 3.2s
📋 تصفية الأبحاث — 12 بحث ذو صلة من أصل 23        ✅ 4.1s
📖 تحليل الأبحاث — 12/12                            ✅ 18.5s
✍️ إنشاء قسم الدراسات المرجعية — 1,500 كلمة         ✅ 12.8s
```

When replaying on page load for completed projects, steps appear instantly (no animation). During live generation, each step animates in with a spinner → checkmark transition.

---

## Files To Create

| File | Purpose | Pattern/Reference |
|------|---------|-------------------|
| `src/Entity/RelatedWorkProject.php` | Entity | `OcrJob.php` (lifecycle) + `PlaygroundProject.php` (ownership) |
| `src/Repository/RelatedWorkProjectRepository.php` | Repository | `OcrJobRepository.php` |
| `src/Service/RelatedWorkService.php` | Project CRUD + emails | `OcrJobService.php` |
| `src/Service/Playground/Agent/RelatedWorkOrchestrator.php` | Multi-agent pipeline | `ResearchPlanOrchestrator.php` |
| `src/Controller/RelatedWorkController.php` | Routes & API | `OcrController.php` + `ResearchChatController.php` |
| `templates/related_work/index.html.twig` | Main page template | New (Elicit/Perplexity-inspired) |
| `templates/emails/related_work_complete.html.twig` | Completion email | `templates/emails/ocr_complete.html.twig` |
| `templates/emails/related_work_failed.html.twig` | Failure email | `templates/emails/ocr_failed.html.twig` |
| `playground_prompts/related_work/clarify.md` | Clarification prompt | `initiate_research_plan.md` |
| `playground_prompts/related_work/search_keywords.md` | Keyword extraction | `research_chat_keywords_prompt.md` |
| `playground_prompts/related_work/relevance_score.md` | Paper scoring | `research_chat_filter_prompt.md` |
| `playground_prompts/related_work/analyze_paper.md` | Paper analysis | New |
| `playground_prompts/related_work/compose_related_work.md` | Final composition | New |
| `translations/RelatedWork.ar.yml` | Arabic translations | New |
| `translations/RelatedWork.en.yml` | English translations | New |
| `migrations/VersionYYYYMMDDHHMMSS.php` | DB migration | Existing pattern |

## Files To Modify

| File | Change | Line/Section |
|------|--------|-------------|
| `homepage_new.html.twig` | Add "دراسات مرجعية" button in `.search-mode-toggle` | ~line 1037 |
| `src/Service/UsageMonitorService.php` | Add `'related_work' => 20` to `OPERATION_CREDITS` | Constants section |
| `templates/header.html.twig` | Add nav link (near OCR and Playground links) | Tools dropdown |

## Services Reused (read-only)

| Service | Usage |
|---------|-------|
| `AzureOpenAIService` | LLM calls for all 5 agents + `loadPrompt()` |
| `PlaygroundRAGService` | `searchShamra()` + ArXiv search via Scout agent |
| `ReferenceIndexingService` | `search()` for user reference library via Scout agent |
| `UsageMonitorService` | Credit check, deduction, refund, usage logging |
| `PlaygroundSubscriptionService` | Subscription validation |

---

## Implementation Phases

### Phase 1: Foundation (Entity + Migration + Prompts)
1. Create `RelatedWorkProject` entity with all fields
2. Create repository with query methods
3. Create migration (`utf8mb4_unicode_ci` collation)
4. Write all 5 prompt files in `playground_prompts/related_work/`
5. Add `'related_work' => 20` to `UsageMonitorService`

### Phase 2: Backend (Orchestrator + Service + Controller)
6. Create `RelatedWorkOrchestrator` — implement 5-agent pipeline with `\Generator` yielding SSE events
7. Create `RelatedWorkService` — project CRUD, lifecycle management, email sending
8. Create `RelatedWorkController` — all 9 routes, SSE streaming endpoint

### Phase 3: Frontend (Templates + JS + CSS)
9. Create `templates/related_work/index.html.twig` — full page with three-panel layout
10. Implement SSE JavaScript (EventSource API, agent timeline animations, content streaming)
11. Add homepage button in `homepage_new.html.twig`
12. Add nav link in `templates/header.html.twig`

### Phase 4: Notifications + Polish
13. Create email templates (completion + failure)
14. Create translation files (AR + EN)
15. Test full flow end-to-end
16. Mobile responsive testing

---

## Design Decisions

| Decision | Choice | Rationale |
|----------|--------|-----------|
| Entry point | Dedicated `/related-work` page (link, not mode toggle) | Feature is too complex for an inline panel — needs project management, sidebar navigation, references panel |
| Processing model | SSE streaming (not async Messenger) | Live progress UX like Perplexity — users see agents working in real-time. Email still sent on completion for users who navigate away |
| Credit cost | 20 credits | Multiple AI agents involved (5 LLM calls minimum), significant token consumption. Comparable to 4 research chat questions |
| Paper sources | Shamra + ArXiv + User Reference Library | Maximum coverage — users benefit from their own uploaded papers being integrated |
| Prompt storage | `playground_prompts/related_work/` | Isolated directory keeps prompts organized and independently editable |
| Data model | Single entity with JSON columns | References and agent steps are read-heavy, write-once — JSON avoids unnecessary joins. No child entities needed |
| Content editing | Simple textarea with auto-save | Rich-text editor (CKEditor/TinyMCE) can be Phase 2. Markdown rendering covers most formatting needs |
| Download format | DOCX via Pandoc (day 1) | Reuses existing Pandoc infrastructure from OCR. PDF download can be Phase 2 |

---

## Verification Checklist

- [ ] Homepage button renders correctly and links to `/related-work`
- [ ] Unauthenticated users redirected to login
- [ ] Unsubscribed users see upgrade prompt
- [ ] Create project → status = draft → appears in sidebar
- [ ] Clarify endpoint returns meaningful questions for vague topics
- [ ] Clarify endpoint returns `sufficient=true` for specific topics
- [ ] Generate endpoint streams SSE events correctly
- [ ] Agent steps appear in order with correct icons and messages
- [ ] Content streams in with typing animation
- [ ] Completed project: title editable, content editable, copy works, DOCX download works
- [ ] References sidebar shows correct papers with source badges
- [ ] 20 credits deducted at generation start
- [ ] Credits refunded on failure
- [ ] Denied if insufficient credits (< 20)
- [ ] Completion email arrives with correct subject, title, CTA link
- [ ] Failure email arrives with error reason + refund notice
- [ ] Arabic project: content in Arabic, UI RTL, citations in Arabic
- [ ] English project: content in English, UI LTR, citations in English
- [ ] Multiple projects: sidebar lists all, switching works
- [ ] Mobile: sidebar collapses, layout responsive
- [ ] Concurrent limit: max 2 active generations per user
- [ ] DB migration runs cleanly with `utf8mb4_unicode_ci` collation
- [ ] Connection drop mid-stream: refresh shows partial state, can retry

---

## Future Enhancements

Sorted by priority — high-impact features that differentiate the product first, then quality/retention improvements, then polish.

### Priority 1 — Core Pipeline Upgrades (promote to Phase 1.5)

These should ship soon after launch — they fundamentally improve output quality and user trust.

#### 1. Paper Review Step (Include/Exclude toggle) ⭐
**Impact: Critical** | **Effort: Medium**

After the Curator agent filters papers, pause the pipeline and show users the candidate list *before* the Writer runs. Users can:
- **Exclude** irrelevant papers (uncheck)
- **Pin** "must-include" papers (star)
- **See why** each paper was selected (relevance reason from Curator)

This gives researchers control and trust — critical for academic tools. Without it, users may distrust the AI's paper selection.

**UX**: Insert a review step between Curator and Reader in the SSE stream. The stream pauses, UI shows paper cards with toggles, user clicks "Continue" to resume.

**API change**: `POST /related-work/api/{slug}/generate` becomes two-phase — first half runs Scout+Curator, returns candidate list. `POST /related-work/api/{slug}/continue` resumes with user's selections.

#### 2. Semantic Outline Before Writing ⭐
**Impact: Critical** | **Effort: Medium**

After the Reader agent analyzes papers, show a proposed thematic outline before the Writer composes:
```
📋 المخطط المقترح:
├── المحور 1: التعلم الآلي في التشخيص الطبي (4 أبحاث)
├── المحور 2: معالجة اللغات الطبيعية للسجلات الطبية (3 أبحاث)
├── المحور 3: الروبوتات في الجراحة (2 أبحاث)
├── الفجوات البحثية المحددة
└── الخلاصة والاتجاهات المستقبلية
```

Users can reorder themes, rename them, or remove a theme entirely. Then the Writer follows the approved outline.

**Why**: Avoids the "AI wrote something I don't want" problem. Researchers know what they'll get before spending the final (most expensive) agent call.

#### 3. Cross-Language Search
**Impact: High** | **Effort: Low**

If user writes topic in Arabic, the Scout agent should also translate keywords to English and search both `arabic_research` and `english_research` ES indices (and vice versa). The Composer still writes in the user's chosen language, but draws from a wider paper pool.

This is low effort because both indices already exist and `PlaygroundRAGService` supports both. Just add keyword translation in the Scout agent's keyword extraction prompt.

#### 4. Gap Analysis Section
**Impact: High** | **Effort: Low**

After the Writer finishes the main content, auto-generate a final section:
```
### الفجوات البحثية
بناءً على مراجعة الأدبيات السابقة، تم تحديد الفجوات التالية:
1. لا توجد دراسات تتناول X في سياق Y
2. معظم الدراسات ركزت على Z ولم تشمل W
3. هناك نقص في الأبحاث التجريبية حول...
```

Extremely valuable for thesis writers — most tools don't do this. Low effort because the Reader agent already has all the analysis data; just add a gap extraction instruction to the Writer prompt.

#### 5. Iterative Refinement ("Regenerate Section")
**Impact: High** | **Effort: Medium**

After generation, let users select a paragraph/theme and request:
- "Expand this section with more papers"
- "Add more papers about X"
- "This section is weak, find more sources on Y"
- "Rewrite this paragraph in a more formal tone"

Only the Writer agent re-runs on the selected section (not the full pipeline). Costs **5 credits** instead of 20.

**New prompt**: `playground_prompts/related_work/refine_section.md`
**New API**: `POST /related-work/api/{slug}/refine` with `{section_index, instruction}`

---

### Priority 2 — Quality & Retention Features

#### 6. "Add to My Library" on Each Reference Card
**Impact: Medium** | **Effort: Low**

After generation, each paper card in the right panel gets a one-click "📥 Save to Library" button. Uses existing `POST /myreferences/api/upload` bulk save endpoint.

Creates a virtuous loop: generate → discover papers → save to library → next generation uses them as "user reference" sources. Drives reference library adoption.

#### 7. Confidence Indicators Per Citation
**Impact: Medium** | **Effort: Low**

In the references panel, show *why* each paper was included:
- "Directly addresses your topic" (score 0.8-1.0) — green badge
- "Provides methodological context" (score 0.6-0.8) — blue badge
- "Background reference" (score 0.5-0.6) — gray badge

Data already available from the Curator agent's relevance scoring. Just surface it in the UI.

#### 8. Output Style/Template Presets
**Impact: Medium** | **Effort: Low**

Let users choose output style before generation:
- **📖 Thesis Chapter** — verbose, 2000+ words, detailed per-paper analysis, comprehensive gaps section
- **📄 Journal Article** — concise, 800-1200 words, thematic grouping, focused gaps
- **📋 Conference Paper** — brief, 500-800 words, key highlights only

Implemented by adding a `{{style}}` variable to the Writer prompt. No pipeline changes needed.

#### 9. Citation Style Selector
**Impact: Medium** | **Effort: Low**

Let users choose APA/MLA/Chicago/Harvard/IEEE/Vancouver before generation. Already supported in the research plan pipeline — reuse the same citation style parameter in the Writer prompt.

#### 10. Semantic Scholar as 4th Source
**Impact: Medium** | **Effort: Medium**

Add [Semantic Scholar API](https://api.semanticscholar.org/) as a fourth paper source in the Scout agent. Free API, good coverage, especially for non-Arabic topics. Better than Google Scholar (which requires scraping).

**New service**: `SemanticScholarService.php` — simple HTTP client for `/graph/v1/paper/search` endpoint.

---

### Priority 3 — Resilience & Polish

#### 11. Persistent Generation (Survive Tab Close)
**Impact: Medium** | **Effort: Medium**

If the user closes the tab mid-stream, the generation keeps running server-side. The orchestrator saves progress to DB after each agent step. When the user returns:
- Completed steps shown instantly
- Partial content displayed
- If still running: SSE stream reconnects to the in-progress generation

**Implementation**: Hybrid model — orchestrator writes to DB after each agent, SSE is just a "view" layer. New endpoint: `GET /related-work/api/{slug}/stream` reconnects to an in-progress generation.

#### 12. Shareable Read-Only Link
**Impact: Medium** | **Effort: Low**

Generate a public `/related-work/view/{slug}` URL (no auth required). Users can share with advisors or colleagues.
- "📋 Copy Link" button in project actions
- Read-only view: rendered content + references sidebar, no edit controls
- Optional: password-protect via a simple token query param

#### 13. Duplicate Project
**Impact: Low** | **Effort: Very Low**

One-click "📄 Clone" button to create a new project with the same topic and clarification answers. Useful for:
- Regenerating with different style/citation preferences
- Creating variations with modified scope
- A/B testing different clarification answers

**API**: `POST /related-work/api/{slug}/duplicate` — copies topic, clarifications, language; resets status to draft.

#### 14. PDF Download
**Impact: Low** | **Effort: Low**

Add PDF export alongside DOCX. Uses existing Pandoc infrastructure from OCR (xelatex or wkhtmltopdf fallback). References formatted as proper bibliography.

#### 15. Rich-Text Editor
**Impact: Low** | **Effort: Medium**

Replace simple textarea with CKEditor/TinyMCE for formatted content editing. Enables bold, headings, lists, tables without writing markdown. Low priority because markdown rendering already covers most needs.

#### 16. Admin Dashboard Tab
**Impact: Low** | **Effort: Low**

Add "Related Work" stats tab to the admin dashboard (`/jim19ud83/playground/dashboard`):
- Total projects generated, success/failure rate
- Average generation time, popular topics
- Credit revenue from this feature
- Papers per generation distribution

#### 17. Integration with Playground
**Impact: Low** | **Effort: Low**

"Import to Playground" button — creates a new `PlaygroundProject` with the Related Work content pre-filled. Users can then continue editing in the full playground editor with chat, AI tools, etc.

---

### Priority 4 — Future Vision

#### 18. Auto-Suggest Related Topics
After generation, suggest 2-3 related research directions based on the identified gaps. Clicking a suggestion pre-fills a new project with that topic.

#### 19. Version History
Track edits to generated content. Show diff view between versions. Allow rollback. Useful for collaborative review workflows.

#### 20. Collaborative Editing
Multiple users working on the same project — shared access via invite link, real-time sync, comment threads on sections.

#### 21. Batch Generation
Generate Related Work for multiple topics at once — useful for literature review courses or research groups. Queue-based via Symfony Messenger.

#### 22. Research Network Visualization
After generation, show a visual graph of paper relationships — which papers cite each other, thematic clusters, chronological timeline. Uses D3.js or similar.
