# Trending Research — Algorithm & Architecture

## Overview

The "Trending Now" feature surfaces research articles that are **gaining unusual attention right now**, replacing the old "Most Read" tab that always showed the same all-time popular articles. Inspired by Google Search Console's "Trending Up" metric.

---

## Algorithm: Velocity-Based Scoring

```
velocity_score = (views_24h)² / (views_7d + 1)
```

- **Squaring 24h views** heavily rewards recent bursts of activity
- **Dividing by 7-day baseline** normalizes against always-popular articles
- A long-tail article with 2 all-time views that suddenly gets 8 today scores **higher** than a popular article with 50k views that got 200 today

### Qualification Threshold

- Minimum **2 unique views in the last 24 hours** to qualify
- Session-based deduplication: repeat views from the same session don't count
- Top **20 articles per language** (Arabic / English) are stored

### Hybrid Backfill

If fewer than 20 articles qualify by velocity, the remaining slots are filled with **top all-time popular articles** (sorted by cumulative `hits` from Elasticsearch), excluding any already shown by velocity. This ensures the tab always shows 20 articles, even during cold-start or low-traffic periods.

---

## Data Flow

```
User views article
        │
        ▼
ResearchController::showAction()
        │
        ├── Session check (dedup: one view per article per session)
        │
        ├── hitsUpdater() → increments cumulative `hits` in MySQL + ES
        │
        └── TrendingResearchService::logView() → INSERT into research_view_log
                                                  (slug, es_index, viewed_at)

Every 30 minutes (cron):
        │
        ▼
app:compute-trending command
        │
        ├── Query research_view_log for 24h and 7d counts per slug
        ├── Calculate velocity_score for each slug with ≥2 views/24h
        ├── Store top 20 per locale in trending_research_cache table
        └── Old view logs purged daily (entries older than 30 days)

User visits homepage:
        │
        ▼
HomepageController
        │
        ├── Read trending_research_cache (precomputed slugs + scores)
        ├── Fetch those articles from Elasticsearch by slug list
        ├── Preserve velocity score ordering
        ├── Backfill remaining slots with top all-time hits
        └── Cache result in APCu for 30 minutes
```

---

## Database Tables

### `research_view_log`

Stores individual view events for trending computation.

| Column | Type | Description |
|--------|------|-------------|
| `id` | BIGINT AUTO_INCREMENT | Primary key |
| `slug` | VARCHAR(255) | Research article slug |
| `es_index` | VARCHAR(50) | `arabic_research` or `english_research` |
| `viewed_at` | DATETIME | Timestamp of the view |

**Indexes:** `idx_slug_viewed_at` (slug, viewed_at), `idx_viewed_at` (viewed_at)

### `trending_research_cache`

Precomputed trending results, written by cron, read by web.

| Column | Type | Description |
|--------|------|-------------|
| `id` | INT AUTO_INCREMENT | Primary key |
| `locale` | VARCHAR(10) | `arabic` or `english` |
| `slug` | VARCHAR(255) | Research article slug |
| `views_24h` | INT | Views in last 24 hours |
| `views_7d` | INT | Views in last 7 days |
| `velocity_score` | DOUBLE | Computed score |
| `sort_order` | INT | Rank (0 = highest velocity) |
| `computed_at` | DATETIME | When this was last computed |

**Index:** `idx_locale_sort` (locale, sort_order)

> **Why a DB table instead of APCu?** The cron command runs in CLI mode where `apc.enable_cli=Off`, and even if enabled, CLI APCu uses a separate memory space from the web SAPI. A DB table is readable by both CLI (writer) and web (reader).

---

## Key Files

| File | Purpose |
|------|---------|
| `src/Service/TrendingResearchService.php` | Core service: `logView()`, `computeTrending()`, `getTrendingSlugs()`, `purgeOldEntries()` |
| `src/Command/ComputeTrendingCommand.php` | Console command `app:compute-trending` with `--purge` and `--purge-days` options |
| `src/syndex/AcademicBundle/Controller/HomepageController.php` | `indexAction()` (anonymous users), `homepageTabAction()` (AJAX tab), `getTrendingResults()` |
| `src/syndex/AcademicBundle/Controller/ResearchController.php` | `showAction()` — calls `logView()` after session dedup check |
| `migrations/Version20260302100000.php` | Creates `research_view_log` and `trending_research_cache` tables |

---

## Cron Jobs (www-data)

```cron
# Recompute trending scores every 30 minutes
0,30 * * * * cd /var/www/html/academia_v2 && php bin/console app:compute-trending --env=prod >> /var/www/html/academia_v2/var/log/trending.log 2>&1

# Daily purge of view logs older than 30 days (3:30 AM)
30 3 * * * cd /var/www/html/academia_v2 && php bin/console app:compute-trending --purge --env=prod >> /var/www/html/academia_v2/var/log/trending.log 2>&1
```

---

## Translations

| Key | English | Arabic |
|-----|---------|--------|
| `homepage.tab.trending` | Trending Now | رائج الآن |
| `homepage.trending.title` | Trending Research | أبحاث رائجة الآن |
| `homepage.trending.subtitle` | Research gaining unusual attention right now | أبحاث تشهد اهتماماً متزايداً مؤخراً |

Translation files: `src/syndex/AcademicBundle/Resources/translations/Academia.{en,ar}.yml`

---

## Monitoring & Debugging

```bash
# Check how many views are logged
sudo -u www-data php bin/console dbal:run-sql --env=prod \
  "SELECT es_index, COUNT(*) as total, COUNT(DISTINCT slug) as unique_slugs FROM research_view_log WHERE viewed_at > DATE_SUB(NOW(), INTERVAL 24 HOUR) GROUP BY es_index"

# See current trending cache
sudo -u www-data php bin/console dbal:run-sql --env=prod \
  "SELECT * FROM trending_research_cache ORDER BY locale, sort_order"

# Force recompute
sudo -u www-data php bin/console app:compute-trending --env=prod

# Recompute + purge old logs
sudo -u www-data php bin/console app:compute-trending --purge --purge-days=30 --env=prod

# Check trending log
tail -50 /var/www/html/academia_v2/var/log/trending.log
```

---

## Design Decisions

1. **Velocity over raw counts**: Prevents popular articles from permanently dominating
2. **Session-based dedup**: Same dedup as the existing `hits` counter — one view per session per article
3. **DB cache over APCu**: CLI/web APCu isolation makes APCu unusable for cron-computed data
4. **Hybrid backfill**: Ensures the tab always shows 20 articles even with little trending data
5. **30-minute compute cycle**: Balances freshness with DB load; results are also APCu-cached for 30 min on the web side
6. **30-day purge**: Keeps `research_view_log` compact; only recent data matters for trending
