Files
livedash-node/docs/scheduler-workflow.md
Kaj Kowalski cd05fc8648 fix: resolve Prettier markdown code block parsing errors
- Fix syntax errors in skills markdown files (.github/skills, .opencode/skills)
- Change typescript to tsx for code blocks with JSX
- Replace ellipsis (...) in array examples with valid syntax
- Separate CSS from TypeScript into distinct code blocks
- Convert JavaScript object examples to valid JSON in docs
- Fix enum definitions with proper comma separation
2026-01-20 21:09:29 +01:00

216 lines
4.5 KiB
Markdown

# Scheduler Workflow Documentation
## Overview
The LiveDash system has two main schedulers that work together to fetch and process session data:
1. **Session Refresh Scheduler** - Fetches new sessions from CSV files
2. **Processing Scheduler** - Processes session transcripts with AI
## Current Status (as of latest check)
- **Total sessions**: 107
- **Processed sessions**: 0
- **Sessions with transcript**: 0
- **Ready for processing**: 0
## How the `processed` Field Works
The ProcessingScheduler picks up sessions where `processed` is **NOT** `true`, which includes:
- `processed = false`
- `processed = null`
**Query used:**
```javascript
{
processed: {
not: true;
}
} // Either false or null
```
## Complete Workflow
### Step 1: Session Refresh (CSV Fetching)
**What it does:**
- Fetches session data from company CSV URLs
- Creates session records in database with basic metadata
- Sets `transcriptContent = null` initially
- Sets `processed = null` initially
**Runs:** Every 30 minutes (cron: `*/30 * * * *`)
### Step 2: Transcript Fetching
**What it does:**
- Downloads full transcript content for sessions
- Updates `transcriptContent` field with actual conversation data
- Sessions remain `processed = null` until AI processing
**Runs:** As part of session refresh process
### Step 3: AI Processing
**What it does:**
- Finds sessions with transcript content where `processed != true`
- Sends transcripts to OpenAI for analysis
- Extracts: sentiment, category, questions, summary, etc.
- Updates session with processed data
- Sets `processed = true`
**Runs:** Every hour (cron: `0 * * * *`)
## Manual Trigger Commands
### Check Current Status
```bash
node scripts/manual-triggers.js status
```
### Trigger Session Refresh (Fetch new sessions from CSV)
```bash
node scripts/manual-triggers.js refresh
```
### Trigger AI Processing (Process unprocessed sessions)
```bash
node scripts/manual-triggers.js process
```
### Run Both Schedulers
```bash
node scripts/manual-triggers.js both
```
## Troubleshooting
### No Sessions Being Processed?
1. **Check if sessions have transcripts:**
```bash
node scripts/manual-triggers.js status
```
2. **If "Sessions with transcript" is 0:**
- Sessions exist but transcripts haven't been fetched yet
- Run session refresh: `node scripts/manual-triggers.js refresh`
3. **If "Ready for processing" is 0 but "Sessions with transcript" > 0:**
- All sessions with transcripts have already been processed
- Check if `OPENAI_API_KEY` is set in environment
### Common Issues
#### "No sessions found requiring processing"
- All sessions with transcripts have been processed (`processed = true`)
- Or no sessions have transcript content yet
#### "OPENAI_API_KEY environment variable is not set"
- Add OpenAI API key to `.env.development` file
- Restart the application
#### "Error fetching transcript: Unauthorized"
- CSV credentials are incorrect or expired
- Check company CSV username/password in database
## Database Field Mapping
### Before AI Processing
```json
{
"id": "session-uuid",
"transcriptContent": "full conversation text or null",
"processed": null,
"sentimentCategory": null,
"questions": null,
"summary": null
}
```
### After AI Processing
```json
{
"id": "session-uuid",
"transcriptContent": "full conversation text",
"processed": true,
"sentimentCategory": "positive",
"questions": "[\"question 1\", \"question 2\"]",
"summary": "Brief conversation summary",
"language": "en",
"messagesSent": 5,
"sentiment": 0.8,
"escalated": false,
"forwardedHr": false,
"category": "Schedule & Hours"
}
```
## Scheduler Configuration
### Session Refresh Scheduler
- **File**: `lib/scheduler.js`
- **Frequency**: Every 30 minutes
- **Cron**: `*/30 * * * *`
### Processing Scheduler
- **File**: `lib/processingScheduler.js`
- **Frequency**: Every hour
- **Cron**: `0 * * * *`
- **Batch size**: 10 sessions per run
## Environment Variables Required
```bash
# Database
DATABASE_URL="postgresql://..."
# OpenAI (for processing)
OPENAI_API_KEY="sk-..."
# NextAuth
NEXTAUTH_SECRET="..."
NEXTAUTH_URL="http://localhost:3000"
```
## Next Steps for Testing
1. **Trigger session refresh** to fetch transcripts:
```bash
node scripts/manual-triggers.js refresh
```
2. **Check status** to see if transcripts were fetched:
```bash
node scripts/manual-triggers.js status
```
3. **Trigger processing** if transcripts are available:
```bash
node scripts/manual-triggers.js process
```
4. **View results** in the dashboard session details pages