Scheduling & Event Ingestion

Overview

The Infomaxim API includes two key background processing services for maintaining data freshness and extracting event information:

  • Stock Scheduler - Automatically refreshes product stock levels at configurable intervals
  • Event Ingestion Service - Extracts detailed event information from websites using AI

Both services run as background processes and require proper environment configuration to function. They work together during each scheduler cycle to keep your event and product data current.

Note: Both services require database connectivity and appropriate configuration values. The scheduler acts as the orchestrator, triggering both stock updates and event processing on each cycle.

Stock Scheduler

The Stock Scheduler is a background process that automatically runs stock update cycles at regular intervals. It's designed for multi-tenant environments and can process multiple application IDs in a single cycle.

Purpose

  • Maintain current product inventory levels
  • Support multi-tenant architectures with app-specific scheduling
  • Trigger event ingestion as part of each cycle
  • Configurable intervals with minimum safety thresholds

How It Works

  1. Scheduler starts on application initialization (if enabled)
  2. At each interval, retrieves the latest scheduler settings
  3. For each configured app ID:
    • Processes stock updates via the Stock Service
    • Triggers event ingestion for any pending events
  4. Logs cycle status and results
  5. Waits for the next interval

Minimum Interval: 60 seconds (enforced at runtime, even if configured lower)

Scheduler Configuration

Configure the Stock Scheduler directly in config/default.js (development) or config/production.js (production). The services read configuration exclusively through the config module — no environment variables are used for these settings.

Configuration Parameters

Property Type Default Description
stockScheduler.enabled boolean true Enable or disable the scheduler entirely
stockScheduler.intervalMinutes number 5 How often to run cycles (in minutes). Minimum: 1 minute (60 seconds enforced at runtime)
stockScheduler.appIds number[] [] (fallback to app.id) Array of app IDs to process (e.g., [8, 9, 42]). If empty, uses the main app ID from app.id.

Configuration File (config/default.js)

module.exports = {
  // ... other config ...

  stockScheduler: {
    enabled: true,
    intervalMinutes: 5,
    // Optional list of app IDs to process, e.g. [8, 9, 42];
    // defaults to app.id when empty.
    appIds: [],
  },

  // ... other config ...
};

Scheduler Configuration Examples

Example 1: Default Single-App Setup

Use the default 5-minute interval for your primary app. When appIds is empty the scheduler falls back to app.id:

// config/default.js
stockScheduler: {
  enabled: true,
  intervalMinutes: 5,
  appIds: [],  // falls back to app.id
},

Result: Scheduler runs every 5 minutes for the app ID defined in app.id


Example 2: Multi-Tenant Setup

Process multiple app IDs in a single scheduler cycle:

// config/production.js
stockScheduler: {
  enabled: true,
  intervalMinutes: 10,
  appIds: [8, 9, 42],
},

Result: Scheduler runs every 10 minutes, processing stock updates for app IDs 8, 9, and 42 in each cycle


Example 3: Frequent Updates

Update stock more frequently for real-time inventory:

// config/production.js
stockScheduler: {
  enabled: true,
  intervalMinutes: 1,  // minimum enforced at runtime
  appIds: [42],
},

Result: Scheduler runs every 1 minute (minimum enforced) for app ID 42


Example 4: Disable Scheduler

Turn off the scheduler (useful for testing or maintenance):

// config/default.js
stockScheduler: {
  enabled: false,
  intervalMinutes: 5,
  appIds: [],
},

Result: Scheduler is disabled; no cycles will run until re-enabled


Example 5: Production Multi-App Configuration

For a production environment managing multiple customer apps:

// config/production.js
stockScheduler: {
  enabled: true,
  intervalMinutes: 15,
  appIds: [10, 20, 30, 40, 50],
},

Result: Runs every 15 minutes for 5 different apps in production

Event Ingestion Service

The Event Ingestion Service automatically extracts detailed event information from website URLs using OpenAI's language models. It processes events queued in the aurora_events table and enriches them with AI-extracted data.

Purpose

  • Extract structured event data from unstructured website content
  • Validate and sanitize images and page content
  • Enforce size and length constraints for performance
  • Support multiple image formats (JPEG, PNG, GIF, WebP)
  • Integrate with the stock scheduler for automatic event processing

How It Works

  1. Service reads configuration on startup (API keys, size limits, model)
  2. On scheduler cycle or manual invocation:
    • Queries the aurora_events table for unprocessed events
    • For each event, fetches the associated website URL
    • Extracts images and page content with size validation
    • Sends content to OpenAI API for structured event extraction
    • Updates the event record with extracted details
  3. Handles network errors and validates DNS lookups
Requirement: OpenAI API key is mandatory. Without it, event ingestion cannot process any events. Requests will fail gracefully with informative error messages.

Event Ingestion Configuration

Configure the Event Ingestion Service directly in config/default.js (development) or config/production.js (production). Settings span two config keys: openAI for LLM connection details, and eventIngestion for content limits.

openAI Parameters

Property Type Default Description
openAI.apiKey string '' (required) Your OpenAI API key. Must be set for ingestion to work.
openAI.model string 'gpt-4.1-mini' The OpenAI model to use for event extraction
openAI.timeoutMs number 30000 (30 seconds) Timeout for OpenAI API requests in milliseconds

eventIngestion Parameters

Property Type Default Description
eventIngestion.enabled boolean true Enable or disable event ingestion
eventIngestion.imageMaxBytes number 10485760 (10 MB) Maximum image file size to process in bytes
eventIngestion.pageMaxChars number 120000 Maximum page content length to extract in characters

Supported Image Formats

  • JPEG (.jpg)
  • PNG (.png)
  • GIF (.gif)
  • WebP (.webp)

Configuration File (config/default.js)

module.exports = {
  // ... other config ...

  openAI: {
    apiKey: '',
    model: 'gpt-4.1-mini',
    timeoutMs: 30000,
  },

  eventIngestion: {
    enabled: true,
    imageMaxBytes: 10 * 1024 * 1024,
    pageMaxChars: 120000,
  },

  // ... other config ...
};

Event Ingestion Configuration Examples

Example 1: Basic Setup with OpenAI

Minimal configuration to enable event ingestion using default content limits:

// config/default.js
openAI: {
  apiKey: 'sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  model: 'gpt-4.1-mini',
  timeoutMs: 30000,
},
eventIngestion: {
  enabled: true,
  imageMaxBytes: 10 * 1024 * 1024,
  pageMaxChars: 120000,
},

Result: Event ingestion enabled with default limits (10 MB images, 120k chars for pages)


Example 2: Strict Size Limits for Performance

Reduce processing overhead by lowering image and content size limits:

// config/production.js
openAI: {
  apiKey: 'sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  model: 'gpt-4.1-mini',
  timeoutMs: 30000,
},
eventIngestion: {
  enabled: true,
  imageMaxBytes: 2 * 1024 * 1024,  // 2 MB
  pageMaxChars: 50000,
},

Result: Images limited to 2 MB, page content to 50,000 characters


Example 3: High-Volume Processing

Configure for higher-quality event extraction with larger content allowances:

// config/production.js
openAI: {
  apiKey: 'sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  model: 'gpt-4',
  timeoutMs: 60000,
},
eventIngestion: {
  enabled: true,
  imageMaxBytes: 50 * 1024 * 1024,  // 50 MB
  pageMaxChars: 500000,
},

Result: Higher limits (50 MB images, 500k chars), using GPT-4 for better quality, 60s timeout for longer processing


Example 4: Disable Event Ingestion

Turn off event ingestion (scheduler will still run for stock updates, but skip ingestion steps):

// config/default.js
eventIngestion: {
  enabled: false,
  imageMaxBytes: 10 * 1024 * 1024,
  pageMaxChars: 120000,
},

Result: Events in the queue are not processed


Example 5: Production Setup

Recommended configuration for production environments:

// config/production.js
openAI: {
  apiKey: 'sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  model: 'gpt-4.1-mini',
  timeoutMs: 45000,
},
eventIngestion: {
  enabled: true,
  imageMaxBytes: 10 * 1024 * 1024,
  pageMaxChars: 120000,
},

Result: Balanced configuration for production: reasonable size limits, gpt-4.1-mini for efficiency, 45s timeout with safety margin

Integration & Dependencies

Service Dependencies

The scheduler and event ingestion services have the following dependencies:

Service Depends On Purpose
Stock Scheduler Stock Service Processes stock updates for products
Stock Scheduler Event Ingestion Service Triggers event processing during each cycle
Event Ingestion Service aurora_events table Reads event records to process
Event Ingestion Service OpenAI API Extracts structured event data
Event Ingestion Service Network/DNS Fetches website content for processing

Combined Configuration Example

Complete config/production.js configuration for both scheduler and event ingestion:

module.exports = {
  app: {
    id: 42,
    // ... other app config ...
  },

  sqlDb: {
    database: 'infomaxim',
    server: 'your-sql-server.database.windows.net',
    user: 'admin',
    password: 'your-secure-password',
    // ... other db config ...
  },

  openAI: {
    apiKey: 'sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
    model: 'gpt-4.1-mini',
    timeoutMs: 30000,
  },

  stockScheduler: {
    enabled: true,
    intervalMinutes: 5,
    appIds: [42],
  },

  eventIngestion: {
    enabled: true,
    imageMaxBytes: 10 * 1024 * 1024,
    pageMaxChars: 120000,
  },
};

Data Flow

Scheduler Cycle Flow:
  1. Stock Scheduler starts (if enabled)
  2. Scheduler runs cycle every N minutes
  3. For each configured app ID:
    • Stock Service updates product inventory
    • Event Ingestion Service processes pending events from aurora_events table
    • OpenAI API extracts structured event data
    • Database records are updated with extracted information
  4. Cycle completes; logs are written
  5. Scheduler waits for next interval

Troubleshooting

Scheduler Not Running

Problem: Scheduler appears to be disabled or not executing cycles.

Solutions:

  • Check that stockScheduler.enabled is true in the active config file
  • Verify app IDs are set correctly: either populate stockScheduler.appIds or ensure app.id is a valid number
  • Ensure database connectivity is working (required to read configuration)
  • Check application logs for scheduler startup messages
  • Verify minimum interval constraint: stockScheduler.intervalMinutes must be at least 1 (60 seconds enforced at runtime)

Event Ingestion Failures

Problem: Events are not being processed or ingestion is failing silently.

Solutions:

  • Missing OpenAI API Key: Ensure openAI.apiKey is set in the config file. Ingestion cannot proceed without it.
  • API Key Exhausted: Check OpenAI account usage and billing. Rate limits or quota may be exceeded.
  • Timeout Issues: If processing large pages/images, increase openAI.timeoutMs in the config (e.g., 60000)
  • Content Too Large: Verify your events don't exceed the eventIngestion.imageMaxBytes or eventIngestion.pageMaxChars limits in the config
  • Network Issues: Check DNS resolution and outbound connectivity to OpenAI API and event URLs
  • Disabled Ingestion: Check that eventIngestion.enabled is true in the active config file

High OpenAI API Costs

Problem: Event ingestion is consuming too many API credits.

Solutions:

  • Reduce eventIngestion.imageMaxBytes in the config to process smaller images
  • Lower eventIngestion.pageMaxChars in the config to process shorter content
  • Switch to a more cost-efficient model (e.g., set openAI.model to 'gpt-4.1-mini' instead of 'gpt-4')
  • Increase the scheduler interval to reduce processing frequency (e.g., set stockScheduler.intervalMinutes to 30)
  • Disable ingestion by setting eventIngestion.enabled to false
  • Monitor and batch process events during off-peak hours

Database Connection Errors

Problem: Scheduler or event ingestion cannot connect to the database.

Solutions:

  • Verify database credentials in the config file: sqlDb.server, sqlDb.user, sqlDb.password
  • Ensure the aurora_events table exists in your database
  • Check network connectivity to the database server
  • Verify SQL Server is running and accepting connections on the configured port
  • Check firewall rules and IP whitelisting for the application server

Multi-App Scheduler Conflicts

Problem: When processing multiple app IDs, some apps are not being updated correctly.

Solutions:

  • Verify all IDs in stockScheduler.appIds are valid and exist in your system
  • Ensure each app has proper database access and permissions
  • Check logs to see which app IDs are being processed in each cycle
  • Increase stockScheduler.intervalMinutes in the config if processing is timing out
  • Consider splitting heavy processing across multiple scheduler instances