Why async?
AI model processing can take anywhere from a few seconds to several minutes depending on:- Model complexity and the specific endpoint
- Request parameters and input size
- Current system load
- Processing requirements
- Submit multiple jobs in parallel
- Check status at your own pace
- Stream real-time progress updates
- Build responsive UIs that don’t block
Job lifecycle
Every job moves through a series of states:| State | Description |
|---|---|
queued | Job is queued but not yet started |
processing | Actively processing your request |
succeeded | Completed successfully; outputs are ready |
failed | Encountered an error; check error field for details |
Checking job status
Polling with GET /jobs/:id
The simplest approach is to poll the job endpoint until it reaches a terminal state:Polling best practices:
- Start with 2-3 second intervals
- Use exponential backoff (double the interval after each check)
- Max out at 30-60 second intervals for long jobs
- Stop polling once status is
succeededorfailed
Streaming with GET /jobs/:id/stream
For real-time updates, use Server-Sent Events (SSE) streaming:| Event | Description |
|---|---|
status | Job state changed; includes progress (0.0-1.0) |
log | Human-readable progress message |
result | Output is ready (for endpoints with multiple results) |
end | Job reached terminal state (succeeded or failed) |