HTTP Pagination Loop Feature Design
Overview
Add support for paginated HTTP API calls within iterator loops, allowing automatic continuation based on response inspection.
Use Case
- step: fetch_all_assessments
tool: http
url: "{{ workload.api_url }}/assessments"
method: GET
loop:
pagination:
type: response_based
continue_while: "{{ response.paging.hasMore == true }}"
next_page:
params:
page: "{{ (response.paging.page | int) + 1 }}"
pageSize: "{{ response.paging.pageSize }}"
merge_strategy: append
merge_path: "data"
max_iterations: 100
retry:
max_attempts: 3
backoff: exponential
initial_delay: 1
Loop Pagination Attributes
pagination Block
type (string, required)
response_based- Continue based on response inspectioncursor_based- Use cursor/token from response (future)offset_based- Use offset/limit pattern (future)
continue_while (jinja2 expression, required)
- Boolean expression evaluated against response
- Available variables:
{{ response }}- Full HTTP response body{{ iteration }}- Current iteration number (0-based){{ accumulated }}- Merged results so far
next_page (object, required)
- Defines how to modify request for next iteration
next_page.params (object, optional)
- Query parameters to update/add
- Supports Jinja2 expressions with response access
next_page.body (object, optional)
- Request body modifications for POST/PUT
- Supports Jinja2 expressions
next_page.headers (object, optional)
- Header modifications
- Useful for cursor tokens
merge_strategy (string, required)
append- Append arrays at merge_pathextend- Flatten nested arraysreplace- Keep only last responsecollect- Store all responses as array
merge_path (string, optional)
- JSONPath to data array in response
- Default: root level
- Examples:
data,results,items
max_iterations (integer, optional)
- Safety limit to prevent infinite loops
- Default: 1000
retry Block (reuse existing retry mechanism)
max_attempts (integer, optional)
- Number of retry attempts per request
- Default: 1 (no retry)
backoff (string, optional)
fixedorexponential- Default: fixed
initial_delay (number, optional)
- Seconds to wait before first retry
- Default: 1
max_delay (number, optional)
- Maximum backoff delay in seconds
- Default: 60
Implementation Plan
Phase 1: Core Pagination Loop (Current)
-
Extend iterator config (
noetl/tools/controller/iterator/config.py)- Extract
paginationblock from loop config - Validate pagination attributes
- Extract
-
Add pagination executor (
noetl/tools/controller/iterator/pagination.py)- New module for pagination logic
execute_paginated_http()function- Response inspection and continuation logic
- Result merging strategies
-
Update iterator executor (
noetl/tools/controller/iterator/executor.py)- Detect pagination block
- Delegate to pagination executor
- Handle merged results
-
Add retry integration (
noetl/tools/controller/iterator/retry.py)- Reuse existing retry mechanism
- Apply retry per HTTP request
- Exponential backoff support
Phase 2: Test Infrastructure
-
Mock HTTP server (
tests/fixtures/servers/paginated_api.py)- Flask-based mock API
- Configurable page size
- Realistic pagination response
- Error injection for retry testing
-
Test playbooks (
tests/fixtures/playbooks/pagination/)test_pagination_basic.yaml- Simple paginationtest_pagination_merge.yaml- Array mergingtest_pagination_retry.yaml- Failed requeststest_pagination_limits.yaml- Max iterations
-
Test script (
tests/scripts/test_pagination.sh)- Start mock server
- Run test playbooks
- Validate merged results
- Cleanup
Phase 3: Documentation
-
Feature doc (
documentation/docs/features/pagination.md)- Overview and use cases
- Configuration examples
- Merge strategies
- Best practices
-
Reference (
documentation/docs/reference/http_pagination.md)- Complete attribute reference
- Expression examples
- Error handling
Response Merging Strategies
Append Strategy
# Initial: accumulated = []
# Page 1 response: {"data": [1, 2, 3]}
# Result: accumulated = [1, 2, 3]
# Page 2 response: {"data": [4, 5, 6]}
# Result: accumulated = [1, 2, 3, 4, 5, 6]
Extend Strategy
# Initial: accumulated = []
# Page 1 response: {"data": [[1, 2], [3, 4]]}
# Result: accumulated = [1, 2, 3, 4]
# Page 2 response: {"data": [[5, 6], [7, 8]]}
# Result: accumulated = [1, 2, 3, 4, 5, 6, 7, 8]
Replace Strategy
# Initial: accumulated = None
# Page 1 response: {"data": [1, 2, 3]}
# Result: accumulated = {"data": [1, 2, 3]}
# Page 2 response: {"data": [4, 5, 6]}
# Result: accumulated = {"data": [4, 5, 6]} # Replaced
Collect Strategy
# Initial: accumulated = []
# Page 1 response: {"data": [1, 2, 3], "page": 1}
# Result: accumulated = [{"data": [1, 2, 3], "page": 1}]
# Page 2 response: {"data": [4, 5, 6], "page": 2}
# Result: accumulated = [{"data": [1, 2, 3], "page": 1}, {"data": [4, 5, 6], "page": 2}]
Example Playbooks
Basic Pagination
- step: fetch_users
tool: http
url: "{{ workload.api_url }}/users"
loop:
pagination:
type: response_based
continue_while: "{{ response.has_more == true }}"
next_page:
params:
offset: "{{ (response.offset | int) + (response.limit | int) }}"
merge_strategy: append
merge_path: "users"
Cursor-Based Pagination
- step: fetch_events
tool: http
url: "{{ workload.api_url }}/events"
loop:
pagination:
type: response_based
continue_while: "{{ response.next_cursor is defined and response.next_cursor }}"
next_page:
params:
cursor: "{{ response.next_cursor }}"
merge_strategy: append
merge_path: "events"
Page Number Pagination
- step: fetch_assessments
tool: http
url: "{{ workload.api_url }}/assessments"
params:
page: 1
pageSize: 100
loop:
pagination:
type: response_based
continue_while: "{{ response.paging.hasMore == true }}"
next_page:
params:
page: "{{ (response.paging.page | int) + 1 }}"
merge_strategy: append
merge_path: "data"
max_iterations: 50
retry:
max_attempts: 3
backoff: exponential
Variable Availability
Within pagination context, these variables are available in Jinja2 expressions:
{{ response }}- Current HTTP response (parsed JSON){{ iteration }}- Current iteration number (0-based){{ accumulated }}- Merged data so far (depends on merge_strategy){{ workload }}- Global workflow variables{{ vars }}- Execution-scoped variables
Error Handling
- HTTP Errors: Retried based on retry configuration
- Max Iterations: Stops with warning, returns accumulated data
- Invalid Response: Stops with error if continue_while evaluation fails
- Merge Errors: Stops with error if merge_path not found
Benefits
- Declarative: No custom code needed for pagination
- Flexible: Supports multiple pagination patterns
- Safe: Built-in limits prevent infinite loops
- Resilient: Integrated retry mechanism
- Debuggable: Each iteration logged with response inspection