Querying APIs Best Practices
- 2025-10-14
- 🔌 API
- Python API Data Engineering Best Practices
0. Building a Robust Python API Client
APIs are one of the most common ways to move data into your pipelines, but they come with pitfalls: timeouts, rate limits, transient errors, and memory issues. A fragile client can make extractions unreliable and hard to debug.
This guide distills battle-tested practices for building API clients in Python. Each tip is practical, with code snippets you can adapt. The sections are grouped into:
- Structure
- Safety & performance
- Error handling
- Advanced usage
So you can build from the basics up to production-ready patterns.
1. Structure & Basics
1.1. Encapsulate in a Class
Encapsulate all API logic inside a client class. This keeps configuration and state in one place and avoids global variables.
class QlikClient:
def __init__(self, timeout_s=180, page_size=100, max_calls=900, debug=False):
self.timeout_s = timeout_s
self.page_size = page_size
self.max_calls = max_calls
self.debug = debug
self.session = requests.Session()
def _get(self, url=None, params=None, headers=None, timeout=None):
"""Do one query to a URL."""
pass
def query_all(self, endpoint, params=None):
"""Query until all items are retrieved."""
pass
Using a class makes it easy to reuse settings and ensures a single responsibility: the client only handles API calls.
1.2. Reuse the Session
Use a requests.Session to reuse TCP connections and reduce latency.
# Inside __init__
self.session = requests.Session()
# Later
response = self.session.get(url, params=params, timeout=timeout)
This avoids reconnecting on each call, which can significantly improve performance (More info at requests docs).
Always close the session when done (self.session.close()) or implement __enter__/__exit__ for context manager support.
1.3. Log Intent
Log what the client is doing, especially the URL and parameters.
from prefect import get_run_logger
def _get(self, url=None, params=None, headers=None, timeout=None):
logger = get_run_logger()
params = params or {}
headers = headers or {}
timeout = timeout or self.timeout_s
logger.info(f"Querying API with {url=} ({params=}, {timeout=})")
response = self.session.get(url, headers=headers, params=params, timeout=timeout)
response.raise_for_status()
return response.json()
Be careful not to log sensitive info like API keys or tokens. Mask or omit them.
2. Safety & Performance
2.1. Timeouts
Always set timeouts to avoid hanging requests.
response = self.session.get(
url, headers=headers, params=params, timeout=(5, 30)
)
Here we use a connect timeout of 5s and a read timeout of 30s. See requests timeout docs.
Choose sensible defaults. For APIs, a few seconds connect timeout and a higher read timeout is typical.
2.2. Handle Pagination with for/else
Prevent infinite loops by limiting max pages.
class MaxApiCallsExceeded(Exception):
pass
def query_all(self, url, params=None):
logger = get_run_logger()
params = params or {}
next_url = None
results = []
for i in range(1, self.max_calls + 1):
response = self._get(url=next_url or url, params=params)
data = response.get("data") or []
results += data
next_url = response.get("next")
prefix = f"[Page {i}/{self.max_calls}]"
logger.info(f"{prefix} Retrieved {len(data)} rows (total={len(results)})")
if not next_url:
logger.info(f"{prefix} No next cursor → stopping")
break
else:
raise MaxApiCallsExceeded(f"{self.max_calls} reached when querying {url}")
return results
Python’s for/else lets you raise an error if the loop never breaks — a safeguard against infinite pagination.
2.3. Separate Concerns
The client should only handle API calls — not data processing.
Don’t mix Pandas/Spark into your client. Keep it lightweight and focused. Transformation belongs downstream.
2.4. Export in Batches
For large datasets, don’t keep everything in memory. Process or export in chunks.
results = []
for page in client.query_all(endpoint):
process_page(page)
if len(results) > 10_000:
export_results(results)
results = []
2.5. Respect Rate Limits
Don’t just react to 429s — pace your calls proactively. A minimal, readable approach is to space requests evenly and honor common headers when present. For advanced concurrency + rate limiting, see Concurrent Async Calls to OpenAI with Rate Limiting.
import time
MAX_CALLS_PER_MIN = 100
SLEEP_S = 60 / MAX_CALLS_PER_MIN # simple pacing
for payload in payloads:
resp = session.get(BASE_URL, params=payload, timeout=30)
# If the API guides you, follow it
retry_after = resp.headers.get("Retry-After")
if resp.status_code == 429 and retry_after:
time.sleep(float(retry_after)) # seconds
resp = session.get(BASE_URL, params=payload, timeout=30)
resp.raise_for_status()
handle(resp.json())
time.sleep(SLEEP_S) # spread calls to avoid bursts
Rate limits are provider‑specific. Prefer header‑driven sleeps (Retry-After, X-RateLimit-Remaining, X-RateLimit-Reset) over hardcoded delays. Check docs, e.g. GitHub API rate limits.
3. Error Handling & Reliability
3.1. Raise for Errors
Never ignore HTTP errors. Use raise_for_status() for unknown responses.
response = self.session.get(url, params=params, timeout=self.timeout_s)
status_code = response.status_code
if status_code == 204: # No content, stop gracefully
return []
elif status_code == 429:
# Too many requests → could retry with backoff
handle_rate_limit()
# Log any non successful error.
elif status_code >= 400: # Switch to 'if' if you want to log all errors
body_preview = (response.text or "")[:500]
logger.error(
f"HTTP error {status_code} at {url=}: {response.reason=} {body_preview=}"
)
response.raise_for_status() # Raise for all other errors
Fail fast on unexpected errors — silent failures can corrupt downstream data.
3.2. Backoff for Transient Failures
Use exponential backoff for rate limits or known errors.
import backoff
class RateLimitError(Exception):
pass
@backoff.on_exception(backoff.expo, RateLimitError, max_tries=5)
def _query_data(self, url, params):
response = self.session.get(url, params=params)
if response.status_code == 429:
raise RateLimitError("Hit rate limit")
response.raise_for_status()
return response.json()
Backoff reduces stress on APIs and increases resilience. See backoff docs.
3.3. Avoid Self‑Recursion
Don’t retry by recursively calling the same function. Use loops or backoff instead.
Recursive retries can hit Python’s recursion limit (~1000) and crash the program.
3.4. Export Partial Results on Failure
Always save what you have if extraction fails mid‑way.
results = []
try:
for i in range(self.max_calls):
results += query_api()
# If you need extra logging, uncomment:
# except Exception as e:
# logger.error(f"Unexpected {e=}")
# raise # Relaunches the exception
finally:
if results:
logger.info(f"Exporting {len(results)} results after failure")
export_results(results)
# 'return', 'break' or 'continue' are not allowed here
# if you add them, the exception will be lost and you won't notice failures
You can use finally with a try block to run code whether the try block fails or succeeds.
This is really useful for tasks like exporting partial results.
Don’t exit from finally: no return, break, or continue. These suppress any exception from try. Also avoid raising new errors here, wrap exports so failures don’t mask the original. If you need to return a value, do it after the try/finally.
4. Advanced Usage
4.1. Stream or Yield Data
For very large responses, yield results instead of accumulating.
def query_all(self, url, params=None):
next_url = url
for i in range(1, self.max_calls + 1):
response = self._get(url=next_url, params=params)
yield response.get("data", [])
if not (next_url := response.get("next")):
break
Streaming reduces memory usage. See requests streaming.
4.2. Clean Up Resources
Support context managers so the session is always closed.
class QlikClient:
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.session.close()
Usage:
with QlikClient() as client:
data = client.query_all("/endpoint")
If you are running your code on ephemeral infrastructure, you can skip this since your machine is terminated after running the extraction.
4.3. Query From the Last Successful Extraction
Rather than hard‑coding time windows (e.g., yesterday), use your destination as the source of truth:
- Read the destination watermark:
max(last_modified_at)(or similar). - Subtract a small margin (e.g., 60 s) for clock skew/late writes.
- Use that timestamp as
min_ts/updated_sincein the API.
Minimal sketch with generic helpers:
TABLE = "dest.users"
# max(last_modified_at) from destination with a margin
min_ts = infer_min_ts(table=TABLE, col="last_modified_at", margin_s=60)
data = query_all(endpoint, min_ts)
write(data, table=TABLE)
It’s important to handle deduplication later.
You can read more about this at Processing new data | Self-Healing Pipelines.
Conclusion
A resilient API client is more than a requests.get() wrapped in a loop. By layering in the practices we covered — from structuring your client class, to enforcing timeouts, handling pagination, adding backoff, and building idempotent incremental loads — you can turn brittle scripts into reliable building blocks for your pipelines.
These techniques help you build clients that are:
- Reliable: failures are handled gracefully, not silently ignored.
- Efficient: sessions are reused, requests are paced, and data is streamed.
- Safe: partial results are saved, sensitive data isn’t logged, and production tables aren’t polluted.
Combine them, and you’ll be able to extract data from APIs at scale — without nasty surprises, late-night alerts, or duplicated rows.