Search API
The Search API allows you to programmatically search podcast transcriptions using the same powerful search engine that powers Audioscrape. Search for specific terms, phrases, or topics across our entire database of transcribed podcast content.
Search Transcriptions ¶
Search podcast transcriptions for specific terms or phrases.
curl -X POST "https://www.audioscrape.com/api/v1/search" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "machine learning AND neural networks",
"limit": 2
}'
{
"meta": {
"query": "machine learning AND neural networks",
"total": 1234,
"limit": 2,
"offset": 0,
"has_more": true,
"took_ms": 45
},
"results": [
{
"segment_id": "101_8669.079",
"score": 23.22,
"text": "The advancements in machine learning and neural networks over the past decade have been remarkable.",
"highlighted_text": "The advancements in <mark>machine learning</mark> and <mark>neural networks</mark> over the past decade have been remarkable.",
"timestamp": {
"start": 125.4,
"end": 142.8
},
"speaker": "Dr. Jane Smith",
"episode": {
"id": "789",
"title": "The Future of AI",
"slug": "the-future-of-ai",
"publish_date": "2024-06-15T10:00:00Z"
},
"podcast": {
"id": "42",
"title": "Tech Talk Weekly",
"slug": "tech-talk-weekly",
"image_url": "https://example.com/image.png"
},
"urls": {
"segment": "/podcast/tech-talk-weekly/episode/the-future-of-ai?t=125.4",
"episode": "/podcast/tech-talk-weekly/episode/the-future-of-ai"
}
}
]
}
Request Body Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | The search query to find in transcriptions |
limit |
integer | No | Maximum number of results to return (default: 20, max: 100) |
offset |
integer | No | Number of results to skip for pagination (default: 0) |
filters |
object | No | Filter results by podcast_ids or speakers |
sort |
object | No | Sort by "relevance" or "date", order "asc" or "desc" |
Filters Object
| Field | Type | Description |
|---|---|---|
podcast_ids |
array of strings | Only search within specific podcasts |
speakers |
array of strings | Only return segments from specific speakers |
Sort Object
| Field | Type | Description |
|---|---|---|
field |
string | "relevance" (default) or "date" |
order |
string | "desc" (default) or "asc" |
Example with Filters
curl -X POST "https://www.audioscrape.com/api/v1/search" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "artificial intelligence",
"limit": 10,
"offset": 0,
"filters": {
"speakers": ["Lex Fridman"]
},
"sort": {
"field": "date",
"order": "desc"
}
}'
Advanced Query Syntax ¶
Our search API supports advanced query syntax for more precise searches:
| Syntax | Description | Example |
|---|---|---|
"phrase" |
Exact phrase matching | "machine learning" - matches the exact phrase |
AND |
Both terms must be present | AI AND ethics - finds content with both terms |
OR |
Either term can be present | python OR javascript - finds content with either term |
NOT or - |
Excludes content with the term | AI NOT chatgpt or AI -chatgpt |
term* |
Wildcard search (prefix matching) | program* - matches program, programming, programmer |
Search Tips
You can combine these operators for complex searches. For example: "neural networks" AND (python OR tensorflow) NOT "image recognition" would find content about neural networks with Python or TensorFlow, but excluding image recognition.
Response Structure ¶
The search API returns a structured response with the following fields:
Meta Object
| Field | Type | Description |
|---|---|---|
query |
string | The search query that was executed |
total |
integer | Total number of matching results |
limit |
integer | Maximum results returned in this response |
offset |
integer | Number of results skipped for pagination |
has_more |
boolean | Whether more results are available |
took_ms |
integer | Query execution time in milliseconds |
Result Object
Each result object contains detailed information about a matching transcription segment:
| Field | Type | Description |
|---|---|---|
segment_id |
string | Unique identifier for the transcript segment |
score |
float | Relevance score of the match |
text |
string | The transcript text |
highlighted_text |
string | Text with search terms highlighted in <mark> tags |
timestamp |
object | Start and end times in seconds |
speaker |
string | Name of the speaker (if identified) |
episode |
object | Episode id, title, slug, and publish_date |
podcast |
object | Podcast id, title, slug, and image_url |
urls |
object | Direct links to segment and episode pages |
Usage Limits ¶
The Search API has usage limits based on your subscription plan. Each search query counts as one API call against your monthly quota.
| Plan | API Calls per Month |
|---|---|
| Basic | 50 |
| Pro | 1,000 |
| Enterprise | Unlimited |
For detailed information about our plans and pricing, please visit our pricing page.
Best Practices ¶
Use specific search terms for better results. General terms may return too many matches.
Use quoted phrases for exact matching when searching for specific expressions or names.
Implement caching for frequent searches to reduce API calls and improve performance.
Use pagination (limit and offset parameters) to navigate through large result sets.
Use the NOT operator (-) to exclude irrelevant content and narrow down results.
Consider combining the Search API with the Notifications API for ongoing monitoring of specific topics.
Error Handling ¶
The API returns standard HTTP status codes and JSON error responses:
| Status Code | Description |
|---|---|
| 400 | Bad Request - Invalid parameters or empty query |
| 401 | Unauthorized - Missing or invalid API key |
| 429 | Too Many Requests - API call limit exceeded |
| 500 | Internal Server Error |
{
"code": "UNAUTHORIZED",
"message": "You're not authorized!"
}
{
"code": "API_CALL_LIMIT_EXCEEDED",
"message": "API call limit exceeded. You have used 50/50 calls this month. Please upgrade your plan.",
"usage": {
"current": 50,
"limit": 50
}
}