Home/Docs/Dataset

Dataset

Your source of truth, configured.

A dataset is a schema, a set of source URLs, and a version history. Everything flows from those three things.

Visibility

Set when you create the dataset. Changeable at any time from dataset settings.

Public

Anyone can hit your endpoint without an account or API key. Your dataset appears in the marketplace and can be cloned by other users.

Private

Requests require a Bearer token. Your dataset does not appear in the marketplace and cannot be cloned. Find your private key in the dataset settings.

Schema

Your schema is a plain-English description of the fields you want extracted from each source. Write one field per line in the format field_name (type) — description.

Example

title (string) — the title of the post
score (number) — the upvote count
comments (number) — the number of comments
url (string) — the link to the original post
author (string) — the username of the submitter

Be descriptive, not terse

score (number) — the Hacker News upvote count works better than score (number). The more context you give, the more accurately the AI maps the source.

Specify the type

Supported types: string, number, boolean, array, object. If you omit the type, the extractor will infer it, but explicit is always more reliable.

Nest when it makes sense

You can describe nested objects: author.name (string), author.handle (string). The extractor will build the nested structure for you.

Leave _source alone

Quorel adds a _source field automatically with the origin URL of each entity. You do not need to declare it in your schema.

Source URLs

Paste one URL per line. Quorel crawls each page and extracts entities against your schema. All URLs must be publicly accessible.

One entity per URL

Each URL should be a page that contains one or more instances of the thing you want. A list page with 30 job postings is fine. A single job detail page is also fine.

Avoid login-walled pages

Quorel crawls public pages only. If a URL requires authentication, the crawl will fail silently for that URL.

Pagination

Paste each paginated URL separately. If the source uses query-param pagination (?page=1, ?page=2), paste each page you want covered.

Plan limits

Free: 20 URLs. Pro: 100 URLs. URLs discovered via SERP intent count separately (Free: 10, Pro: 40).

SERP intent discovery

Don't know the exact URLs? Describe what you want in plain English and Quorel will discover the sources for you via web search. SERP-discovered URLs are additive — they stack on top of any URLs you pasted directly.

Example intents

top remote React jobs posted this week
Y Combinator W25 batch companies
AI tools launched on Product Hunt in 2025
Keep the intent narrow. "top remote React jobs" works. "all jobs" does not.
Quorel runs the query via Serper and picks the most relevant public URLs from the results.
SERP-discovered URLs are additive — they stack on top of any URLs you pasted directly.
Re-running discovery on a refresh may return different URLs if the search results have changed.

Refresh

Nightly (automatic)

All datasets on all plans refresh automatically every night. No configuration needed.

On-demand via ping URL

Pro and above. Every dataset gets a dedicated ping URL. Hit it with a GET request to trigger an extra refresh immediately. Wire it into any scheduler or CI pipeline.

Webhook on completion

Pro and above. Register an endpoint and Quorel will POST to it the moment a refresh finishes. See the Webhooks doc for the payload shape.

CLONE & EXTEND

Fork any public dataset.

Any public dataset in the marketplace can be cloned. Think of it like forking a GitHub repo — you get a full independent copy you own entirely.

01

Find a public dataset

Browse the marketplace and open any public dataset. Every public dataset has a Clone button in the top right of its page.

02

Fork it to your account

Cloning copies the current active version, the schema, and the source URLs into a new dataset owned by you. The original is untouched.

03

Extend it

Add your own URLs, edit the schema to add or remove fields, or change the visibility. Your clone is fully independent from the original.

04

Hit your own endpoint

Your extended dataset gets its own dataset ID and slug. It refreshes on its own schedule. The original dataset's changes never affect yours.

Alt version

Every dataset version can have an alt — a cleaned or modified copy of the same entities that lives alongside the original without replacing it. Alt versions are produced manually via the dashboard, or automatically by an MCP agent using push_alt_version.

GET https://quorel.vercel.app/api/42/my-dataset/active/?alt=true
# or
GET https://quorel.vercel.app/api/42/my-dataset/v3/alt/

Frozen datasets

A frozen dataset will never refresh again.
Its API endpoint stays live and returns the last version indefinitely.
Frozen datasets cannot be unfrozen. Clone it first if you want an active copy.
Datasets that violate the Terms of Service may be frozen by Quorel.

Plan limits

Free1 dataset20 URLs10 SERP
Pro5 datasets100 URLs40 SERP
ScaleUnlimited datasets500 URLs100 SERP

See the full pricing page for a complete feature comparison.

Next steps