How processing works

Processing is the work that turns a raw file into something Curator can search by meaning and use in Chat. You may also see it called ingestion or the pipeline. Every file goes through the same few steps.

What happens to a file

Extract text and metadata. Curator pulls the readable text out of the file, along with details like its type and size.
Write an AI description. A model reads the file and writes a short description of what it contains.
Split it into chunks. Long files are broken into smaller pieces so they can be searched and quoted precisely.
Create embeddings. Each chunk is turned into an embedding, a numeric representation of its meaning.

Why it matters

Embeddings let Curator match ideas instead of exact words, which is what powers meaning-based search and grounded chat. Once a file is processed, you can find it by what it is about and ask questions about its contents.

An unprocessed file is still visible in the file browser and searchable by name. Processing adds meaning-based search on top.

When processing runs

Processing runs in the background, so you can keep working while it catches up. A few toggles in Settings under Processing control when it starts.

Process on upload processes files as soon as you add them.
Process on scan processes files that a sync finds in a source.
Auto-categorize on upload assigns a category to new files automatically.

You can watch the pipeline run on the Processing screen, where each file moves through its stages live.

[!NOTE] Embeddings are numeric representations of meaning. Curator compares them to find related files even when they share no words.

Learn more

Processing and jobs Watch the pipeline run and reprocess files.

Processing settings Automation toggles, concurrency, and chunk sizes.