# Supported file types

Curator extracts text and metadata from common document, text, and image files. Scanned PDFs and images are read with OCR for text and a vision model for content. That extracted content is what makes a file searchable and lets chat cite it.

Any file can still be browsed by filename in **All Files** even when Curator cannot read its content. It just will not appear in content or semantic search until its text can be extracted.

## Supported types

> [!NOTE]
> PLACEHOLDER: the exact list of supported extensions is not confirmed here. Fill in the confirmed types in the table below and remove this note. Do not publish a precise extension list until it is verified in the app.

| Category | Types | Notes |
| --- | --- | --- |
| Documents | PLACEHOLDER | PLACEHOLDER |
| Text | PLACEHOLDER | PLACEHOLDER |
| PDFs | PLACEHOLDER | Text PDFs read directly. Scanned PDFs go through OCR. |
| Images | PLACEHOLDER | Read with OCR for text and a vision model for content. |

For how this content is turned into search, see [How processing works](/concepts/processing/).