Skip to main content

File handling

SecondLoop keeps original attachments, then optionally runs enrichment pipelines based on type and settings.

Processing by file type

File type	What SecondLoop does
`image/*`	Stores original image, reads EXIF metadata when available, and can run image caption/OCR enrichment if enabled.
`audio/*`	Stores attachment, optionally creates a normalized proxy format, and can run transcription (local / BYOK / Cloud).
`video/*`	Can convert to a segmented video manifest (`application/x.secondloop.video+json`) with preview frames and optional extracted audio; then runs transcript/OCR style enrichment.
`application/pdf`	Extracts text layer first; if text is missing, marks it as OCR-needed and can run OCR.
`application/vnd.openxmlformats-officedocument.wordprocessingml.document`	Extracts document text; OCR can use embedded media (for scan-like content).
`text/*`	Decodes UTF-8 text and normalizes it for search/readability.
Structured text-like (`json/xml/yaml/toml/ini/csv`)	Treated as text extraction targets for searchable content.
URL share manifest (`application/x.secondloop.url+json`)	Fetches page metadata/readable text with security checks, then stores extracted result.
Other binary files	Stored as attachments and available for system open/download, without guaranteed text enrichment.

PDF and OCR behavior

PDF text extraction runs first.
If extracted text is empty, the attachment enters needs OCR state.
OCR can be run automatically or manually (depending on settings and limits).

Video behavior

Video processing may create:
- segmented playback proxy
- poster + keyframes
- linked audio payload for transcript
The viewer can combine transcript and OCR/keyframe text into one readable payload.

Practical notes

Processing depends on enabled settings (for example OCR/transcription toggles).
If a format is stored but not fully enriched, you can still keep and sync the original file.

Processing by file type
PDF and OCR behavior
Video behavior
Practical notes