In addition to structured and unstructured data that can be added from the UI or data connectors, Curiosity supports out of the box most of the file types used on the day to day job of a team or enterprise:
- Documents (*.pdf, *.doc, *.docx, *.ps, *.txt, ...)
- Slides (*.ppt, *.pptx, ...)
- Excel Worksheets (*.xls, *.xlsx, ...)
- Emails (*.msg, *.eml, ...)
- Images (*.gif, *.png, *.jpg, *.tiff, *.bmp, *.psd, ...)
- Diagrams (*.vsd)
- Drawings (*.dwg, *.dxf)
- Webpages (*.html)
If you have any special needs, we're ready to support you in adding more data types, just reach out!
Unlike most enterprise search systems, we believe having full access to your documents instead of just a tiny snippet is key for a good search experience. For this, we support natively previews of all supported file types in the system:
These previews are generated automatically while the system indexes your files for search, and all supported file types will be rendered as a PDF preview, so it can be visualized within the browser. For example, a PowerPoint presentation:
Or an Excel sheet:
The original file is still available, and can be downloaded with the Download button on the bottom right of the card / preview interface:
Search on images using OCR
If your Curiosity system has been configured to perform OCR (using one of the supported models: AWS Textract, Azure Cloud Vision and On-Premise Azure Cloud Vision), you'll also be able to search on the content of images and scanned PDFs once they've been processed. Curiosity will automatically process your files and enrich the PDF previews with the extracted text, so that it is searchable within your system just like any other document.