To start using Unstructured right away, skip ahead to the UI quickstart or API quickstart now!
What is Unstructured?
Unstructured provides a platform and tools to ingest and process unstructured documents for retrieval-augmented generation (RAG) and agentic AI. This 60-second video describes more about what Unstructured does and its benefits (no sound):
This 40-second video demonstrates a simple use case that Unstructured helps solve (no sound):
This 60-second video shows why using Unstructured is preferable to building your own similar solution:
You can use Unstructured through a user interface (UI), an API, or both. Read on to learn more.
Unstructured UI quickstart
This quickstart shows how, in just a few minutes, you can use the Unstructured user interface (UI) to quickly and easily see Unstructured’s best-in-class transformation results for a single file that is stored on your local computer.This quickstart focuses on a single, local file for ease-of-use demonstration purposes.To use Unstructured later to do
large-scale batch processing of multiple files and semi-structured data that are stored in remote locations,
skip over to the remote quickstart after you finish this one.
- After you are signed in, the Start page appears.
-
In the Welcome area, do one of the following:
- Click one of the sample files, such as realestate.pdf, to have Unstructured parse and transform that sample file.
-
Click Browse files, and then browse to and select one of your own files, to have Unstructured parse and transform it.
If you choose to use your own file, the file must be 10 MB or less in size. Also, the file must one of the following supported file types:
File extension .bmp.csv.doc.docx.eml.epub.heic.html.jpeg.jpg.md.msg.odt.org.p7s.pdf.png.ppt.pptx.rst.rtf.tif.tiff.tsv.txt.xls.xlsx.xml

-
After Unstructured has finished parsing and transforming the file (a process known as
partitioning), you will see the file’s contents in the
Preview pane in the center and Unstructured’s results in the Result pane on the right.

-
The Result pane shows a formatted view of Unstructured’s results by default. This formatted view is designed for human
readability. To see the underlying JSON view of the results, which is designed for RAG and agentic AI,
click JSON at the top of the Result pane.
Learn about what’s in the JSON view.

-
Unstructured’s initial results are based on its High Res partitioning strategy, which
begins processing the file’s contents and converting these contents into a series of Unstructured
document elements and metadata. This partitioning strategy provides good results overall, depending on the complexity of the file’s contents.
This partioning strategy also generates a bounding box for each detected object in the file. A bounding box is
an imaginary rectangular box drawn around the object to show its location and extent within the file.
After the High Res partitioning results are shown, Unstructured begins improving these initial results by
using vision language models (VLMs) to apply a series of generative refinements known as enrichments. These
enrichments include:
- An image description enrichment, which uses a VLM to provide a text-based summary of the contents of the each detected image.
- A generative OCR enrichment, which uses a VLM to improve the accuracy of each block of initially-processed text.
- A table to HTML enrichment, which uses a VLM to provide an HTML-structured representation of each detected table.
To see these enrichments applied to the initial results, click Update results in the banner as soon as this button appears, which might take up to a minute or more.
Each page that Unstructured processes by using this approach is counted as two pages for usage and billing purposes.This is because Unstructured processes each page once with its High Res partitioning strategy and then reprocessess each page with a VLM to improve the quality, accuracy, and relevance of the initial partitioning results. The final results of these two processing passes for each page count as two pages for usage and billing purposes. This two-pass process happens regardless of whether you click Update results in the banner.This two-page usage and billing behavior is a known issue and will be addressed in a future release. -
To synchronize the scrolling of the Preview pane’s selected contents with the Result pane’s Formatted results,
rest your mouse pointer anywhere inside the contents of the Preview pane until a bounding box appears.
Then click the bounding box. Unstructured automatically scrolls the Result pane’s Formatted
results to match the selected bounding box. (You cannot synchronize the scrolling of the JSON results.)
To show all of the bounding boxes in the Preview pane at once, turn on the Show all bounding boxes toggle at the top of the Preview pane. You can now click any of the bounding boxes without first needing to rest your mouse pointer on them to show them.

-
To download the JSON view of the results as a local JSON file, click the download icon to the left of the Formatted and JSON buttons in the Result pane.
(You cannot download the formatted view of the results.)

- To have Unstructured partition a different file, click Add new file in the Files pane on the left, and then browse to and select the target file.
- To view the results for a file that was previously partitioned during this session, click the file’s name in the Recent files list in the Files pane.
- To return to the Start page, click the X (close) button at the left on the title bar, next to Transform.
-
To have Unstructured do more—such as
chunking, embedding,
applying additional kinds of enrichments, and
processing larger files and semi-structured data in batches at scale—click
Edit in Workflow Editor at the right on the title bar, and then skip over to the walkthrough.

Unstructured API quickstart
This quickstart shows how you can use the Unstructured API to quickly and easily see Unstructured’s transformation results for a single file that is stored locally.This quickstart uses the Unstructured API’s Partition Endpoint and focuses on a single, local file for ease-of-use demonstration purposes. This quickstart also
focuses only on a limited set of Unstructured’s full capabilities.To unlock Unstructured’s full feature set, as well as use Unstructured to do
large-scale batch processing of multiple files and semi-structured data that are stored in remote locations,
skip over to an expanded, advanced version of this quickstart that uses the
Unstructured API’s Workflow Endpoint instead.
- If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your Unstructured Starter account, at https://platform.unstructured.io.
- Watch the following 3-minute video:
Run this quickstart as a notebook on Google Colab instead.
Get the sample code for this video.
Get the full setup instructions for this video.
Learn more.
Pricing
Unstructured offers several account types with different pricing plans:- Starter - A single user, with a single workspace, hosted alongside other accounts on Unstructured’s cloud infrastructure.
- Team - Multiple users and workspaces, hosted alongside other accounts on Unstructured’s cloud instrastructure.
-
Enterprise - Multiple users and workspaces, isolated from all other accounts, with two hosting options for additional security and control:
- Dedicated instance - Hosted within a virtual private cloud (VPC) running inside Unstructured’s cloud infrastructure.
- In-VPC - Hosted within your own VPC on your own cloud infrastructure.
- For these file types, a page is a page, slide, or image:
.pdf,.pptx, and.tiff. - For
.docxfiles that have page metadata, Unstructured calculates the number of pages based on that metadata. - For all other file types, Unstructured calculates the number of pages as the file’s size divided by 100 KB.
- For non-file data, Unstructured calculates a page as 100 KB of incoming data to be processed.
Questions? Need help?
- For general questions about Unstructured products and pricing, email Unstructured Sales at sales@unstructured.io.
- For technical support for Unstructured accounts, email Unstructured Support at support@unstructured.io.
- For technical support for the Unstructured open source library, use our Slack community.

