Do you want to run this quickstart without modifying your local machine?
Skip ahead to run this quickstart as a notebook on Google Colab now!Do you want to just copy the sample code for use on your local machine? Skip ahead to the code now!This quickstart uses the Unstructured Partition Endpoint and focuses on a single, local file for ease-of-use demonstration purposes. This quickstart also
focuses only on a limited set of Unstructured’s full capabilities. To unlock the full feature set, as well as use Unstructured to do
large-scale batch processing of multiple files and semi-structured data that are stored in remote locations,
skip over to an expanded, advanced version of this quickstart that uses the
Unstructured Workflow Endpoint instead.
The following code shows how to use the Unstructured Python SDK
to have Unstructured process one or more local files by using
the Unstructured Partition Endpoint.
To run this code, you will need the following:
- An Unstructured account and an Unstructured API key for your account. Learn how.
- Python 3.9 or higher installed on your local machine.
- A Python virtual environment is recommended for isolating and versioning Python project code dependencies, but this is not required. To create and activate a virtual environment, you can use a framework such as uv (recommended). Python provides a built-in framework named venv.
-
You must install the Unstructured Python SDK on your local machine, for example by running one of the
following commands:
- For
uv, runuv add unstructured-client - For
venv(or for no virtual environment), runpip install unstructured-client
- For
-
Add the following code to a Python file on your local machine; make the following code changes; and then run the code file to see the results.
-
Replace
<unstructured-api-key>with your Unstructured API key. -
To process all files within a directory, change
Noneforinput_dirto a string that contains the path to the directory on your local machine. This can be a relative or absolute path. -
To process specific files within a directory or across multiple directories, change
Noneforinput_fileto a string that contains a comma-separated list of filepaths on your local machine, for example"./input/2507.13305v1.pdf,./input2/table-multi-row-column-cells.pdf". These filepaths can be relative or absolute.Ifinput_dirandinput_fileare both set to something other thanNone, then theinput_dirsetting takes precedence, and theinput_filesetting is ignored. -
For the
output_dirparameter, specify a string that contains the path to the directory on your local machine that you want Unstructured to send its JSON output files. If the specified directory does not exist at that location, the code will create the missing directory for you. This path can be relative or absolute.
-
Replace
Sample code
Python SDK

