July 31st, 2024

The Open Source Aryn Partitioning Service

The Aryn Partitioning Service is a serverless, GPU-powered API for segmenting and labeling PDF documents, improving accuracy and efficiency in processing complex data, accessible via an API key.

Read original articleLink Icon
The Open Source Aryn Partitioning Service

The Aryn Partitioning Service (APS) has been launched as a serverless, GPU-powered API designed to simplify the segmentation and labeling of PDF documents. It utilizes the Aryn Partitioner, which is based on a state-of-the-art deep learning model trained on over 80,000 enterprise documents, resulting in significantly improved accuracy in data chunking and recall for hybrid search applications. The service processes PDFs and returns the output in JSON format, making it easy for developers to integrate into their applications. Users can test the service through the Aryn Playground, where they can upload PDFs and visualize the segmentation results. The APS is designed to handle complex, unstructured data efficiently, allowing for the extraction of various document components such as paragraphs, tables, and images. It eliminates the need for users to manage their own GPU resources, providing a cost-effective solution for document processing. The service can be accessed via an API key, and users can utilize it directly in their scripts or in conjunction with the Sycamore document processing engine. The Aryn SDK and curl commands are available for developers to implement the service in their workflows. The APS aims to enhance the processing of large documents, particularly those requiring OCR, by allowing users to batch process pages for efficiency. Feedback and feature requests are encouraged as the service is rolled out.

Link Icon 0 comments