Skip to content

Commit 982d428

Browse files
authored
Update README.md
1 parent 53a7892 commit 982d428

File tree

1 file changed

+25
-9
lines changed

1 file changed

+25
-9
lines changed

README.md

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,14 @@
1010

1111
> [!IMPORTANT]
1212
>
13-
>OmniParse is a platform that ingests and parses any unstructured data into structured, actionable data optimized for GenAI (LLM) applications. Whether working with documents, tables, images, videos, audio files, or web pages, OmniParse prepares your data to be clean, structured, and ready for AI applications, such as RAG, fine-tuning, and more.
14-
15-
13+
>OmniParse is a platform that ingests and parses any unstructured data into structured, actionable data optimized for GenAI (LLM) applications. Whether you are working with documents, tables, images, videos, audio files, or web pages, OmniParse prepares your data to be clean, structured, and ready for AI applications such as RAG, fine-tuning, and more
1614
1715
## Try it out
1816
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adithya-s-k/omniparse/blob/main/examples/OmniParse_GoogleColab.ipynb)
1917

18+
## Intro
19+
https://github.com/adithya-s-k/omniparse/assets/27956426/457d8b5b-9573-44da-8bcf-616000651a13
20+
2021
## Features
2122
✅ Completely local, no external APIs \
2223
✅ Fits in a T4 GPU \
@@ -25,14 +26,14 @@
2526
✅ Table extraction, image extraction/captioning, audio/video transcription, web page crawling \
2627
✅ Easily deployable using Docker and Skypilot \
2728
✅ Colab friendly \
28-
✅ Interative UI powered by Gradio \
29+
✅ Interative UI powered by Gradio
2930

30-
### Problem Statement
31+
### Why OmniParse ?
3132
It's challenging to process data as it comes in different shapes and sizes. OmniParse aims to be an ingestion/parsing platform where you can ingest any type of data, such as documents, images, audio, video, and web content, and get the most structured and actionable output that is GenAI (LLM) friendly.
3233

3334
## Installation
34-
> Note: The server only works on Linux-based systems. This is due to certain dependencies and system-specific configurations that are not compatible with Windows or macOS.
35-
To install OmniParse, you can use `pip`:
35+
> [!IMPORTANT]
36+
> The server only works on Linux-based systems. This is due to certain dependencies and system-specific configurations that are not compatible with Windows or macOS.
3637
3738
```bash
3839
git clone https://github.com/adithya-s-k/omniparse
@@ -42,7 +43,7 @@ cd omniparse
4243
Create a Virtual Environment:
4344

4445
```bash
45-
conda create --name omniparse-venv python=3.10
46+
conda create --n omniparse-venv python=3.10
4647
conda activate omniparse-venv
4748
```
4849

@@ -52,6 +53,8 @@ Install Dependencies:
5253
poetry install
5354
# or
5455
pip install -e .
56+
# or
57+
pip install -r pyproject.toml
5558
```
5659

5760
### 🛳️ Docker
@@ -247,7 +250,7 @@ curl -X POST -F "file=@/path/to/audio.mp3" http://localhost:8000/parse_media/aud
247250

248251
#### Parse Website
249252

250-
Endpoint: `/parse_website`
253+
Endpoint: `/parse_website/parse`
251254
Method: POST
252255

253256
Parses a website given its URL.
@@ -261,6 +264,7 @@ Arguments:
261264

262265
</details>
263266

267+
264268
## Coming Soon/ RoadMap
265269
🦙 LlamaIndex | Langchain | Haystack integrations coming soon
266270
📚 Batch processing data
@@ -273,6 +277,18 @@ Arguments:
273277
**Final goal**: replace all the different models currently being used with a single MultiModel Model to parse any type of data and get the data you need.
274278

275279

280+
## Limitations
281+
There is a need for a GPU with 8~10 GB minimum VRAM as we are using deep learning models.
282+
\
283+
Document Parsing Limitations
284+
\
285+
[Marker](https://github.com/VikParuchuri/marker) which is the underlying PDF parser will not convert 100% of equations to LaTeX because it has to detect and then convert them.
286+
Tables are not always formatted 100% correctly; text can be in the wrong column.
287+
Whitespace and indentations are not always respected.
288+
Not all lines/spans will be joined properly.
289+
This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors.
290+
To fit all the models in the GPU, we are using the smallest variants, which might not offer the best-in-class performance.
291+
276292
## License
277293
OmniParse is licensed under the GPL-3.0 license. See `LICENSE` for more information.
278294
The project uses Marker under the hood, which has a commercial license that needs to be followed. Here are the details:

0 commit comments

Comments
 (0)