You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25-9Lines changed: 25 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,14 @@
10
10
11
11
> [!IMPORTANT]
12
12
>
13
-
>OmniParse is a platform that ingests and parses any unstructured data into structured, actionable data optimized for GenAI (LLM) applications. Whether working with documents, tables, images, videos, audio files, or web pages, OmniParse prepares your data to be clean, structured, and ready for AI applications, such as RAG, fine-tuning, and more.
14
-
15
-
13
+
>OmniParse is a platform that ingests and parses any unstructured data into structured, actionable data optimized for GenAI (LLM) applications. Whether you are working with documents, tables, images, videos, audio files, or web pages, OmniParse prepares your data to be clean, structured, and ready for AI applications such as RAG, fine-tuning, and more
16
14
17
15
## Try it out
18
16
[](https://colab.research.google.com/github/adithya-s-k/omniparse/blob/main/examples/OmniParse_GoogleColab.ipynb)
It's challenging to process data as it comes in different shapes and sizes. OmniParse aims to be an ingestion/parsing platform where you can ingest any type of data, such as documents, images, audio, video, and web content, and get the most structured and actionable output that is GenAI (LLM) friendly.
32
33
33
34
## Installation
34
-
> Note: The server only works on Linux-based systems. This is due to certain dependencies and system-specific configurations that are not compatible with Windows or macOS.
35
-
To install OmniParse, you can use `pip`:
35
+
> [!IMPORTANT]
36
+
> The server only works on Linux-based systems. This is due to certain dependencies and system-specific configurations that are not compatible with Windows or macOS.
**Final goal**: replace all the different models currently being used with a single MultiModel Model to parse any type of data and get the data you need.
274
278
275
279
280
+
## Limitations
281
+
There is a need for a GPU with 8~10 GB minimum VRAM as we are using deep learning models.
282
+
\
283
+
Document Parsing Limitations
284
+
\
285
+
[Marker](https://github.com/VikParuchuri/marker) which is the underlying PDF parser will not convert 100% of equations to LaTeX because it has to detect and then convert them.
286
+
Tables are not always formatted 100% correctly; text can be in the wrong column.
287
+
Whitespace and indentations are not always respected.
288
+
Not all lines/spans will be joined properly.
289
+
This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors.
290
+
To fit all the models in the GPU, we are using the smallest variants, which might not offer the best-in-class performance.
291
+
276
292
## License
277
293
OmniParse is licensed under the GPL-3.0 license. See `LICENSE` for more information.
278
294
The project uses Marker under the hood, which has a commercial license that needs to be followed. Here are the details:
0 commit comments