-
Notifications
You must be signed in to change notification settings - Fork 574
feat: add onnxslim #2258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add onnxslim #2258
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks clean ! we will need some basic testing for this feature, maybe in test_export_cli.py
@inisis the table is very cool ! is it possible to get the "diff" i.e. how many operators were removed per operator type and overall reduction in number of ops ? |
@IlyasMoutawwakil Currently, we don't support that, but this feature can be added, and we provides api for doing so, and there will be different colors in terminal, green means less. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@IlyasMoutawwakil It runs very slow locally, can I use your CI server to do it. I simply enable onnxslim in test_export_cli.py, and if all passed I will remove it. |
The CI runs on basic Github runners. |
@IlyasMoutawwakil Currently, CI test will raise No module named 'onnxslim', so where to put the dependency, a possible solution is that when user set slim to True, optimum automatically installs onnxslim if an import error happened. |
@inisis please add it to "tests" extras, and add a check when |
Just want to add my voice in support of onnxslim: it's an amazing library I use for all my Transformers.js models 🤗. So, having this built-in to Optimum could be really helpful too! 👍 Happy to provide a review when PR is ready too. |
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! thanks for iterating on this !
What does this PR do?
This pr introduces onnxslim, a new high performance onnx optimization tool.
for gpt2 model, we can reduce number of operators and model size without losing accuracy, and applies to many other models.
Before submitting
Who can review?
@IlyasMoutawwakil @xenova