Skip to content

Improve Dockerfile for better layer caching and workflow speed #5087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

itayporezky
Copy link

Description of changes: This PR hopefully will improve docker layer caching which has many benefits.

  1. Shorten workflow execution time, lowering Github cost
  2. Better usage of docker layer caching which will: reduce hosting storage cost, lower client's storage cost, speedup client's pull time.

Note, the workflow always pushes :latest tag, building this PR will probably(?) break your deployments. You can modify my commits to push a different tag for testing.

I can't really test this myself due to all of your dependencies on AWS and private base docker image(FROM), I've tested partially using base image of python.

Additional possible future improvement is to change the order of COPY if you have some python modules which are not updated frequently and separate it to a different RUN pip install xxxxx, again to improve caching.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@prateekdesai04
Copy link
Contributor

Thank you for your contribution @itayporezky!
Curious if you happen to have any speed-up numbers for individual dockerfile builds before and after your changes?

@prateekdesai04 prateekdesai04 self-requested a review April 23, 2025 20:52
@itayporezky itayporezky force-pushed the improve_dockerfiles branch from f0bf092 to acb9d2c Compare April 23, 2025 21:08
@itayporezky
Copy link
Author

I didn't test specific size differences, I'm speaking mostly from a theoretical point of view.
Previously, there was

RUN git clone https://github.com/autogluon/autogluon.git
COPY full_install_image.sh autogluon/
RUN cd autogluon && chmod +x full_install_image.sh && ./full_install_image.sh

This means that any single byte change in the whole git repo would cause the final layer of the image to be recreated.
After my changes, docker build will COPY only the relevant folders, so changes in other folders(docs, CI, .git, etc) would not cause the final layer to be different and docker push would complete almost instantly and push nothing.

This is significant since the final layer contains cuda and other very heavy packages.

@itayporezky itayporezky force-pushed the improve_dockerfiles branch from acb9d2c to bfa2058 Compare April 23, 2025 21:09
@itayporezky
Copy link
Author

I'm getting some new ideas to improve this more. I'll change the PR to draft for now

@itayporezky itayporezky marked this pull request as draft April 23, 2025 21:31
Copy link

Job PR-5087-bfa2058 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-5087/bfa2058/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants