Add test for Openai GPT OSS model #2749
Draft
+83
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Model card: https://huggingface.co/blog/welcome-openai-gpt-oss
Model requires transformer version upgrade of 4.55.0 , post which it fails with
RuntimeError: Using MXFP4 quantized models requires a GPU
. The model uses MXFP4 quantization, which requires a GPU to run. But since we’re planning to use the model in BF16 or FP32 and don’t actually need quantization, quantization_config has been removed from model config before loading model.After this change, the model loads without the MXFP4 error, but we encounter a warning:
Some weights of GptOssForCausalLM were not initialized from the model checkpoint at openai/gpt-oss-20b and are newly initialized,
which suggests that the model weights are not being loaded correctly—likely due to the architecture/config mismatch caused by altering the original configuration. This test is intended to verify whether the model's architecture is compatible with Forge compilation, and to determine how far the compilation process progresses.