[megatron] support export lora to_mcore #5445

Jintao-Huang · 2025-08-19T02:24:47Z

No description provided.

gemini-code-assist

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the capability to export LoRA (Low-Rank Adaptation) models, specifically those loaded as Megatron Core adapters, into a full Megatron Core model format. This enhances the model conversion utilities by allowing users to consolidate fine-tuned LoRA weights into a base Megatron Core model for deployment or further use. Additionally, it includes several robustness improvements and minor code cleanups across the codebase.

Highlights

LoRA to Megatron Core Export: The convert_mcore2hf function has been extended to support saving the Megatron Core model when mcore_adapters are present and the to_mcore flag is set, effectively enabling the export of LoRA-adapted models to the Megatron Core format.
Conditional Model Preparation: The prepare_model_template function now conditionally applies model adaptations, improving robustness by preventing operations on a None model.
Megatron Core Debugging Enhancement: The repr method for TELinear (Transformer Engine Linear) modules has been patched to include Tensor Parallelism (TP) size, aiding in debugging and inspection of Megatron Core models.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for exporting LoRA adapters to the mcore format, which involves merging LoRA weights with the base model and saving the result as a new mcore model. The changes are primarily in swift/megatron/utils/convert.py to handle this new export path, with related adjustments in swift/llm/export/export.py to trigger the new functionality and in swift/llm/infer/utils.py to accommodate model loading scenarios. My review focuses on the clarity of the new logic and potential issues with handling concurrent export flags.

gemini-code-assist · 2025-08-19T02:26:43Z

swift/megatron/utils/convert.py

+    if args.to_hf:
+        megatron_model_meta.convert_mcore2hf(hf_model, mg_model)
+        if args.test_convert_precision:
+            test_convert_precision(hf_model, mg_model, template)
+        del mg_model
+        logger.info('Successfully transferred MG model weights to HF model.')
+        ckpt_dir = megatron_args.load if megatron_args.adapter_load is None else megatron_args.adapter_load
+        save_checkpoint(
+            hf_model,
+            processor,
+            args.output_dir,
+            safe_serialization=args.safe_serialization,
+            model_dirs=[ckpt_dir, args.model_dir],
+            max_shard_size=args.max_shard_size,
+            additional_saved_files=hf_model.model_meta.additional_saved_files)
+        logger.info(f'Successfully saved HF model weights in `{args.output_dir}`.')
+    elif args.to_mcore:
+        if args.thread_count is None:
+            checkpoint_size = sum(get_n_params_grads(mg_model)[0]) * torch.finfo(args.torch_dtype).bits // 8e9
+            args.thread_count = max(math.ceil(checkpoint_size / 10), 2)  # 10GB
+        patch_torch_dist_shard(args.thread_count)
+
+        args.save_args()
+        mg_save_checkpoint(1, [mg_model], None, None, 0)
+        logger.info(f'Successfully saved Megatron model weights in `{args.output_dir}`.')


The use of elif args.to_mcore: makes the to_hf and to_mcore operations mutually exclusive. If a user provides both --to_hf and --to_mcore flags, only the to_hf operation will be performed. This might not be the intended behavior. If both operations should be supported in a single run, this elif should be an if.

However, changing elif to if will cause a NameError because del mg_model is inside the if args.to_hf: block.

To support both flags concurrently, you could refactor this section to perform both conversions if requested, and then delete the model as suggested below.

if args.to_hf: megatron_model_meta.convert_mcore2hf(hf_model, mg_model) if args.test_convert_precision: test_convert_precision(hf_model, mg_model, template) logger.info('Successfully transferred MG model weights to HF model.') ckpt_dir = megatron_args.load if megatron_args.adapter_load is None else megatron_args.adapter_load save_checkpoint( hf_model, processor, args.output_dir, safe_serialization=args.safe_serialization, model_dirs=[ckpt_dir, args.model_dir], max_shard_size=args.max_shard_size, additional_saved_files=hf_model.model_meta.additional_saved_files) logger.info(f'Successfully saved HF model weights in `{args.output_dir}`.') if args.to_mcore: if args.thread_count is None: checkpoint_size = sum(get_n_params_grads(mg_model)[0]) * torch.finfo(args.torch_dtype).bits // 8e9 args.thread_count = max(math.ceil(checkpoint_size / 10), 2) # 10GB patch_torch_dist_shard(args.thread_count) args.save_args() mg_save_checkpoint(1, [mg_model], None, None, 0) logger.info(f'Successfully saved Megatron model weights in `{args.output_dir}`.') del mg_model

gemini-code-assist · 2025-08-19T02:26:44Z

swift/llm/export/export.py

@@ -32,12 +32,12 @@ def run(self):
            export_to_ollama(args)
        elif args.to_cached_dataset:
            export_cached_dataset(args)
+        elif args.to_hf or args.mcore_adapters and args.to_mcore:


The boolean logic args.to_hf or args.mcore_adapters and args.to_mcore can be hard to read due to operator precedence. While and has higher precedence than or, adding parentheses would make the intent much clearer and prevent potential misinterpretations.

Suggested change

elif args.to_hf or args.mcore_adapters and args.to_mcore:

elif args.to_hf or (args.mcore_adapters and args.to_mcore):

support export lora to_mcore

cd12684

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

hjh0119 approved these changes Aug 19, 2025

View reviewed changes

Jintao-Huang added 2 commits August 19, 2025 10:33

update

41ccc86

lint pass

997cd2c

Jintao-Huang force-pushed the support_lora_to_mcore branch from e510a6b to 997cd2c Compare August 19, 2025 02:46

Jintao-Huang merged commit 9a6ac83 into modelscope:main Aug 19, 2025
2 of 4 checks passed

Jintao-Huang added a commit that referenced this pull request Aug 21, 2025

[megatron] support export lora to_mcore (#5445)

ca4d407

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[megatron] support export lora to_mcore #5445

[megatron] support export lora to_mcore #5445

Uh oh!

Jintao-Huang commented Aug 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 19, 2025

Uh oh!

gemini-code-assist bot Aug 19, 2025

Uh oh!

Uh oh!

Uh oh!

	elif args.to_hf or args.mcore_adapters and args.to_mcore:
	elif args.to_hf or (args.mcore_adapters and args.to_mcore):

[megatron] support export lora to_mcore #5445

[megatron] support export lora to_mcore #5445

Uh oh!

Conversation

Jintao-Huang commented Aug 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!