Llama2 Readme Update #1344

wtomin · 2025-09-30T09:40:01Z

What does this PR do?

Adds # (feature)

Add a readme file for Llama2 inference. Listed ther performance.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
documentation guidelines
Did you build and run the code without any errors?
Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

gemini-code-assist · 2025-09-30T09:40:14Z

Summary of Changes

Hello @wtomin, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the project's support for Llama2 models by introducing dedicated documentation and configuration updates. The primary goal is to provide users with clear guidance on how to set up, run inference, and understand the performance characteristics of Llama2 models within the MindSpore ecosystem. This addition streamlines the user experience for integrating and utilizing Llama2 for various natural language processing tasks.

Highlights

New Llama2 README: A new README file (README_llama2.md) has been added under examples/transformers/llama, providing comprehensive documentation for Llama2 inference.
Llama2 Inference Guide: The new README includes detailed installation instructions, system requirements, a quick start guide for running Llama2 inference, and performance benchmarks.
Auto-modeling Update: The mindone/transformers/models/auto/modeling_auto.py file has been updated to include LlamaForTokenClassification in the _TOKEN_CLASSIFICATION_MODEL_MAPPING_NAMES mapping, enhancing auto-model recognition for Llama models.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new README file for Llama2 inference and adds a mapping for LlamaForTokenClassification. The new documentation is a valuable addition, but it contains a critical error in the code snippet due to a missing import, as well as several typos and markdown formatting issues that affect readability. Additionally, the code change in modeling_auto.py disrupts the alphabetical ordering of the dictionary, which could impact maintainability. I've provided specific suggestions to address these points.

examples/transformers/llama/README_llama2.md

gemini-code-assist · 2025-09-30T09:41:18Z

mindone/transformers/models/auto/modeling_auto.py

        ("mpnet", "MPNetForTokenClassification"),
        ("mt5", "MT5ForTokenClassification"),
        ("persimmon", "PersimmonForTokenClassification"),
+        ("llama", "LlamaForTokenClassification"),


For consistency and maintainability, please add this entry in alphabetical order within the MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING_NAMES dictionary. It should be placed after lilt and before longformer.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

wtomin added 4 commits September 30, 2025 16:09

updates

b3f0555

update the auto config

d814d0e

updates

8480f3a

update readme

419fa19

wtomin requested a review from vigo999 as a code owner September 30, 2025 09:40

gemini-code-assist bot reviewed Sep 30, 2025

View reviewed changes

wtomin and others added 5 commits September 30, 2025 17:42

Apply suggestion from @gemini-code-assist[bot]

3a574f9

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Apply suggestion from @gemini-code-assist[bot]

b3df7d5

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Apply suggestion from @gemini-code-assist[bot]

5be88fd

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Apply suggestion from @gemini-code-assist[bot]

39de7f7

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Apply suggestion from @gemini-code-assist[bot]

4296fa1

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

vigo999 assigned wtomin Sep 30, 2025

vigo999 added the new model add new model to mindone label Sep 30, 2025

vigo999 added this to mindone Sep 30, 2025

vigo999 moved this to In Progress in mindone Sep 30, 2025

vigo999 requested a review from SamitHuang September 30, 2025 09:52

vigo999 approved these changes Sep 30, 2025

View reviewed changes

vigo999 enabled auto-merge September 30, 2025 09:52

wtomin requested review from zhanghuiyao and CaitinZhao September 30, 2025 10:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama2 Readme Update #1344

Llama2 Readme Update #1344

Uh oh!

wtomin commented Sep 30, 2025

Uh oh!

gemini-code-assist bot commented Sep 30, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Sep 30, 2025

Uh oh!

Uh oh!

Llama2 Readme Update #1344

Are you sure you want to change the base?

Llama2 Readme Update #1344

Uh oh!

Conversation

wtomin commented Sep 30, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

gemini-code-assist bot commented Sep 30, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!