Skip to content

LLaMA-3 support and questions #53

Description

@catid

This seems like one of the best options for quantization for the important new LLaMA-3 70B model so that it can be run on 1-2 consumer grade GPUs. However it looks like support for MQA is not present in llama.py so it will not work I think.

Are you planning to add support for LLaMA-3?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions