Skip to content

Support more models and flash attention 2#3

Open
yuyijiong wants to merge 3 commits intoVITA-Group:mainfrom
yuyijiong:main
Open

Support more models and flash attention 2#3
yuyijiong wants to merge 3 commits intoVITA-Group:mainfrom
yuyijiong:main

Conversation

@yuyijiong
Copy link
Copy Markdown

  1. Support more models such as mistral, gemma, qwen2
  2. Support flash attention
  3. Truncate sequence length to under 4k when calculate outliers to avoid OOM

@ZackZikaiXiao
Copy link
Copy Markdown

Good job, bugs of casual mask shape are fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants