Skip to content

Test Gemma 270M and Gemma 3n #41

@supreme-gg-gg

Description

@supreme-gg-gg

This issue is about testing our current platform on these two new Gemma models because most development validation happen on 1B and 4B text and vision. The final goal is to document these on the project docs as well! 😄

270M

This model is built with the vision of fine tuning and task specific application -- the right tool for the right job. Might be our most-recommended model for a lot of users!!

Test the usage of this model on all our methods (including dataset modality, types, trainer, frameworks, backends, etc.) to make sure this always works and is blazing fast with fine tuning!

No changes should be required except adding this to the list of valid models to choose from.

Good candidate use case: more instruct based text dataset, SFT should definitely work, not sure about preference and reasoning though need some research to back it up

3n

3n has an unique audio modality support, but unfortunately we do not support it right now, we hope to bring it in very soon!! However, for now we should still validate the usage for text + vision. Again, should work OOTB because it's just 1B and 4B training process.

Would be cool to test GRPO and DPO in addition to SFT for this one.

Metadata

Metadata

Labels

docsImprovements or additions to documentationquestionFurther information is requested

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions