LLAMA2 7b finetuning на малом текстовом блоке при разных batch_size Иходный

Question

LLAMA2 7b finetuning на малом текстовом блоке при разных batch_size Иходный

блок датасета начинается с
LLaMA (Large Language Model Meta AI) is a family of large language models (LLMs), released by Meta AI starting in February 2023. For the first version of LLaMa, four model sizes were trained: 7, 13, 33 and 65 billion parameters.

per_device_train_batch_size = 12

'train_loss': 0.25286429792642595,
'epoch': 100.0}

Результат - плохой. Модель не сгенерировала по датасету ответ и вернула просто что знала.
What is LLAMA? LLAMA is a decentralized language model trained by a team of researcher at Meta AI....

per_device_train_batch_size = 64
'train_loss': 0.24969906300306322,
'epoch': 100.0}

What is LLAMA? LLaMA (Large Language Model Meta AI) is a family of large language models (LLMs), released by Meta AI starting in February 2023. For the first version of LLaMA, four model sizes were trained: 7, 13, 33 and 65 billion parameters.

И продолжила отсебятиной
LLaMA is trained on a diverse range of topics and styles, including but not limited to....

per_device_train_batch_size = 128
'train_loss': 0.24585976153612138, 'epoch': 100.0

Потеряла связь с реальностью
What is LLAMA? LLAMA is a decentralized platform that enables users to interact with various decentralized applications (dApps)