In this repository we have done all of the data analysis and fine-tunining work and we have provided all of the deployed models and datasets resources.
- We installed dependencies and load the dataset from hugging face, and then fine tuned llama 3.2 1B model in
Spark2.ipynb
notebook. - We deployed the model on hugging face and also on ollama.
- We have create a new dataset contains the English conversations in the first dataset and also with Arabic conversations and then we shuffled them using Pandas as we did in
data_preparation.ipynb
notebook. - Soon we will train the model on the new dataset to allow users to send Arabic and English messages to our model.
- Ar-En version dataset on hugging face: Dataset
- Fine-Tuned model on hugging face: Model
- Ollama deployed model: Ollama
For more informations about how we fine-tuned our model I can provide you a step by step tutorial to do that, just contact me here or on my email address [email protected]
.