Training Instructions for Vishwamai Model

Prerequisites

Python Environment: Ensure you have Python 3.8 or later installed.
Virtual Environment: It is recommended to use a virtual environment to manage dependencies.
Git LFS: Ensure Git LFS is installed and initialized in the repository.

Setup

Clone the Repository:

git clone https://github.com/VishwamAI/chat-agent.git
cd chat-agent

Initialize Git LFS:
```
git lfs install
git lfs pull
```

Create and Activate Virtual Environment:

python3 -m venv venv
source venv/bin/activate

Install Dependencies:
```
pip install -r requirements.txt
```

Dataset Preparation

Ensure the dataset is available in the datasets directory. The config_for_9b.yaml file is configured to use datasets/dev.json for both training and validation.

Training the Model

Run the Training Script:

python scripts/train_t5.py --config configs/config_for_9b.yaml

Notes

The train_t5.py script is designed to train the Vishwamai model using the specified configuration file.
Ensure that the datasets/dev.json file is correctly formatted and available in the datasets directory.
The training process may take a significant amount of time, depending on the size of the dataset and the available computational resources.

Troubleshooting

If you encounter any issues with missing dependencies, ensure that all required packages are listed in the requirements.txt file and installed in your virtual environment.
For any other issues, refer to the repository's README file or seek assistance from the repository maintainers.

This PR was written by Devin 👼

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TRAINING_INSTRUCTIONS.md

TRAINING_INSTRUCTIONS.md

Training Instructions for Vishwamai Model

Prerequisites

Setup

Dataset Preparation

Training the Model

Notes

Troubleshooting

Files

TRAINING_INSTRUCTIONS.md

Latest commit

History

TRAINING_INSTRUCTIONS.md

File metadata and controls

Training Instructions for Vishwamai Model

Prerequisites

Setup

Dataset Preparation

Training the Model

Notes

Troubleshooting