FedTextGrad: Federated Textual Gradient

📌 Official repository for our ICLR 2025 paper:

Can Textual Gradient Work in Federated Learning?
Minghui Chen, Ruinan Jin, Wenlong Deng, Yuanyuan Chen, Zhi Huang, Han Yu, Xiaoxiao Li
📄 [Arxiv]

📌 Abstract

Recent studies highlight the promise of LLM-based prompt optimization, especially with TextGrad, which automates differentiation via texts and backpropagates textual feedback. This approach facilitates training in various real-world applications that do not support numerical gradient propagation or loss calculation. In this paper, we systematically explore the potential and challenges of incorporating textual gradient into Federated Learning (FL). Our contributions are fourfold. Firstly, we introduce a novel FL paradigm, Federated Textual Gradient (FedTextGrad), that allows clients to upload locally optimized prompts derived from textual gradients, while the server aggregates the received prompts. Unlike traditional FL frameworks, which are designed for numerical aggregation, FedTextGrad is specifically tailored for handling textual data, expanding the applicability of FL to a broader range of problems that lack well-defined numerical loss functions. Secondly, building on this design, we conduct extensive experiments to explore the feasibility of FedTextGrad. Our findings highlight the importance of properly tuning key factors (e.g., local steps) in FL training. Thirdly, we highlight a major challenge in FedTextGrad aggregation: retaining essential information from distributed prompt updates. Last but not least, in response to this issue, we improve the vanilla variant of FedTextGrad by providing actionable guidance to the LLM when summarizing client prompts by leveraging the Uniform Information Density principle. Through this principled study, we enable the adoption of textual gradients in FL for optimizing LLMs, identify important issues, and pinpoint future directions, thereby opening up a new research area that warrants further investigation.

📂 Project Structure

📁 repo/
│-- 📜 README.md                    # Project documentation
│-- 📜 .gitignore                   # Ignored files
│-- 📜 requirements.txt             # Required packages
│-- 📜 main.py                      # Main Python file
│-- 📜 train_centralized.py         # Centralized training
│-- 📜 train_homo_fed.py            # Homogeneous Federated Learning
│-- 📜 train_hetero_fed.py          # Heterogeneous Federated Learning
│-- 📂 scripts/                     # Scripts for running Python files
│   │-- 📜 run_centralized.sh       # Main script for centralized training
│   │-- 📜 run_homo_fed.sh          # Homogeneous FL script
│   │-- 📜 run_hetero_fed.sh        # Heterogeneous FL script
│   │-- 📜 vllm_serve.sh            # VLLM serve
│   │-- 📜 sglang_serve.sh          # SG serve
│-- 📂 textgrad/                    # TextGrad package
│-- 📂 utils/                       # Utility functions for training
│-- 📂 logs/                        # Results and logs

🚀 Installation

Clone the repository and install dependencies:

# Clone the repository
git clone https://github.com/ubc-tea/FedTextGrad
cd FedTextGrad

# Create a virtual environment (optional)
python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

🔧 Tips: Installation

Use our local package textgrad (no local install needed).

🏗️ Usage

1️⃣ Data Preparation

Downloading Datasets

BBH Word Counting (automatic downloading)
BBH Word Sorting (automatic downloading)
GSM8k (requires pip install datasets from Hugging Face)

2️⃣ Setting Up LLM APIs

Using OpenAI API

Specify OPENAI_API_KEY.

Using Other APIs

Specify BASE_URL for third-party providers.

Using Local LLM API: Ollama

Automatic installation of Ollama:

curl -fsSL https://ollama.com/install.sh | sh

If download with ollama pre-built binaries from the Ollama GitHub release. Start the Ollama Serving in the backend

tar -xzvf ollamaxxx.tgz &
chmod +x ./bin/ollama &
cd ./bin/ollama &
./ollama serve

🔧 Tips

Improve Ollama Speed – Techniques to optimize Ollama’s performance.
Specify Ollama GPU – Guide on selecting a specific GPU for Ollama.
Set Ollama Serve URL – Configure a custom serve URL for Ollama.

Using Local LLM API: vLLM

Install vllm:

pip install vllm

Start vLLM as an API server:

sh scripts/vllm_serve.py

🔧 Tips

vLLM Tutorial – Guide to deploying vLLM.
Use Compute Capability 8.0+ (e.g., A100) for better performance.
️Serving on Tesla V100-SXM2 – Use --dtype=half --gpu-memory-utilization=0.9 --max-model-len=101728.
Offline Mode – Download models manually, set TRANSFORMERS_OFFLINE=1 (or HF_HUB_OFFLINE=1).
Serving with vLLM – Use the Instruct version of the LLM.

Using Local LLM API: SGLang

Install SGLang:

pip install --upgrade pip
pip install "sglang[all]"

Install FlashInfer CUDA kernels:

pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/

Start SGLang Server:

python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --port 30000

🔧 Tips: Model Selection

If occasional format misalignment occurs, try a more advanced or larger LLM.

3️⃣ Running Experiments

Start local LLM server:

sh <LLM_API_tool>_serve.sh

Run the corresponding module:

sh scripts/run_<module_name>.sh

Examples:

Using Ollama API:

OLLAMA_BASE_URL='http://localhost:11434/v1' OLLAMA_API_KEY='xxxxxxxxxxxxxxxx' python main.py --evaluation_engine ollama-llama3.1 --test_engine ollama-llama3.1 --task BBH_object_counting --module train_centralized

🔧 Tips: API URL & Key

When integrating third-party LLM API services like OpenRouter or Togther AI, please configure the URL and key as specified in the Ollama documentation, and ensure that the model name is prefixed with “ollama-”.

📊 Results

Results can be found in the logs directory. Additionally, you can configure comet_ml for logging.

📜 Citation

If you find our work useful and relevant, please cite:

@inproceedings{chencan,
  title={Can Textual Gradient Work in Federated Learning?},
  author={Chen, Minghui and Jin, Ruinan and Deng, Wenlong and Chen, Yuanyuan and Huang, Zhi and Yu, Han and Li, Xiaoxiao},
  booktitle={The Thirteenth International Conference on Learning Representations}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FedTextGrad: Federated Textual Gradient

📌 Abstract

📂 Project Structure

🚀 Installation

🔧 Tips: Installation

🏗️ Usage

1️⃣ Data Preparation

Downloading Datasets

2️⃣ Setting Up LLM APIs

Using OpenAI API

Using Other APIs

Using Local LLM API: Ollama

🔧 Tips

Using Local LLM API: vLLM

🔧 Tips

Using Local LLM API: SGLang

🔧 Tips: Model Selection

3️⃣ Running Experiments

🔧 Tips: API URL & Key

📊 Results

📜 Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
resources		resources
scripts		scripts
textgrad		textgrad
utils		utils
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
main.py		main.py
requirements.txt		requirements.txt
train_centralized.py		train_centralized.py
train_hetero_fed.py		train_hetero_fed.py
train_homo_fed.py		train_homo_fed.py

ubc-tea/FedTextGrad

Folders and files

Latest commit

History

Repository files navigation

FedTextGrad: Federated Textual Gradient

📌 Abstract

📂 Project Structure

🚀 Installation

🔧 Tips: Installation

🏗️ Usage

1️⃣ Data Preparation

Downloading Datasets

2️⃣ Setting Up LLM APIs

Using OpenAI API

Using Other APIs

Using Local LLM API: Ollama

🔧 Tips

Using Local LLM API: vLLM

🔧 Tips

Using Local LLM API: SGLang

🔧 Tips: Model Selection

3️⃣ Running Experiments

🔧 Tips: API URL & Key

📊 Results

📜 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages