Skip to content

Commit

Permalink
v0.3 updates
Browse files Browse the repository at this point in the history
  • Loading branch information
ShengdingHu committed Oct 14, 2022
1 parent cf9f30f commit 9587a5e
Show file tree
Hide file tree
Showing 41 changed files with 803 additions and 686 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,4 @@ t.sh


unittest/outputs/
unittest/tmp/
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,17 +26,17 @@

OpenDelta is a toolkit for parameter-efficient tuning methods (we dub it as *delta tuning*), by which users could flexibly assign (or add) a small amount parameters to update while keeping the most paramters frozen. By using OpenDelta, users could easily implement prefix-tuning, adapters, Lora, or any other types of delta tuning with preferred PTMs.

- Our repo is tested on Python 3.8 and PyTorch 1.9.0. Lower version may also be supported.
- Our repo is tested on Python 3.=-0 and PyTorch 1.9.0. Lower version may also be supported.

- **A demo of using Opendelta to modify the PLM (E.g., BART).**
![How PLM changes using Delta-tuning](docs/source/imgs/demo.gif)

## News
- 2022.10.10 We merge new version into main. Key changes can be seen in [Update log](#updata_log)
- 2022.03.24 We notice several bugs in Soft Prompt Tuning and Prefix Tuning, mainly due to their need to customize attention ids, token_type_ids, we are fixing it! Currently, please use the other methods since they are stabler and better in performance.
- 2022.03.20 Add a [colab example](https://colab.research.google.com/drive/1uAhgAdc8Qr42UKYDlgUv0f7W1-gAFwGo?usp=sharing) to illustrate efficient training and space-saving multitask-serving.
- 2022.03.20 A new pip version released.
- 2022.02.16 Support [regular expression](https://opendelta.readthedocs.io/en/latest/notes/namebasedaddr.html#regexexpr) in named-based addressing.
- **2022.10.14** Release v0.3.0. We make the usage of default configurations of each delta tuning methods (i.e., the position they are attached) more friendly! If a custom model has our supported models as submodules inside, the default configuration is also available. Other key changes can be seen in [Update Log](file:///Users/hsd/codes/opendelta_doc/OpenDelta/docs/build/html/notes/update.html#version-0-3-0)
- **2022.03.24** We notice several bugs in Soft Prompt Tuning and Prefix Tuning, mainly due to their need to customize attention ids, token_type_ids, we are fixing it! Currently, please use the other methods since they are stabler and better in performance.
- **2022.03.20** Add a [colab example](https://colab.research.google.com/drive/1uAhgAdc8Qr42UKYDlgUv0f7W1-gAFwGo?usp=sharing) to illustrate efficient training and space-saving multitask-serving.
- **2022.03.20** A new pip version released.
- **2022.02.16** Support [regular expression](https://opendelta.readthedocs.io/en/latest/notes/namebasedaddr.html#regexexpr) in named-based addressing.

## Installation
create a virtualenv (optional)
Expand Down Expand Up @@ -74,7 +74,7 @@ python setup.py develop
```

#### Tips
- If you want to use mirror for installing the packages, please change the `index_url` in [setup.cfg](set.cfg)
- If you want to use mirror for installing the packages, please change the `index_url` in [setup.cfg](setup.cfg)

- If you encounter network error using setup.py, please firstly install the dependencies via
```shell
Expand Down Expand Up @@ -115,7 +115,6 @@ used models that OpenDelta are sure to support.
| CTRL |||||||| | |


## Update Log

### version 0.3.0


8 changes: 5 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@
import sphinx_rtd_theme
import doctest
import opendelta
import opendelta.delta_models




# -- Project information -----------------------------------------------------

Expand All @@ -29,8 +31,8 @@


# The full version, including alpha/beta/rc tags
release = '0.1.1'
version = "0.1.1"
release = '0.3.0'
version = "0.3.0"

html_theme = 'sphinx_rtd_theme'
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
Expand Down
10 changes: 8 additions & 2 deletions docs/source/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
OpenDelta's documentation!
=====================================

OpenDelta is a **Plug-and-play** Library of the parameter-efficient fine-tuning ([delta-tuning](WhatisDelta)) technology for pre-trained models.
[OpenDelta](https://github.com/thunlp/OpenDelta/) is a **Plug-and-play** Library of the parameter-efficient fine-tuning ([delta-tuning](WhatisDelta)) technology for pre-trained models.


## Essential Advantages:
Expand Down Expand Up @@ -35,12 +35,18 @@ OpenDelta is a **Plug-and-play** Library of the parameter-efficient fine-tuning
notes/pluginunplug.md
notes/acceleration.md
notes/explored_config.md
.. toctree::
:maxdepth: 1
:caption: Information
notes/citation.md
notes/update.md
notes/faq.md
.. toctree::
:maxdepth: 2
:caption: Package Reference
:caption: Documentation
modules/base
modules/deltas
Expand Down
11 changes: 10 additions & 1 deletion docs/source/notes/citation.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# Citation

<img src="../imgs/todo-icon.jpeg" height="30px"> We are working on a technical report.
If you find our repo useful, please cite the following paper.

```
@article{ding2022delta,
title={Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models},
author={Ding, Ning and Qin, Yujia and Yang, Guang and Wei, Fuchao and Yang, Zonghan and Su, Yusheng and Hu, Shengding and Chen, Yulin and Chan, Chi-Min and Chen, Weize and others},
journal={arXiv preprint arXiv:2203.06904},
year={2022}
}
```
9 changes: 4 additions & 5 deletions docs/source/notes/composition.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
(composition)=
# Composition of delta models

With OpenDelta, you can perform compostion of different delta models.


### Add different deltas to the backbone
## Add different deltas to the backbone

```
from transformers import AutoModelForSequenceClassification
Expand All @@ -18,14 +17,14 @@ delta_model.log()
```{figure} ../imgs/composition_of_delta.png
---
width: 600px
name: defaultmodification
name: composition_of_delta
---
```
````



### Even add multiple delta to the same layer
## Even add multiple delta to the same layer

```
from transformers import AutoModelForSequenceClassification
Expand All @@ -40,7 +39,7 @@ delta_model.log()
```{figure} ../imgs/multiple_to_one_layer.png
---
width: 600px
name: defaultmodification
name: multiple_to_one_layer
---
```
````
Expand Down
10 changes: 3 additions & 7 deletions docs/source/notes/explored_config.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
(favoredconfiguration)=
# Favored Configuration

<img src="../imgs/todo-icon.jpeg" height="30px"> We will add the commonly used configuration of delta models HERE in future.
Generally, the default configurations are already good enough. If we want squeeze the size of delta models further, you can refer to the following papers.

E.g.
- the modified_modules (position of delta),
- hyperparameter that are the most efficient
- the favored composition between delta models

Currenlty, use the default setting, explore it by yourself, or refer to existing papers' configuration!
- [AdapterDrop: On the Efficiency of Adapters in Transformers](https://arxiv.org/abs/2010.11918)
- [Sparse Structure Search for Parameter-Efficient Tuning(Delta Tuning)](https://arxiv.org/abs/2206.07382)
2 changes: 1 addition & 1 deletion docs/source/notes/faq.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# FAQ

1. We haven't provide common structure mapping for this backbone model...
1.
2 changes: 1 addition & 1 deletion docs/source/notes/keyfeature.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ We use three key functions to achieve the modifications to the backbone model ou
- **parallel insertion**

Adapters can also be used in a parallel fashion (see [Paper](https://arxiv.org/abs/2110.04366)).
For these methods, use [insert_parallel_module](opendelta.basemodel.DeltaBase.insert_parrellel_module) interface.
For these methods, use [insert_parallel_module](opendelta.basemodel.DeltaBase.insert_parallel_module) interface.


:::{admonition} Doc-preserving Insertion
Expand Down
2 changes: 0 additions & 2 deletions docs/source/notes/knownissue.md

This file was deleted.

12 changes: 6 additions & 6 deletions docs/source/notes/namebasedaddr.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
(namebasedaddr)=

# Name-based Addressing

Named based addressing is what set OpenDelta apart from other packages and provide the possibility to be used to a broader range of models (even emerging ones).
Expand Down Expand Up @@ -52,7 +52,7 @@ In this case, string `"name_b.0.name_a"` will be the name to address the submodu

Thus when applying a delta model to this toy net.

```
```python
from opendelta import AdapterModel
AdapterModel(backbone_model=root, modified_modules=['name_b.0.name_a'])
Visualization(root).structure_graph()
Expand All @@ -67,7 +67,7 @@ name: toy-delta
```
````


(targetmodules)=
## Target modules.

For different delta methods, the operation for the modification target is different.
Expand All @@ -88,7 +88,7 @@ Handcrafting the full names of submodules can be frustrating. We made some simpl
1. **End-matching** Rules.

OpenDelta will take every modules that
**ends with** the provided name suffix as the modification [target module](target_module).
**ends with** the provided name suffix as the modification [target module](targetmodules).
:::{admonition} Example
:class: tip
Taking DistilBert with an classifier on top as an example:
Expand All @@ -115,7 +115,7 @@ Handcrafting the full names of submodules can be frustrating. We made some simpl
:::{admonition} Regex in Json Configs
:class: warning
In json, you should write `"\\."` instead of `"\."` for a real dot due to json parsing rules. That is
```json
```
{
...
"modified_moduls": ['[r][0-5]\\.attention'],
Expand All @@ -138,7 +138,7 @@ Handcrafting the full names of submodules can be frustrating. We made some simpl
delta_model = LoraModel(backbone_model=model, interactive_modify=True)
```
by setting `interactive_modify`, a web server will be opened on local host, and the link will be print in the terminal.
by setting `interactive_modify`, a web server will be opened on local host, and the link will be print in the terminal, e.g.,
```
http://0.0.0.0:8888/
Expand Down
12 changes: 6 additions & 6 deletions docs/source/notes/pluginunplug.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ delta_model.log()
```{figure} ../imgs/plugunplug1.png
---
width: 800px
name: defaultmodification
name: plugunplug1
---
```
````
Expand All @@ -33,7 +33,7 @@ delta_model.log()
```{figure} ../imgs/plugunplug2.png
---
width: 800px
name: defaultmodification
name: plugunplug2
---
```
````
Expand All @@ -48,7 +48,7 @@ delta_model.log()
```{figure} ../imgs/plugunplug3.png
---
width: 800px
name: defaultmodification
name: plugunplug3
---
```
````
Expand All @@ -67,7 +67,7 @@ delta_model2.log()
```{figure} ../imgs/plugunplug4.png
---
width: 800px
name: defaultmodification
name: plugunplug4
---
```
````
Expand All @@ -81,7 +81,7 @@ delta_model.log()
```{figure} ../imgs/plugunplug5.png
---
width: 800px
name: defaultmodification
name: plugunplug5
---
```
````
Expand All @@ -96,7 +96,7 @@ delta_model.log()
```{figure} ../imgs/plugunplug6.png
---
width: 800px
name: defaultmodification
name: plugunplug6
---
```
````
Expand Down
3 changes: 1 addition & 2 deletions docs/source/notes/saveload.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
(saveload)=
# Save and Share the Delta

## Space efficient saving without changing the code.
Expand Down Expand Up @@ -95,4 +94,4 @@ If you are satisfied with your checkpoint, do not forget to share your model to

## Save & Load for Composition of Delta

<img src="../imgs/todo-icon.jpeg" height="30px"> Currently save & load method is not suitable for [composition of delta model](compositon). Please wait for future releases.
<img src="../imgs/todo-icon.jpeg" height="30px"> Currently save & load method is not suitable for [composition](composition) of delta model. Please wait for future releases.
4 changes: 2 additions & 2 deletions docs/source/notes/unifyname.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
(unifyname)=
(commonstructure)=

# Common Structure Mapping

Expand Down Expand Up @@ -41,7 +41,7 @@ Visualize bert-base using a common structure name: The submodules that are not c

```{figure} ../imgs/commonstructure_vis.png
:width: 600px
:name: transformers_structure
:name: commonstructure_vis
```

(mappingexample)=
Expand Down
21 changes: 21 additions & 0 deletions docs/source/notes/update.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Update Logs and Known Issues


## Version 0.3.0
### Updates:
- Add this changelog for a granular record of updates.
- The default configuration of delta models can be applied to more wrapped models.
- There is less need to configure 'modified_modules' for wrapped models like [BertForSequenceClassification](https://huggingface.co/docs/transformers/main/en/model_doc/bert#transformers.BertForSequenceClassification) or even [OpenMatch.DRModel](https://github.com/OpenMatch/OpenMatch/blob/master/src/openmatch/modeling/dense_retrieval_model.py#L37), as long as it has a model we support default configuration inside. **Note that if you customize `modified_modules` by yourself, most pytorch models are supported.**
- LoRA and BitFit models now does not need pseudo data to instantiate the model.
- BitFit models can now support [Conv1D](https://huggingface.co/docs/transformers/v4.23.1/en/internal/modeling_utils#transformers.Conv1D) using default configuration.
- Improve type hint for AutoDeltaModel.
- Fix bugs in documentation.
- Fix small bugs when saving a model without a config attributes.
- Make the default modified modules of adapter-like methods more accurate: attach the adapter-like modules after the output of attention layer and second feed-forward layer, both before the layernorm layers.
- A simple unit test folder containing development-time tests has been added for interested users.


### Known Issues
- SoftPrompt is still not supported for wrapped model if the model has no attribute `get_input_embeddings`.
- Prefix Tuning is still limited to T5, GPT2, Bart, Bert, Roberta.

4 changes: 2 additions & 2 deletions docs/source/notes/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ model = AutoModelForSequenceClassification.from_pretrained("facebook/bart-base")
## STEP 2: Add delta modules
We provide two alternatives to add the delta modules.
### 2.1 Modification based on visualization
Suppose we want to make the feedforward layer of each block as our [modification target module](target_module),
Suppose we want to make the feedforward layer of each block as our [modification target module](targetmodules),
We should first know what is the name of the feedforward layer in the BART model by visualization. <img src="../imgs/hint-icon-2.jpg" height="30px"> *For more about visualization, see [Visualization](visualization).*

```python
Expand Down Expand Up @@ -48,7 +48,7 @@ delta_model.log() # This will visualize the backbone after modification and othe
### 2.2 Use the default modification.
We also provide the default modifications of each delta methods for some commonly used PTMs (e.g., BERT, RoBERTA, DistilBERT, T5, GPT2), so the users don't need to specify the submodules to modify.

The default modifications is achieved by mapping a name of a submodule to it's name on a common transformer structure. <img src="../imgs/hint-icon-2.jpg" height="30px"> *For details about the common structure mapping, see [Common Structure Mapping](unifyname)*
The default modifications is achieved by mapping a name of a submodule to it's name on a common transformer structure. <img src="../imgs/hint-icon-2.jpg" height="30px"> *For details about the common structure mapping, see [Common Structure Mapping](commonstructure)*



Expand Down
1 change: 0 additions & 1 deletion docs/source/notes/visualization.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
(visualization)=
# Visualize the Parameters

When OpenDelta makes modifications to a pretrained model (PTM), it is beneficial to know what your PTM looks like, especially the location of the parameters.
Expand Down
2 changes: 1 addition & 1 deletion opendelta/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

__version__ = "0.2.4"
__version__ = "0.3.0"

class GlobalSetting:
def __init__(self):
Expand Down
Loading

0 comments on commit 9587a5e

Please sign in to comment.