v0.3 updates

thunlp · Oct 14, 2022 · 9587a5e · 9587a5e
1 parent cf9f30f
commit 9587a5e
Show file tree

Hide file tree

Showing 41 changed files with 803 additions and 686 deletions.
diff --git a/.gitignore b/.gitignore
@@ -56,3 +56,4 @@ t.sh
 
 
 unittest/outputs/
+unittest/tmp/
diff --git a/README.md b/README.md
@@ -26,17 +26,17 @@
 
 OpenDelta is a toolkit for parameter-efficient tuning methods (we dub it as *delta tuning*), by which users could flexibly assign (or add) a small amount parameters to update while keeping the most paramters frozen. By using OpenDelta, users could easily implement prefix-tuning, adapters, Lora, or any other types of delta tuning with preferred PTMs.
 
-- Our repo is tested on Python 3.8 and PyTorch 1.9.0. Lower version may also be supported. 
+- Our repo is tested on Python 3.=-0 and PyTorch 1.9.0. Lower version may also be supported. 
 
 - **A demo of using Opendelta to modify the PLM (E.g., BART).**
 ![How PLM changes using Delta-tuning](docs/source/imgs/demo.gif)
 
 ## News
-- 2022.10.10 We merge new version into main. Key changes can be seen in [Update log](#updata_log)
-- 2022.03.24 We notice several bugs in Soft Prompt Tuning and Prefix Tuning, mainly due to their need to customize attention ids, token_type_ids, we are fixing it! Currently, please use the other methods since they are stabler and better in performance. 
-- 2022.03.20 Add a [colab example](https://colab.research.google.com/drive/1uAhgAdc8Qr42UKYDlgUv0f7W1-gAFwGo?usp=sharing) to illustrate efficient training and space-saving multitask-serving.
-- 2022.03.20 A new pip version released.
-- 2022.02.16 Support [regular expression](https://opendelta.readthedocs.io/en/latest/notes/namebasedaddr.html#regexexpr) in named-based addressing. 
+- **2022.10.14** Release v0.3.0. We make the usage of default configurations of each delta tuning methods (i.e., the position they are attached) more friendly! If a custom model has our supported models as submodules inside, the default configuration is also available. Other key changes can be seen in [Update Log](file:///Users/hsd/codes/opendelta_doc/OpenDelta/docs/build/html/notes/update.html#version-0-3-0)
+- **2022.03.24** We notice several bugs in Soft Prompt Tuning and Prefix Tuning, mainly due to their need to customize attention ids, token_type_ids, we are fixing it! Currently, please use the other methods since they are stabler and better in performance. 
+- **2022.03.20** Add a [colab example](https://colab.research.google.com/drive/1uAhgAdc8Qr42UKYDlgUv0f7W1-gAFwGo?usp=sharing) to illustrate efficient training and space-saving multitask-serving.
+- **2022.03.20** A new pip version released.
+- **2022.02.16** Support [regular expression](https://opendelta.readthedocs.io/en/latest/notes/namebasedaddr.html#regexexpr) in named-based addressing. 
 
 ## Installation
 create a virtualenv (optional)
@@ -74,7 +74,7 @@ python setup.py develop
 ```
 
 #### Tips
-- If you want to use mirror for installing the packages, please change the `index_url` in [setup.cfg](set.cfg)
+- If you want to use mirror for installing the packages, please change the `index_url` in [setup.cfg](setup.cfg)
 
 - If you encounter network error using setup.py, please firstly install the dependencies via
 ```shell
@@ -115,7 +115,6 @@ used models that OpenDelta are sure to support.
 | CTRL           | ✅  | ✅  | ✅  | ✅  | ✅  | ✅  | ✅  |     |     |
 
 
-##  Update Log
 
-### version 0.3.0
+
 
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -19,7 +19,9 @@
 import sphinx_rtd_theme
 import doctest
 import opendelta
-import opendelta.delta_models
+
+
+
 
 # -- Project information -----------------------------------------------------
 
@@ -29,8 +31,8 @@
 
 
 # The full version, including alpha/beta/rc tags
-release = '0.1.1'
-version = "0.1.1"
+release = '0.3.0'
+version = "0.3.0"
 
 html_theme = 'sphinx_rtd_theme'
 html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]

diff --git a/docs/source/index.md b/docs/source/index.md
@@ -1,7 +1,7 @@
 OpenDelta's documentation!
 =====================================
 
-OpenDelta is a **Plug-and-play** Library of the parameter-efficient fine-tuning ([delta-tuning](WhatisDelta)) technology for pre-trained models.
+[OpenDelta](https://github.com/thunlp/OpenDelta/) is a **Plug-and-play** Library of the parameter-efficient fine-tuning ([delta-tuning](WhatisDelta)) technology for pre-trained models.
 
 
 ## Essential Advantages:
@@ -35,12 +35,18 @@ OpenDelta is a **Plug-and-play** Library of the parameter-efficient fine-tuning
    notes/pluginunplug.md
    notes/acceleration.md
    notes/explored_config.md
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Information
+
    notes/citation.md
+   notes/update.md
    notes/faq.md
 
 .. toctree::
    :maxdepth: 2
-   :caption: Package Reference
+   :caption: Documentation
 
    modules/base
    modules/deltas

diff --git a/docs/source/notes/citation.md b/docs/source/notes/citation.md
@@ -1,3 +1,12 @@
 # Citation
 
-<img src="../imgs/todo-icon.jpeg" height="30px"> We are working on a technical report.
+If you find our repo useful, please cite the following paper. 
+
+```
+@article{ding2022delta,
+  title={Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models},
+  author={Ding, Ning and Qin, Yujia and Yang, Guang and Wei, Fuchao and Yang, Zonghan and Su, Yusheng and Hu, Shengding and Chen, Yulin and Chan, Chi-Min and Chen, Weize and others},
+  journal={arXiv preprint arXiv:2203.06904},
+  year={2022}
+}
+```
diff --git a/docs/source/notes/composition.md b/docs/source/notes/composition.md
@@ -1,10 +1,9 @@
-(composition)=
 # Composition of delta models
 
 With OpenDelta, you can perform compostion of different delta models.
 
 
-### Add different deltas to the backbone
+## Add different deltas to the backbone
 
 ```
 from transformers import AutoModelForSequenceClassification
@@ -18,14 +17,14 @@ delta_model.log()
 ```{figure} ../imgs/composition_of_delta.png
 ---
 width: 600px
-name: defaultmodification
+name: composition_of_delta
 ---
 ```
 ````
 
 
 
-### Even add multiple delta to the same layer
+## Even add multiple delta to the same layer
 
 ```
 from transformers import AutoModelForSequenceClassification
@@ -40,7 +39,7 @@ delta_model.log()
 ```{figure} ../imgs/multiple_to_one_layer.png
 ---
 width: 600px
-name: defaultmodification
+name: multiple_to_one_layer
 ---
 ```
 ````

diff --git a/docs/source/notes/explored_config.md b/docs/source/notes/explored_config.md
@@ -1,11 +1,7 @@
 (favoredconfiguration)=
 # Favored Configuration
 
-<img src="../imgs/todo-icon.jpeg" height="30px"> We will add the commonly used configuration of delta models HERE in future.
+Generally, the default configurations are already good enough. If we want squeeze the size of delta models further, you can refer to the following papers.
 
-E.g.
-- the modified_modules (position of delta), 
-- hyperparameter that are the most efficient
-- the favored composition between delta models
-
-Currenlty, use the default setting, explore it by yourself, or refer to existing papers' configuration!
+ - [AdapterDrop: On the Efficiency of Adapters in Transformers](https://arxiv.org/abs/2010.11918)
+ - [Sparse Structure Search for Parameter-Efficient Tuning(Delta Tuning)](https://arxiv.org/abs/2206.07382)
diff --git a/docs/source/notes/faq.md b/docs/source/notes/faq.md
@@ -1,3 +1,3 @@
 # FAQ
 
-1. We haven't provide common structure mapping for this backbone model...
+1. 
diff --git a/docs/source/notes/keyfeature.md b/docs/source/notes/keyfeature.md
@@ -38,7 +38,7 @@ We use three key functions to achieve the modifications to the backbone model ou
    - **parallel insertion**
 
     Adapters can also be used in a parallel fashion (see [Paper](https://arxiv.org/abs/2110.04366)).
-    For these methods, use [insert_parallel_module](opendelta.basemodel.DeltaBase.insert_parrellel_module) interface.
+    For these methods, use [insert_parallel_module](opendelta.basemodel.DeltaBase.insert_parallel_module) interface.
 
 
 :::{admonition} Doc-preserving Insertion

diff --git a/docs/source/notes/knownissue.md b/docs/source/notes/knownissue.md
diff --git a/docs/source/notes/namebasedaddr.md b/docs/source/notes/namebasedaddr.md
@@ -1,4 +1,4 @@
-(namebasedaddr)=
+
 # Name-based Addressing
 
 Named based addressing is what set OpenDelta apart from other packages and provide the possibility to be used to a broader range of models (even emerging ones).
@@ -52,7 +52,7 @@ In this case, string `"name_b.0.name_a"` will be the name to address the submodu
 
 Thus when applying a delta model to this toy net.
 
-```
+```python
 from opendelta import AdapterModel
 AdapterModel(backbone_model=root, modified_modules=['name_b.0.name_a'])
 Visualization(root).structure_graph()
@@ -67,7 +67,7 @@ name: toy-delta
 ```
 ````
 
-
+(targetmodules)=
 ## Target modules.
 
 For different delta methods, the operation for the modification target is different.
@@ -88,7 +88,7 @@ Handcrafting the full names of submodules can be frustrating. We made some simpl
 1. **End-matching** Rules.
 
     OpenDelta will take every modules that 
-    **ends with** the provided name suffix as the modification [target module](target_module). 
+    **ends with** the provided name suffix as the modification [target module](targetmodules). 
     :::{admonition} Example
     :class: tip
     Taking DistilBert with an classifier on top as an example:
@@ -115,7 +115,7 @@ Handcrafting the full names of submodules can be frustrating. We made some simpl
     :::{admonition} Regex in Json Configs 
     :class: warning
     In json, you should write `"\\."` instead of `"\."` for a real dot due to json parsing rules. That is 
-    ```json
+    ```
     {   
         ...
         "modified_moduls": ['[r][0-5]\\.attention'],
@@ -138,7 +138,7 @@ Handcrafting the full names of submodules can be frustrating. We made some simpl
     delta_model = LoraModel(backbone_model=model, interactive_modify=True)
     ```
 
-    by setting `interactive_modify`, a web server will be opened on local host, and the link will be print in the terminal.
+    by setting `interactive_modify`, a web server will be opened on local host, and the link will be print in the terminal, e.g.,
 
     ```
     http://0.0.0.0:8888/

diff --git a/docs/source/notes/pluginunplug.md b/docs/source/notes/pluginunplug.md
@@ -19,7 +19,7 @@ delta_model.log()
 ```{figure} ../imgs/plugunplug1.png
 ---
 width: 800px
-name: defaultmodification
+name: plugunplug1
 ---
 ```
 ````
@@ -33,7 +33,7 @@ delta_model.log()
 ```{figure} ../imgs/plugunplug2.png
 ---
 width: 800px
-name: defaultmodification
+name: plugunplug2
 ---
 ```
 ````
@@ -48,7 +48,7 @@ delta_model.log()
 ```{figure} ../imgs/plugunplug3.png
 ---
 width: 800px
-name: defaultmodification
+name: plugunplug3
 ---
 ```
 ````
@@ -67,7 +67,7 @@ delta_model2.log()
 ```{figure} ../imgs/plugunplug4.png
 ---
 width: 800px
-name: defaultmodification
+name: plugunplug4
 ---
 ```
 ````
@@ -81,7 +81,7 @@ delta_model.log()
 ```{figure} ../imgs/plugunplug5.png
 ---
 width: 800px
-name: defaultmodification
+name: plugunplug5
 ---
 ```
 ````
@@ -96,7 +96,7 @@ delta_model.log()
 ```{figure} ../imgs/plugunplug6.png
 ---
 width: 800px
-name: defaultmodification
+name: plugunplug6
 ---
 ```
 ````

diff --git a/docs/source/notes/saveload.md b/docs/source/notes/saveload.md
@@ -1,4 +1,3 @@
-(saveload)=
 # Save and Share the Delta
 
 ## Space efficient saving without changing the code.
@@ -95,4 +94,4 @@ If you are satisfied with your checkpoint, do not forget to share your model to
 
 ## Save & Load for Composition of Delta
 
-<img src="../imgs/todo-icon.jpeg" height="30px"> Currently save & load method is not suitable for [composition of delta model](compositon). Please wait for future releases. 
+<img src="../imgs/todo-icon.jpeg" height="30px"> Currently save & load method is not suitable for [composition](composition) of delta model. Please wait for future releases. 
diff --git a/docs/source/notes/unifyname.md b/docs/source/notes/unifyname.md
@@ -1,4 +1,4 @@
-(unifyname)=
+(commonstructure)=
 
 # Common Structure Mapping
 
@@ -41,7 +41,7 @@ Visualize bert-base using a common structure name: The submodules that are not c
 
 ```{figure} ../imgs/commonstructure_vis.png
 :width: 600px
-:name: transformers_structure
+:name: commonstructure_vis
 ```
 
 (mappingexample)=

diff --git a/docs/source/notes/update.md b/docs/source/notes/update.md
@@ -0,0 +1,21 @@
+# Update Logs and Known Issues
+
+
+## Version 0.3.0
+### Updates:
+- Add this changelog for a granular record of updates.
+- The default configuration of delta models can be applied to more wrapped models.
+  - There is less need to configure 'modified_modules' for wrapped models like [BertForSequenceClassification](https://huggingface.co/docs/transformers/main/en/model_doc/bert#transformers.BertForSequenceClassification) or even [OpenMatch.DRModel](https://github.com/OpenMatch/OpenMatch/blob/master/src/openmatch/modeling/dense_retrieval_model.py#L37), as long as it has a model we support default configuration inside. **Note that if you customize `modified_modules` by yourself, most pytorch models are supported.**
+- LoRA and BitFit models now does not need pseudo data to instantiate the model.
+- BitFit models can now support [Conv1D](https://huggingface.co/docs/transformers/v4.23.1/en/internal/modeling_utils#transformers.Conv1D) using default configuration.
+- Improve type hint for AutoDeltaModel.
+- Fix bugs in documentation.
+- Fix small bugs when saving a model without a config attributes.
+- Make the default modified modules of adapter-like methods more accurate: attach the adapter-like modules after the output of attention layer and second feed-forward layer, both before the layernorm layers. 
+- A simple unit test folder containing development-time tests has been added for interested users.
+
+
+### Known Issues
+- SoftPrompt is still not supported for wrapped model if the model has no attribute `get_input_embeddings`.
+- Prefix Tuning is still limited to T5, GPT2, Bart, Bert, Roberta.
+
diff --git a/docs/source/notes/usage.md b/docs/source/notes/usage.md
@@ -12,7 +12,7 @@ model = AutoModelForSequenceClassification.from_pretrained("facebook/bart-base")
 ## STEP 2: Add delta modules
 We provide two alternatives to add the delta modules.
 ### 2.1 Modification based on visualization
-Suppose we want to make the feedforward layer of each block as our [modification target module](target_module),
+Suppose we want to make the feedforward layer of each block as our [modification target module](targetmodules),
 We should first know what is the name of the feedforward layer in the BART model by visualization. <img src="../imgs/hint-icon-2.jpg" height="30px"> *For more about visualization, see [Visualization](visualization).*
 
 ```python
@@ -48,7 +48,7 @@ delta_model.log() # This will visualize the backbone after modification and othe
 ### 2.2 Use the default modification.
 We also provide the default modifications of each delta methods for some commonly used PTMs (e.g., BERT, RoBERTA, DistilBERT, T5, GPT2), so the users don't need to specify the submodules to modify.
 
-The default modifications is achieved by mapping a name of a submodule to it's name on a common transformer structure. <img src="../imgs/hint-icon-2.jpg" height="30px">  *For details about the common structure mapping, see [Common Structure Mapping](unifyname)*
+The default modifications is achieved by mapping a name of a submodule to it's name on a common transformer structure. <img src="../imgs/hint-icon-2.jpg" height="30px">  *For details about the common structure mapping, see [Common Structure Mapping](commonstructure)*
 
 
 

diff --git a/docs/source/notes/visualization.md b/docs/source/notes/visualization.md
@@ -1,4 +1,3 @@
-(visualization)=
 # Visualize the Parameters
 
 When OpenDelta makes modifications to a pretrained model (PTM), it is beneficial to know what your PTM looks like, especially the location of the parameters.

diff --git a/opendelta/__init__.py b/opendelta/__init__.py
@@ -1,5 +1,5 @@
 
-__version__ = "0.2.4"
+__version__ = "0.3.0"
 
 class GlobalSetting:
     def __init__(self):
Original file line number	Diff line number	Diff line change
Expand Up		@@ -56,3 +56,4 @@ t.sh


		unittest/outputs/
		unittest/tmp/