-
Notifications
You must be signed in to change notification settings - Fork 310
Keras hub rename #1840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keras hub rename #1840
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ __pycache__/ | |
| *.swp | ||
| *.swo | ||
|
|
||
| keras_nlp.egg-info/ | ||
| keras_hub.egg-info/ | ||
| dist/ | ||
|
|
||
| .coverage | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,13 +1,13 @@ | ||
| # Model Contribution Guide | ||
|
|
||
| KerasNLP has a plethora of pre-trained large language models | ||
| KerasHub has a plethora of pre-trained large language models | ||
| ranging from BERT to OPT. We are always looking for more models and are always | ||
| open to contributions! | ||
|
|
||
| In this guide, we will walk you through the steps one needs to take in order to | ||
| contribute a new pre-trained model to KerasNLP. For illustration purposes, let's | ||
| contribute a new pre-trained model to KerasHub. For illustration purposes, let's | ||
| assume that you want to contribute the DistilBERT model. Before we dive in, we encourage you to go through | ||
| [our getting started guide](https://keras.io/guides/keras_nlp/getting_started/) | ||
| [our getting started guide](https://keras.io/guides/keras_hub/getting_started/) | ||
| for an introduction to the library, and our | ||
| [contribution guide](https://github.com/keras-team/keras-nlp/blob/master/CONTRIBUTING.md). | ||
|
|
||
|
|
@@ -22,29 +22,29 @@ Keep this checklist handy! | |
|
|
||
| ### Step 2: PR #1 - Add XXBackbone | ||
|
|
||
| - [ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)\]. | ||
| - [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone_test.py)\]. | ||
| - [ ] An `xx/xx_backbone.py` file which has the model graph \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py)\]. | ||
|
||
| - [ ] An `xx/xx_backbone_test.py` file which has unit tests for the backbone \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone_test.py)\]. | ||
|
||
| - [ ] A Colab notebook link in the PR description which matches the outputs of the implemented backbone model with the original source \[[Example](https://colab.research.google.com/drive/1SeZWJorKWmwWJax8ORSdxKrxE25BfhHa?usp=sharing)\]. | ||
|
|
||
| ### Step 3: PR #2 - Add XXTokenizer | ||
|
|
||
| - [ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py)\]. | ||
| - [ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer_test.py)\]. | ||
| - [ ] An `xx/xx_tokenizer.py` file which has the tokenizer for the model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py)\]. | ||
| - [ ] An `xx/xx_tokenizer_test.py` file which has unit tests for the model tokenizer \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer_test.py)\]. | ||
| - [ ] A Colab notebook link in the PR description, demonstrating that the output of the tokenizer matches the original tokenizer \[[Example](https://colab.research.google.com/drive/1MH_rpuFB1Nz_NkKIAvVtVae2HFLjXZDA?usp=sharing)]. | ||
|
|
||
| ### Step 4: PR #3 - Add XX Presets | ||
|
|
||
| - [ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_presets.py)\]. | ||
| - [ ] An `xx/xx_presets.py` file with links to weights uploaded to a personal GCP bucket/Google Drive \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_presets.py)\]. | ||
| - [ ] A `tools/checkpoint_conversion/convert_xx_checkpoints.py` which is reusable script for converting checkpoints \[[Example](https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_distilbert_checkpoints.py)\]. | ||
| - [ ] A Colab notebook link in the PR description, showing an end-to-end task such as text classification, etc. The task model can be built using the backbone model, with the task head on top \[[Example](https://gist.github.com/mattdangerw/bf0ca07fb66b6738150c8b56ee5bab4e)\]. | ||
|
|
||
| ### Step 5: PR #4 and Beyond - Add XX Tasks and Preprocessors | ||
|
|
||
| This PR is optional. | ||
|
|
||
| - [ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier.py)\] | ||
| - [ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor.py)\]. | ||
| - [ ] `xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_preprocessor_test.py)\]. | ||
| - [ ] An `xx/xx_<task>.py` file for adding a task model like classifier, masked LM, etc. \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier.py)\] | ||
| - [ ] An `xx/xx_<task>_preprocessor.py` file which has the preprocessor and can be used to get inputs suitable for the task model \[[Example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor.py)\]. | ||
| - [ ] `xx/xx_<task>_test.py` file and `xx/xx_<task>_preprocessor_test.py` files which have unit tests for the above two modules \[[Example 1](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_classifier_test.py) and [Example 2](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_preprocessor_test.py)\]. | ||
| - [ ] A Colab notebook link in the PR description, demonstrating that the output of the preprocessor matches the output of the original preprocessor \[[Example](https://colab.research.google.com/drive/1GFFC7Y1I_2PtYlWDToqKvzYhHWv1b3nC?usp=sharing)]. | ||
|
|
||
| ## Detailed Instructions | ||
|
|
@@ -81,7 +81,7 @@ around by a class to implement our models. | |
|
|
||
| A model is typically split into three/four sections. We would recommend you to | ||
| compare this side-by-side with the | ||
| [`keras_nlp.layers.DistilBertBackbone` source code](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py)! | ||
| [`keras_hub.layers.DistilBertBackbone` source code](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py)! | ||
|
|
||
| **Inputs to the model** | ||
|
|
||
|
|
@@ -92,32 +92,32 @@ Generally, the standard inputs to any text model are: | |
| **Embedding layer(s)** | ||
|
|
||
| Standard layers used: `keras.layers.Embedding`, | ||
| `keras_nlp.layers.PositionEmbedding`, `keras_nlp.layers.TokenAndPositionEmbedding`. | ||
| `keras_hub.layers.PositionEmbedding`, `keras_hub.layers.TokenAndPositionEmbedding`. | ||
|
|
||
| **Encoder layers** | ||
|
|
||
| Standard layers used: `keras_nlp.layers.TransformerEncoder`, `keras_nlp.layers.FNetEncoder`. | ||
| Standard layers used: `keras_hub.layers.TransformerEncoder`, `keras_hub.layers.FNetEncoder`. | ||
|
|
||
| **Decoder layers (possibly)** | ||
|
|
||
| Standard layers used: `keras_nlp.layers.TransformerDecoder`. | ||
| Standard layers used: `keras_hub.layers.TransformerDecoder`. | ||
|
|
||
| **Other layers which might be used** | ||
|
|
||
| `keras.layers.LayerNorm`, `keras.layers.Dropout`, `keras.layers.Conv1D`, etc. | ||
|
|
||
| <br/> | ||
|
|
||
| The standard layers provided in Keras and KerasNLP are generally enough for | ||
| The standard layers provided in Keras and KerasHub are generally enough for | ||
| most of the usecases and it is recommended to do a thorough search | ||
| [here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_nlp/layers/). | ||
| [here](https://keras.io/api/layers/) and [here](https://keras.io/api/keras_hub/layers/). | ||
| However, sometimes, models have small tweaks/paradigm changes in their architecture. | ||
| This is when things might slightly get complicated. | ||
|
|
||
| If the model introduces a paradigm shift, such as using relative attention instead | ||
| of vanilla attention, the contributor will have to implement complete custom layers. A case | ||
| in point is `keras_nlp.models.DebertaV3Backbone` where we had to [implement layers | ||
| from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_nlp/models/deberta_v3). | ||
| in point is `keras_hub.models.DebertaV3Backbone` where we had to [implement layers | ||
| from scratch](https://github.com/keras-team/keras-nlp/tree/master/keras_hub/models/deberta_v3). | ||
|
|
||
| On the other hand, if the model has a small tweak, something simpler can be done. | ||
| For instance, in the Whisper model, the self-attention and cross-attention mechanism | ||
|
|
@@ -154,23 +154,23 @@ and loaded correctly, etc. | |
| #### Tokenizer | ||
|
|
||
| Most text models nowadays use subword tokenizers such as WordPiece, SentencePiece | ||
| and BPE Tokenizer. Since KerasNLP has implementations of most of the popular | ||
| and BPE Tokenizer. Since KerasHub has implementations of most of the popular | ||
| subword tokenizers, the model tokenizer layer typically inherits from a base | ||
| tokenizer class. | ||
|
|
||
| For example, DistilBERT uses the WordPiece tokenizer. So, we can introduce a new | ||
| class, `DistilBertTokenizer`, which inherits from `keras_nlp.tokenizers.WordPieceTokenizer`. | ||
| class, `DistilBertTokenizer`, which inherits from `keras_hub.tokenizers.WordPieceTokenizer`. | ||
| All the underlying actual tokenization will be taken care of by the superclass. | ||
|
|
||
| The important thing here is adding "special tokens". Most models have | ||
| special tokens such as beginning-of-sequence token, end-of-sequence token, | ||
| mask token, pad token, etc. These have to be | ||
| [added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_tokenizer.py#L91-L105) | ||
| [added as member attributes](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_tokenizer.py#L91-L105) | ||
| to the tokenizer class. These member attributes are then accessed by the | ||
| preprocessor layers. | ||
|
|
||
| For a full list of the tokenizers KerasNLP offers, please visit | ||
| [this link](https://keras.io/api/keras_nlp/tokenizers/) and make use of the | ||
| For a full list of the tokenizers KerasHub offers, please visit | ||
| [this link](https://keras.io/api/keras_hub/tokenizers/) and make use of the | ||
| tokenizer your model uses! | ||
|
|
||
| #### Unit Tests | ||
|
|
@@ -193,7 +193,7 @@ files. These files will then be uploaded to GCP by us! | |
| After wrapping up the preset configuration file, you need to | ||
| add the `from_preset` function to all three classes, i.e., `DistilBertBackbone`, | ||
| and `DistilBertTokenizer`. Here is an | ||
| [example](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_backbone.py#L187-L189). | ||
| [example](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_backbone.py#L187-L189). | ||
|
|
||
| The testing for presets is divided into two: "large" and "extra large". | ||
| For "large" tests, we pick the smallest preset (in terms of number of parameters) | ||
|
|
@@ -228,12 +228,12 @@ and return the dictionary in the form expected by the model. | |
|
|
||
| The preprocessor class might have a few intricacies depending on the model. For example, | ||
| the DeBERTaV3 tokenizer does not have the `[MASK]` in the provided sentencepiece | ||
| proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have | ||
| proto file, and we had to make some modifications [here](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/deberta_v3/deberta_v3_preprocessor.py). Secondly, we have | ||
| a separate preprocessor class for every task. This is because different tasks | ||
| might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/distil_bert/distil_bert_masked_lm_preprocessor.py) | ||
| might require different input formats. For instance, we have a [separate preprocessor](https://github.com/keras-team/keras-nlp/blob/master/keras_hub/models/distil_bert/distil_bert_masked_lm_preprocessor.py) | ||
| for masked language modeling (MLM) for DistilBERT. | ||
|
|
||
| ## Conclusion | ||
|
|
||
| Once all three PRs (and optionally, the fourth PR) have been merged, you have | ||
| successfully contributed a model to KerasNLP. Congratulations! 🔥 | ||
| successfully contributed a model to KerasHub. Congratulations! 🔥 | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The keras.io paths will be updated before the keras-hub release, right? @divyashreepathihalli
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! let's stick to the old paths for now, no sense breaking our links