Skip to content

[BT] add BetterTransformer support for ViLT architecture#508

Merged
younesbelkada merged 5 commits into
huggingface:mainfrom
ka00ri:add-BetterTransformer-support-for-ViLT-architecture
Nov 24, 2022
Merged

[BT] add BetterTransformer support for ViLT architecture#508
younesbelkada merged 5 commits into
huggingface:mainfrom
ka00ri:add-BetterTransformer-support-for-ViLT-architecture

Conversation

@ka00ri

@ka00ri ka00ri commented Nov 23, 2022

Copy link
Copy Markdown
Contributor

What does this PR do?

Added BetterTransformer support for ViLT architecture.
Added test model for ViltModel.

Tested the conversion of ViltLayer:

from optimum.bettertransformer import BetterTransformer
model = BetterTransformer.transform("hf-internal-testing/tiny-random-ViltModel")
print(model)
ViltModel(
  (embeddings): ViltEmbeddings(
    (text_embeddings): TextEmbeddings(
      (word_embeddings): Embedding(1124, 32)
      (position_embeddings): Embedding(512, 32)
      (token_type_embeddings): Embedding(16, 32)
      (LayerNorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (patch_embeddings): ViltPatchEmbeddings(
      (projection): Conv2d(3, 32, kernel_size=(2, 2), stride=(2, 2))
    )
    (token_type_embeddings): Embedding(2, 32)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (encoder): ViltEncoder(
    (layer): ModuleList(
      (0): ViltLayerBetterTransformer()
      (1): ViltLayerBetterTransformer()
      (2): ViltLayerBetterTransformer()
      (3): ViltLayerBetterTransformer()
      (4): ViltLayerBetterTransformer()
    )
  )
  (layernorm): LayerNorm((32,), eps=1e-12, elementwise_affine=True)
  (pooler): ViltPooler(
    (dense): Linear(in_features=32, out_features=32, bias=True)
    (activation): Tanh()
  )
)

To: @younesbelkada @michaelbenayoun @fxmarty

@younesbelkada younesbelkada left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Thank you very much for adding BetterTransformer support for ViLT model!
I think we need 2 more additional steps before merging:
1- Could you add a new class on the testing suite since the model expected text + image. So inside tests_bettertransformer_vision.py you can just add something like:

ALL_VISION_TEXT_MODELS_TO_TEST = ("hf-internal-testing/tiny-random-ViltModel")

class BetterTransformersViLTTest(BetterTransformersTestMixin, unittest.TestCase):
    r"""
    Testing suite for Vision Models - tests all the tests defined in `BetterTransformersTestMixin`
    """
    all_models_to_test = ALL_VISION_TEXT_MODELS_TO_TEST

    def prepare_inputs_for_class(self, model_id=None):
        url = "http://images.cocodataset.org/val2017/000000039769.jpg"
        image = Image.open(requests.get(url, stream=True).raw)
        processor = AutoProcessor.from_pretrained("hf-internal-testing/tiny-random-ViltModel")
        .... 
        return inputs

And tweak the function prepare_inputs_for_class based on this example: https://huggingface.co/dandelin/vilt-b32-finetuned-vqa

2- Can you add the new architecture on the documentation? 🙏 this would require adding just a line here (don't forget to respect the alphabetical order ;) )

Also don't forget to run make style before pushing! ;)

Thanks a bunch!

@HuggingFaceDocBuilderDev

HuggingFaceDocBuilderDev commented Nov 23, 2022

Copy link
Copy Markdown

The documentation is not available anymore as the PR was closed or merged.

Comment thread tests/bettertransformer/test_bettertransformer_vision.py Outdated
Comment thread tests/bettertransformer/test_bettertransformer_vision.py Outdated
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
@younesbelkada younesbelkada changed the title add BetterTransformer support for ViLT architecture [BT] add BetterTransformer support for ViLT architecture Nov 23, 2022

@younesbelkada younesbelkada left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean and super fast implementation of ViLT support for BetterTransformer! Thanks also for taking care of slightly refactoring the documentation.
Thanks a lot for adding this, you really did a great job here! 💪
I have nothing to add on my side, gently pinging @fxmarty and @michaelbenayoun for a last review!

@ka00ri

ka00ri commented Nov 23, 2022

Copy link
Copy Markdown
Contributor Author

@younesbelkada Thank you for your support and making this first contribution so smooth 😄 Glad I could help 💪

@michaelbenayoun michaelbenayoun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@younesbelkada younesbelkada merged commit c95565d into huggingface:main Nov 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants