Fix rope scaling factor #1605

abuelnasr0 · 2024-04-29T09:50:26Z

I think the use of scaling_factor is wrong in RotaryEmbedding layer. It is used to scale the positions not the frequencies.

References:

The original idea of rope scaling: https://kaiokendev.github.io/til#extending-context-to-8k
hf llama implementation: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L108
Meta paper: EXTENDING CONTEXT WINDOW OF LARGE LANGUAGE MODELS VIA POSITION INTERPOLATION

mattdangerw

This looks good! Can we add some tests?

mattdangerw · 2024-05-02T02:43:55Z

@abuelnasr0 also apologies, we just changed our entire directory structure in #1608

(Hopefully for good reason, we want to allow pip install -e . and pip install git+https:// while still keeping our explicit API surface)

But it does mean everything will need an annoying merge/rebase. If it'd help for me to do any of those and push to this branch just lmk!

abuelnasr0 · 2024-05-02T20:02:57Z

Can we add some tests?

I can add tests, but on sunday. sorry for that, but I will be AFK until then.

mattdangerw · 2024-05-03T20:17:46Z

I can add tests, but on sunday. sorry for that, but I will be AFK until then.

No rush at all! And thanks so much for all the major contributions to the library :)

I am just getting back from vacation myself, slowly catching up on all the review.

mattdangerw

LGTM! If tests pass lets ship it!

abuelnasr0 · 2024-05-06T20:55:38Z

thanks so much for all the major contributions to the library :)

You're welcome. I am trying to give back to the community as much as I can. And actually contributing to the library helped me to improve, I am learning new things with each PR. Thank you & other authors for creating the library. and thank you for all your reviews, they were really helpful to me.

abuelnasr0 · 2024-05-06T21:30:08Z

https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L96-L174

checkout these lines. what is implemented in this PR is LlamaLinearScalingRotaryEmbedding. and I think it was implemented in the main layer at first, but they decided to move it to a new layer. there's also LlamaDynamicNTKScalingRotaryEmbedding that uses scaling_factor in another way. but I think LlamaLinearScalingRotaryEmbedding is more popular.

mattdangerw requested review from mattdangerw and tirthasheshpatel April 30, 2024 20:29

mattdangerw reviewed May 2, 2024

View reviewed changes

mattdangerw mentioned this pull request May 2, 2024

Add phi3 #1597

Merged

2 tasks

abuelnasr0 added 2 commits May 2, 2024 22:24

Fix rope scaling factor

24182ad

Fix format

f93a813

abuelnasr0 force-pushed the rope_scaling_factor branch from 4c76fe6 to f93a813 Compare May 2, 2024 19:30

abuelnasr0 added 2 commits May 6, 2024 23:38

Add tests

dbf4d05

Fix format

479d4e8

mattdangerw approved these changes May 6, 2024

View reviewed changes

mattdangerw added the kokoro:force-run Runs Tests on GPU label May 6, 2024

kokoro-team removed the kokoro:force-run Runs Tests on GPU label May 6, 2024

mattdangerw merged commit 778ccd7 into keras-team:master May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix rope scaling factor #1605

Fix rope scaling factor #1605

Uh oh!

abuelnasr0 commented Apr 29, 2024

Uh oh!

mattdangerw left a comment

Uh oh!

mattdangerw commented May 2, 2024

Uh oh!

abuelnasr0 commented May 2, 2024 •

edited

Loading

Uh oh!

mattdangerw commented May 3, 2024

Uh oh!

mattdangerw left a comment

Uh oh!

abuelnasr0 commented May 6, 2024

Uh oh!

abuelnasr0 commented May 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix rope scaling factor #1605

Fix rope scaling factor #1605

Uh oh!

Conversation

abuelnasr0 commented Apr 29, 2024

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

mattdangerw commented May 2, 2024

Uh oh!

abuelnasr0 commented May 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdangerw commented May 3, 2024

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

abuelnasr0 commented May 6, 2024

Uh oh!

abuelnasr0 commented May 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abuelnasr0 commented May 2, 2024 •

edited

Loading