-
Notifications
You must be signed in to change notification settings - Fork 0
feature: Embed Service #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #182 +/- ##
=======================================
Coverage 93.61% 93.61%
=======================================
Files 9 9
Lines 407 407
=======================================
Hits 381 381
Misses 26 26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
||
| def embed(input_text: str) -> Tensor: | ||
| """Embed text.""" | ||
| return model.encode(input_text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome and simple! Should we update the doc string to be a bit more descriptive? This may only be relevant for maybe our actual APIs though....but just checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely want to add more to the doc string. It's fairly standard to include the parameters and expected output.
|
|
||
| def embed(input_text: str) -> Tensor: | ||
| """Embed text.""" | ||
| return model.encode(input_text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely want to add more to the doc string. It's fairly standard to include the parameters and expected output.
|
|
||
| from dibbs_text_to_code.configs import MODEL_NAME | ||
|
|
||
| model = SentenceTransformer(MODEL_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We likely want to refactor this into a lazy load to make the API and lambda cold start a little faster. Up to you if you want to do that in this PR or another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can tackle this in a different PR with an update to the doc strings (see comment below).
| @@ -0,0 +1,40 @@ | |||
| from dibbs_text_to_code import configs | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think you should add at least one test to show we've returned a tensor object that is the same dimensions as the model expects (768 for Qwen). I could be wrong, but I don't think you'll have to mock anything as long as you load this in the test file from embeddings import embed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea as well!
…s docstrings to be on a single continuous line with no returns
Description
Creates a simple "embed" service with a single function.
Related Issues
Closes #157