Skip to content

Conversation

@JNygaard-Skylight
Copy link
Collaborator

Description

Creates a simple "embed" service with a single function.

Related Issues

Closes #157

@codecov-commenter
Copy link

codecov-commenter commented Dec 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.61%. Comparing base (da2d612) to head (69d000b).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #182   +/-   ##
=======================================
  Coverage   93.61%   93.61%           
=======================================
  Files           9        9           
  Lines         407      407           
=======================================
  Hits          381      381           
  Misses         26       26           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.


def embed(input_text: str) -> Tensor:
"""Embed text."""
return model.encode(input_text)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome and simple! Should we update the doc string to be a bit more descriptive? This may only be relevant for maybe our actual APIs though....but just checking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely want to add more to the doc string. It's fairly standard to include the parameters and expected output.


def embed(input_text: str) -> Tensor:
"""Embed text."""
return model.encode(input_text)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely want to add more to the doc string. It's fairly standard to include the parameters and expected output.


from dibbs_text_to_code.configs import MODEL_NAME

model = SentenceTransformer(MODEL_NAME)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We likely want to refactor this into a lazy load to make the API and lambda cold start a little faster. Up to you if you want to do that in this PR or another.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can tackle this in a different PR with an update to the doc strings (see comment below).

@@ -0,0 +1,40 @@
from dibbs_text_to_code import configs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think you should add at least one test to show we've returned a tensor object that is the same dimensions as the model expects (768 for Qwen). I could be wrong, but I don't think you'll have to mock anything as long as you load this in the test file from embeddings import embed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Write function to embed nonstandard text

6 participants