Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
2ce2399
docs(pypi): Improve README display and badge reliability
aksg87 Jul 22, 2025
4fe7580
feat: add trusted publishing workflow and prepare v1.0.0 release
aksg87 Jul 22, 2025
e696a48
Fix: Resolve libmagic ImportError (#6)
aksg87 Aug 1, 2025
5447637
docs: clarify output_dir behavior in medication_examples.md
kleeena Aug 1, 2025
9c47b34
Merge pull request #11 from google/fix/libmagic-dependency-issue
aksg87 Aug 1, 2025
175e075
Removed inline comment in medication example
kleeena Aug 2, 2025
9472099
Merge pull request #15 from kleeena/docs/update-medication_examples.md
aksg87 Aug 2, 2025
e6c3dcd
docs: add output_dir="." to all save_annotated_documents examples
aksg87 Aug 2, 2025
1fb1f1d
Merge pull request #17 from google/fix/output-dir-consistency
aksg87 Aug 2, 2025
7905f93
Fix typo in Ollama API parameter name
Mirza-Samad-Ahmed-Baig Aug 2, 2025
06afc9c
Fix security vulnerability and bugs in Ollama API integration
Mirza-Samad-Ahmed-Baig Aug 2, 2025
13fbd2c
build: add formatting & linting pipeline with pre-commit integration
aksg87 Aug 3, 2025
c8d2027
style: apply pyink, isort, and pre-commit formatting
aksg87 Aug 3, 2025
146a095
ci: enable format and lint checks in tox
aksg87 Aug 3, 2025
aa6da18
Merge pull request #24 from google/feat/code-formatting-pipeline
aksg87 Aug 3, 2025
ed65bca
Add LangExtractError base exception for centralized error handling
aksg87 Aug 3, 2025
6c4508b
Merge pull request #26 from google/feat/exception-hierarchy
aksg87 Aug 3, 2025
8b85225
fix: Remove LangFun and pylibmagic dependencies (v1.0.2)
aksg87 Aug 3, 2025
88520cc
Merge pull request #28 from google/fix/remove-breaking-dep-langfun
aksg87 Aug 3, 2025
75a6f12
Fix save_annotated_documents to handle string paths
aksg87 Aug 3, 2025
a415b94
Merge pull request #29 from google/fix-save-annotated-documents-mkdir
aksg87 Aug 3, 2025
8289b3a
feat: Add OpenAI language model support
aksg87 Aug 3, 2025
c8ef723
Merge pull request #31 from google/feature/add-oai-inference
aksg87 Aug 3, 2025
dfe8188
fix(ui): prevent current highlight border from being obscured. Chan…
tonebeta Aug 4, 2025
0d76530
Merge branch 'google:main' into fix-ollama-num-threads-typo
Mirza-Samad-Ahmed-Baig Aug 4, 2025
87c511e
feat: Add live API integration tests (#39)
aksg87 Aug 4, 2025
dc61372
Add PR template validation workflow (#45)
aksg87 Aug 4, 2025
7fc809f
Merge branch 'main' into fix-ollama-num-threads-typo
Mirza-Samad-Ahmed-Baig Aug 5, 2025
da771e6
fix: Change OllamaLanguageModel parameter from 'model' to 'model_id' …
aksg87 Aug 5, 2025
e83d5cf
feat: Add CITATION.cff file for proper software citation
aksg87 Aug 5, 2025
337beee
feat: Add Ollama integration with Docker examples and CI tests (#62)
aksg87 Aug 5, 2025
a7ef0bd
chore: Bump version to 1.0.4 for release
aksg87 Aug 5, 2025
87beb4f
build(deps): bump tj-actions/changed-files (#66)
dependabot[bot] Aug 5, 2025
db140d1
Add PR validation workflows and update contribution guidelines (#74)
aksg87 Aug 5, 2025
ed97f73
Fix custom comment in linked issue check (#77)
aksg87 Aug 5, 2025
ad1f27b
Add infrastructure file protection workflow (#76)
aksg87 Aug 5, 2025
41bc9ed
Allow maintainers to bypass community support requirement
aksg87 Aug 5, 2025
54e57db
Add manual trigger capability to validation workflows (#75)
aksg87 Aug 5, 2025
25ebc17
Fix fork PR labeling by using pull_request_target
aksg87 Aug 5, 2025
1290d63
Add workflow_dispatch trigger to CI workflow
aksg87 Aug 6, 2025
42687fc
Add secure label-based testing for fork PRs
aksg87 Aug 6, 2025
234081e
Add base_url to OpenAILanguageModel (#51)
mariano Aug 6, 2025
46b4f0d
Fix validation workflows that were skipping all checks
aksg87 Aug 6, 2025
6fb66cf
Add commit status to revalidation workflow
aksg87 Aug 6, 2025
47a251e
Fix boolean comparison in revalidation workflow
aksg87 Aug 7, 2025
b28e673
Add maintenance scripts for PR management
aksg87 Aug 7, 2025
6b02efb
Fix IPython import warnings and notebook detection (#86)
aksg87 Aug 7, 2025
e6dcc8e
Fix CI to validate PR branch formatting directly
aksg87 Aug 7, 2025
1c3c1a2
Add PR update automation workflows
aksg87 Aug 7, 2025
b60f0b2
Fix workflow formatting
aksg87 Aug 7, 2025
f888bd8
Minor changes
Mirza-Samad-Ahmed-Baig Aug 7, 2025
8659ef3
Merge branch 'fix-ollama-num-threads-typo'
Mirza-Samad-Ahmed-Baig Aug 7, 2025
ea71754
Fix chunking bug and improve test documentation (#88)
aksg87 Aug 7, 2025
82c6644
Fix: Resolve merge conflict and update docstrings in inference.py
Mirza-Samad-Ahmed-Baig Aug 7, 2025
ce0caa5
Changes
Mirza-Samad-Ahmed-Baig Aug 7, 2025
792fd3e
Merge branch 'main' into fix-ollama-num-threads-typo
Mirza-Samad-Ahmed-Baig Aug 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix IPython import warnings and notebook detection (#86)
- Add type ignore comments for IPython imports
- Fix return type annotation (remove unnecessary quotes)
- Add _is_jupyter() to properly detect notebook environments
- Replace lambda with def function for pylint compliance

Fixes #65
  • Loading branch information
aksg87 authored Aug 7, 2025
commit 6b02efb36b9185a991430c10c83cf20714788416
4 changes: 2 additions & 2 deletions .github/scripts/add-new-checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@ echo "✓ New checks added!"
echo ""
echo "Updated required status checks will include:"
echo "- test (3.10) [existing]"
echo "- test (3.11) [existing]"
echo "- test (3.11) [existing]"
echo "- test (3.12) [existing]"
echo "- Validate PR Template [existing]"
echo "- live-api-tests [existing]"
echo "- ollama-integration-test [existing]"
echo "- enforce [NEW - linked issue validation]"
echo "- size [NEW - PR size limit]"
echo "- protect-infrastructure [NEW - infrastructure file protection]"
echo "- protect-infrastructure [NEW - infrastructure file protection]"
12 changes: 6 additions & 6 deletions .github/scripts/add-size-labels.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ gh pr list --limit 50 --json number,additions,deletions --jq '.[]' | while read
additions=$(echo "$pr_data" | jq -r '.additions')
deletions=$(echo "$pr_data" | jq -r '.deletions')
total_changes=$((additions + deletions))

# Determine size label
if [ $total_changes -lt 50 ]; then
size_label="size/XS"
Expand All @@ -36,20 +36,20 @@ gh pr list --limit 50 --json number,additions,deletions --jq '.[]' | while read
else
size_label="size/XL"
fi

echo "PR #$pr_number: $total_changes lines -> $size_label"

# Remove any existing size labels first
existing_labels=$(gh pr view $pr_number --json labels --jq '.labels[].name' | grep "^size/" || true)
if [ ! -z "$existing_labels" ]; then
echo " Removing existing label: $existing_labels"
gh pr edit $pr_number --remove-label "$existing_labels"
fi

# Add the new size label
gh pr edit $pr_number --add-label "$size_label"

sleep 1 # Avoid rate limiting
done

echo "Done adding size labels!"
echo "Done adding size labels!"
4 changes: 2 additions & 2 deletions .github/scripts/revalidate-all-prs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ for pr in $PR_NUMBERS; do
COUNT=$((COUNT + 1))
echo "[$COUNT/$TOTAL] Triggering revalidation for PR #$pr..."
gh workflow run revalidate-pr.yml -f pr_number=$pr

# Small delay to avoid rate limiting
sleep 2
done
Expand All @@ -39,4 +39,4 @@ echo ""
echo "To monitor progress:"
echo " gh run list --workflow=revalidate-pr.yml --limit=$TOTAL"
echo ""
echo "To see results, check comments on each PR"
echo "To see results, check comments on each PR"
6 changes: 3 additions & 3 deletions .github/workflows/check-linked-issue.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,9 @@ jobs:
repo: context.repo.repo,
username: prAuthor
});

const isMaintainer = ['admin', 'maintain'].includes(authorPermission.permission);

const body = context.payload.pull_request.body || '';
const match = body.match(/(?:Fixes|Closes|Resolves)\s+#(\d+)/i);

Expand Down Expand Up @@ -87,4 +87,4 @@ jobs:
core.setFailed(`Issue #${issueNumber} needs at least ${REQUIRED_THUMBS_UP} 👍 reactions (currently has ${thumbsUp})`);
} else if (isMaintainer && thumbsUp < REQUIRED_THUMBS_UP) {
core.info(`Maintainer ${prAuthor} bypassing community support requirement (issue has ${thumbsUp} 👍 reactions)`);
}
}
4 changes: 2 additions & 2 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ jobs:
repo: context.repo.repo,
username: context.actor
});

const isMaintainer = ['admin', 'maintain'].includes(permission.permission);
if (!isMaintainer) {
throw new Error(`User ${context.actor} does not have maintainer permissions.`);
Expand All @@ -170,7 +170,7 @@ jobs:
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"

# pull_request_target runs in base repo context, so this is safe
git fetch origin pull/${{ github.event.pull_request.number }}/head:pr-to-test
git merge pr-to-test --no-ff --no-edit
Expand Down
32 changes: 16 additions & 16 deletions .github/workflows/revalidate-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,14 @@ jobs:
repo: context.repo.repo,
pull_number: ${{ inputs.pr_number }}
});

core.info(`Validating PR #${pr.number}: ${pr.title}`);
core.info(`Author: ${pr.user.login}`);
core.info(`Changes: +${pr.additions} -${pr.deletions}`);

// Store head SHA for creating status
core.setOutput('head_sha', pr.head.sha);

return pr;

- name: Create pending status
Expand All @@ -60,44 +60,44 @@ jobs:
const pr = ${{ steps.pr_data.outputs.result }};
const errors = [];
let passed = true;

// Check size
const totalChanges = pr.additions + pr.deletions;
const MAX_LINES = 1000;
if (totalChanges > MAX_LINES) {
errors.push(`PR size (${totalChanges} lines) exceeds ${MAX_LINES} line limit`);
passed = false;
}

// Check template
const body = pr.body || '';
const requiredSections = ["# Description", "Fixes #", "# How Has This Been Tested?", "# Checklist"];
const missingSections = requiredSections.filter(section => !body.includes(section));

if (missingSections.length > 0) {
errors.push(`Missing PR template sections: ${missingSections.join(', ')}`);
passed = false;
}

if (body.match(/Replace this with|Choose one:|Fixes #\[issue number\]/i)) {
errors.push('PR template contains unmodified placeholders');
passed = false;
}

// Check linked issue
const issueMatch = body.match(/(?:Fixes|Closes|Resolves)\s+#(\d+)/i);
if (!issueMatch) {
errors.push('No linked issue found');
passed = false;
}

// Store results
core.setOutput('passed', passed);
core.setOutput('errors', errors.join('; '));
core.setOutput('totalChanges', totalChanges);
core.setOutput('hasTemplate', missingSections.length === 0);
core.setOutput('hasIssue', !!issueMatch);

if (!passed) {
core.setFailed(errors.join('; '));
}
Expand All @@ -109,7 +109,7 @@ jobs:
script: |
const passed = ${{ steps.validate.outputs.passed }};
const errors = '${{ steps.validate.outputs.errors }}';

await github.rest.repos.createCommitStatus({
owner: context.repo.owner,
repo: context.repo.repo,
Expand All @@ -131,27 +131,27 @@ jobs:
const hasTemplate = ${{ steps.validate.outputs.hasTemplate }};
const hasIssue = ${{ steps.validate.outputs.hasIssue }};
const errors = '${{ steps.validate.outputs.errors }}'.split('; ').filter(e => e);

let body = `### Manual Validation Results\n\n`;
body += `**Status**: ${passed ? '✅ Passed' : '❌ Failed'}\n\n`;
body += `| Check | Status | Details |\n`;
body += `|-------|--------|----------|\n`;
body += `| PR Size | ${totalChanges <= 1000 ? '✅' : '❌'} | ${totalChanges} lines ${totalChanges > 1000 ? '(exceeds 1000 limit)' : ''} |\n`;
body += `| Template | ${hasTemplate ? '✅' : '❌'} | ${hasTemplate ? 'Complete' : 'Missing required sections'} |\n`;
body += `| Linked Issue | ${hasIssue ? '✅' : '❌'} | ${hasIssue ? 'Found' : 'Missing Fixes/Closes #XXX'} |\n`;

if (errors.length > 0) {
body += `\n**Errors:**\n`;
errors.forEach(error => {
body += `- ❌ ${error}\n`;
});
}

body += `\n[View workflow run](https://github.com/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId})`;

await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: pr.number,
body: body
});
});
35 changes: 29 additions & 6 deletions langextract/visualization.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@
import html
import itertools
import json
import os
import pathlib
import textwrap

Expand All @@ -37,10 +36,30 @@

# Fallback if IPython is not present
try:
from IPython.display import HTML # type: ignore
except Exception:
from IPython import get_ipython # type: ignore[import-not-found]
from IPython.display import HTML # type: ignore[import-not-found]
except ImportError:

def get_ipython(): # type: ignore[no-redef]
return None

HTML = None # pytype: disable=annotation-type-mismatch


def _is_jupyter() -> bool:
"""Check if we're in a Jupyter/IPython environment that can display HTML."""
try:
if get_ipython is None:
return False
ip = get_ipython()
if ip is None:
return False
# Simple check: if we're in IPython and NOT in a plain terminal
return ip.__class__.__name__ != 'TerminalInteractiveShell'
except Exception:
return False


_PALETTE: list[str] = [
'#D2E3FC', # Light Blue (Primary Container)
'#C8E6C9', # Light Green (Tertiary Container)
Expand Down Expand Up @@ -538,7 +557,7 @@ def visualize(
animation_speed: float = 1.0,
show_legend: bool = True,
gif_optimized: bool = True,
) -> 'HTML | str':
) -> HTML | str:
"""Visualises extraction data as animated highlighted HTML.

Args:
Expand Down Expand Up @@ -582,7 +601,9 @@ def visualize(
' animate.</p></div>'
)
full_html = _VISUALIZATION_CSS + empty_html
return HTML(full_html) if HTML is not None else full_html
if HTML is not None and _is_jupyter():
return HTML(full_html)
return full_html

color_map = _assign_colors(valid_extractions)

Expand All @@ -603,4 +624,6 @@ def visualize(
'class="lx-animated-wrapper lx-gif-optimized"',
)

return HTML(full_html) if HTML is not None else full_html
if HTML is not None and _is_jupyter():
return HTML(full_html)
return full_html
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ test = [
"pytest>=7.4.0",
"tomli>=2.0.0"
]
notebook = [
"ipython>=7.0.0",
"notebook>=6.0.0"
]

[tool.setuptools]
packages = ["langextract"]
Expand Down