bmlib

Shared Python library for biomedical literature tools — LLM abstraction, quality assessment, transparency analysis, full-text retrieval, publication ingestion, and database utilities.

Version: 0.2.1 | License: AGPL-3.0-or-later | Python: >=3.11

Installation

# Core (only jinja2 dependency)
pip install bmlib

# Editable install with all extras
uv pip install -e ".[all]"

Optional dependency groups

Group	Install command	Provides
`anthropic`	`pip install bmlib[anthropic]`	Anthropic Claude LLM provider
`ollama`	`pip install bmlib[ollama]`	Ollama local LLM provider
`openai`	`pip install bmlib[openai]`	OpenAI, DeepSeek, Mistral, Gemini, and OpenAI-compatible providers
`postgresql`	`pip install bmlib[postgresql]`	PostgreSQL database backend
`transparency`	`pip install bmlib[transparency]`	Transparency analysis (httpx)
`publications`	`pip install bmlib[publications]`	Publication ingestion and sync (httpx)
`dev`	`pip install bmlib[dev]`	pytest, pytest-cov, ruff
`all`	`pip install bmlib[all]`	All of the above

Modules

Module	Description
bmlib.db	Thin database abstraction (SQLite + PostgreSQL) with pure functions over DB-API connections
bmlib.llm	Unified LLM client with pluggable providers (Anthropic, OpenAI, Ollama, DeepSeek, Mistral, Gemini)
bmlib.templates	Jinja2-based prompt template engine with user-override directory fallback
bmlib.agents	Base agent class for LLM-driven tasks with template rendering and JSON parsing
bmlib.quality	3-tier quality assessment pipeline for biomedical publications (metadata → LLM classifier → deep assessment)
bmlib.transparency	Multi-API transparency and bias analysis (PubMed, CrossRef, EuropePMC, OpenAlex, ClinicalTrials.gov)
bmlib.publications	Publication ingestion from PubMed, bioRxiv, medRxiv, and OpenAlex with deduplication and sync
bmlib.fulltext	Full-text retrieval (Europe PMC → Unpaywall → DOI), JATS XML parsing, and disk-based caching

Quick Start

Database

from bmlib.db import connect_sqlite, execute, fetch_all, transaction

conn = connect_sqlite("~/.myapp/data.db")
with transaction(conn):
    execute(conn, "INSERT INTO papers (doi, title) VALUES (?, ?)", ("10.1101/x", "A paper"))
rows = fetch_all(conn, "SELECT * FROM papers")

LLM

from bmlib.llm import LLMClient, LLMMessage

client = LLMClient(default_provider="ollama")
response = client.chat(
    messages=[LLMMessage(role="user", content="Summarise this paper.")],
    model="ollama:medgemma4B_it_q8",
)
print(response.content)

Model strings use the format "provider:model_name":

"anthropic:claude-sonnet-4-20250514"
"openai:gpt-4o"
"ollama:medgemma4B_it_q8"
"deepseek:deepseek-chat"
"mistral:mistral-large-latest"
"gemini:gemini-2.0-flash"

Publication Sync

from datetime import date
from bmlib.db import connect_sqlite
from bmlib.publications import sync

conn = connect_sqlite("publications.db")
report = sync(
    conn,
    sources=["pubmed", "biorxiv"],
    date_from=date(2025, 1, 1),
    date_to=date(2025, 1, 7),
    email="researcher@example.com",
)
print(f"Added: {report.records_added}, Merged: {report.records_merged}")

Full-Text Retrieval

from bmlib.fulltext import FullTextService, FullTextCache

service = FullTextService(email="researcher@example.com")
result = service.fetch_fulltext(pmc_id="PMC7614751", doi="10.1234/example")

if result.source == "europepmc" and result.html:
    cache = FullTextCache()  # uses platform default directory
    cache.save_html(result.html, "PMC7614751")

Quality Assessment

from bmlib.llm import LLMClient
from bmlib.quality import QualityManager

llm = LLMClient()
manager = QualityManager(
    llm=llm,
    classifier_model="anthropic:claude-3-haiku-20240307",
    assessor_model="anthropic:claude-sonnet-4-20250514",
)

assessment = manager.assess(
    title="A Randomized Controlled Trial of ...",
    abstract="We conducted a double-blind RCT ...",
    publication_types=["Randomized Controlled Trial"],
)
print(assessment.study_design, assessment.quality_tier)

Transparency Analysis

from bmlib.transparency import TransparencyAnalyzer

analyzer = TransparencyAnalyzer(email="researcher@example.com")
result = analyzer.analyze("doc-001", doi="10.1038/s41586-024-00001-0")
print(result.transparency_score, result.risk_level)

Development

# Install with dev dependencies
uv pip install -e ".[all]"

# Run tests
pytest tests/ -v

# Lint and format
ruff check .
ruff format --check .

Documentation

Full API documentation is available in docs/manual/.

License

AGPL-3.0-or-later

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.claude		.claude
bmlib		bmlib
docs		docs
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bmlib

Installation

Optional dependency groups

Modules

Quick Start

Database

LLM

Publication Sync

Full-Text Retrieval

Quality Assessment

Transparency Analysis

Development

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bmlib

Installation

Optional dependency groups

Modules

Quick Start

Database

LLM

Publication Sync

Full-Text Retrieval

Quality Assessment

Transparency Analysis

Development

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages