Contributing to ALMA

Thank you for your interest in contributing to ALMA! This document provides guidelines and instructions for contributing.

Code of Conduct
Getting Started
How to Contribute
Development Setup
Code Style Guidelines
Testing Requirements
Pull Request Process
Issue Guidelines
Release Process
Community

Code of Conduct

We are committed to providing a welcoming and inclusive environment. Please:

Be respectful and constructive in discussions
Welcome newcomers and help them get started
Focus on what is best for the community
Show empathy towards other community members

Getting Started

Good First Issues

New to ALMA? Look for issues labeled good-first-issue. These are specifically selected to be approachable for newcomers.

Understanding the Codebase

alma/
├── core.py              # Main ALMA class and interface
├── types.py             # Data types (Heuristic, Outcome, etc.)
├── exceptions.py        # Exception hierarchy (NEW in v0.4.0)
├── config/              # Configuration loading
├── storage/             # Storage backends (SQLite, PostgreSQL, Azure, File)
├── retrieval/           # Memory retrieval and embeddings
├── learning/            # Learning protocols
├── extraction/          # LLM-powered fact extraction
├── graph/               # Graph memory with Neo4j
├── mcp/                 # MCP server for Claude integration
├── progress/            # Work item tracking
├── session/             # Session handoff management
├── domains/             # Domain-specific memory schemas
├── harness/             # Agent harness pattern
├── confidence/          # Forward-looking confidence engine
└── initializer/         # Session initialization

Key Concepts

Memory Types: ALMA has 5 memory types - Heuristics, Outcomes, Preferences, Domain Knowledge, Anti-patterns
Scoped Learning: Agents can only learn within their defined domains
The Harness Pattern: Setting -> Context -> Agent -> Memory Schema

Architecture Principles

Pluggable backends: Storage and embedding providers are interchangeable
Clean abstractions: All backends implement abstract base classes
No side effects: Pure functions where possible
Type safety: Full type hints throughout

How to Contribute

Types of Contributions

Type	Description	Difficulty
Documentation	Fix typos, improve explanations, add examples	Easy
Bug Reports	Report issues with clear reproduction steps	Easy
Bug Fixes	Fix reported issues	Medium
Tests	Add test coverage for existing features	Medium
Features	Implement new functionality	Hard
Integrations	Add new storage backends, LLM providers	Hard

What We Need Most Right Now

Documentation improvements - Examples, tutorials, explanations
Test coverage - We need more tests for edge cases
Storage backends - MongoDB, Pinecone, Qdrant integrations
LLM providers - Ollama, Groq, local models for extraction
Language SDKs - TypeScript/JavaScript SDK

Priority Areas (from v0.4.0 Roadmap)

Multi-agent memory sharing
Memory consolidation engine
Event system / webhooks
TypeScript SDK

Development Setup

Prerequisites

Python 3.10+
Git
(Optional) Docker for testing with Neo4j/PostgreSQL
(Optional) Node.js for TypeScript SDK development

Installation

# Clone the repository
git clone https://github.com/RBKunnela/ALMA-memory.git
cd ALMA-memory

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in development mode with all extras
pip install -e ".[dev,local,postgres,azure]"

# Install pre-commit hooks
pip install pre-commit
pre-commit install

Running the Full Test Suite

# Run all tests
pytest

# Run with coverage
pytest --cov=alma --cov-report=html

# Run specific test file
pytest tests/test_storage.py

# Run tests matching a pattern
pytest -k "test_retrieval"

# Run integration tests (requires Docker)
pytest tests/integration/ --run-integration

Setting Up Test Databases

# PostgreSQL with pgvector
docker run -d \
  --name alma-postgres \
  -e POSTGRES_PASSWORD=test \
  -e POSTGRES_DB=alma_test \
  -p 5432:5432 \
  pgvector/pgvector:pg16

# Neo4j for graph memory
docker run -d \
  --name alma-neo4j \
  -e NEO4J_AUTH=neo4j/testpassword \
  -p 7474:7474 -p 7687:7687 \
  neo4j:5

Code Style Guidelines

Python Style

We use the following tools for code quality:

Tool	Purpose	Config
Black	Code formatting	`pyproject.toml`
isort	Import sorting	`pyproject.toml`
flake8	Linting	`.flake8`
mypy	Type checking	`pyproject.toml`

Running Code Quality Checks

# Format code
black alma/ tests/

# Sort imports
isort alma/ tests/

# Run linter
flake8 alma/ tests/

# Type check
mypy alma/

# Run all checks (same as pre-commit)
pre-commit run --all-files

Code Style Rules

Type hints required: All function signatures must have type hints

# Good
def retrieve(self, task: str, agent: str, top_k: int = 5) -> MemorySlice:

# Bad
def retrieve(self, task, agent, top_k=5):

Docstrings required: All public functions/classes need docstrings

def learn(self, agent: str, task: str, outcome: str) -> bool:
    """
    Learn from a task outcome.

    Args:
        agent: The agent that performed the task
        task: Description of the task
        outcome: "success" or "failure"

    Returns:
        True if learning was successful

    Raises:
        ScopeViolationError: If agent cannot learn this type
    """

Use the exception hierarchy: Never raise generic Exception

# Good
from alma.exceptions import ValidationError, StorageError
raise ValidationError("agent name cannot be empty")

# Bad
raise Exception("agent name cannot be empty")

Timezone-aware datetimes: Always use UTC

# Good
from datetime import datetime, timezone
now = datetime.now(timezone.utc)

# Bad (deprecated in Python 3.12+)
now = datetime.utcnow()

No eval(): Security risk, use json.loads() or ast.literal_eval()

Testing Requirements

Test Coverage Requirements

Minimum coverage: 80% for new code
Critical paths: 100% coverage for:
- Storage backends (CRUD operations)
- Security-sensitive code (input validation)
- Exception handling paths

Test Structure

tests/
├── unit/                    # Fast, isolated tests
│   ├── test_core.py
│   ├── test_types.py
│   └── test_scoring.py
├── integration/             # Tests with external dependencies
│   ├── test_postgresql.py
│   ├── test_neo4j.py
│   └── test_azure.py
├── e2e/                     # End-to-end scenarios
│   └── test_full_workflow.py
└── fixtures/                # Shared test fixtures
    ├── conftest.py
    └── sample_data.py

Writing Good Tests

import pytest
from alma import ALMA
from alma.exceptions import ValidationError

class TestALMARetrieval:
    """Tests for ALMA.retrieve() method."""

    @pytest.fixture
    def alma_instance(self, tmp_path):
        """Create a fresh ALMA instance for testing."""
        config = {
            "project_id": "test",
            "storage": "sqlite",
            "storage_dir": str(tmp_path),
        }
        return ALMA.from_dict(config)

    def test_retrieve_returns_memory_slice(self, alma_instance):
        """retrieve() should return a MemorySlice object."""
        result = alma_instance.retrieve(task="test task", agent="test-agent")
        assert isinstance(result, MemorySlice)

    def test_retrieve_empty_task_raises_validation_error(self, alma_instance):
        """retrieve() should raise ValidationError for empty task."""
        with pytest.raises(ValidationError, match="task cannot be empty"):
            alma_instance.retrieve(task="", agent="test-agent")

    @pytest.mark.parametrize("top_k,expected", [
        (1, 1),
        (5, 5),
        (100, 100),
    ])
    def test_retrieve_respects_top_k(self, alma_instance, top_k, expected):
        """retrieve() should return at most top_k results."""
        result = alma_instance.retrieve(task="test", agent="agent", top_k=top_k)
        assert len(result.heuristics) <= expected

Integration Test Markers

@pytest.mark.integration
def test_postgresql_connection():
    """Test that requires PostgreSQL."""
    pass

@pytest.mark.slow
def test_large_dataset():
    """Test that takes >30 seconds."""
    pass

@pytest.mark.azure
def test_cosmos_db():
    """Test that requires Azure credentials."""
    pass

Pull Request Process

Before You Start

Check existing issues and PRs to avoid duplicate work
For large changes, open an issue first to discuss the approach
Fork the repository and create a branch from main

Branch Naming

Use descriptive branch names:

feature/add-postgres-storage
fix/retrieval-cache-bug
docs/improve-quickstart
test/add-learning-tests

PR Checklist

Before submitting your PR, ensure:

Code passes all tests (pytest)
Code passes linting (pre-commit run --all-files)
New code has type hints
New code has docstrings
Tests added for new functionality
Documentation updated if needed
CHANGELOG.md updated for user-facing changes
No security vulnerabilities introduced

PR Template

## Summary
Brief description of changes.

## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update

## Testing
How was this tested?

## Checklist
- [ ] Tests pass
- [ ] Linting passes
- [ ] Documentation updated
- [ ] CHANGELOG updated

## Related Issues
Fixes #123

Review Process

Maintainers will review within 48 hours (usually faster)
Address feedback promptly
Be open to suggestions
Once approved, maintainer will merge

Issue Guidelines

Bug Reports

Please include:

ALMA version (pip show alma-memory)
Python version
Operating system
Storage backend being used
Minimal code to reproduce
Expected vs actual behavior
Full error traceback

Feature Requests

Please include:

Clear description of the feature
Use case - why is this needed?
Proposed implementation (optional)
Willingness to implement (optional)

Security Issues

Do not open public issues for security vulnerabilities.

Instead, email security@jurevo.io with:

Description of the vulnerability
Steps to reproduce
Potential impact
Suggested fix (if any)

Release Process

Version Numbering

We follow Semantic Versioning:

MAJOR (1.0.0): Breaking changes
MINOR (0.1.0): New features, backwards compatible
PATCH (0.0.1): Bug fixes, backwards compatible

Release Checklist

Update version in pyproject.toml
Update CHANGELOG.md with release date
Create release PR
After merge, tag release: git tag v0.4.0
Push tag: git push origin v0.4.0
GitHub Actions builds and publishes to PyPI

Community

Getting Help

GitHub Issues: Bug reports and feature requests
GitHub Discussions: Questions and general discussion
Email: renata@jurevo.io (maintainer)

Recognition

Contributors are recognized in:

README.md Contributors section
Release notes
Social media shoutouts

Becoming a Maintainer

Active contributors may be invited to become maintainers. This includes:

Triage access to issues
Merge access for PRs
Input on project direction

License

By contributing to ALMA, you agree that your contributions will be licensed under the MIT License.

Thank You!

Every contribution matters, whether it is fixing a typo or implementing a major feature. We appreciate your time and effort in making ALMA better!

Questions? Open an issue or reach out to @RBKunnela.

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History