metadata

S3 Vector Metadata Query Framework

This module provides a framework for building type-safe query expressions for S3 vector metadata filtering. It supports AWS S3 vector search metadata filtering operators and allows building complex nested queries using Python operators.

The framework is designed with the following principles: - Type safety through dataclasses and type hints - Pythonic query building using operator overloading (&, |) - Support for inheritance to create hierarchical metadata models - Clean separation between data models and query logic

Example:
>>> class DocumentMeta(BaseMetadata):
...     document_id = MetaKey()
...     chunk_seq = MetaKey()
...
>>> meta = DocumentMeta()
>>> query = meta.document_id.eq("doc-1") & meta.chunk_seq.gt(5)
>>> query.to_doc()
{"$and": [{"document_id": {"$eq": "doc-1"}}, {"chunk_seq": {"$gt": 5}}]}
class s3vectorm.metadata.OperatorEnum(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Enumeration of supported query operators for S3 vector metadata filtering.

These operators correspond to the AWS S3 vector metadata filtering operators as documented in the AWS S3 User Guide.

Reference:

https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-metadata-filtering.html#s3-vectors-metadata-filtering-filterable

class s3vectorm.metadata.Expr(field: str, operator: str, value: Any)[source]

Represents a single query expression for metadata filtering.

An expression consists of a field name, an operator, and a value. For example: field=”document_id”, operator=”$eq”, value=”doc-1”

Attributes:

field: The metadata field name to filter on operator: The filtering operator (e.g., “$eq”, “$gt”, “$in”) value: The value to compare against

Example:
>>> expr = Expr(field="status", operator="$eq", value="active")
>>> expr.to_doc()
{"status": {"$eq": "active"}}
to_doc() dict[source]

Convert the expression to a dictionary format suitable for S3 filtering.

Returns:

A dictionary with the field as key and operator/value as nested dict

class s3vectorm.metadata.CompoundExpr(left: Expr | CompoundExpr, operator: str, right: Expr | CompoundExpr)[source]

Represents a compound query expression combining multiple expressions.

A compound expression contains two sub-expressions (left and right) combined with either AND or OR logic. This allows building complex nested queries.

Attributes:

left: The left-side expression (can be Expr or CompoundExpr) operator: The logical operator (“$and” or “$or”) right: The right-side expression (can be Expr or CompoundExpr)

Example:
>>> expr1 = Expr(field="status", operator="$eq", value="active")
>>> expr2 = Expr(field="priority", operator="$gt", value=5)
>>> compound = CompoundExpr(left=expr1, operator="$and", right=expr2)
>>> compound.to_doc()
{"$and": [{"status": {"$eq": "active"}}, {"priority": {"$gt": 5}}]}
to_doc() dict[source]

Convert the compound expression to a dictionary format for S3 filtering.

Returns:

A dictionary with the operator as key and list of sub-expressions as value

class s3vectorm.metadata.MetaKey(name: str = '')[source]

Represents a metadata field that can be used in query expressions.

A MetaKey provides methods to create filtering expressions using various operators. Each method returns an Expr object that can be combined with other expressions to build complex queries.

Attributes:

name: The field name used in query expressions

Example:
>>> field = MetaKey(name="status")
>>> expr = field.eq("active")
>>> expr.to_doc()
{"status": {"$eq": "active"}}
eq(other: Any) Expr[source]

Create an equality expression (field == value).

ne(other: Any) Expr[source]

Create a not-equal expression (field != value).

gt(other: Any) Expr[source]

Create a greater-than expression (field > value).

gte(other: Any) Expr[source]

Create a greater-than-or-equal expression (field >= value).

lt(other: Any) Expr[source]

Create a less-than expression (field < value).

lte(other: Any) Expr[source]

Create a less-than-or-equal expression (field <= value).

in_(other: Any) Expr[source]

Create an ‘in’ expression (field in [values]).

nin(other: Any) Expr[source]

Create a ‘not in’ expression (field not in [values]).

exists(other: bool) Expr[source]

Create an existence check expression (field exists/doesn’t exist).

class s3vectorm.metadata.MetaClass(name, bases, namespace, **kwargs)[source]

Metaclass that scans class definitions for MetaKey fields and registers them.

This metaclass automatically processes class definitions to: 1. Collect MetaKey fields from base classes (supporting inheritance) 2. Scan for annotated and non-annotated MetaKey fields in the current class 3. Ensure all MetaKey instances have proper field names 4. Store field information on the class for runtime access

The metaclass enables the declarative syntax where you can define metadata fields as class attributes and they become queryable at runtime.

class s3vectorm.metadata.BaseMetadata[source]

Base class for metadata models providing field access and query functionality.

This class serves as the foundation for creating metadata models with queryable fields. It automatically manages MetaKey instances through the MetaClass metaclass.

Features: - Automatic field registration through the MetaClass metaclass - Class-level field access for building queries - Support for inheritance of fields from parent classes

Example:
>>> class DocumentMeta(BaseMetadata):
...     document_id = MetaKey()
...     status = MetaKey()
...
>>> query = DocumentMeta.document_id.eq("doc-1") & DocumentMeta.status.eq("active")