Tutorial: Processing your first media asset

This tutorial walks you through the core workflow of MADAM from scratch. By the end you will be able to:

  • Read an image file and inspect its metadata

  • Resize and convert the image using the configured processor

  • Build a reusable pipeline that processes a whole batch of files

  • Store the results in a storage backend

You do not need any prior knowledge of MADAM. You will need Python 3.11 or later and a copy of your own image file (JPEG, PNG, WebP, etc.) to follow along.

Step 1 — Install MADAM

Install MADAM from PyPI:

pip install madam

If you plan to process audio or video files, also install FFmpeg on your system. On most Linux distributions:

sudo apt-get install ffmpeg   # Debian / Ubuntu
brew install ffmpeg           # macOS (Homebrew)

Step 2 — Create a registry

Every interaction with MADAM starts with a Madam instance. It acts as a registry that automatically selects the right processor for each file format and carries your configuration:

from madam import Madam

madam = Madam()

That’s all you need for default settings. In Step 6 — Configure format defaults you will see how to pass custom quality settings.

Step 3 — Read an image

Open your image file in binary mode and pass it to read():

with open('photo.jpg', 'rb') as f:
    asset = madam.read(f)

asset is now an Asset — an immutable object that holds the raw image data (the essence) and the extracted metadata.

Inspect the metadata using attribute access:

print(asset.mime_type)    # 'image/jpeg'
print(asset.width)        # e.g. 4000
print(asset.height)       # e.g. 3000
print(asset.color_space)  # 'RGB'

If the file contains EXIF data it is available under the exif key:

exif = asset.metadata.get('exif', {})
print(exif.get('camera.model'))      # e.g. 'Canon EOS 5D Mark III'
print(exif.get('datetime_original')) # datetime.datetime(2024, 6, 15, …)
print(asset.created_at)              # '2024-06-15T10:30:00'

Note

madam.read() automatically strips embedded metadata (EXIF, IPTC, XMP) from the essence so that the raw bytes represent only the pixel data. The metadata is stored separately in asset.metadata.

Step 4 — Get a processor and run an operator

To transform an asset you need a processor — an object that knows how to manipulate a particular format. Rather than importing a processor class directly, use get_processor() to obtain the processor that was configured for this asset’s format:

processor = madam.get_processor(asset)

This returns the same processor instance that madam.read() used internally, already initialised with the Madam instance’s configuration. It works with images, audio, and video alike — you never need to import PillowProcessor or FFmpegProcessor directly.

Now create a resize operator — a callable that takes an asset and returns a resized version:

from madam.image import ResizeMode

make_thumbnail = processor.resize(width=200, height=200, mode=ResizeMode.FIT)

The operator is configured once and can be applied to any number of assets. Apply it to your photo:

thumbnail = make_thumbnail(asset)

print(thumbnail.width)   # 200 (or less, because FIT keeps the aspect ratio)
print(thumbnail.height)  # 200 (or less)

The original asset is unchanged — MADAM never mutates assets.

Step 5 — Convert format and save

Create a format-conversion operator and chain it:

to_webp = processor.convert(mime_type='image/webp')
webp_thumbnail = to_webp(thumbnail)

print(webp_thumbnail.mime_type)  # 'image/webp'

Write the result to disk with write():

with open('thumbnail.webp', 'wb') as f:
    madam.write(webp_thumbnail, f)

You can also write the raw essence directly if you prefer:

with open('thumbnail.webp', 'wb') as f:
    f.write(webp_thumbnail.essence.read())

Step 6 — Configure format defaults

Pass a configuration dictionary to Madam() to set quality and codec defaults. These settings are automatically applied by the processor that get_processor() returns:

madam = Madam({
    'image/jpeg': {'quality': 85, 'progressive': True},
    'image/webp': {'quality': 80, 'method': 6},
})

with open('photo.jpg', 'rb') as f:
    asset = madam.read(f)

processor = madam.get_processor(asset)
convert = processor.convert(mime_type='image/jpeg')
result = convert(asset)
# The result is saved at quality=85 because the Madam config says so.

See Configuration for the full list of options for every format.

Step 7 — Build a pipeline

When you need to apply the same sequence of operators to many assets, use a Pipeline:

from madam.core import Pipeline
from madam.image import ResizeMode

# Build the pipeline once.
pipeline = Pipeline()
pipeline.add(processor.resize(width=800, height=800, mode=ResizeMode.FIT))
pipeline.add(processor.sharpen(radius=1, percent=100))
pipeline.add(processor.convert(mime_type='image/webp'))

# Read all source images.
import pathlib

sources = []
for path in pathlib.Path('originals/').glob('*.jpg'):
    with open(path, 'rb') as f:
        sources.append(madam.read(f))

# Process and save.
for processed in pipeline.process(*sources):
    name = processed.content_id + '.webp'
    with open(f'output/{name}', 'wb') as f:
        madam.write(processed, f)

content_id is a SHA-256 digest of the essence bytes, making it a safe unique filename.

Step 8 — Store assets

Use a storage backend to keep assets organised and searchable. The simplest backend is InMemoryStorage:

from madam.core import InMemoryStorage

storage = InMemoryStorage()

for path in pathlib.Path('originals/').glob('*.jpg'):
    with open(path, 'rb') as f:
        asset = madam.read(f)
    # Store with a key and tags.
    storage[path.stem] = (asset, {'photo', 'original'})

# Retrieve by key.
hero, tags = storage['hero']

# Filter by metadata value.
jpegs = list(storage.filter(mime_type='image/jpeg'))

# Filter by tag.
originals = list(storage.filter_by_tags({'original'}))

For persistent storage across restarts, use FileSystemAssetStorage instead — it writes one file per asset to a directory atomically:

from madam.core import FileSystemAssetStorage

storage = FileSystemAssetStorage('/var/lib/myapp/assets')
storage['hero'] = (asset, {'homepage'})

What’s next?

Now that you know the basics, explore the rest of the documentation: