Tutorial: Processing your first media asset
###########################################

This tutorial walks you through the core workflow of MADAM from scratch.  By
the end you will be able to:

* Read an image file and inspect its metadata
* Resize and convert the image using the configured processor
* Build a reusable pipeline that processes a whole batch of files
* Store the results in a storage backend

You do not need any prior knowledge of MADAM.  You will need Python 3.11 or
later and a copy of your own image file (JPEG, PNG, WebP, etc.) to follow
along.

.. contents::
   :local:
   :depth: 2


Step 1 — Install MADAM
======================

Install MADAM from PyPI::

   pip install madam

If you plan to process audio or video files, also install FFmpeg on your
system.  On most Linux distributions::

   sudo apt-get install ffmpeg   # Debian / Ubuntu
   brew install ffmpeg           # macOS (Homebrew)


Step 2 — Create a registry
===========================

Every interaction with MADAM starts with a :class:`~madam.core.Madam`
instance.  It acts as a registry that automatically selects the right
processor for each file format and carries your configuration:

.. code-block:: python

   from madam import Madam

   madam = Madam()

That's all you need for default settings.  In :ref:`step-6-configure` you
will see how to pass custom quality settings.


Step 3 — Read an image
=======================

Open your image file in binary mode and pass it to
:meth:`~madam.core.Madam.read`:

.. code-block:: python

   with open('photo.jpg', 'rb') as f:
       asset = madam.read(f)

``asset`` is now an :class:`~madam.core.Asset` — an immutable object that
holds the raw image data (the *essence*) and the extracted metadata.

Inspect the metadata using attribute access:

.. code-block:: python

   print(asset.mime_type)    # 'image/jpeg'
   print(asset.width)        # e.g. 4000
   print(asset.height)       # e.g. 3000
   print(asset.color_space)  # 'RGB'

If the file contains EXIF data it is available under the ``exif`` key:

.. code-block:: python

   exif = asset.metadata.get('exif', {})
   print(exif.get('camera.model'))      # e.g. 'Canon EOS 5D Mark III'
   print(exif.get('datetime_original')) # datetime.datetime(2024, 6, 15, …)
   print(asset.created_at)              # '2024-06-15T10:30:00'

.. note::

   ``madam.read()`` automatically strips embedded metadata (EXIF, IPTC, XMP)
   from the essence so that the raw bytes represent *only* the pixel data.
   The metadata is stored separately in ``asset.metadata``.


Step 4 — Get a processor and run an operator
=============================================

To transform an asset you need a *processor* — an object that knows how to
manipulate a particular format.  Rather than importing a processor class
directly, use :meth:`~madam.core.Madam.get_processor` to obtain the processor
that was configured for this asset's format:

.. code-block:: python

   processor = madam.get_processor(asset)

This returns the same processor instance that ``madam.read()`` used
internally, already initialised with the Madam instance's configuration.
It works with images, audio, and video alike — you never need to import
:class:`~madam.image.PillowProcessor` or :class:`~madam.ffmpeg.FFmpegProcessor`
directly.

Now create a *resize operator* — a callable that takes an asset and returns a
resized version:

.. code-block:: python

   from madam.image import ResizeMode

   make_thumbnail = processor.resize(width=200, height=200, mode=ResizeMode.FIT)

The operator is configured once and can be applied to any number of assets.
Apply it to your photo:

.. code-block:: python

   thumbnail = make_thumbnail(asset)

   print(thumbnail.width)   # 200 (or less, because FIT keeps the aspect ratio)
   print(thumbnail.height)  # 200 (or less)

The original ``asset`` is unchanged — MADAM never mutates assets.


Step 5 — Convert format and save
==================================

Create a format-conversion operator and chain it:

.. code-block:: python

   to_webp = processor.convert(mime_type='image/webp')
   webp_thumbnail = to_webp(thumbnail)

   print(webp_thumbnail.mime_type)  # 'image/webp'

Write the result to disk with :meth:`~madam.core.Madam.write`:

.. code-block:: python

   with open('thumbnail.webp', 'wb') as f:
       madam.write(webp_thumbnail, f)

You can also write the raw essence directly if you prefer:

.. code-block:: python

   with open('thumbnail.webp', 'wb') as f:
       f.write(webp_thumbnail.essence.read())


.. _step-6-configure:

Step 6 — Configure format defaults
=====================================

Pass a configuration dictionary to ``Madam()`` to set quality and codec
defaults.  These settings are automatically applied by the processor that
``get_processor()`` returns:

.. code-block:: python

   madam = Madam({
       'image/jpeg': {'quality': 85, 'progressive': True},
       'image/webp': {'quality': 80, 'method': 6},
   })

   with open('photo.jpg', 'rb') as f:
       asset = madam.read(f)

   processor = madam.get_processor(asset)
   convert = processor.convert(mime_type='image/jpeg')
   result = convert(asset)
   # The result is saved at quality=85 because the Madam config says so.

See :doc:`configuration` for the full list of options for every format.


Step 7 — Build a pipeline
===========================

When you need to apply the same sequence of operators to many assets, use a
:class:`~madam.core.Pipeline`:

.. code-block:: python

   from madam.core import Pipeline
   from madam.image import ResizeMode

   # Build the pipeline once.
   pipeline = Pipeline()
   pipeline.add(processor.resize(width=800, height=800, mode=ResizeMode.FIT))
   pipeline.add(processor.sharpen(radius=1, percent=100))
   pipeline.add(processor.convert(mime_type='image/webp'))

   # Read all source images.
   import pathlib

   sources = []
   for path in pathlib.Path('originals/').glob('*.jpg'):
       with open(path, 'rb') as f:
           sources.append(madam.read(f))

   # Process and save.
   for processed in pipeline.process(*sources):
       name = processed.content_id + '.webp'
       with open(f'output/{name}', 'wb') as f:
           madam.write(processed, f)

:attr:`~madam.core.Asset.content_id` is a SHA-256 digest of the essence bytes,
making it a safe unique filename.


Step 8 — Store assets
======================

Use a storage backend to keep assets organised and searchable.  The simplest
backend is :class:`~madam.core.InMemoryStorage`:

.. code-block:: python

   from madam.core import InMemoryStorage

   storage = InMemoryStorage()

   for path in pathlib.Path('originals/').glob('*.jpg'):
       with open(path, 'rb') as f:
           asset = madam.read(f)
       # Store with a key and tags.
       storage[path.stem] = (asset, {'photo', 'original'})

   # Retrieve by key.
   hero, tags = storage['hero']

   # Filter by metadata value.
   jpegs = list(storage.filter(mime_type='image/jpeg'))

   # Filter by tag.
   originals = list(storage.filter_by_tags({'original'}))

For persistent storage across restarts, use
:class:`~madam.core.FileSystemAssetStorage` instead — it writes one file per
asset to a directory atomically:

.. code-block:: python

   from madam.core import FileSystemAssetStorage

   storage = FileSystemAssetStorage('/var/lib/myapp/assets')
   storage['hero'] = (asset, {'homepage'})


What's next?
=============

Now that you know the basics, explore the rest of the documentation:

* :doc:`howto` — Practical recipes for specific tasks (effects, metadata,
  video, pipelines, optional formats, …)
* :doc:`explanation` — Why MADAM is designed the way it is (immutable assets,
  the operator pattern, format detection, …)
* :doc:`configuration` — Full reference for all format-specific settings
* :ref:`modindex` — Complete API reference for every class and function