`madam.ffmpeg` module

class madam.ffmpeg.AudioCodec[source]

Bases: object

Named constants for audio codec strings accepted by FFmpegProcessor.convert().

Use these instead of raw FFmpeg codec names to avoid depending on FFmpeg internals:

processor.convert(mime_type='audio/mpeg', audio={'codec': AudioCodec.MP3})

Added in version 0.23.

class madam.ffmpeg.FFmpegContext(processor: FFmpegProcessor, asset: Asset, graph: FFmpegFilterGraph)[source]

Bases: ProcessingContext

Deferred in-memory state for an FFmpeg processing run.

Holds the original input Asset and an FFmpegFilterGraph that accumulates the filter chain built up by consecutive FFmpegProcessor operators. Call materialize() to execute a single ffmpeg subprocess that applies all accumulated filters at once.

Instances are created by FFmpegProcessor and passed to execute_run(). Custom operator implementations can inspect or extend the accumulated state before materialisation.

Variables:

asset (Asset) – The original input asset whose essence will be passed as the ffmpeg input file. Do not replace this attribute; append filters to graph instead.
graph (FFmpegFilterGraph) – The filter graph being built up for this run. Append additional filters or set codec options by calling the mutation methods on this object.

Added in version 1.0.

__init__(processor: FFmpegProcessor, asset: Asset, graph: FFmpegFilterGraph) → None[source]

materialize() → Asset[source]: Run one ffmpeg subprocess from the accumulated filter graph.

property processor: FFmpegProcessor: The Processor that owns this context.

class madam.ffmpeg.FFmpegFilterGraph[source]

Bases: object

Accumulates FFmpeg video and audio filters for a single deferred pipeline run.

An FFmpegFilterGraph is created automatically by FFmpegProcessor when a group of consecutive FFmpeg operators is gathered by Pipeline for deferred execution. Custom operator implementations can also receive one via FFmpegContext.graph and call its mutation methods.

Mutation interface — call these from operator implementations:

add_video_filter() — append a video filter (e.g. scale, crop).
add_audio_filter() — append an audio filter (e.g. volume, atrim).
set_output_format() — set the target MIME type for the encoded output.
set_codec_options() — merge codec/muxer options; raises on conflict.

Read-only views — inspect after accumulation:

video_filter_string — comma-joined -vf filter string.
audio_filter_string — comma-joined -af filter string.

Variables:

output_mime_type (str | None) – MIME type string set by set_output_format(), or None if not yet set.
codec_options (dict[str, Any]) – Codec and muxer options accumulated by set_codec_options(). Keys are ffmpeg option names (e.g. 'vcodec', 'acodec'); values are their settings.
extra_input_args (list[str]) – Additional raw ffmpeg CLI arguments inserted immediately before the -i <input> flag. Use for input-side options such as ['-ss', '00:00:10'] to seek before decoding.
extra_output_args (list[str]) – Additional raw ffmpeg CLI arguments inserted immediately after the -i <input> flag, before filter and codec flags. Use for output-side options that are not expressible as filters, such as ['-t', '30'] to limit duration.

Added in version 1.0.

__init__() → None[source]

add_audio_filter(name: str, **params: Any) → None[source]: Append an audio filter (e.g. volume, atrim) to the chain.

add_video_filter(name: str, **params: Any) → None[source]: Append a video filter (e.g. scale, crop) to the chain.

property audio_filter_string: str: Comma-joined FFmpeg -af filter string, or empty string.

set_codec_options(**opts: Any) → None[source]

Merge codec options into the accumulated options dict.

Raises:: ValueError – if the same key already has a different value.

set_output_format(mime_type: str) → None[source]: Set the target output MIME type for this run.

property video_filter_string: str: Comma-joined FFmpeg -vf filter string, or empty string.

class madam.ffmpeg.FFmpegMetadataProcessor(config: Mapping[str, Any] | None = None)[source]

Bases: MetadataProcessor

Represents a metadata processor that uses FFmpeg.

__init__(config: Mapping[str, Any] | None = None) → None[source]

Initializes a new FFmpegMetadataProcessor.

Parameters:: config – Mapping with settings.

combine(file: IO, metadata: Mapping[str, Mapping]) → IO[source]

Returns a byte stream whose contents represent the specified file where the specified metadata was added.

Parameters:

metadata (Mapping) – Mapping of the metadata format to the metadata dict
file (IO) – Container file

Returns:

file-like object with combined content

Return type:

IO

property formats: Iterable[str]

The metadata formats which are supported.

Returns:: supported metadata formats
Return type:: set[str]

read(file: IO) → Mapping[str, Mapping][source]

Reads the file and returns the metadata.

The metadata that is returned is grouped by type. The keys are specified by format.

Parameters:: file (IO) – File-like object to be read
Returns:: Metadata contained in the file
Return type:: Mapping
Raises:: UnsupportedFormatError – if the data is corrupt or its format is not supported

strip(file: IO) → IO[source]

Removes all metadata of the supported type from the specified file.

Parameters:: file (IO) – file-like that should get stripped of the metadata
Returns:: file-like object without metadata
Return type:: IO

class madam.ffmpeg.FFmpegProcessor(config: Mapping[str, Any] | None = None)[source]

Bases: Processor

Represents a processor that uses FFmpeg to read audio and video data.

The minimum version of FFmpeg required is v3.3.

__init__(config: Mapping[str, Any] | None = None) → None[source]

Initializes a new FFmpegProcessor.

Parameters:: config – Mapping with settings.
Raises:: EnvironmentError – if ffprobe is not found, times out, or its version is below the minimum requirement (3.3).

can_read(file: IO) → bool[source]

Returns whether the specified MIME type is supported by this processor.

Parameters:: file (IO) – file-like object to be tested
Returns:: whether the data format of the specified file is supported or not
Return type:: bool

convert(asset: Asset, mime_type: MimeType | str, video: Mapping[str, Any] | None = None, audio: Mapping[str, Any] | None = None, subtitle: Mapping[str, Any] | None = None, progress_callback: Callable[[dict[str, str]], None] | None = None) → Asset[source]

Creates a new asset of the specified MIME type from the essence of the specified asset.

Additional options can be specified for video, audio, and subtitle streams. Options are passed as dictionary instances and can contain various keys for each stream type.

Options for video streams:

codec – Processor-specific name of the video codec as string
bitrate – Target bitrate in kBit/s as float number

Options for audio streams:

codec – Processor-specific name of the audio codec as string
bitrate – Target bitrate in kBit/s as float number

Options for subtitle streams:

codec – Processor-specific name of the subtitle format as string

Parameters:

asset (Asset) – Asset whose contents will be converted
mime_type (MimeType or str) – MIME type of the video container
video (dict or None) – Dictionary with options for video streams.
audio (dict or None) – Dictionary with options for audio streams.
subtitle (dict or None) – Dictionary with the options for subtitle streams.

Returns:

New asset with converted essence

Return type:

Asset

crop(asset: Asset, *, x: int, y: int, width: int, height: int) → Asset[source]

Creates a cropped video asset whose essence is cropped to the specified rectangular area.

Parameters:

asset (Asset) – Video asset whose contents will be cropped
x (int) – Horizontal offset of the cropping area from left
y (int) – Vertical offset of the cropping area from top
width (int) – Width of the cropping area
height (int) – Height of the cropping area

Returns:

New asset with cropped essence

Return type:

Asset

execute_run(steps: list[Callable], asset_or_context: Asset | FFmpegContext) → Asset | FFmpegContext[source]

Group consecutive FFmpegProcessor operators into a single subprocess.

Each step’s _accumulate_* method appends to the FFmpegFilterGraph. Operators without an accumulation method fall back to direct sequential execution (which may spawn a subprocess). The accumulated context is returned for the pipeline to materialise at the next processor boundary or pipeline end.

extract_frame(asset: Asset, mime_type: MimeType | str, seconds: float = 0) → Asset[source]

Creates a new image asset of the specified MIME type from the essence of the specified video asset.

Parameters:

asset (Asset) – Video asset which will serve as the source for the frame
mime_type (MimeType or str) – MIME type of the destination image
seconds (float) – Offset of the frame in seconds

Returns:

New image asset with converted essence

Return type:

Asset

normalize_audio(asset: Asset, target_lufs: float = -23.0) → Asset[source]

Creates a new asset whose audio stream is loudness-normalized to target_lufs LUFS (EBU R128).

Uses a two-pass approach with the FFmpeg loudnorm filter. The first pass measures integrated loudness, LRA, and true peak; the second pass applies a linear gain correction using those measurements for accurate normalization without re-quantizing the signal unnecessarily.

Parameters:

asset (Asset) – Audio or video asset to normalize
target_lufs (float) – Target integrated loudness in LUFS

Returns:

New asset with normalized audio

Return type:

Asset

Raises:

UnsupportedFormatError – If the asset type is not supported
OperatorError – If loudness measurement or normalization fails

overlay(asset: Asset, overlay_asset: Asset, x: int = 0, y: int = 0, gravity: str = 'north_west', from_seconds: float | None = None, to_seconds: float | None = None) → Asset[source]

Composites overlay_asset on top of asset at the specified position.

Position can be set explicitly with x and y pixel offsets from the top-left corner, or implicitly via gravity (same nine-point vocabulary as pad()). When both x/y and gravity are meaningful, x and y act as additional offsets relative to the gravity anchor.

The overlay can be restricted to a time window with from_seconds and to_seconds. Outside the window the base video is shown unmodified.

Parameters:

asset (Asset) – Base video asset
overlay_asset (Asset) – Image or video asset to composite on top
x (int) – Horizontal pixel offset from the left edge (or gravity anchor)
y (int) – Vertical pixel offset from the top edge (or gravity anchor)
gravity (str) – One of north_west, north, north_east, west, center, east, south_west, south, south_east
from_seconds (float or None) – Start time of the overlay window in seconds; None means the overlay is visible from the beginning
to_seconds (float or None) – End time of the overlay window in seconds; None means the overlay is visible until the end

Returns:

New video asset with overlay composited

Return type:

Asset

Raises:

UnsupportedFormatError – If the base asset type is not supported

read(file: IO) → Asset[source]

Returns an Asset object whose essence is identical to the contents of the specified file.

Parameters:: file (IO) – file-like object to be read
Returns:: Asset with essence
Return type:: Asset
Raises:: UnsupportedFormatError – if the specified data format is not supported

resize(asset: Asset, width: int, height: int) → Asset[source]

Creates a new image or video asset of the specified width and height from the essence of the specified image or video asset.

Width and height must be positive numbers.

Parameters:

asset (Asset) – Video asset that will serve as the source for the frame
width (int) – Width of the resized asset
height (int) – Height of the resized asset

Returns:

New asset with specified width and height

Return type:

Asset

rotate(asset: Asset, angle: float, expand: bool = False) → Asset[source]

Creates an asset whose essence is rotated by the specified angle in degrees.

Parameters:

asset (Asset) – Asset whose contents will be rotated
angle (float) – Angle in degrees, counter clockwise
expand (bool) – If true, changes the dimensions of the new asset so it can hold the entire rotated essence, otherwise the dimensions of the original asset will be used.

Returns:

New asset with rotated essence

Return type:

Asset

set_speed(asset: Asset, factor: float) → Asset[source]

Creates a new audio or video asset whose playback speed is scaled by factor relative to the source.

A factor greater than 1.0 speeds up playback (timelapse); a factor less than 1.0 slows it down (slow motion). The output duration equals source_duration / factor.

For video streams the setpts filter is used. For audio streams the atempo filter is used; because atempo accepts values only in [0.5, 2.0], the filter is chained automatically for extreme factors.

Parameters:

asset (Asset) – Audio or video asset to retime
factor (float) – Speed multiplier; must be positive and non-zero

Returns:

New asset with adjusted playback speed

Return type:

Asset

Raises:

ValueError – If factor is not positive
UnsupportedFormatError – If the asset type is not supported

property supported_mime_types: frozenset: MIME types this processor can handle (used to build the Madam index).

Added in version 0.24.

thumbnail_sprite(asset: Asset, columns: int, rows: int, thumb_width: int, thumb_height: int, mime_type: MimeType | str = 'image/jpeg') → Asset[source]

Extracts columns × rows evenly-spaced frames from asset and stitches them into a single sprite-sheet image.

The returned image asset has dimensions (columns × thumb_width) × (rows × thumb_height). Its metadata includes a 'sprite' dict with the grid parameters, which can be used by the application layer to generate a WebVTT thumbnail track.

Parameters:

asset (Asset) – Source video asset
columns (int) – Number of thumbnail columns in the sprite sheet
rows (int) – Number of thumbnail rows in the sprite sheet
thumb_width (int) – Width of each thumbnail in pixels
thumb_height (int) – Height of each thumbnail in pixels
mime_type (MimeType or str) – MIME type of the output image (default image/jpeg)

Returns:

Image asset containing the sprite sheet

Return type:

Asset

Raises:

UnsupportedFormatError – If the source asset is not a video

to_dash(asset: Asset, output: MultiFileOutput, segment_duration: float = 6, video: Mapping[str, Any] | None = None, audio: Mapping[str, Any] | None = None) → None[source]

Transcodes asset to MPEG-DASH format and writes all output files to output.

The output consists of an MPD manifest and one or more MP4 segment files. Stream options can be provided via video and audio; by default the video is encoded as H.264 and audio as AAC.

Parameters:

asset (Asset) – Source video asset
output (MultiFileOutput) – Destination for the manifest and segment files
segment_duration (float) – Target segment duration in seconds
video (dict or None) – Optional video stream options (codec, bitrate)
audio (dict or None) – Optional audio stream options (codec, bitrate)

Raises:

UnsupportedFormatError – If the source asset is not a video

to_hls(asset: Asset, output: MultiFileOutput, segment_duration: float = 6, video: Mapping[str, Any] | None = None, audio: Mapping[str, Any] | None = None) → None[source]

Transcodes asset to HLS (HTTP Live Streaming) format and writes all output files to output.

The output consists of an M3U8 playlist and one or more MPEG-TS segment files. Stream options can be provided via video and audio; by default the video is encoded as H.264 and audio as AAC, which are the most widely supported codecs for HLS.

Parameters:

asset (Asset) – Source video asset
output (MultiFileOutput) – Destination for the playlist and segment files
segment_duration (float) – Target segment duration in seconds
video (dict or None) – Optional video stream options (codec, bitrate)
audio (dict or None) – Optional audio stream options (codec, bitrate)

Raises:

UnsupportedFormatError – If the source asset is not a video

trim(asset: Asset, from_seconds: float = 0, to_seconds: float = 0) → Asset[source]

Creates a trimmed audio or video asset that only contains the data between from_seconds and to_seconds.

Parameters:

asset (Asset) – Audio or video asset, which will serve as the source
from_seconds (float) – Start time of the clip in seconds
to_seconds (float) – End time of the clip in seconds

Returns:

New asset with trimmed essence

Return type:

Asset

class madam.ffmpeg.SubtitleFormat[source]

Bases: object

Named constants for subtitle codec strings accepted by FFmpegProcessor.convert().

Use these instead of raw FFmpeg codec names to avoid depending on FFmpeg internals:

processor.convert(mime_type='text/vtt', subtitle={'codec': SubtitleFormat.WEBVTT})

Added in version 1.0.

class madam.ffmpeg.VideoCodec[source]

Bases: object

Named constants for video codec strings accepted by FFmpegProcessor.convert().

Use these instead of raw FFmpeg codec names to avoid depending on FFmpeg internals:

processor.convert(mime_type='video/mp4', video={'codec': VideoCodec.H264})

Added in version 0.23.

madam.ffmpeg.combine(assets: Iterable[Asset], mime_type: MimeType | str, *, fps: float = 24.0, video: Mapping[str, Any] | None = None, audio: Mapping[str, Any] | None = None) → Asset[source]

Assembles a sequence of image (or video) assets into a video by treating each asset as one frame at a fixed frame rate.

Each asset’s essence is written to a temporary file and listed in an FFmpeg concat-demuxer playlist. The duration of each entry is computed from fps so that the resulting clip plays at the specified frame rate.

Parameters:

assets (Iterable[Asset]) – Iterable of assets to use as frames; must be non-empty
mime_type (MimeType or str) – MIME type of the output video container
fps (float) – Frames per second (must be positive; default 24.0)
video (dict or None) – Optional video stream options (same keys as FFmpegProcessor.convert(); e.g. {'codec': VideoCodec.H264})
audio (dict or None) – Optional audio stream options

Returns:

New video asset

Return type:

Asset

Raises:

ValueError – If assets is empty or fps ≤ 0
UnsupportedFormatError – If mime_type is not a supported video format
OperatorError – If FFmpeg fails

Added in version 1.0.

madam.ffmpeg.concatenate(assets: Iterable[Asset], mime_type: MimeType | str, video: Mapping[str, Any] | None = None, audio: Mapping[str, Any] | None = None) → Asset[source]

Joins a sequence of audio or video assets end-to-end into a single asset.

Assets are concatenated in the order they appear in assets. By default the streams are copied without re-encoding (-c copy). Provide video and/or audio stream options to force re-encoding, which is required when the source clips use different codecs.

Uses the FFmpeg concat demuxer, which supports any format that can be read from files.

Parameters:

assets (Iterable[Asset]) – Iterable of assets to concatenate; must be non-empty
mime_type (MimeType or str) – MIME type of the output container
video (dict or None) – Optional video stream options (same keys as FFmpegProcessor.convert())
audio (dict or None) – Optional audio stream options (same keys as FFmpegProcessor.convert())

Returns:

New asset with concatenated essence

Return type:

Asset

Raises:

ValueError – If assets is empty

Added in version 0.24.

Raises:: UnsupportedFormatError – If mime_type is not supported

madam.ffmpeg module

`madam.ffmpeg` module