Mergers Module

class moosefs.merging_strategies.base_merger.MergingStrategy(strategy_type: str)[source]

Bases: object

Abstract base for merging strategies.

Strategies can be “set-based” or “rank-based” depending on how they merge the per-selector outputs.

__init__(strategy_type: str) → None[source]

Initialize the strategy.

Parameters:: strategy_type – Either “set-based” or “rank-based”.

merge(data: list, num_features_to_select: int, **kwargs) → list[source]

Merge input data according to the strategy.

Subclasses must implement this method.

Parameters:

data – List of Feature lists (one list per selector) or a single list.
num_features_to_select – Number of top features to return.
**kwargs – Strategy-specific options.

Returns:

A list of merged features (or names depending on strategy).

Raises:

NotImplementedError – If not implemented in a subclass.

is_set_based() → bool[source]: Return True if the strategy is set-based.

is_rank_based() → bool[source]: Return True if the strategy is rank-based.

_validate_input(subsets: list) → None[source]

Validate that subsets contains Feature objects.

Parameters:: subsets – A list of Feature or a list of Feature lists.
Raises:: ValueError – If empty or containing invalid types.

class moosefs.merging_strategies.borda_merger.BordaMerger(**kwargs)[source]

Bases: MergingStrategy

Rank-based merging using the Borda count method.

name = 'Borda'

__init__(**kwargs) → None[source]

Initialize a rank-based merger.

Parameters:: **kwargs – Forwarded to the Borda routine (if applicable).

merge(subsets: list, num_features_to_select: int, **kwargs) → list[source]

Merge by Borda and return top-k names.

Parameters:

subsets – Feature lists (one list per selector).
num_features_to_select – Number of names to return.

Returns:

Feature names sorted by merged Borda scores.

_validate_input(subsets: list) → None

Validate that subsets contains Feature objects.

Parameters:: subsets – A list of Feature or a list of Feature lists.
Raises:: ValueError – If empty or containing invalid types.

is_rank_based() → bool: Return True if the strategy is rank-based.

is_set_based() → bool: Return True if the strategy is set-based.

class moosefs.merging_strategies.union_of_intersections_merger.UnionOfIntersectionsMerger[source]

Bases: MergingStrategy

Union of intersections across selector subsets.

name = 'UnionOfIntersections'

__init__() → None[source]

Initialize the strategy.

Parameters:: strategy_type – Either “set-based” or “rank-based”.

merge(subsets: list, num_features_to_select: int | None = None, fill: bool = False, **kwargs) → set[source]

Merge by union of pairwise intersections.

Parameters:

subsets – Feature lists (one list per selector).
num_features_to_select – Required when fill=True.
fill – If True, trim/pad output to requested size.
**kwargs – Unused.

Returns:

Set of selected feature names.

Raises:

ValueError – If inputs are invalid or size is missing when fill=True.

_validate_input(subsets: list) → None

Validate that subsets contains Feature objects.

Parameters:: subsets – A list of Feature or a list of Feature lists.
Raises:: ValueError – If empty or containing invalid types.

is_rank_based() → bool: Return True if the strategy is rank-based.

is_set_based() → bool: Return True if the strategy is set-based.

class moosefs.merging_strategies.l2_norm_merger.L2NormMerger(**kwargs)[source]

Bases: MergingStrategy

Rank-based merging using the L2-norm (RMS) of scores.

name = 'L2Norm'

__init__(**kwargs) → None[source]

Initialize the strategy.

Parameters:: strategy_type – Either “set-based” or “rank-based”.

merge(subsets: list, num_features_to_select: int, **kwargs) → list[source]

Return the top‑k feature names after L2-norm aggregation.

Parameters:

subsets – Feature lists (one list per selector).
num_features_to_select – Number of names to return.

Returns:

Feature names sorted by aggregated L2 score.

_validate_input(subsets: list) → None

Validate that subsets contains Feature objects.

Parameters:: subsets – A list of Feature or a list of Feature lists.
Raises:: ValueError – If empty or containing invalid types.

is_rank_based() → bool: Return True if the strategy is rank-based.

is_set_based() → bool: Return True if the strategy is set-based.

class moosefs.merging_strategies.arithmetic_mean_merger.ArithmeticMeanMerger(**kwargs)[source]

Bases: MergingStrategy

Rank-based merging using the arithmetic mean of scores.

name = 'ArithmeticMean'

__init__(**kwargs) → None[source]

Initialize the strategy.

Parameters:: strategy_type – Either “set-based” or “rank-based”.

merge(subsets: list, num_features_to_select: int, **kwargs) → list[source]

Return the top‑k feature names after arithmetic-mean aggregation.

Parameters:

subsets – Feature lists (one list per selector).
num_features_to_select – Number of names to return.

Returns:

Feature names sorted by mean score.

_validate_input(subsets: list) → None

Validate that subsets contains Feature objects.

Parameters:: subsets – A list of Feature or a list of Feature lists.
Raises:: ValueError – If empty or containing invalid types.

is_rank_based() → bool: Return True if the strategy is rank-based.

is_set_based() → bool: Return True if the strategy is set-based.

class moosefs.merging_strategies.consensus_merger.ConsensusMerger(k: int = 2, *, fill: bool = False)[source]

Bases: MergingStrategy

Set-based consensus merger with optional fill.

Keeps features selected by at least k selectors. If fill=True, trims/pads to num_features_to_select using summed, per-selector min–max–normalized scores as a tie-breaker.

__init__(k: int = 2, *, fill: bool = False) → None[source]

Initialize the strategy.

Parameters:: strategy_type – Either “set-based” or “rank-based”.

merge(subsets: list, num_features_to_select: int | None = None, **kwargs) → set[source]

Merge by consensus threshold.

Parameters:

subsets – Feature lists (one list per selector).
num_features_to_select – Required when fill=True.
**kwargs – Unused.

Returns:

Set of selected feature names.

_validate_input(subsets: list) → None

Validate that subsets contains Feature objects.

Parameters:: subsets – A list of Feature or a list of Feature lists.
Raises:: ValueError – If empty or containing invalid types.

is_rank_based() → bool: Return True if the strategy is rank-based.

is_set_based() → bool: Return True if the strategy is set-based.