Mergers Module
- class moosefs.merging_strategies.base_merger.MergingStrategy(strategy_type: str)[source]
Bases:
object
Abstract base for merging strategies.
Strategies can be “set-based” or “rank-based” depending on how they merge the per-selector outputs.
- __init__(strategy_type: str) None [source]
Initialize the strategy.
- Parameters:
strategy_type – Either “set-based” or “rank-based”.
- merge(data: list, num_features_to_select: int, **kwargs) list [source]
Merge input data according to the strategy.
Subclasses must implement this method.
- Parameters:
data – List of Feature lists (one list per selector) or a single list.
num_features_to_select – Number of top features to return.
**kwargs – Strategy-specific options.
- Returns:
A list of merged features (or names depending on strategy).
- Raises:
NotImplementedError – If not implemented in a subclass.
- class moosefs.merging_strategies.borda_merger.BordaMerger(**kwargs)[source]
Bases:
MergingStrategy
Rank-based merging using the Borda count method.
- name = 'Borda'
- __init__(**kwargs) None [source]
Initialize a rank-based merger.
- Parameters:
**kwargs – Forwarded to the Borda routine (if applicable).
- merge(subsets: list, num_features_to_select: int, **kwargs) list [source]
Merge by Borda and return top-k names.
- Parameters:
subsets – Feature lists (one list per selector).
num_features_to_select – Number of names to return.
- Returns:
Feature names sorted by merged Borda scores.
- _validate_input(subsets: list) None
Validate that
subsets
contains Feature objects.- Parameters:
subsets – A list of Feature or a list of Feature lists.
- Raises:
ValueError – If empty or containing invalid types.
- is_rank_based() bool
Return True if the strategy is rank-based.
- is_set_based() bool
Return True if the strategy is set-based.
- class moosefs.merging_strategies.union_of_intersections_merger.UnionOfIntersectionsMerger[source]
Bases:
MergingStrategy
Union of intersections across selector subsets.
- name = 'UnionOfIntersections'
- __init__() None [source]
Initialize the strategy.
- Parameters:
strategy_type – Either “set-based” or “rank-based”.
- merge(subsets: list, num_features_to_select: int | None = None, fill: bool = False, **kwargs) set [source]
Merge by union of pairwise intersections.
- Parameters:
subsets – Feature lists (one list per selector).
num_features_to_select – Required when
fill=True
.fill – If True, trim/pad output to requested size.
**kwargs – Unused.
- Returns:
Set of selected feature names.
- Raises:
ValueError – If inputs are invalid or size is missing when
fill=True
.
- _validate_input(subsets: list) None
Validate that
subsets
contains Feature objects.- Parameters:
subsets – A list of Feature or a list of Feature lists.
- Raises:
ValueError – If empty or containing invalid types.
- is_rank_based() bool
Return True if the strategy is rank-based.
- is_set_based() bool
Return True if the strategy is set-based.
- class moosefs.merging_strategies.l2_norm_merger.L2NormMerger(**kwargs)[source]
Bases:
MergingStrategy
Rank-based merging using the L2-norm (RMS) of scores.
- name = 'L2Norm'
- __init__(**kwargs) None [source]
Initialize the strategy.
- Parameters:
strategy_type – Either “set-based” or “rank-based”.
- merge(subsets: list, num_features_to_select: int, **kwargs) list [source]
Return the top‑k feature names after L2-norm aggregation.
- Parameters:
subsets – Feature lists (one list per selector).
num_features_to_select – Number of names to return.
- Returns:
Feature names sorted by aggregated L2 score.
- _validate_input(subsets: list) None
Validate that
subsets
contains Feature objects.- Parameters:
subsets – A list of Feature or a list of Feature lists.
- Raises:
ValueError – If empty or containing invalid types.
- is_rank_based() bool
Return True if the strategy is rank-based.
- is_set_based() bool
Return True if the strategy is set-based.
- class moosefs.merging_strategies.arithmetic_mean_merger.ArithmeticMeanMerger(**kwargs)[source]
Bases:
MergingStrategy
Rank-based merging using the arithmetic mean of scores.
- name = 'ArithmeticMean'
- __init__(**kwargs) None [source]
Initialize the strategy.
- Parameters:
strategy_type – Either “set-based” or “rank-based”.
- merge(subsets: list, num_features_to_select: int, **kwargs) list [source]
Return the top‑k feature names after arithmetic-mean aggregation.
- Parameters:
subsets – Feature lists (one list per selector).
num_features_to_select – Number of names to return.
- Returns:
Feature names sorted by mean score.
- _validate_input(subsets: list) None
Validate that
subsets
contains Feature objects.- Parameters:
subsets – A list of Feature or a list of Feature lists.
- Raises:
ValueError – If empty or containing invalid types.
- is_rank_based() bool
Return True if the strategy is rank-based.
- is_set_based() bool
Return True if the strategy is set-based.
- class moosefs.merging_strategies.consensus_merger.ConsensusMerger(k: int = 2, *, fill: bool = False)[source]
Bases:
MergingStrategy
Set-based consensus merger with optional fill.
Keeps features selected by at least
k
selectors. Iffill=True
, trims/pads tonum_features_to_select
using summed, per-selector min–max–normalized scores as a tie-breaker.- __init__(k: int = 2, *, fill: bool = False) None [source]
Initialize the strategy.
- Parameters:
strategy_type – Either “set-based” or “rank-based”.
- merge(subsets: list, num_features_to_select: int | None = None, **kwargs) set [source]
Merge by consensus threshold.
- Parameters:
subsets – Feature lists (one list per selector).
num_features_to_select – Required when
fill=True
.**kwargs – Unused.
- Returns:
Set of selected feature names.
- _validate_input(subsets: list) None
Validate that
subsets
contains Feature objects.- Parameters:
subsets – A list of Feature or a list of Feature lists.
- Raises:
ValueError – If empty or containing invalid types.
- is_rank_based() bool
Return True if the strategy is rank-based.
- is_set_based() bool
Return True if the strategy is set-based.