Feature Selectors Module

class moosefs.feature_selectors.base_selector.FeatureSelector(task: str, num_features_to_select: int)[source]

Bases: object

Base class for feature selection.

Subclasses must implement compute_scores returning a score per feature.

__init__(task: str, num_features_to_select: int) None[source]

Initialize the selector.

Parameters:
  • task – Either “classification” or “regression”.

  • num_features_to_select – Number of top features to select.

select_features(X: Any, y: Any) tuple[source]

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

compute_scores(X: Any, y: Any) ndarray[source]

Compute per-feature scores (override in subclasses).

class moosefs.feature_selectors.f_statistic_selector.FStatisticSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Feature selector using F-statistic scores.

name = 'FStatistic'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]
Parameters:
  • task – ML task (‘classification’ or ‘regression’).

  • num_features_to_select – Number of features to select.

  • **kwargs – Additional arguments for the scoring function.

compute_scores(X: Any, y: Any) ndarray[source]

Computes F-statistic scores.

Parameters:
  • X – Training samples.

  • y – Target values.

Returns:

F-statistic scores for each feature.

Raises:

ValueError – If task is not ‘classification’ or ‘regression’.

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.mutual_info_selector.MutualInfoSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Feature selector using mutual information scores.

name = 'MutualInfo'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]
Parameters:
  • task – ML task (‘classification’ or ‘regression’).

  • num_features_to_select – Number of features to select.

  • **kwargs – Additional arguments for mutual information function.

compute_scores(X: Any, y: Any) ndarray[source]

Computes mutual information scores.

Parameters:
  • X – Training samples.

  • y – Target values.

Returns:

Mutual information scores for each feature.

Raises:

ValueError – If task is not ‘classification’ or ‘regression’.

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.random_forest_selector.RandomForestSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Feature selector using RandomForest feature importance.

name = 'RandomForest'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]
Parameters:
  • task – ML task (‘classification’ or ‘regression’).

  • num_features_to_select – Number of features to select.

  • **kwargs – Additional arguments for RandomForest model.

compute_scores(X: Any, y: Any) ndarray[source]

Computes feature importances using a RandomForest model.

Parameters:
  • X – Training samples.

  • y – Target values.

Returns:

Feature importances from the trained RandomForest model.

Raises:

ValueError – If task is not ‘classification’ or ‘regression’.

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.svm_selector.SVMSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Feature selector using SVM coefficients.

name = 'SVM'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]
Parameters:
  • task – ML task (‘classification’ or ‘regression’).

  • num_features_to_select – Number of features to select.

  • **kwargs – Additional arguments for the SVM model.

compute_scores(X: Any, y: Any) ndarray[source]

Computes feature importances using an SVM model.

Parameters:
  • X – Training samples.

  • y – Target values.

Returns:

Feature importances derived from SVM model coefficients.

Raises:

ValueError – If task is not ‘classification’ or ‘regression’.

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.xgboost_selector.XGBoostSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Feature selector using XGBoost feature importance.

name = 'XGBoost'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]
Parameters:
  • task – ML task (‘classification’ or ‘regression’).

  • num_features_to_select – Number of features to select.

  • **kwargs – Additional arguments for the XGBoost model.

compute_scores(X: Any, y: Any) ndarray[source]

Computes feature importances using an XGBoost model.

Parameters:
  • X – Training samples.

  • y – Target values.

Returns:

Feature importances from the trained XGBoost model.

Raises:

ValueError – If task is not ‘classification’ or ‘regression’.

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.lasso_selector.LassoSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Feature selector using Lasso regression.

name = 'Lasso'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]
Parameters:
  • task – ML task (‘classification’ or ‘regression’).

  • num_features_to_select – Number of features to select.

  • **kwargs – Additional arguments for Lasso.

compute_scores(X: Any, y: Any) ndarray[source]

Computes feature scores using Lasso regression.

Parameters:
  • X – Training samples.

  • y – Target values.

Returns:

Feature scores based on absolute Lasso coefficients.

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.elastic_net_selector.ElasticNetSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Elastic‑net based selector.

  • regression → sklearn.linear_model.ElasticNet (L1+L2 on y∈ℝ)

  • classification → sklearn.linear_model.LogisticRegression with penalty=’elasticnet’ (solver=’saga’)

Scores are |coef| (mean over classes if multiclass).

name = 'ElasticNet'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]

Initialize the selector.

Parameters:
  • task – Either “classification” or “regression”.

  • num_features_to_select – Number of top features to select.

compute_scores(X: Any, y: Any) ndarray[source]

Compute per-feature scores (override in subclasses).

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.variance_selectors.VarianceSelector(task: str, num_features_to_select: int, **kwargs)[source]

Bases: FeatureSelector

name = 'Variance'
__init__(task: str, num_features_to_select: int, **kwargs)[source]

Initialize the selector.

Parameters:
  • task – Either “classification” or “regression”.

  • num_features_to_select – Number of top features to select.

compute_scores(X, y)[source]

Compute per-feature scores (override in subclasses).

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

class moosefs.feature_selectors.mrmr_selector.MRMRSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]

Bases: FeatureSelector

Feature selector using Minimum Redundancy Maximum Relevance (MRMR).

name = 'MRMR'
__init__(task: str, num_features_to_select: int, **kwargs: Any) None[source]
Parameters:
  • task – ML task (‘classification’ or ‘regression’).

  • num_features_to_select – Number of features to select.

  • **kwargs – Additional arguments for mRMR functions.

compute_scores(X: Any, y: Any) ndarray[source]

Computes feature scores using the MRMR algorithm.

Parameters:
  • X – Training samples.

  • y – Target values.

Returns:

MRMR scores for each feature.

select_features(X: Any, y: Any) tuple

Select top features using the computed scores.

Parameters:
  • X – Training samples, shape (n_samples, n_features).

  • y – Targets, shape (n_samples,) or (n_samples, n_outputs).

Returns:

Tuple (scores, indices) where indices are the top-k positions.

moosefs.feature_selectors.default_variance.variance_selector_default(X, y=None, alpha=0.01)[source]