Feature Selectors Module
- class moosefs.feature_selectors.base_selector.FeatureSelector(task: str, num_features_to_select: int)[source]
Bases:
object
Base class for feature selection.
Subclasses must implement
compute_scores
returning a score per feature.- __init__(task: str, num_features_to_select: int) None [source]
Initialize the selector.
- Parameters:
task – Either “classification” or “regression”.
num_features_to_select – Number of top features to select.
- class moosefs.feature_selectors.f_statistic_selector.FStatisticSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Feature selector using F-statistic scores.
- name = 'FStatistic'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
- Parameters:
task – ML task (‘classification’ or ‘regression’).
num_features_to_select – Number of features to select.
**kwargs – Additional arguments for the scoring function.
- compute_scores(X: Any, y: Any) ndarray [source]
Computes F-statistic scores.
- Parameters:
X – Training samples.
y – Target values.
- Returns:
F-statistic scores for each feature.
- Raises:
ValueError – If task is not ‘classification’ or ‘regression’.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.mutual_info_selector.MutualInfoSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Feature selector using mutual information scores.
- name = 'MutualInfo'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
- Parameters:
task – ML task (‘classification’ or ‘regression’).
num_features_to_select – Number of features to select.
**kwargs – Additional arguments for mutual information function.
- compute_scores(X: Any, y: Any) ndarray [source]
Computes mutual information scores.
- Parameters:
X – Training samples.
y – Target values.
- Returns:
Mutual information scores for each feature.
- Raises:
ValueError – If task is not ‘classification’ or ‘regression’.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.random_forest_selector.RandomForestSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Feature selector using RandomForest feature importance.
- name = 'RandomForest'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
- Parameters:
task – ML task (‘classification’ or ‘regression’).
num_features_to_select – Number of features to select.
**kwargs – Additional arguments for RandomForest model.
- compute_scores(X: Any, y: Any) ndarray [source]
Computes feature importances using a RandomForest model.
- Parameters:
X – Training samples.
y – Target values.
- Returns:
Feature importances from the trained RandomForest model.
- Raises:
ValueError – If task is not ‘classification’ or ‘regression’.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.svm_selector.SVMSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Feature selector using SVM coefficients.
- name = 'SVM'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
- Parameters:
task – ML task (‘classification’ or ‘regression’).
num_features_to_select – Number of features to select.
**kwargs – Additional arguments for the SVM model.
- compute_scores(X: Any, y: Any) ndarray [source]
Computes feature importances using an SVM model.
- Parameters:
X – Training samples.
y – Target values.
- Returns:
Feature importances derived from SVM model coefficients.
- Raises:
ValueError – If task is not ‘classification’ or ‘regression’.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.xgboost_selector.XGBoostSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Feature selector using XGBoost feature importance.
- name = 'XGBoost'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
- Parameters:
task – ML task (‘classification’ or ‘regression’).
num_features_to_select – Number of features to select.
**kwargs – Additional arguments for the XGBoost model.
- compute_scores(X: Any, y: Any) ndarray [source]
Computes feature importances using an XGBoost model.
- Parameters:
X – Training samples.
y – Target values.
- Returns:
Feature importances from the trained XGBoost model.
- Raises:
ValueError – If task is not ‘classification’ or ‘regression’.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.lasso_selector.LassoSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Feature selector using Lasso regression.
- name = 'Lasso'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
- Parameters:
task – ML task (‘classification’ or ‘regression’).
num_features_to_select – Number of features to select.
**kwargs – Additional arguments for Lasso.
- compute_scores(X: Any, y: Any) ndarray [source]
Computes feature scores using Lasso regression.
- Parameters:
X – Training samples.
y – Target values.
- Returns:
Feature scores based on absolute Lasso coefficients.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.elastic_net_selector.ElasticNetSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Elastic‑net based selector.
regression → sklearn.linear_model.ElasticNet (L1+L2 on y∈ℝ)
classification → sklearn.linear_model.LogisticRegression with penalty=’elasticnet’ (solver=’saga’)
Scores are |coef| (mean over classes if multiclass).
- name = 'ElasticNet'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
Initialize the selector.
- Parameters:
task – Either “classification” or “regression”.
num_features_to_select – Number of top features to select.
- compute_scores(X: Any, y: Any) ndarray [source]
Compute per-feature scores (override in subclasses).
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.variance_selectors.VarianceSelector(task: str, num_features_to_select: int, **kwargs)[source]
Bases:
FeatureSelector
- name = 'Variance'
- __init__(task: str, num_features_to_select: int, **kwargs)[source]
Initialize the selector.
- Parameters:
task – Either “classification” or “regression”.
num_features_to_select – Number of top features to select.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.
- class moosefs.feature_selectors.mrmr_selector.MRMRSelector(task: str, num_features_to_select: int, **kwargs: Any)[source]
Bases:
FeatureSelector
Feature selector using Minimum Redundancy Maximum Relevance (MRMR).
- name = 'MRMR'
- __init__(task: str, num_features_to_select: int, **kwargs: Any) None [source]
- Parameters:
task – ML task (‘classification’ or ‘regression’).
num_features_to_select – Number of features to select.
**kwargs – Additional arguments for mRMR functions.
- compute_scores(X: Any, y: Any) ndarray [source]
Computes feature scores using the MRMR algorithm.
- Parameters:
X – Training samples.
y – Target values.
- Returns:
MRMR scores for each feature.
- select_features(X: Any, y: Any) tuple
Select top features using the computed scores.
- Parameters:
X – Training samples, shape (n_samples, n_features).
y – Targets, shape (n_samples,) or (n_samples, n_outputs).
- Returns:
Tuple (scores, indices) where indices are the top-k positions.