Skip to content

Edited Nearest Neighbor #115

@EmilHvitfeldt

Description

@EmilHvitfeldt

Feature Request

Add support for Edited Nearest Neighbor (ENN) under-sampling.

Description

Edited Nearest Neighbor is an under-sampling technique that removes samples whose class label differs from the majority of their k-nearest neighbors. This effectively removes noisy and borderline samples.

Algorithm

  1. For each sample in the dataset:
    • Find its k-nearest neighbors
    • If the majority of neighbors belong to a different class, mark the sample for removal
  2. Remove all marked samples

This is a cleaning method that removes misclassified or ambiguous samples, resulting in smoother decision boundaries.

Key Properties

  • Removes noisy samples that are likely misclassified
  • Cleans decision boundaries
  • Can be applied to majority class only, or to all classes
  • Often used as a preprocessing step before other methods

Relationship to Other Methods

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    New Stepa new recipe stepfeaturea feature request or enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions