Hi Nick,
Great package!
I just ran into an IndexError when the DataFrame index values are not from a RangeIndex. I would imagine this to happen quite often if the user passes in training data from a shuffled train-test split.
Code to reproduce the error:
import pandas as pd
import smogn
housing = pd.read_csv('https://raw.githubusercontent.com/nickkunz/smogn/master/data/housing.csv')
smogn.smoter(housing[housing.index > 10], 'SalePrice')
smogn.smoter(housing[housing.index > 10].reset_index(), 'SalePrice') fixes it, but is not necessarily desirable because I would like (need) to preserve the original index.
Best,
Michael
Hi Nick,
Great package!
I just ran into an
IndexErrorwhen the DataFrame index values are not from aRangeIndex. I would imagine this to happen quite often if the user passes in training data from a shuffled train-test split.Code to reproduce the error:
smogn.smoter(housing[housing.index > 10].reset_index(), 'SalePrice')fixes it, but is not necessarily desirable because I would like (need) to preserve the original index.Best,
Michael