Wednesday, January 23, 2019

Interpolate missing values in pandas DataFrame

If we have a dataframe with dates and flows - with missing values, as example below:

        0
2019-01-31 50.208308
2019-02-28 50.623457
2019-03-31 56.203933
2019-04-30 NaN
2019-05-31 NaN
2019-06-30 117.727655
2019-07-31 62.273259
2019-08-31 49.054898
2019-09-30 55.612575
2019-10-31 54.187409


We can use the function pandas interpolate, and interpolate the data with different methods

dfIn.interpolate() - will fill noData with linear interpolation;
dfIn.interpolate(method='polynomial', order=3) - will fill noData with 3rd degree polinomial interpolation;

Result:
                linear  polinomial    original
2019-01-31   50.208308   50.208308   50.208308
2019-02-28   50.623457   50.623457   50.623457
2019-03-31   56.203933   56.203933   56.203933
2019-04-30   76.711840   89.513986         NaN
2019-05-31   97.219748  124.233259         NaN
2019-06-30  117.727655  117.727655  117.727655
2019-07-31   62.273259   62.273259   62.273259
2019-08-31   49.054898   49.054898   49.054898
2019-09-30   55.612575   55.612575   55.612575
2019-10-31   54.187409   54.187409   54.187409