if df0 is a Pandas DataFrame with null Values:
df0=df0.dropna(axis=0,how='all') - will remove rows that have only 'NaN' values
df0=df0.dropna(axis=1,how='all') - will remove columns that have only 'NaN' values
Python programming, with examples in hydraulic engineering and in hydrology.
Monday, October 21, 2019
Thursday, June 13, 2019
Find Maximum Values by year in timeseries dataframe, keeping the date
df2=df1.ix[df1.groupby(df1.index.year).idxmax().iloc[:,0]]
Tuesday, May 28, 2019
Find common timespan/ years in multiple time series/ dataframes
#concatenate vertically all dataframes
dfAlldf=pd.concat([df1, df2,df3,df4], axis=1)
#sort them by the date (datetimeindex)
dfAlldf=dfAlldf.sort_index()
#group dataframe by years
grps=dfAlldf.groupby(dfAlldf.index.year)
#empty dataframe for populating with complete years
dfCompl=pd.DataFrame()
# for each group of years
for g in grps:
#if don't have any null values in year, in any column
if not any(g[1].isnull().any(axis=1)):
#concatenate in dfCompl
dfCompl=pd.concat([dfCompl, g[1]], axis=0)
# re-sort by index
dfCompl=dfCompl.sort_index()
Thursday, April 25, 2019
Pandas - Reading headers and dates correctly from Clipboard/ CSV
When using pandas funcions read_clipboard() or read_csv() you have to define if your data has headers (column headers) and indexes (row headers).
If you're passing indexes with datetime format, make sure if it will be parsed correctly, indicating it's a datetime and if it has dayfirst format (dd/mm/YYYY).
For example:
pd.read_clipboard(index_col=0, headers=None,parse_dates=True, dayfirst=True)
Is telling pandas that the table in clipboard has no column headers, but have index (row headers) in the first column and it is in datetime format with day first (dd/mm/YYYY).
Sunday, April 21, 2019
Logarithmic and Exponential Curve Fit in Python - Numpy
With numpy function "polyfit":
X,y : data to be fitted
import numpy as np
1. Exponential fit
cf = np.polyfit(X, np.log(y), 1)
will return two coefficients, who will compose the equation:
exp(cf[1])*exp(cf[0]*X)
2. Logarithm fit:
cf = np.polyfit(np.log(X), y, 1)
will return two coefficients, who will compose the equation:
cf[0]*log(X)+cf[1]
Labels:
curve fit,
exponential,
fit,
logarithmic,
numpy,
polyfit,
regression
Wednesday, January 23, 2019
Interpolate missing values in pandas DataFrame
If we have a dataframe with dates and flows - with missing values, as example below:
0
2019-01-31 50.208308
2019-02-28 50.623457
2019-03-31 56.203933
2019-04-30 NaN
2019-05-31 NaN
2019-06-30 117.727655
2019-07-31 62.273259
2019-08-31 49.054898
2019-09-30 55.612575
2019-10-31 54.187409
We can use the function pandas interpolate, and interpolate the data with different methods
dfIn.interpolate() - will fill noData with linear interpolation;
dfIn.interpolate(method='polynomial', order=3) - will fill noData with 3rd degree polinomial interpolation;
Result:
linear polinomial original
2019-01-31 50.208308 50.208308 50.208308
2019-02-28 50.623457 50.623457 50.623457
2019-03-31 56.203933 56.203933 56.203933
2019-04-30 76.711840 89.513986 NaN
2019-05-31 97.219748 124.233259 NaN
2019-06-30 117.727655 117.727655 117.727655
2019-07-31 62.273259 62.273259 62.273259
2019-08-31 49.054898 49.054898 49.054898
2019-09-30 55.612575 55.612575 55.612575
2019-10-31 54.187409 54.187409 54.187409
0
2019-01-31 50.208308
2019-02-28 50.623457
2019-03-31 56.203933
2019-04-30 NaN
2019-05-31 NaN
2019-06-30 117.727655
2019-07-31 62.273259
2019-08-31 49.054898
2019-09-30 55.612575
2019-10-31 54.187409
We can use the function pandas interpolate, and interpolate the data with different methods
dfIn.interpolate() - will fill noData with linear interpolation;
dfIn.interpolate(method='polynomial', order=3) - will fill noData with 3rd degree polinomial interpolation;
Result:
linear polinomial original
2019-01-31 50.208308 50.208308 50.208308
2019-02-28 50.623457 50.623457 50.623457
2019-03-31 56.203933 56.203933 56.203933
2019-04-30 76.711840 89.513986 NaN
2019-05-31 97.219748 124.233259 NaN
2019-06-30 117.727655 117.727655 117.727655
2019-07-31 62.273259 62.273259 62.273259
2019-08-31 49.054898 49.054898 49.054898
2019-09-30 55.612575 55.612575 55.612575
2019-10-31 54.187409 54.187409 54.187409
Subscribe to:
Posts (Atom)