df=df.select_dtypes(include=['float64'])
will return only the float values of the 'df' dataframe.
Python programming, with examples in hydraulic engineering and in hydrology.
Wednesday, November 14, 2018
Tuesday, September 4, 2018
Excel to Python Pandas DataFrame
The most easy way is to use the clipboard. Just copy your table in Excel to the clipboard, and then call the read_clipboard method from pandas:
df1 = pd.read_clipboard()
Pandas can read Excel files directly too, with read_excel
df1 = pd.read_excel('tmp.xlsx',index_col=0, header=0)
Taking care with header and indexes, and other options as described here.
df1 = pd.read_clipboard()
Pandas can read Excel files directly too, with read_excel
df1 = pd.read_excel('tmp.xlsx',index_col=0, header=0)
Taking care with header and indexes, and other options as described here.
Tuesday, August 21, 2018
Maximum, minimum and average monthly precipitation
Below is shown some panda commands for retrieving maximum, minimum and average monthly precipitation from daily precipitation data.
The daily precipitation is assumed to be in a pandas DataFrame, with its index in Datetime index format.
1 - Daily to monthly precipitation
df_m=df1.resample('M').sum()
2 - Maximum monthly precipitation
p_max=df_m.groupby(df_m.index.month).max()
3 - Minimum monthly precipitation
p_min=df_m.groupby(df_m.index.month).min()
4 - Average monthly precipitation
p_avg=df_m.groupby(df_m.index.month).mean()
The daily precipitation is assumed to be in a pandas DataFrame, with its index in Datetime index format.
1 - Daily to monthly precipitation
df_m=df1.resample('M').sum()
2 - Maximum monthly precipitation
p_max=df_m.groupby(df_m.index.month).max()
3 - Minimum monthly precipitation
p_min=df_m.groupby(df_m.index.month).min()
4 - Average monthly precipitation
p_avg=df_m.groupby(df_m.index.month).mean()
Tuesday, July 24, 2018
How to extract all numbers / floats from a string with python regex
import re
stringWithNumbers='anystring2with 6.89 numbers3.55 3.141312'
digits = re.findall("[-+]?\d+\.?\d*", stringWithNumbers)
stringWithNumbers='anystring2with 6.89 numbers3.55 3.141312'
digits = re.findall("[-+]?\d+\.?\d*", stringWithNumbers)
Labels:
excerpts,
extract,
float,
hints,
regex,
String,
string manipulation,
text,
text files,
text string
Tuesday, July 10, 2018
Fitting IDF curves with Scipy and Pandas
# -*- coding: utf-8 -*- """ Created on Tue Jul 10 11:19:37 2018 @author: rodrigo.goncalves """ import pandas as pd from scipy.optimize import minimize # This is the IDF function, returning the sum of squared errors (SSE) def func2(par, res): p1 = (par[0] * res.index.values **par[1]) p2 = ((res.columns.values.astype(float)+par[2])**par[3]) f = pd.DataFrame([p/p2 for p in p1],index=res.index,columns=res.columns) erroTotQ=((f-res)**2).sum(axis=1).sum() return erroTotQ # copy your rainfall intensities table from excel # with column headers with rainfall durations # and row names with Return period value in years dfInt=pd.read_clipboard() #initial guess param1 = [5000, 0.1, 10, 0.9] res2 = minimize(func2, param1, args=(dfInt,), method='Nelder-Mead') print(res2) cs=['K=','a=','b=','c='] dfResult=pd.DataFrame(res2.x,index=cs).transpose() print(dfResult) dfResult.to_clipboard(index=None)
Labels:
fitting idf curves,
idf,
minimize,
pandas,
precipitation,
rain,
rainfall,
scipy
Friday, June 8, 2018
CSV file to Python list of lists
import csv with open('file.csv', 'r') as f1: reader = csv.reader(f1) your_list = list(reader)
Wednesday, May 30, 2018
Pandas - Operations between rows - distance between 2 points
If we have a table with a column with xy coordinates, for example:
We can get the difference between consecutive rows by using Pandas SHIFT function on columns.
".shift(-1)" will roll the rows 1 position backwards, and ".shift(1)" or simply ".shift()" will roll down your column by 1 position of the rows.
In our example, df1['x'].shift() will return:
0 NaN
1 455395.996360
2 527627.076641
3 536278.269190
4 553932.441097
5 568699.553239
6 569709.130272
7 573016.302437
8 575141.096777
9 580107.934566
if we want to calculate the euclidean distance between consecutive points, we can use the shift associated with numpy functions numpy.sqrt and numpy.power as following:
df1['diff']= np.sqrt(np.power(df1['x'].shift()-df1['x'],2)+
np.power(df1['y'].shift()-df1['y'],2))
Resulting in:
0 NaN
1 89911.101224
2 21323.016099
3 204394.524574
4 37767.197793
5 46692.771398
6 13246.254235
7 2641.201366
8 15153.187527
9 15853.974422
We can get the difference between consecutive rows by using Pandas SHIFT function on columns.
".shift(-1)" will roll the rows 1 position backwards, and ".shift(1)" or simply ".shift()" will roll down your column by 1 position of the rows.
In our example, df1['x'].shift() will return:
0 NaN
1 455395.996360
2 527627.076641
3 536278.269190
4 553932.441097
5 568699.553239
6 569709.130272
7 573016.302437
8 575141.096777
9 580107.934566
if we want to calculate the euclidean distance between consecutive points, we can use the shift associated with numpy functions numpy.sqrt and numpy.power as following:
df1['diff']= np.sqrt(np.power(df1['x'].shift()-df1['x'],2)+
np.power(df1['y'].shift()-df1['y'],2))
Resulting in:
0 NaN
1 89911.101224
2 21323.016099
3 204394.524574
4 37767.197793
5 46692.771398
6 13246.254235
7 2641.201366
8 15153.187527
9 15853.974422
Wednesday, May 2, 2018
Solve Manning's Equation with Python Scipy library
from scipy.optimize import root def manningC(d, args): Q, w,h,sSlopeL,sSlopeR,nMann,lSlope = args #left side slope can be different from right side slope area = ((((d*sSlopeL)+(d*sSlopeR)+w)+w)/2)*d # wet perimeter wPer = w+(d*(sSlopeL*sSlopeL+1)**0.5)+(d*(sSlopeR*sSlopeR+1)**0.5) #Hydraulic Radius hR = area/ wPer # following formula must be zero # manipulation of Manning's formula mannR = (Q*nMann/lSlope**0.5)-(area*hR**(2.0/3.0)) return mannR ###### MAIN CODE # the following are input data to our open channel manning calculation # flow, width, height, left side slope, right side slope, # Manning coefficient, longitudinal slope args0 = [2.5,2,.5,1.0,1.0,.015,.005] initD = .00001 # initial water depth value # then we call the root scipy function to the manningC sol =root(manningC,initD, args=(args0,)) # print the root found print(sol.x)
Friday, April 13, 2018
IDF table from equation - Pandas
How to fill a table (pandas DataFrame) with rainfall intensity values, extracted from idf equation:
import pandas as pd
durs = [5,10,15,20,30,60,120,180,360,540,720,1440] # rainfall durations
rps=[5,10,25,50,100] # return periods
t1 = pd.DataFrame(index=rps,columns=durs) # initialize DataFrame with rp's and durations
cf = [942.76,.1242,17.0,.650] #example idf equation coefficients
table = t1.apply(lambda x: (cf[0]*x.index**cf[1])/((x.name+cf[2])**cf[3]))
Wednesday, February 28, 2018
Installing Python in Windows
WinPython is a free open-source portable distribution of the Python programming language for Windows 7/8/10 and scientific and educational usage.
Designed for scientists, data-scientists, and education.
WinPython is a portable application, so the user should not expect any integration into Windows explorer during installation. However, the WinPython Control Panel allows to "register" your distribution to Windows.
Free Download at:
Thursday, February 8, 2018
Daily precipitation data to monthly and yearly
With pandas, we can group datetime values according to datetime units, as months and years.
If we have a pandas dataframe with 2 columns - DATE and VALUE
Certify that the column DATE is recognized as datetime64 format - you can use, for example:
df1['DATE'] = pd.to_datetime(df1['DATE'])
and then you can group and make operations on this group.
Getting the average of total monthly precipitation:
monthly = df1.groupby(df1['DATE'].dt.month).sum() / len(pd.unique(df1['DATE'].dt.year))
(groups by month sum and then divide by the number of years)
Getting the average yearly precipitation:
PAnual = df1.groupby(df1[0].dt.year).sum().mean()
If we have a pandas dataframe with 2 columns - DATE and VALUE
Certify that the column DATE is recognized as datetime64 format - you can use, for example:
df1['DATE'] = pd.to_datetime(df1['DATE'])
and then you can group and make operations on this group.
Getting the average of total monthly precipitation:
monthly = df1.groupby(df1['DATE'].dt.month).sum() / len(pd.unique(df1['DATE'].dt.year))
(groups by month sum and then divide by the number of years)
Getting the average yearly precipitation:
PAnual = df1.groupby(df1[0].dt.year).sum().mean()
Friday, January 26, 2018
Accumulating Precipitation Data
If we have a pandas dataframe named df1 with a column '15min' containing 15 minute precipitation data, we can easily accumulate for other durations, using the rolling method as shown in the example below:
Note that the window=4 parameter means that it will accumulate 4 lines of 15 minute data, resulting in 1 hour precipitation. This parameter can be changed to whatever duration you want.
df1['01 h'] =df1['15min'].rolling(window=4,center=False).sum()
Note that the window=4 parameter means that it will accumulate 4 lines of 15 minute data, resulting in 1 hour precipitation. This parameter can be changed to whatever duration you want.
Tuesday, January 23, 2018
Polynomial Curve Fitting
The code below shows how easily you can do a Polynomial Curve Fitting with Python and Numpy.
import numpy as np # sample x and y data - example x = [7.76,10.11,11.89,14.81,15.49] y = [1.851,1.971,1.953,1.842,1.805] # the polyfit functions does the nth degree polynomial best fit on the data, # returning the polynomial coefficients n = 4 # 4th degree polynomial, you can change for whatever degree you want coefs = np.polyfit(x,y,n) # The poly1d function applies the polynomial function to our calculated coefficients polyf = np.poly1d(coefs) #if we want to apply our polynomial function to a range of x values xf = np.linspace(0,20) yf = polyf(xf)
Labels:
curve fit,
numpy,
polyfit,
polynomial,
regression
Subscribe to:
Posts (Atom)