Wednesday, May 30, 2018

Pandas - Operations between rows - distance between 2 points

If we have a table with a column with xy coordinates, for example:



We can get the difference between consecutive rows by using  Pandas SHIFT function on columns.
".shift(-1)" will roll the rows 1 position backwards, and ".shift(1)" or simply ".shift()" will roll down your column by 1 position of the rows.

In our example, df1['x'].shift() will return:

0              NaN
1    455395.996360
2    527627.076641
3    536278.269190
4    553932.441097
5    568699.553239
6    569709.130272
7    573016.302437
8    575141.096777
9    580107.934566

if we want to calculate the euclidean distance between consecutive points, we can use the shift associated with numpy functions numpy.sqrt and numpy.power as following:

df1['diff']= np.sqrt(np.power(df1['x'].shift()-df1['x'],2)+
   np.power(df1['y'].shift()-df1['y'],2))

Resulting in:

0              NaN
1     89911.101224
2     21323.016099
3    204394.524574
4     37767.197793
5     46692.771398
6     13246.254235
7      2641.201366
8     15153.187527
9     15853.974422


No comments:

Post a Comment