Thursday, April 14, 2016

Time-Series in Python

Dealing with timeseries is a very common task in Hydrology.

One of the possibilities to process timeseries in python is to use a simple list.

For example, we can have a list of lists like this:

series1 = [ ['01/01/1900',0.0],['01/02/1900',0.1],['01/03/1900',0.3],['01/04/1900',0.4],['01/05/1900',2.2]...]

In this case, the ['01/01/1900',0.0] is composed of lists with a string representing the date, and a float number representing a value.

To properly make computations with dates, including sorting and grouping, it is necessary to interpret the string as a datetime format.

import datetime



for i in series1:

    i[0]=datetime.datetime.strptime(i[0], '%m/%d/%Y')


datetime objects accepts being sorted, making possible to sort the list based on the date, for example:

series1.sort(key=lambda x: x[0])

And we can make sums or averages based on specific months or years:

#eg. List of year 1900

lst1900 = [item for item in series1 if item[0].year==1900]



#Sum of 1900's values:

sum1900 = sum[item[1] for item in series1 if item[0].year==1900]



# avg of 1900's values

avg1900 = sum1900 / float(len(lst1900))