The Pandas library in Python provides excellent, builtin support for time series data.
Once loaded, Pandas also provides tools to explore and better understand your dataset.
In this post, you will discover how to load and explore your time series dataset.
After completing this tutorial, you will know:
 How to load your time series dataset from a CSV file using Pandas.
 How to peek at the loaded data and calculate summary statistics.
 How to plot and review your time series data.
Let’s get started.
Daily Female Births Dataset
In this post, we will use the Daily Female Births Dataset as an example.
This univariate time series dataset describes the number of daily female births in California in 1959.
The units are a count and there are 365 observations. The source of the dataset is credited to Newton (1988).
Below is a sample of the first 5 rows of data, including the header row.

“Date”,”Daily total female births in California, 1959″ “19590101”,35 “19590102”,32 “19590103”,30 “19590104”,31 “19590105”,44 
Below is a plot of the entire dataset taken from Data Market.
You can download the dataset from this website.
Download the dataset and place it in your current working directory with the file name “dailytotalfemalebirthsincal.csv“.
Load Time Series Data
Pandas represented time series datasets as a Series.
A Series is a onedimensional array with a time label for each row.
We can load the Daily Female Births dataset directly using the Series class as follows:

# Load birth data from pandas import Series series = Series.from_csv(‘dailytotalfemalebirthsincal.csv’, header=0) print(series.head()) 
Running this example prints the first 5 rows of the dataset, as follows:

Date 19590101 35 19590102 32 19590103 30 19590104 31 19590105 44 Name: Daily total female births in California, 1959, dtype: int64 
The series has a name, which is the column name of the data column.
You can see that each row has an associated date. This is in fact not a column, but instead a time index for value. As an index, there can be multiple values for one time, and values may be spaced evenly or unevenly across times.
The main function for loading CSV data in Pandas is the read_csv() function. We can use this to load the time series as a Series object, instead of a DataFrame, as follows:

# Load birth data using read_csv from pandas import read_csv series = read_csv(‘dailytotalfemalebirthsincal.csv’, header=0, parse_dates=[0], index_col=0, squeeze=True) print(type(series)) print(series.head()) 
Note the arguments to the read_csv() function.
We provide it a number of hints to ensure the data is loaded as a Series.
 header=0: We must specify the header information at row 0.
 parse_dates=[0]: We give the function a hint that data in the first column contains dates that need to be parsed. This argument takes a list, so we provide it a list of one element, which is the index of the first column.
 index_col=0: We hint that the first column contains the index information for the time series.
 squeeze=True: We hint that we only have one data column and that we are interested in a Series and not a DataFrame.
One more argument you may need to use for your own data is date_parser to specify the function to parse datetime values. In this example, the date format has been inferred, and this works in most cases. In those few cases where it does not, specify your own date parsing function and use the date_parser argument.
Running the example above prints the same output, but also confirms that the time series was indeed loaded as a Series object.

<class ‘pandas.core.series.Series’> Date 19590101 35 19590102 32 19590103 30 19590104 31 19590105 44 Name: Daily total female births in California, 1959, dtype: int64 
It is often easier to perform manipulations of your time series data in a DataFrame rather than a Series object.
In those situations, you can easily convert your loaded Series to a DataFrame as follows:

dataframe = DataFrame(series) 
Further Reading
Exploring Time Series Data
Pandas also provides tools to explore and summarize your time series data.
In this section, we’ll take a look at a few, common operations to explore and summarize your loaded time series data.
Peek at the Data
It is a good idea to take a peek at your loaded data to confirm that the types, dates, and data loaded as you intended.
You can use the head() function to peek at the first 5 records or specify the first n number of records to review.
For example, you can print the first 10 rows of data as follows.

from pandas import Series series = Series.from_csv(‘dailytotalfemalebirthsincal.csv’, header=0) print(series.head(10)) 
Running the example prints the following:

Date 19590101 35 19590102 32 19590103 30 19590104 31 19590105 44 19590106 29 19590107 45 19590108 43 19590109 38 19590110 27 
You can also use the tail() function to get the last n records of the dataset.
Number of Observations
Another quick check to perform on your data is the number of loaded observations.
This can help flush out issues with column headers not being handled as intended, and to get an idea on how to effectively divide up data later for use with supervised learning algorithms.
You can get the dimensionality of your Series using the size parameter.

from pandas import Series series = Series.from_csv(‘dailytotalfemalebirthsincal.csv’, header=0) print(series.size) 
Running this example we can see that as we would expect, there are 365 observations, one for each day of the year in 1959.
Querying By Time
You can slice, dice, and query your series using the time index.
For example, you can access all observations in January as follows:

from pandas import Series series = Series.from_csv(‘dailytotalfemalebirthsincal.csv’, header=0) print(series[‘195901’]) 
Running this displays the 31 observations for the month of January in 1959.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Date 19590101 35 19590102 32 19590103 30 19590104 31 19590105 44 19590106 29 19590107 45 19590108 43 19590109 38 19590110 27 19590111 38 19590112 33 19590113 55 19590114 47 19590115 45 19590116 37 19590117 50 19590118 43 19590119 41 19590120 52 19590121 34 19590122 53 19590123 39 19590124 32 19590125 37 19590126 43 19590127 39 19590128 35 19590129 44 19590130 38 19590131 24 
This type of indexbased querying can help to prepare summary statistics and plots while exploring the dataset.
Descriptive Statistics
Calculating descriptive statistics on your time series can help get an idea of the distribution and spread of values.
This may help with ideas of data scaling and even data cleaning that you can perform later as part of preparing your dataset for modeling.
The describe() function creates a 7 number summary of the loaded time series including mean, standard deviation, median, minimum, and maximum of the observations.

from pandas import Series series = Series.from_csv(‘dailytotalfemalebirthsincal.csv’, header=0) print(series.describe()) 
Running this example prints a summary of the birth rate dataset.

count 365.000000 mean 41.980822 std 7.348257 min 23.000000 25% 37.000000 50% 42.000000 75% 46.000000 max 73.000000 
Plotting Time Series
Plotting time series data, especially univariate time series, is an important part of exploring your data.
This functionality is provided on the loaded Series by calling the plot() function.
Below is an example of plotting the entire loaded time series dataset.

from pandas import Series from matplotlib import pyplot series = Series.from_csv(‘dailytotalfemalebirthsincal.csv’, header=0) pyplot.plot(series) pyplot.show() 
Running the example creates a time series plot with the number of daily births on the yaxis and time in days along the xaxis.
Further Reading
If you’re interested in learning more about Pandas’ functionality working with time series data, see some of the links below.
Summary
In this post, you discovered how to load and handle time series data using the Pandas Python library.
Specifically, you learned:
 How to load your time series data as a Pandas Series.
 How to peek at and calculate summary statistics of your time series data.
 How to plot your time series data.
Do you have any questions about handling time series data in Python, or about this post?
Ask your questions in the comments below and I will do my best to answer.