Visualizing Data with Python

In this post I will go over a few different ways to manipulate price data to create visuals to aid in the investing and trading research process. I have attached a ten minute YouTube video that has explanations, etc. However, this post also attempts to briefly walk you through the Python code.

First, we will use some Python code to download some free data from the Yahoo Finance API. The code below creates a function called “get_data” that downloads and adjusts price data for a specified symbol over a specified period of time. I then download and store $SPY and $VIX data into a pandas dataframe.

import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
from pandas_datareader import data
import seaborn as sns

print "Start Time: ", datetime.today().now()

def get_data(symbol, start_date, end_date):

dat = data.DataReader(symbol, "yahoo", start_date, end_date)
dat['Ratio'] = dat['Adj Close'] / dat['Close']
dat['Open'] = dat['Open'] * dat['Ratio']
dat['High'] = dat['High'] * dat['Ratio']
dat['Low'] = dat['Low'] * dat['Ratio']
dat['Close'] = dat['Close'] * dat['Ratio']
return dat

Ticker1 = get_data("SPY",datetime(2005,1,1),datetime.today())
Ticker2 = get_data("^VIX",datetime(2005,1,1),datetime.today())

df = pd.DataFrame(index=Ticker1.index)

df['spy'] = Ticker1['Close']
df['vix'] = Ticker2['Close']

This next piece of code is two ways to accomplish the same thing – a graph of both SPY and VIX. Both will create the desired plots, but in later posts we will build on why it is important to know how to plot the same graph in two different ways.

1.
df[['spy','vix']].plot(subplots=True)

2.
fig, ax = plt.subplots(figsize=(12,6))
ax = plt.subplot(211)
plt.plot(df['spy'],label='spy')
plt.legend(loc=0)

ax = plt.subplot(212)
plt.plot(df['vix'],label='vix',color='g')
plt.legend(loc=0)

The first method is simple and straight forward. The second method creates a “figure” and “axis”. We then use plt.subplot to specify how many rows, columns, and which chart we are working with. For example, ax = plt.subplot(212) means we want to set our axis to our display that has 2 rows, 1 column, and we want to work with our 2nd graph. plt.subplot(743) would be 7 rows, 4 columns, and work with the 3rd graph (of 28). You can also use commas to specify like this plt.subplot(7,4,3).

Anyways, here is the output.

subplots1

The next task is to mark these graphs whenever some significant event happens. In this example, I show code that marks each time SPY falls 10 points or more below its 20 period simple moving average. I then plot SPY and mark each occurrence with a red diamond. I also added a line of code that prints a title, “Buying Opportunities?”, on our chart.

df['MovAvg'] = Ticker1['Close'].rolling(20).mean()
markers = [idx for idx,close in enumerate(df['spy']) if df['MovAvg'][idx] - close &gt= 10]
plt.suptitle("Buying Opportunities?")
plt.plot(df['spy'],marker='D',markerfacecolor='r',markevery=markers)

This code creates a python list named markers. In this list we loop through our SPY data and if our condition is true (price is 10 or more points below the moving average) we store the bar number in our markers list. In the plot line we specify the shape of our marker as a diamond using ‘D’, give it the color red using ‘r’, and mark each point in our markers list using the markevery option. The output of this piece of the code is below.

markers

Next, and simply, I show some code on how to shade an area of the chart. This may be important if you are trying to specify different market regimes and want to visualize when one started or ended. In this example I use the financial crisis and arbitrarily defined it by the dates October 2007 to March 2009. The code below is extremely simple and we only introduce the axvspan function. It takes a start and stopping point of where shading should exist. The code and output are below.

fig, ax = plt.subplots()
ax.plot(df['spy'],label='spy')
ax.axvspan(datetime(2007,10,1), datetime(2009,3,9), alpha=0.5, color='red')

shading

Personally I do not like the shading of graphs, but prefer the changing of the lines colors. There are a few ways to do this, but this is the simplest work around for this post. I create two empty lists for our x and y values named marked_dates and marked_prices. These will contain the points we want to plot with an alternate color. I then loop through the SPY data and say if date is within our financial crisis window then add the date to our x list and add the price to our y list. I do this with the code below.

marked_dates = []
marked_prices = []

for date,close in zip(df.index,df['spy']):

if date >= datetime(2007,10,1) and date <= datetime(2009,3,9):marked_dates.append(date)
marked_prices.append(close)

I then plot our original price series and then also plot our new x’s and y’s to overlap our original series. The new x’s and y’s are colored red whereas our original price series is plotted with default blue. The code and output is below.

fig,ax = plt.subplots()
ax.plot(df['spy'],label='spy')
ax.plot(marked_dates,marked_prices,color='r',label='financial crisis')
plt.legend(loc=0)

linecolor

That’s it for this post, but I hope this info helps you in visualizing your data. Please let me know if you enjoy these Python tutorial type posts and I will keep doing them – I know there is a huge interest in Python due to its simplicity.

Also, I understand there may be simpler or more “pythonic” ways to accomplish some of these things. I am often writing this code with intentions of simplifying the code for mass understanding, unaware of the better ways, or attempting to build on these blocks in later posts.

Cheers,

Dave

 


 

It has been brought to my attention that Yahoo Finance has changed their API and this code will no longer work. However, we can simply change the get_data function to the code below to call from the Google Finance API

def get_data(symbol,start_date,end_date):

dat = data.DataReader(symbol,"google",start_date,end_date)
dat = dat.dropna()
return dat

Google adjusts their data so we do not have to. So I removed those lines. I also swapped out ‘yahoo’ for ‘google’ in the DataReader function parameters. Google’s data is also not as clean so I added a line to drop NaN values. That’s it. Simple adjustment to change data sources.

Thanks for reading,
Dave


5 thoughts on “Visualizing Data with Python

  1. Thanks for the post! I decided to pick up with my dip into the python waters again this week. Your post is well-timed.

    For some reason, in the ‘buying opportunities’ part of the code, I got the right display except that vix was also plotted below it. Not sure why! An exercise to figure out…

    1. Anytime. To be honest, it is probably something simple like forgetting to re-comment out the first group of plots.

Leave a Reply

Your email address will not be published. Required fields are marked *

Login


Username
Password
(close)

Create an Account!


Username
Email
Password
Confirm Password
Want to Login? (close)

forgot password?


Username or Email
(close)

Risk Disclosure

FUTURES AND FOREX TRADING CONTAINS SUBSTANTIAL RISK AND IS NOT FOR EVERY INVESTOR. AN INVESTOR COULD POTENTIALLY LOSE ALL OR MORE THAN THE INITIAL INVESTMENT. RISK CAPITAL IS MONEY THAT CAN BE LOST WITHOUT JEOPARDIZING ONES FINANCIAL SECURITY OR LIFE STYLE. ONLY RISK CAPITAL SHOULD BE USED FOR TRADING AND ONLY THOSE WITH SUFFICIENT RISK CAPITAL SHOULD CONSIDER TRADING. PAST PERFORMANCE IS NOT NECESSARILY INDICATIVE OF FUTURE RESULTS.

Hypothetical Performance Disclaimer

HYPOTHETICAL PERFORMANCE RESULTS HAVE MANY INHERENT LIMITATIONS, SOME OF WHICH ARE DESCRIBED BELOW. NO REPRESENTATION IS BEING MADE THAT ANY ACCOUNT WILL OR IS LIKELY TO ACHIEVE PROFITS OR LOSSES SIMILAR TO THOSE SHOWN; IN FACT, THERE ARE FREQUENTLY SHARP DIFFERENCES BETWEEN HYPOTHETICAL PERFORMANCE RESULTS AND THE ACTUAL RESULTS SUBSEQUENTLY ACHIEVED BY ANY PARTICULAR TRADING PROGRAM. ONE OF THE LIMITATIONS OF HYPOTHETICAL PERFORMANCE RESULTS IS THAT THEY ARE GENERALLY PREPARED WITH THE BENEFIT OF HINDSIGHT. IN ADDITION, HYPOTHETICAL TRADING DOES NOT INVOLVE FINANCIAL RISK, AND NO HYPOTHETICAL TRADING RECORD CAN COMPLETELY ACCOUNT FOR THE IMPACT OF FINANCIAL RISK OF ACTUAL TRADING. FOR EXAMPLE, THE ABILITY TO WITHSTAND LOSSES OR TO ADHERE TO A PARTICULAR TRADING PROGRAM IN SPITE OF TRADING LOSSES ARE MATERIAL POINTS WHICH CAN ALSO ADVERSELY AFFECT ACTUAL TRADING RESULTS. THERE ARE NUMEROUS OTHER FACTORS RELATED TO THE MARKETS IN GENERAL OR TO THE IMPLEMENTATION OF ANY SPECIFIC TRADING PROGRAM WHICH CANNOT BE FULLY ACCOUNTED FOR IN THE PREPARATION OF HYPOTHETICAL PERFORMANCE RESULTS AND ALL WHICH CAN ADVERSELY AFFECT TRADING RESULTS.