GET FREE FINANCIAL DATA W/ PYTHON (EARNINGS ESTIMATES-FROM YAHOO FINANCE)

Today I present a simple function to extract Earnings Estimates from Yahoo Finance. If you have any questions feel free to leave it in the comments. 

This code uses Python 3 on Windows 8.1 but could be easily adapted for Python 2 by changing the 'urllib' import. 

First we import the necessary packages into our programming environment. 


import pandas as pd
import urllib as u
from bs4 import BeautifulSoup as bs
import warnings
warnings.filterwarnings("ignore")

I also suppress warnings for the deprecation warning for the Pandas "Dataframe.convert_objects()"method within the scraper function that follows. 

This function takes the Yahoo Finance URL with our symbol of interest and uses BeautifulSoup to parse the resulting HTML. I also added some formatting code to clean up the readability of the headers. 


def _get_eps_estimates(url):
    try:
        html_source = u.request.urlopen(url).read()
        soup = bs(html_source, 'lxml')
        # 
        # table
        table = soup.find_all('table', attrs={'class': 'yfnc_tableout1'})
        header = [th.text for th in table[0].find_all(class_='yfnc_tablehead1')]
        header_title = header[0]
        header_cols = header[1:5]
        index_row_labels = header[-5:]
        body = [[td.text for td in row.select('td')] for row in table[0].find_all('tr')]
        body = body[1:]
        df = pd.DataFrame.from_records(body)
        df = df.ix[:, 1:]
        df.index = index_row_labels
        header_cols = pd.Series(header_cols)
        header_cols = header_cols.str.replace(
            'Year', 'Year ').str.replace('Qtr.', 'Qtr. ')
        df.columns = header_cols
        eps_est = df.convert_objects(convert_numeric=True)
    except Exception as e:
        print(e)
    return eps_est

Now let's test the function using the proper URL. I'm using the symbol 'SWKS' in this example.


symbol = 'SWKS'

base_url = r'http://finance.yahoo.com/q/ae?s={}+Analyst+Estimates'.format(symbol)
eps_est = _get_eps_estimates(base_url)
eps_est

Your output should appear like the following: