# Synthetic ETF Data Generation (Part-2) - Gaussian Mixture Models

This post is a summary of a more detailed Jupyter (IPython) notebook where I demonstrate a method of using Python, Scikit-Learn and Gaussian Mixture Models to generate realistic looking return series. In this post we will compare real ETF returns versus synthetic realizations.

# Post Outline

• Why IEX?

• Why Parquet?

• System Outline

• Code

# Post Outline

• Notes on Part-2
• The Data
• How Do Aggregate Bid-Ask Spreads Vary with Days To Expiration?
• How Do Bid-Ask Spreads Vary with Volume?
• How Do Bid-Ask Spreads Vary with Volatility?
• Summary Conclusions

# Post Outline

• The Objective
• The Data
• Basic Data Analysis
• How Do Aggregate Bid-Ask Spreads Vary with Days To Expiration?
• How Do Bid-Ask Spreads Vary with Volume?
• How Do Bid-Ask Spreads Vary with Volatility?
• Summary Conclusions

# Get Free Financial Data w/ Python (Fundamental Ratios-From Finviz.com)

A simple script to scrape fundamental ratios from Finviz.com. This basic code can be tailored to suit your application.

``````
"""IPython 3.1, Python 3.4, Windows 8.1"""

import pandas as pd
import urllib as u
from bs4 import BeautifulSoup as bs

"""
First visit www.Finviz.com and get the base url for the quote page.
example: http://finviz.com/quote.ashx?t=aapl

Then write a simple function to retrieve the desired ratio.
In this example I'm grabbing Price-to-Book (mrq) ratio
"""

def get_price2book( symbol ):
try:
url = r'http://finviz.com/quote.ashx?t={}'\
.format(symbol.lower())
soup = bs(html, 'lxml')
# Change the text below to get a diff metric
pb =  soup.find(text = r'P/B')
pb_ = pb.find_next(class_='snapshot-td2').text
print( '{} price to book = {}'.format(symbol, pb_) )
return pb_
except Exception as e:
print(e)

"""
Construct a pandas series whose index is the list/array
of stock symbols of interest.

Run a loop assigning the function output to the series
"""
stock_list = ['XOM','AMZN','AAPL','SWKS']
p2b_series = pd.Series( index=stock_list )

for sym in stock_list:
p2b_series[sym] = get_price2book(sym)
``````

The function should produce the following:

``````
XOM price to book = 1.89
AMZN price to book = 20.74
AAPL price to book = 5.27
SWKS price to book = 5.52
``````

Very simple adaptable code, allowing you to spend more time analyzing the data and less time aggregating it.