2021-01-31 - Exploring Streamlit for Machine Learning Apps

F.G.O. Stuart (1843-1923). With retouches and colorisation by Fidodog14 and SandyShores03.,Public domain, via Wikimedia Commons

For this weeks learning I started exploring Streamlit. This series of videos on YouTube from So You Want To Be A Data Scientist came up in my feed which gave a really good overview of using Streamlit to produce a web app and after watching them I wanted to give it a go.

In order try it out I wanted some form of machine learning model that would receive inputs from the app, make a prediction and display it back to the user. I’m very much a beginner with machine learning. I have tried the Kaggle Titanic - Machine Learning from Disaster Competition in the past but didn’t get very much further than the introductory example at the time. This time around though I had a purpose and was more motivated to progress. I wasn’t trying to create a competition winning model, just something that would be good enough to use in the web app.

After a bit of data cleanse and manipulation I had a trained model against 80% of the train data set and returning a 83% accuracy on the remaining 20% I used for testing. The model uses the Random Forest Classifier from Scikit Learn.

Now I had a reasonable model to work with I moved onto to Streamlit. Using the getting started guide and API reference in no time at all I had a the bones of a web app up and running. Another hour or so and I had a completely working application.

I’ve published the code to this repository. This is what I did…

Firstly, import modules.

import streamlit as st
import pickle

Streamlit is obviously required. Pickle is used to open the saved model file.

filename ="model.sv"
model = pickle.load(open(filename,'rb'))

Next I define dictionaries to hold feature label and values:

sex_d = {0:"Female",1:"Male"}
pclass_d = {0:"First",1:"Second", 2:"Third"}
embarked_d = {0:"Cherbourg", 1:"Queenstown", 2:"Southampton"}

Next I start to define the configuration and layout of the app. Using set_page_config() the page title is set. I then define a container for the overview, 2 columns, left and right and finally a container for the prediction.

st.set_page_config(page_title="Titanic Survival App")
overview = st.beta_container()
left, right = st.beta_columns(2)
prediction = st.beta_container()

The overview container is populated using the overview variable under which I use the 2 methods st.title() and st.markdown() to add text to the page

with overview:
	st.title("Titanic App")
	st.markdown("Predicting suriving the Titanic disaster")

The columns defined as left and right are then used to hold a combination of radio selections and sliders which will act as inputs to the model. The format_func argument is used when using the radio widget to change the display label for each option:

with left:
	sex_radio = st.radio( "Sex", list(sex_d.keys()), format_func=lambda x : sex_d[x] )
	pclass_radio = st.radio( "Ticket Class", list(pclass_d.keys()), \
	  format_func=lambda x: pclass_d[x])
	embarked_radio = st.radio( "Port of Embarkment", list(embarked_d.keys()), \
	  index=2, format_func= lambda x: embarked_d[x] )


with right:
	age_slider = st.slider("Age", value=50, min_value=1, max_value=100)
	sibsp_slider = st.slider( "# Siblings / Spouses on board", min_value=0, max_value=8)
        parch_slider = st.slider( "# Parents / Children on board", min_value=0, max_value=6)
        fare_slider = st.slider( "Passenger Fare", min_value=0, max_value=500, step=10)

These inputs are then used to generate a prediction and confidence:

data = [[pclass_radio, 
	sex_radio, 
	age_slider, 
	sibsp_slider, 
	parch_slider, 
	fare_slider, 
	embarked_radio]]
survival = model.predict(data)
s_confidence = model.predict_proba(data)

Which is then displayed in the prediction container:

with prediction:
	st.header("Survived? {0}".format("Yes" if survival[0] == 1 else "No"))
	st.subheader("Confidence {0:.2f} %".format(s_confidence[0][survival][0] * 100))

That’s all that’s needed. The app can be run with:

streamlit run titanic_app.py

the output of which gives the URL from which the app is accessible. Hit that in a browser and it’s good to go.

Each time a widget is interacted with the script is re-run and therefore a new prediction is made based on the inputs.

In no time at all and with minimal effort we have something usable! I’ve only scratched the surface of what’s possible but for now I’m putting a tick in the box for this project. For much better examples, see the Streamlit Gallery.

Blog
Data

Adam's Ramblings Thoughts and randomness

2021-01-31 - Exploring Streamlit for Machine Learning Apps

F.G.O. Stuart (1843-1923). With retouches and colorisation by Fidodog14 and SandyShores03.,Public domain, via Wikimedia Commons