Halo 5 — Building a Streamlit App to Get More Info on Your Competitors — Part 1

12 min readJul 29, 2021

For my Capstone Project as a data science student at Flatiron, I decided to dig into something that my friends and I got perhaps a little too into during the many months of lockdown — Halo 5: Guardians. One or two nights a week we would get together online to catch up for a bit, and then we would play the Super Fiesta Party playlist, where players always spawn with random weapons.

There were many occasions where we were pitted against players that made an absolute mockery of our abilities, to the point where we were unable to turn around before getting whacked in some embarrassing way. The skill discrepancies were so great on these occasions that as a part of my final project where I used machine learning to predict matchmaking results, I also decided to make a simple web app to show more detailed player skill information.

I’m certainly not the first to make a stats app for Halo 5. However, none of the apps I found took the liberty to get creative with the data and offer new metrics focused on player skill. Here are a few exmamples:

Win rate and total time played by game type (Slayer, Capture the Flag, Oddball, Strongholds, etc)
Aiming accuracy
“Per game” features (most apps only show total stats, but dividing these stats by the number of games reveals a lot about a player’s strengths and weaknesses)

I was able to create my version of a dashboard using the Halo API and a great dashboard prototyping tool called Streamlit.

For Part 1 of this blog, we’ll focus on extracting information on the Halo API . If you would like to skip to the end result, click here to check out Part 2.

If you would like to follow along, go ahead and visit https://developer.haloapi.com/ to get your personal API key.

Hint regarding getting theAPI key — you might be prompted to submit an application, but this shouldn’t be necessary. Try checking your profile and look for “Subscriptions.” There, you should find something leading you to subscribe to Developer Access.

The goal of the code below is to create a single dataframe for a player’s most recent match in Halo 5 and graph various skill metrics for the players on each team. It uses 6 separate API calls and a variety of Python functions that will all be chained together to easily create a match dataframe. I should note the first 3 calls will need to be run each time you run the Streamlit app, but the other 3 only need to be called convert some internal codes to readable information.

Once the code is executed, it will produce a graph like this. The plot function is designed to display a variety of stats, not just the WinRate stat below.

Here’s a quick walkthrough of what we’ll be doing before we dive into the API calls and function code:

Create a function to format any gamertag to a string compatible with the Halo 5 API
Call the Player Match History API to get the Match ID for the most recent game that gamertag has played (using their Xbox Live Gamertag)
Use that Match ID to call the Match Result: Arena API to get specific information about the match.
Compile that information into a ‘base’ dataframe for the match
Use the teammates and enemy team’s gamertags to call the Player Service Records: Arena API and pull more detailed information and skill metrics from each player’s Halo 5 play history
Compile that information into its own ‘history’ dataframe
Call metadata API’s to decode game type, playlist, and map
Merge the ‘base’ and the ‘history’ dataframes
Plot our merged dataframe to show stats by player, sorted by team

1. Insert your API Key

Paste your API key from the Halo Developer Portal as a string.

api_key = ‘insert your API key here as a string’

2. Imports

Import standard Python packages and some additional packages required for API calls, converting time codes, and plotting graphs.

#Standard Packages
import pandas as pd
pd.set_option('display.max_columns', None)
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import numpy as np
import pickle
import warnings
warnings.filterwarnings(action='ignore')# Packages used for API calls and data processing
import requests
import json
import ast
import time
import http.client, urllib.request, urllib.parse, urllib.error, base64
gamertag = 'Drymander'
from tqdm import tqdm
# !pip install isodate
import isodate# Plotly
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

3. Function: gamertag_for_api

This is a quick function that will convert any gamertag into a string compatible with the API.

# Prepare gamertag for API
def gamertag_for_api(gamertag):
 
 # Replace spaces with ‘+’
 gamertag = gamertag.replace(‘ ‘,’+’)
 return gamertag

4. Function: pull_recent_match

Using a single gamertag, call the Player Match History API, extract Match ID and date, use Match ID to call Match Result: Arena API, return API response as JSON.

def pull_recent_match(gamertag, how_recent, explore=False):
    
    # Use gamertag_for_api function to remove any spaces
    gamertag = gamertag_for_api(gamertag)
    headers = {
        # Request headers
        'Ocp-Apim-Subscription-Key': api_key,
    }
    # Pulls from arena mode, how_recent is how far to go back in the match history
    # 'count' refers to the number of matches to pull
    params = urllib.parse.urlencode({
        # Request parameters
        'modes': 'arena',
        'start': how_recent,
        'count': 1,
        'include-times': True,
    })
    
    # Try this, otherwise return error message
    try:
        
        # Connect to API and pull most recent match for specified gamer
        conn = http.client.HTTPSConnection('www.haloapi.com')
        conn.request("GET", f"/stats/h5/players/{gamertag}/matches?%s" % params, "{body}", headers)
        response = conn.getresponse()
        latest_match = json.loads(response.read())
        
        # Identify match ID and match date
        match_id = latest_match['Results'][0]['Id']['MatchId']
        match_date = latest_match['Results'][0]['MatchCompletedDate']['ISO8601Date']
        
        # Rest for 1.01 seconds to not get blocked by API
        time.sleep(1.01)
        
        # Using match_id, pull details from match
        conn.request("GET", f"/stats/h5/arena/matches/{match_id}?%s" % params, "{body}", headers)
        response = conn.getresponse()
        data = response.read()
        
        # Option to return as byte string for alternative viewing
        if explore == True:
            print(data)
        else:
            # Append match ID and date from player history API call
            match_results = json.loads(data)
            match_results['MatchId'] = match_id
            match_results['Date'] = match_date
        conn.close()
    
    # Print error if issue with calling API
    except Exception as e:
        print(f"[Errno {0}] {1}".format(e.errno, e.strerror))
    
    # Return match results as JSON
    return match_results# Show result
match_results = pull_recent_match('Drymander', 0, explore=False)

Preview of the API response:

5. Function: build_base_dataframe

This function will take the JSON we just got from the API and convert it into a ‘base ’dataframe for our final mtach dataframe.

# Function to build the base dataframe for a single match
# Designed to take in the JSON provided by the pull_recent_match function
def build_base_dataframe(match_results, gamertag):
    
    # Build empty base match dataframe
    df = pd.DataFrame()
    columns = [
        'Finished'
        'TeamId',
        'Gamertag',
        'SpartanRank',
        'PrevTotalXP',
    ]
    df = pd.DataFrame(columns = columns)
    
    # Populate base match dataframe with player stats for each player
    i = 0
    for player in match_results['PlayerStats']:player_dic = {}
        # Team ID
        player_dic['DNF'] = match_results['PlayerStats'][i]['DNF']
        player_dic['TeamId'] = match_results['PlayerStats'][i]['TeamId']
        # Team Color
        player_dic['TeamColor'] = match_results['PlayerStats'][i]['TeamId']
        # Gamer Tag
        player_dic['Gamertag'] = match_results['PlayerStats'][i]['Player']['Gamertag']
        # Spartan Rank
        player_dic['SpartanRank'] = match_results['PlayerStats'][i]['XpInfo']['SpartanRank']
        # Previous Total XP
        player_dic['PrevTotalXP'] = match_results['PlayerStats'][i]['XpInfo']['PrevTotalXP']
        df = df.append(player_dic, ignore_index=True)
        i += 1
    
    ########## DATE, GAME VARIANT, MAP ID, MATCH ID, PLAYLIST ID ##########
    df['Date'] = match_results['Date']
    df['Date'] = pd.to_datetime(df['Date']).dt.tz_convert(None)
#     df['Date'] = df['Date'].floor('T')
    df['MatchId'] = match_results['MatchId']
    df['GameBaseVariantId'] = match_results['GameBaseVariantId']
    df['MapVariantId'] = match_results['MapVariantId']
    df['PlaylistId'] = match_results['PlaylistId']
    
    ########## DEFINE PLAYER TEAM ##########
    playerteam = df.loc[df['Gamertag'] == gamertag, 'TeamId'].values[0]
    if playerteam == 0:
        enemyteam = 1   
    else:
        enemyteam = 0
        
    df['PlayerTeam'] = df['TeamId'].map({playerteam:'Player', enemyteam:'Enemy'})
    
    if match_results['TeamStats'][0]['TeamId'] == playerteam:
        playerteam_stats = match_results['TeamStats'][0]
        enemyteam_stats = match_results['TeamStats'][1]
    else: 
        playerteam_stats = match_results['TeamStats'][1]
        enemyteam_stats = match_results['TeamStats'][0]
    
    ########## DETERMINE WINNER ##########
    # Tie
    if playerteam_stats['Rank'] == 1 and enemyteam_stats['Rank'] == 1:
        df['Winner'] = 'Tie'
    # Player wins
    elif playerteam_stats['Rank'] == 1 and enemyteam_stats['Rank'] == 2:
        df['Winner'] = df['TeamId'].map({playerteam:'Victory', enemyteam:'Defeat'})
    # Enemy wins
    elif playerteam_stats['Rank'] == 2 and enemyteam_stats['Rank'] == 1:
        df['Winner'] = df['TeamId'].map({enemyteam:'Victory', playerteam:'Defeat'})
    # Error handling
    else:
        winner = 'Error determining winner'
    
    ########## TEAM COLOR ##########
    df['TeamColor'] = df['TeamId'].map({0:'Red', 1:'Blue'})
    
    # Set columns
    df = df[['Date', 'MatchId', 'GameBaseVariantId', 'PlaylistId', 'MapVariantId', 'DNF',
             'TeamId', 'PlayerTeam', 'Winner', 'TeamColor', 
             'Gamertag', 'SpartanRank', 'PrevTotalXP',
            ]]
    # Sort match by winning team
    df = df.sort_values(by=['Winner'], ascending=False)
    
    return dfdf = build_base_dataframe(pull_recent_match('Drymander', 8), 'Drymander')df.head(3)

Preview of the ‘base’ dataframe:

6. Function: get_player_list

Similar to the function where we converted a single gamertag into an API compatible string, this function will prepare all 8 players’ gamertags into a single string for the API.

def get_player_list(df):
    
    # Create list from our df['Gamertag'] column and remove the brackets
    player_list = str(list(df['Gamertag']))[1:-1]
    
    # Format string for API
    player_list = player_list.replace(', ',',')
    player_list = player_list.replace("'",'')
    player_list = player_list.replace(' ','+')
    
    # Return in one full string
    return player_list

7. Function: get_player_history

Now we’ll take that single string representing all 8 gamertags from the match and call the Player Service Records: Arena API. This will return a list of 8 dictionaries containing lots of interesting player stats

def get_player_history(df, readable=False):
    headers = {
        # Request headers
        'Ocp-Apim-Subscription-Key': api_key,
    }
    params = urllib.parse.urlencode({
    })
    # Use our function in the block above the prepare the gamertags for the API
    player_list_api = get_player_list(df)
    
    # Try calling service records API using our player list
    try:
        conn = http.client.HTTPSConnection('www.haloapi.com')
        conn.request("GET", f"/stats/h5/servicerecords/arena?players={player_list_api}&%s" % params, "{body}", headers)
        response = conn.getresponse()
        data = response.read()
        player_history = json.loads(data)
        conn.close()
    
    # Return error if issue with API
    except Exception as e:
        print(f"[Errno {0}] {1}".format(e.errno, e.strerror))
    
    # Option to view in byte string readable format
    if readable == False:
        return player_history
    else:
        return data# Show result
player_history = get_player_history(df)
player_history

Preview of the API results:

8. Function: build_history_dataframe

Next, we’ll create a separate dataframe with the Player Service Record: Arena stats.

def build_history_dataframe(player_history, variant_id):
    
    # Option to view 'streamlit' dataframe, which includes pertinent
    # information but excludes all stats for modeling
    stat_list = ['Gamertag', 'TotalKills', 'TotalHeadshots', 'TotalWeaponDamage', 'TotalShotsFired','TotalShotsLanded', 'TotalMeleeKills', 'TotalMeleeDamage', 'TotalAssassinations', 'TotalGroundPoundKills', 'TotalGroundPoundDamage', 'TotalShoulderBashKills','TotalShoulderBashDamage', 'TotalGrenadeDamage', 'TotalPowerWeaponKills','TotalPowerWeaponDamage', 'TotalPowerWeaponGrabs', 'TotalPowerWeaponPossessionTime','TotalDeaths', 'TotalAssists', 'TotalGamesCompleted', 'TotalGamesWon','TotalGamesLost', 'TotalGamesTied', 'TotalTimePlayed','TotalGrenadeKills']
    vdf = pd.DataFrame(columns = stat_list)
    
    # Set coutner variable
    i = 0
    # Loop the goes through each player in the player history JSON
    for player in player_history['Results']:
        
        # Loop that goes through each Arena Game Base Variant and locates
        # the details specific to the game vase variant of the match
        for variant in player['Result']['ArenaStats']['ArenaGameBaseVariantStats']:
            if variant['GameBaseVariantId'] == variant_id:
                variant_stats = variant
        
        # Create empty dictionary where stats will be added
        variant_dic = {}
        
        # Modeling option - includes all features but does not yet calculate
        variant_dic['Gamertag'] = player_history['Results'][i]['Id']
        variant_dic['TotalTimePlayed']= isodate.parse_duration(variant_stats['TotalTimePlayed']).total_seconds() / 3600
        variant_dic['K/D'] = variant_stats['TotalKills'] / variant_stats['TotalDeaths']
        variant_dic['Accuracy'] = variant_stats['TotalShotsLanded'] / variant_stats['TotalShotsFired']
        variant_dic['WinRate'] = variant_stats['TotalGamesWon'] / variant_stats['TotalGamesLost']
        # Loop that appends all stats to variant dic
        for stat in stat_list[1:]:    
            variant_dic[stat] = variant_stats[stat]
        # Parsing ISO duration times
        variant_dic['TotalTimePlayed']= isodate.parse_duration(variant_stats['TotalTimePlayed']).total_seconds() / 3600
        variant_dic['TotalPowerWeaponPossessionTime']= isodate.parse_duration(variant_stats['TotalPowerWeaponPossessionTime']).total_seconds() / 3600
        # Per game stats
        per_game_stat_list = ['TotalKills', 'TotalHeadshots', 'TotalWeaponDamage', 'TotalShotsFired', 'TotalShotsLanded', 'TotalMeleeKills', 'TotalMeleeDamage', 'TotalAssassinations', 'TotalGroundPoundKills', 'TotalGroundPoundDamage', 'TotalShoulderBashKills', 'TotalShoulderBashDamage', 'TotalGrenadeDamage', 'TotalPowerWeaponKills', 'TotalPowerWeaponDamage', 'TotalPowerWeaponGrabs', 'TotalPowerWeaponPossessionTime', 'TotalDeaths', 'TotalAssists', 'TotalGrenadeKills']
        for stat in per_game_stat_list:
            per_game_stat_string = stat.replace('Total', '')
            per_game_stat_string = f'{per_game_stat_string}PerGame'
            variant_dic[per_game_stat_string] = variant_dic[stat] / variant_dic['TotalGamesCompleted']
        vdf = vdf.append(variant_dic, True)
        i += 1
            
    # Return the streamlit or modeling dataframe
    return vdfdf = build_history_dataframe(player_history, '1571fdac-e0b4-4ebc-a73a-6e13001b71d3')
df

Preview of the ‘history’ dataframe:

9. Function: decode_column

The metadata APIs feature codes and their corresponding values. This function will convert the codes.

# This function will convert codes provided by the API into a readable format
def decode_column(df, column, api_dict):
    
    # Empty list of decoded values
    decoded_list = []
    
    # Loop through each row
    for row in df[column]:
        i = 0
        
        # Loop through API dictionary
        for item in api_dict:
            
            # If code found, append it to list
            if item['id'] == row:
                name = item['name']
                decoded_list.append(name)
            
            # Otherwise keep searching until found
            else:
                i += 1
    
    # Return decoded list
    return decoded_list

10. Function: decode_maps

We’ll need a separate conversion function for the maps.

# This function will convert maps to readable format
def decode_maps(df, column, api_dict):
    decoded_list = []
    
    # Loop through each row
    for row in df[column]:
        i = 0
        
        # Creating map_count variable
        map_count = len(api_dict)
        
        # For each item in API dictionary
        for item in api_dict:
            
            # If map cannot be found, name 'Custom Map'
            if (i+1) == map_count:
                name = 'Custom Map'
                decoded_list.append(name)
            
            # If found, assign value to code
            elif item['id'] == row:
                name = item['name']
                decoded_list.append(name)
            
            # Otherwise keep looping
            else:
                i += 1
    
    # Return decoded list
    return decoded_list

11. API Metadata Calls

Now that we have our conversion functions, we’ll call the metadata APIs and save the responses as dictionaaries.

Game Base Variant API:

headers = {
        # Request headers
        'Accept-Language': 'en',
        'Ocp-Apim-Subscription-Key': api_key,
    }params = urllib.parse.urlencode({
    })try:
        conn = http.client.HTTPSConnection('www.haloapi.com')
        conn.request("GET", "/metadata/h5/metadata/game-base-variants?%s" % params, "{body}", headers)
        response = conn.getresponse()
        data = response.read()
        GameBaseVariantId_dic = json.loads(data)
        conn.close()
    except Exception as e:
        print(f"[Errno {0}] {1}".format(e.errno, e.strerror))

Playlist Metadata API

headers = {
    # Request headers
    'Accept-Language': 'en',
    'Ocp-Apim-Subscription-Key': api_key,
}params = urllib.parse.urlencode({
})try:
    conn = http.client.HTTPSConnection('www.haloapi.com')
    conn.request("GET", "/metadata/h5/metadata/playlists?%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    PlaylistId_dic = json.loads(data)
    conn.close()
except Exception as e:
    print("[Errno {0}] {1}".format(e.errno, e.strerror))

Map Variant API

headers = {
        # Request headers
        'Accept-Language': 'en',
        'Ocp-Apim-Subscription-Key': api_key,
    }params = urllib.parse.urlencode({
    })
    map_list = []
    for map_id in tqdm(unique_map_ids):
        try:
            conn = http.client.HTTPSConnection('www.haloapi.com')
            conn.request("GET", f"/metadata/h5/metadata/map-variants/{map_id}?%s" % params, "{body}", headers)
            response = conn.getresponse()
            data = response.read()
            map_dic = json.loads(data)
            map_list.append(map_dic)
            conn.close()
            time.sleep(1.1)
        except Exception as e:
            print(f"[Errno {0}] {1}".format(e.errno, e.strerror))

14. Main function: recent_match_stats

Finally, we’ll combine all of our functions into one function that compiles the match history dataframe for any specified gamertag going back any number of matches.

# Get the match dataframe
def recent_match_stats(gamertag, back_count=0):
    
    # Pull the match result as JSON from API
    match_results = pull_recent_match(gamertag, back_count, explore=False)
    
    # Build the base dataframe
    base_df = build_base_dataframe(match_results, gamertag=gamertag)
    
    # Convert dates
    base_df['Date'] = base_df['Date'].dt.strftime('%B, %d %Y')
    
    # Decode GameBaseVariantId, PlaylistId, and MapVariantId
    base_df['GameBaseVariantId'] = decode_column(base_df, 'GameBaseVariantId', GameBaseVariantId_dic)    
    base_df['PlaylistId'] = decode_column(base_df, 'PlaylistId', PlaylistId_dic)
    base_df['MapVariantId'] = decode_maps(base_df, 'MapVariantId', map_list)
    
    # Sleep for 1.01 seconds to avoid issues with API
    time.sleep(1.01)
    
    # Create playerlist for player history API call
    player_list = get_player_list(base_df)
    
    # Call API to get player history JSON
    player_history = get_player_history(base_df)
    
    # Build base player stats dataframe based on player history API call
    history_df = build_history_dataframe(player_history, match_results['GameBaseVariantId'])
    
    # Merge the base dataframe and stats dataframe
    full_stats_df = pd.merge(base_df, history_df, how='inner', on = 'Gamertag')
    
    return full_stats_df

Preview of the final match dataframe:

15. Graph function: compare_stat

Now that we have our dataframe, we’ll set up our graph function. This function is designed to do a few things:

Separate the players by team
Locate desired stat specified within the function parameters
Display stats in horizontal bar chart from highest to lowest broken out by player team and enemy team

def compare_stat(df, column_name):    
    df = df.round(2)
    # Separate player and enemy teams
    df_player = df.loc[df['PlayerTeam'] == 'Player']
    df_enemy = df.loc[df['PlayerTeam'] == 'Enemy']# Sort total time played by descending
    df_player = df_player.sort_values(by=[column_name])
    df_enemy = df_enemy.sort_values(by=[column_name])# Assign player / enemy colors
    if df_player['TeamColor'].iloc[0] == 'Blue':
        player_color = 'Blue'
        enemy_color = 'Red'
    else:
        player_color = 'Red'
        enemy_color = 'Blue'
    
    # Make subplot and X axis range
    fig = make_subplots(rows=2, cols=1, subplot_titles=[f'Player Team - {column_name}', 
                                                        f'Enemy Team - {column_name}'],
                       vertical_spacing = 0.12)
    x_range = df[column_name].max()
    
    # Player team sub plot
    fig.add_trace(go.Bar(
                x=df_player[column_name],
                y=df_player['Gamertag'],
                orientation='h',
                text=df_player[column_name],
                textposition='auto',
                marker_color=player_color),
                    row=1, col=1)
    fig.update_xaxes(range=[0, x_range], row=1, col=1)
    
    # Enemy team sub plot
    fig.add_trace(go.Bar(
                x=df_enemy[column_name],
                y=df_enemy['Gamertag'],
                orientation='h',
                text=df_enemy[column_name],
                textposition='auto',
                marker_color=enemy_color),
                    row=2, col=1)
    fig.update_xaxes(range=[0, x_range], row=2, col=1)
    fig.update_yaxes(automargin=True)
    fig['layout'].update(margin=dict(l=125,r=50,b=20,t=30))
    fig['layout'].update(showlegend=False)
    return fig

The graph should look something like this:

And that’s it! By combining the functions above, we’ll be able to efficiently design our Streamlit app.

Click here to check out Part 2, where we walk through the entire setup process for the Streamlit app.