Preamble

import numpy as np                   # for multi-dimensional containers 
import pandas as pd                  # for DataFrames
import itertools
from plotapi import Chord
import json
import ssl

import plotapi

plotapi.api_key("5dc347fb-d747-474f-93e8-4e1a9f5d41dd")
from plotapi import Chord

ssl._create_default_https_context = ssl._create_unverified_context

Introduction

In previous sections, we visualised co-occurrences of Pokémon type. Whilst it was interesting to look at, the dataset only contained Pokémon from the first six geerations. In this section, we're going to use the Pokemon with stats Generation 8 dataset to visualise the co-occurrence of Pokémon types from generations one to eight.

The Dataset

The dataset documentation states that we can expect 13 variables per each of the 1017 Pokémon of the first eight generations.

Let's download the mirrored dataset and have a look for ourselves.

#data_url = 'http://ddragon.leagueoflegends.com/cdn/12.4.1/data/en_US/champion.json'
data_url = "https://ddragon.leagueoflegends.com/cdn/14.22.1/data/en_US/champion.json"
data = pd.read_json(data_url)
data.head()

data

It looks good so far, but let's confirm the 13 variables against 1017 samples from the documentation.

data = pd.DataFrame(data.data.tolist()).set_index(data.index)

data

Perfect, that's exactly what we were expecting.

Data Wrangling

We need to do a bit of data wrangling before we can visualise our data. We can see from the columns names that the Pokémon types are split between the columns Type 1 and Type 2.

pd.DataFrame(data.columns.values.tolist())

So let's select just these two columns and work with a list containing only them as we move forward.

Without further investigation, we can see that we have at least a few NaN values in the table above. We are only interested in co-occurrence of types, so we can remove all samples which contain a NaN value.

We can also see an instance where the type Fighting at index $1014$ is followed by \n. We'll strip all these out before continuing.

Our chord diagram will need two inputs: the co-occurrence matrix, and a list of names to label the segments.

First we'll populate our list of type names by looking for the unique ones.

types = [item for sublist in data.tags.tolist() for item in sublist]
names = np.unique(types).tolist()
pd.DataFrame(names)

Now we can create our empty co-occurrence matrix using these type names for the row and column indeces.

matrix = pd.DataFrame(0, index=names, columns=names)
matrix

We can populate a co-occurrence matrix with the following approach. We'll start by creating a list with every type pairing in its original and reversed form.

Which we can now use to create the matrix.

len(data.tags[0])

for x in data.tags:
    if(len(x) == 2):
        matrix.at[x[0], x[1]] += 1
        matrix.at[x[1], x[0]] += 1
    if(len(x) == 1):
        matrix.at[x[0], x[0]] += 1

matrix = matrix.values.tolist()

We can list DataFrame for better presentation.

pd.DataFrame(matrix)

colors = ["#80FF72","#ffce2b","#FA7E07","#ff006e","#8338ec","#3a86ff"]

Chord(matrix, names, colors=colors, curved_labels=True).show()

Chord Diagram with Names

It would be nice to show a list of Pokémon names and images when hovering over co-occurring Pokémon types. To do this, we can make use of the optional details parameter.

Let's also add a column to our dataset to store URLs that point to the images.

data['URL'] = ""

import urllib.request


for index, row in data.iterrows():
    #url = f"http://127.0.0.1:8000/images/data-is-beautiful/lol/champion/{row.name}.png"
    url = f"https://datacrayon.com/datasets/lol-img/champion/{row.name}.png"
    urllib.request.urlretrieve(f"http://ddragon.leagueoflegends.com/cdn/14.22.1/img/champion/{row.name}.png", f"img/{row.name}.png")

    data.at[index,'URL'] = url

data.URL

data.loc['Akali']

names

Next, we'll create an empty multi-dimensional arrays with the same shape as our matrix for our details and thumbnail images.

data[['tag_1','tag_2']] = pd.DataFrame(data.tags.tolist(), index= data.index)

data

#data.loc[data.tag_2.isna(), 'tag_2'] = data[data.tag_2.isna()].tag_1

details = np.empty((len(names),len(names)),dtype=object)
details_thumbs = np.empty((len(names),len(names)),dtype=object)

Now we can populate the details array with lists of Pokémon names in the correct positions.

data[(data['tag_1'] == "Assassin") & (data["tag_2"].isnull())]

data.iloc[1].title

for count_x, item_x in enumerate(names):
    for count_y, item_y in enumerate(names):
        if(item_y == item_x):
            details_urls = data[
                (data['tag_1'] == item_x) & (data["tag_2"].isnull())]['URL'].to_list()

            details_names = data[
                (data['tag_1'] == item_x) & (data["tag_2"].isnull())]['name'].to_list()
        else:
            details_urls = data[
                (data['tag_1'].isin([item_x, item_y])) &
                (data['tag_2'].isin([item_y, item_x]))]['URL'].to_list()

            details_names = data[
                (data['tag_1'].isin([item_x, item_y])) &
                (data['tag_2'].isin([item_y, item_x]))]['name'].to_list()
        
        urls_names = np.column_stack((details_urls, details_names))
        if(urls_names.size > 0):
            details[count_x][count_y] = details_names
            details_thumbs[count_x][count_y] = details_urls

        else:
            details[count_x][count_y] = []
            details_thumbs[count_x][count_y] = []

details=pd.DataFrame(details).values.tolist()
details_thumbs=pd.DataFrame(details_thumbs).values.tolist()

pd.DataFrame(details)

data.head(1)

data.iloc[0].info

def get_class_icon(txt):
    if(txt == "Mage"):
        return '<div style="display:inline-block; margin-left:2px; margin-right: 2px; text-align:center"><img width="50" height="40" src="https://datacrayon.com/datasets/lol-img/class/mage.svg"></div>'
    if(txt == "Assassin"):
        return '<div style="display:inline-block; margin-left:2px; margin-right: 2px; text-align:center"><img width="50" height="40" src="https://datacrayon.com/datasets/lol-img/class/assassin.svg"></div>'
    if(txt == "Fighter"):
        return '<div style="display:inline-block; margin-left:2px; margin-right: 2px; text-align:center"><img width="50" height="40" src="https://datacrayon.com/datasets/lol-img/class/fighter.svg"></div>'
    if(txt == "Marksman"):
        return '<div style="display:inline-block; margin-left:2px; margin-right: 2px; text-align:center"><img width="50" height="40" src="https://datacrayon.com/datasets/lol-img/class/marksman.svg"></div>'
    if(txt == "Support"):
        return '<div style="display:inline-block; margin-left:2px; margin-right: 2px; text-align:center"><img width="50" height="40" src="https://datacrayon.com/datasets/lol-img/class/support.svg"></div>'
    if(txt == "Fighter"):
        return '<div style="display:inline-block; margin-left:2px; margin-right: 2px; text-align:center"><img width="50" height="40" src="https://datacrayon.com/datasets/lol-img/class/fighter.svg"></div>'
    if(txt == "Tank"):
        return '<div style="display:inline-block; margin-left:2px; margin-right: 2px; text-align:center"><img width="50" height="40" src="https://datacrayon.com/datasets/lol-img/class/tank.svg"></div>'
                
    return ""

for index, row in data.iterrows():
    name = f"{row['name']}"
    name = name.lower().replace("'", "-").replace(". ","-").replace(" & willump", "").replace(" ", "-")
    
    if(row['tag_2'] == None):
        data.at[index,'tag_2'] = ""


    url = f"https://na.leagueoflegends.com/en-us/champions/{name}"
    data.at[index,'name_title'] = f'<b><a href="{url}">{row["name"]}</a></b>, {row["title"]}'

    data.at[index,'Class'] = f'{get_class_icon(row["tag_1"])}{get_class_icon(row["tag_2"])}'
    

    data.at[index,'stats'] = f"<b>Attack</b>: {row['info']['attack']}<br><b>Defense</b>: {row['info']['defense']}<br><b>Magic</b>: {row['info']['magic']}<br><b>Difficulty</b>: {row['info']['difficulty']}<br><b>Resource</b>: {row.partype.replace('Blood Well','Blood')}"



data['name_title'][0]



data_table = data[['tag_1', "tag_2","Class","stats","name_title"]]
data_table["URL"] = '<img title="' + data['blurb'] + '" src="'+data['URL'] +'">'

data_table.columns = ['Class 1', 'Class 2', 'Class', 'Info', 'Title', '']

Finally, we can put it all together but this time with the details matrix passed in.

colors = ["#8338ec","#3a86ff","#80FF72","#ffce2b","#FA7E07","#ff006e"]




Chord(
    matrix,
    names,
    colors=colors,
    details_thumbs=details_thumbs,
    noun="Champions",
    thumbs_width=50,
    thumbs_margin=0,
    credit=True,
    padding=0.05,
    arc_numbers=True,
    margin=30,
    curved_labels=True,
    reverse_gradients=True,
    verb="appear together in",
    data_table=data_table.to_csv(index=False),
    data_table_show_indices=False,
    animated_intro=True,
    animated_duration=3000,
).show()

names

Conclusion

In this section, we demonstrated how to conduct some data wrangling on a downloaded dataset to prepare it for a chord diagram. Our chord diagram is interactive, so you can use your mouse or touchscreen to investigate the co-occurrences!

import json

details_thumbs = details_thumbs
data_table = data_table.to_csv(index=False)

data = {"matrix": matrix,
       "details_thumbs": details_thumbs,
       "data_table": data_table}

with open("lol_classes.json", "w") as fp:
    json.dump(data, fp)

Preamble

Preamble

Introduction

The Dataset

Data Wrangling

Chord Diagram with Names

Conclusion

Pokemon Types with Chord

Animal Crossing Villager Species and Personality

IMDb Top 1000 with Chord

Desktop Browsers Market Share with Pie Fight

Pokemon Trends with Bar Fight

Degree Classification by Graduate Gender with Terminus

Degree Classification by Graduate Ethnicity with Terminus

Global Email Spam with Terminus

Apple 2021 Q3 Results with Sankey

Apple 2021 Q4 Results with Sankey

League of Legends Classes

Pokemon Types with Heat Map

Video Game Publishers and Genres with SplitChord

Top Olympic Medal Earning Countries

League of Legends World Championship

Animal Crossing Villager Style

IMDb Top 1000 with Heat Map

StamiStudios Panels and Colours