log-log Plot of Covid-19 Using Plotly


From time to time I check on the Covid-19 trends using the log-log diagram. This plot is characterized with total number of cases shown on the X axis, and number of new confirmed cases in the past week shown on the Y axis. In order to reproduce this plot I will be using plotly.

Data and Preparation

The data is sourced from:

I am especially interested in plotting data for the city I live in, in order to get the more clear picture. The raw data is a list of numbers, representing new daily cases on the given date:

Date New Cases in Belgrade
10.03.2020. 2
11.03.2020. 3

However, the data obtained is missing values from 30.3.2020. to 15.4.2020. Since this is not a scientific paper, I took liberty of filling in the missing data using the growth factor. The growth factor represents the rate at which new cases progress. It is calculated by dividing number of new cases of the current day with the number of new cases the day before. However, the growth factor data is available for the whole country, so the interpolated data will not be precise.

The Fun Part

My data is stored in CSV, and the easiest way to manipulate csv file is with pandas. It is recommended to install pandas in your wirtual environment using pip:

pip install pandas

From here it is easy to obtain daily new cases from the CSV:

import pandas as pd

def load(data):
    return pd.read_csv(data)

if __name__ == '__main__':
    covid = load('./data/interpolated.csv')
    belgrade = covid['Belgrade'].tolist()

Now, we are two steps removed from the plot. For the X axis, we need to calculate the total number of cases for each day since the virus was first detected in the city. This is the cumulative sum. That can be calcuated easy using the list comprehension.

def get_daily_total(data):
    return [t + sum(data[:i]) for i, t in enumerate(data)]

In other words, each day is the sum of the number of new cases that day and all of the previous days.

The Y axis is represented with the similar cumulative sum, but with the seven days window.

def get_seven_past_days_total(data):
    # This could also be a list comprehension
    # (the ugly one though). But this is left as
    # an exercise to the reader ;)
    seven_days_running_total = []
    for i, today in enumerate(data):
        if i > 7:
            seven_days_running_total.append(today + sum(data[i-7:i]))
        else:
            seven_days_running_total.append(today + sum(data[:i]))
    return seven_days_running_total

Each number in the resulting list is the sum of new cases in the last seven days.

Now we have all the numbers we need to make the plot:

if __name__ == '__main__':
    covid = load('./data/interpolated.csv')
    belgrade = covid['Belgrade'].tolist()

    belgrade_daily_total = get_daily_total(belgrade)
    belgrade_seven_days_total = get_seven_past_days_total(belgrade)

But before that, you will need to install plotly:

pip install plotly

Now we can plot the log-log diagram:

def plot(x, y, title):
    fig = go.Figure()

    # We will use Scatter plot with
    # log-log axes
    fig.add_trace(go.Scatter(x=x, y=y))

    fig.update_layout(
        xaxis_type='log',
        yaxis_type='log',
        xaxis_title='Total Confirmed Cases',
        yaxis_title='New Confirmed Cases in the last 7 days',
        title_text=title,
        font=dict(
            family='Courier New, monospace',
            size=18,
            color='#7f7f7f'
        )
    )

    fig.show()

Finally, we have:

if __name__ == '__main__':
    covid = load('./data/interpolated.csv')
    belgrade = covid['Belgrade'].tolist()

    belgrade_daily_total = get_daily_total(belgrade)
    belgrade_seven_days_total = get_seven_past_days_total(belgrade)

    plot(x=belgrade_daily_total, y=belgrade_seven_days_total, title='Belgrade')

Which gives us this beautiful plot:

Covid-19 Belgrade data

You can find the full code in this repository.

We're not spammers, and you can opt-out at any moment. We hate spam as much as you do.

powered by TinyLetter

See also