From time to time I check on the Covid-19 trends using the log-log diagram. This plot is characterized with total number of cases shown on the X axis, and number of new confirmed cases in the past week shown on the Y axis. In order to reproduce this plot I will be using plotly.
Data and Preparation
The data is sourced from:
I am especially interested in plotting data for the city I live in, in order to get the more clear picture. The raw data is a list of numbers, representing new daily cases on the given date:
|Date||New Cases in Belgrade|
However, the data obtained is missing values from 30.3.2020. to 15.4.2020. Since this is not a scientific paper, I took liberty of filling in the missing data using the growth factor. The growth factor represents the rate at which new cases progress. It is calculated by dividing number of new cases of the current day with the number of new cases the day before. However, the growth factor data is available for the whole country, so the interpolated data will not be precise.
The Fun Part
My data is stored in CSV, and the easiest way to manipulate csv file is with
pandas. It is recommended to install
pandas in your wirtual environment using pip:
pip install pandas
From here it is easy to obtain daily new cases from the CSV:
import pandas as pd def load(data): return pd.read_csv(data) if __name__ == '__main__': covid = load('./data/interpolated.csv') belgrade = covid['Belgrade'].tolist()
Now, we are two steps removed from the plot. For the X axis, we need to calculate the total number of cases for each day since the virus was first detected in the city. This is the cumulative sum. That can be calcuated easy using the list comprehension.
def get_daily_total(data): return [t + sum(data[:i]) for i, t in enumerate(data)]
In other words, each day is the sum of the number of new cases that day and all of the previous days.
The Y axis is represented with the similar cumulative sum, but with the seven days window.
def get_seven_past_days_total(data): # This could also be a list comprehension # (the ugly one though). But this is left as # an exercise to the reader ;) seven_days_running_total =  for i, today in enumerate(data): if i > 7: seven_days_running_total.append(today + sum(data[i-7:i])) else: seven_days_running_total.append(today + sum(data[:i])) return seven_days_running_total
Each number in the resulting list is the sum of new cases in the last seven days.
Now we have all the numbers we need to make the plot:
if __name__ == '__main__': covid = load('./data/interpolated.csv') belgrade = covid['Belgrade'].tolist() belgrade_daily_total = get_daily_total(belgrade) belgrade_seven_days_total = get_seven_past_days_total(belgrade)
But before that, you will need to install plotly:
pip install plotly
Now we can plot the log-log diagram:
def plot(x, y, title): fig = go.Figure() # We will use Scatter plot with # log-log axes fig.add_trace(go.Scatter(x=x, y=y)) fig.update_layout( xaxis_type='log', yaxis_type='log', xaxis_title='Total Confirmed Cases', yaxis_title='New Confirmed Cases in the last 7 days', title_text=title, font=dict( family='Courier New, monospace', size=18, color='#7f7f7f' ) ) fig.show()
Finally, we have:
if __name__ == '__main__': covid = load('./data/interpolated.csv') belgrade = covid['Belgrade'].tolist() belgrade_daily_total = get_daily_total(belgrade) belgrade_seven_days_total = get_seven_past_days_total(belgrade) plot(x=belgrade_daily_total, y=belgrade_seven_days_total, title='Belgrade')
Which gives us this beautiful plot:
You can find the full code in this repository.