Overview
An interactive exploration of the 2020 Johns Hopkins and Worldometer COVID-19 datasets, producing a suite of standalone animated HTML visualizations plus a Dash dashboard that ties them together. The project spans global choropleths, regional comparisons, USA county-level drill-downs, and Holt-Winters forecasting against roughly 628,000 daily county records and 49,000 province/state-level records from January to July 2020.
All charts share loading and country-name-mapping logic through a central shared_utils.py module, and each visualization script writes a self-contained HTML output that auto-opens in the browser.
Key Achievements
- Built 7 standalone visualizations plus an interactive Dash dashboard from a single shared data-loading module
- Processed over 712,000 rows across six CSVs, including 627,920 daily US county records with FIPS codes
- Animated ~1,978 US counties × weekly frames in a single choropleth using the Plotly FIPS GeoJSON
- Implemented Holt-Winters exponential smoothing with a 30-day forward forecast and 95% confidence band
- Solved three-way country-name reconciliation between Johns Hopkins, Worldometer, and Plotly’s built-in country list
- Applied log10 color scaling to span the six orders of magnitude between the smallest and largest outbreaks
- Used weekly downsampling (every 7th day) to reduce 188 daily frames to ~27, making animations responsive without losing the trend
Architecture
Every visualization script reads from data/, passes the data through shared_utils.py for aggregation and country-name normalization, and writes a self-contained HTML file to result/. The Dash app reuses the same shared layer to combine a global choropleth (date slider plus metric radio) with a multi-country time-series line chart.
| Technology | Purpose |
|---|---|
| Python 3.8+ | Core language for all scripts |
| pandas | CSV loading, grouping, and date aggregation |
| NumPy | Log scaling and numerical transformations |
| Plotly | Animated choropleths, bubble maps, line and bar charts |
| Dash | Interactive dashboard combining map and time series |
| statsmodels | Holt-Winters exponential smoothing for forecasting |
| FIPS GeoJSON | County-level US geography (cached locally) |
Core Features
- Animated world choropleth (
covid.py): Weekly-frame animated choropleth of global confirmed cases (Jan–Jul 2020), log-scaled so small and large outbreaks are both visible. - Bubble map (
covid_bubble_map.py): Animated scatter-geo sized by √(confirmed), colored by WHO region. - Mortality rate (
covid_mortality_rate.py): Animated choropleth of deaths / confirmed, capped at 15% to prevent small-sample outliers from washing out the scale. - Recovery rate (
covid_recovery_rate.py): Animated choropleth of recovered / confirmed. - WHO region comparison (
covid_who_regions.py): Line chart of total confirmed cases split by WHO region over time. - Worldometer snapshot (
covid_worldometer.py): Static choropleth of cases per million population (log-scaled) plus a horizontal bar chart of the top 30 countries by critical-case rate, colored by continent. - USA county drill-down (
covid_usa_counties.py): Animated county-level choropleth for the United States using the Plotly FIPS GeoJSON, covering ~1,978 counties × weekly frames. - 30-day forecast (
covid_forecast.py): Holt-Winters exponential smoothing fit to global daily confirmed cases, extrapolated 30 days past the dataset end with a 95% confidence band. - Interactive Dash dashboard (
covid_dashboard.py): A combined global choropleth with date slider and metric radio, alongside a multi-country time-series line chart for side-by-side comparison.