and Resilience Consortium

## Technical Notes

### Epidemic Indicators

Collaborators: Robert Leong, MSc, PhD(c), Dominic Ligot, Jan Gil Sarmiento

and

### Regional Statistics

Project Proponents: Peter Julian Cayton, Jason Alacapa, and Robert Neil Leong

Research Team: Jan Gil Sarmiento, Simon Bismonte, Maryliz Zubiri, and Nicole Uy

Webmasters: Dominic Ligot, Mark Toledo, and Angelica Mhay Salazar

Description:
These are time series data on the developments of the COVID-19 in the country. It provides up-to-date information on the trends in infections, movements as reaction to policies, and targets for flattening the curve. More importantly, it can provide valuable insights about the community control of COVID-19.

Value to the LGU:

If provided with data from the LGUs, we can provide data that will give insights on the state of COVID-19 in the local area. In particular, the metrics to be displayed can provide a real-time evaluation of how transmissible is COVID-19 in the locality subject to control measures and testing practices imposed as data are collected. The key metrics currently available at the moment (and their utility):

• Effective (real-time) reproduction number - measures the current transmissibility, on average, of a single infection in a community (i.e., how many people, on average, will be infected by a single infection under the prevailing community conditions). This is technically defined as the average number of secondary cases that each infected individual would infect if the conditions remained as they were instantaneously, and is computed as (Cori et al., 2013):

$$R_t = {{E[\Delta]} \over {\sum \Delta I_{t-s} \omega_s}}$$

$R_t = {{E[\Delta]} \over {\sum \Delta I_{t-s} \omega_s}}$

• ${E[\Delta]}$ := expected number of new infections in day t
• ${\sum \Delta I_{t-s} \omega_s}$ := weighted average of the number of secondary cases caused by previous new infections, weighted by the probability distribution of the serial interval

The serial interval (SI) is the time between the onset of symptom of the first infected to the onset of symptoms of the secondary cases. Assumed in our calculations is that SI follows a lognormal distribution with mean 4.8 days and standard deviation 2.3 days (from Nishiura et al., 2020).

• Delay-adjusted confirmed case fatality (CFR) rate - measures the risk of death of a confirmed case in real-time adjusted for the expected time-delay between case confirmation date and actual death (i.e. to reflect the average time it takes between reporting of confirmation of cases up to reporting of death if it happens). This is computed as (Nishiura et al., 2010):

$$CFR_t = {{D_t} \over {{{\sum \Delta I_u F (t-u) } \over C_t } \times (C_t - R_t - D_t) + R_t + D_t } }$$

$CFR_t$ = $D_t$ / ((${\sum \Delta I_u F (t-u) }$ / $C_t$ ) x $(C_t - R_t - D_t) + R_t + D_t)$

• $D_t$ := total number of deaths at time t
• $R_t$ := total number of recoveries at time t
• $C_t$ := total number of confirmed cases at time t
• ${{\sum \Delta I_u F (t-u) } \over C_t }$ := delay-adjustment factor applied to cases not yet reported to be closed (either recovered or died) at time t which provides how many cases are expected to have been closed by t considering they have reported as confirmed in time t - u (as not all closed cases, especially recoveries, may have been completely reported)

The delay distribution F in our calculations assumed a gamma distribution with mean 10.1 days and standard deviation 5.4 days (Shim, et. al. 2020).

• Crude Recovery Rate - is the ratio of the total cumulative recoveries $R_t$ and the cumulative number of infected cases $I_t$.

$$RecovRate = {R_t \over I_t}$$

RecovRate = $R_t$ / $I_t$

• Percent Change $p_t$ - is the daily growth in cases, deaths, and recoveries. For $Y_t$ time series, the percent change series $p_{t,Y}$ is

$$p_{t,Y} = {{Y_t - Y_{t-1}} \over {Y_{t-1}}}$$

$p_{t,Y}$ = $Y_t - Y_{t-1}$ / $Y_{t-1}$

• Implied Doubling Rate $d_t$ - is the defined length of time for the present value of variable to double given the prevailing rate of increase. It is based on the compound accumulation model. Suppose $A_0$ is the starting value, $p$ is the percent change of increase for every time period, and d is the doubling rate, measured in units of the time period. Then, the compound accumulation model for doubling is:

$$2A_0 = {A_0(1+p)^d}$$

$2A_0$ = $A_0(1 + p)$

Solving for $d$ in terms of $p$ is:

$$d = {{ln(2)} \over {ln(1+p)}}$$

$ln(2)$ / $ln(1 + p)$

Since there is a time series $p_{t,Y}$ from time series $Y_t$, then the implied doubling rate for $Y_t$ is:

$$d_{t,Y} = {{ln(2)} \over {ln(1+p_{t,Y})}}$$

$d_{t,Y}$ = $ln(2) / ln(1 + p_{t,Y})$

The two time series $p_{t,Y}$ and $d_{t,Y}$ are similar in terms of information, but both are presented for interested readers. Since doubling time is related to growth rate, calculating for equivalent percentage growth from doubling rate:

$$p_{t,Y} = {e^{ln(2) \over d_{t,Y}}-1}$$

$p_{t,Y}$ = ${e^{{ln(2)} / d_{t,Y}}-1}$

### CO-INFORM Risk Scoring Dashboard

Lead: Michael Promentilla, PhD and Jomar Rabajante, PhD

Collaborators: Geminn Louis Apostol, MD, MBA; Dominic Ligot, April Anne Tigue, MSc

Description:
The COVID-19 risk index is defined as a function of the following dimensions namely: Hazard & Exposure, Vulnerability, and Resilience. This conceptual framework is adapted from the InfoRM model developed by the European Commission’s Joint Research Commission (JRC), which is a global multi-hazard disaster risk assessment tool to identify countries at risk of disaster and humanitarian crisis. The risk concepts used in the model are based on several publications in scientific literature and considers the three dimensions of risk: Hazards & Exposure, Vulnerability, and Lack of Coping Capacity. However, this index is non-specific and some of the indicators may not be available or even relevant to the needs of decision-makers in the context of the COVID-19 epidemic in the country. Thus, we propose a conceptual framework to measure the COVID-19 risk index as shown in Figure 1. This approach is modular and allows for a simple and transparent calculation of epidemic risk using a composite index methodology. Each dimension includes different categories where each category can be broken down by a reliable set of indicators. Geometric aggregation is used to compute the composite index or risk score as shown by the following equation:

COVID-19 Risk Index $= {{H^{w1}} \times {V^{w2}} \times {L^{w3}}}$

• ${H^{w1}}$ := Hazard and Exposure
• ${V^{w2}}$ := Vulnerability
• ${L^{w3}}$ := Lack of Resilience

In theory, there is no risk from the COVID-19 epidemic, if there is no exposure, no matter how severe the hazard event is. If there are physical exposure and physical vulnerability, the “hard” risk can be computed and it is considered hazard dependent. The risk is also very low if the person or community is not vulnerable or the resilience to cope and recover is ideal. The “soft” risk can be computed from this second dimension using the concept of vulnerability due to the fragility of the socio-economic system including the susceptibility associated with the low level of awareness, nutritional and health status. These are the social determinants of health and are hazard independent. Likewise, resilience is operationalized and defined by physical infrastructure, health systems capacity, and also institutional and management capacity. Conceptually, better epidemic management to absorb, recover and adapt means higher resilience capacity, which reduces the level of risk from the vulnerability and exposure from this hazard. Likewise, the lack of resilience translates to higher risk overall.

Value to the LGU:

As a novel risk reduction exercise concurrently developed with the pandemic response, this tool offers numerous utility for decision-makers in the Philippines.

• Risk index comparisons and visualization may potentially accelerate not only national but also district-level actions to the developing pandemic. And when the situation has settled, the risk index could be used as a guide for reducing risks and drafting preparedness plans to mitigate future impacts of such health emergencies.
• Information from the CO-INFoRM index could nudge multi-sectoral actions to aptly respond to the complexity and the nuances of the crisis.
• Proactive communication, such as through a regularly updated visualization, strategically aligns the perceived expectations of the public and the media closer to the actual risks. This communication control mitigates potential negative impressions, ensures accountability, and promotes public trust to preserve an institutions brand equity.

Model Framework:

Notes on the Initial Set of Indicators:
Dimension Indicator Description and data processing requirement
${H^{w1}}$
Hazard & Exposure
Active Case/Outbreak threshold (Number) The ratio between active infected cases and a number of cases that will potentially lead to an outbreak in the area will be calculated. A lower number means a lower risk. The index will be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization and inversion. The Min value is 0 and the Max value is set to 2.
Data were obtained from Epid. Model of Rabajante (R = 2, c = 0.01) for outbreak threshold and DOH for active cases.
Sources: Rabajante (2020); DOH
${H^{w1}}$
Hazard & Exposure
Projected Number of Cases (Number per 100,000 population) The projected number of infected cases (population-density adjusted) throughout the epidemic period in the area. A higher number means a higher risk. The data will be first normalized with the population of the region and presented as number of cases per 100,000 population. The index will then be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization. The Min value is 0 cases per 100,000 and Max value is the largest value in the data set.
Data were obtained from Epid. Model of Rabajante (R = 2, peak at 25%, base).
Sources: Rabajante (2020)
${H^{w1}}$
Hazard & Exposure
Projected Number of potential fatalities (Number per 1,000,000 population) Projected number of fatalities considering the age structure of the population and its corresponding case fatality rates. A higher number means a higher risk. The data will be first normalized with the population of the region and presented as number of cases per million population. The index will then be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization. The Min value is 0% and Max value is the largest value in the data set.
Data were obtained from Epid. Model of Rabajante (R = 2, peak at 25%, base).
Sources: Rabajante (2020)
${H^{w1}}$
Hazard & Exposure
Time-varying Reproductive Number R (Rt) The average number of secondary cases that each infected individual would infect if the conditions remained as they were instantaneously. A higher number means a higher risk. The index will then be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization. The Min value is 0 and the Max value is set to 2.
Data were obtained from COVID-19 Time Series Dashboard of Cayton.
Sources: Cayton (2020)
${V^{w2}}$
Vulnerability
Poverty incidence (%) Proportion of poor in the total number of households in the area as a proxy data for the social vulnerability. A higher number means a higher risk. Raw data were obtained from Rabajante. The index is then computed from rescaling the data to a score ranging from 0 to 10 using max-min normalization. The Min value is 0 and Max value is 50%.
Data were obtained from Philippine Statistics Authority (Highlights on Household Population, Number of Households, and Average Household Size of the Philippines, 2015 Census of Population)
Sources: PSA (2015)
${V^{w2}}$
Vulnerability
Elderly Population (%) The percentage of senior citizen in the population (60 y/o and above). A higher number means a higher risk. The index will then be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization. The Min value is 0% and Max value is the largest value in the data set.
Data were obtained from Philippine Statistics Authority (2015 Census of Population).
Sources: PSA (2015)
${L^{w3}}$
Lack of Resilience
LGU Health expenditure (%) The amount of LGU budget allocated to health per capita. This serves as proxy data for effective governance to respond against health-related crises. A higher number translates to higher resilience to reduce risk level. The index will then be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization and inversion. The Min value is Php 0 and Max value is the largest value in the data set.
Data were obtained from Department of Finance, Bureau of Local Government Finance, Statement of Receipts and Expenditures.
Sources: DOF (2018)
${L^{w3}}$
Lack of Resilience
Critical care bed demand (number per a million population) This considers the number of critical care beds. The projected number of critical cared beds needed by healthcare during peak. A lower demand translates to higher resilience to reduce risk level. The data will be first normalized with the population of the region and presented as number of critical care bed per million population. The index will then be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization and inversion.
Data were obtained from Epid. Model of Rabajante (R=2, peak at 25%, base).
Sources: Rabajante (2020)
${L^{w3}}$
Lack of Resilience
Bayanihan Grant (Php per capita) The amount of financial assistance given to LGU to address COVID-19. This serves as proxy data for effective governance to respond against health-related crises. A higher number translates to higher resilience to reduce risk level. The index will then be computed from rescaling the data to a score ranging from 0 to 100 using max-min normalization and inversion. The Min value is Php 0 and Max value is set to Php 5,236.00 which is based on the 40% allocation for catastrophic payment for health as recommended by WHO.
Data were obtained from Department of Budget and Management, Local Budget Circular No. 125.
Sources: DOF (2020)

### National and Regional Forecasts

Project Proponents: Peter Julian Cayton, Jason Alacapa, and Robert Neil Leong

Research Team: Jan Gil Sarmiento, Simon Bismonte, Maryliz Zubiri, and Nicole Uy

Webmasters: Dominic Ligot, Mark Toledo, and Angelica Mhay Salazar

Description:
These are forecasts in the future number of new reported cases of COVID-19 in the country and in the regions. It provides up-to-date information on the future trends in infections and future movements as reaction to past policies. More importantly, it can provide valuable insights for short-term planning in mitigation of COVID-19.

Value to the LGU:

If provided with data from the LGUs, we can provide estimates that will give insights on the state of COVID-19 in the local area over the next 7 days. In particular, the forecasts to be displayed can provide a forward-looking evaluation COVID-19 trends to aid in preparing and mitigating possible future surges with a lead time of 7 days. The following metrics are provided in the dashboard with their utility:

• Cases by Date of Report: The plot and corresponding numbers show predicted new cases per day. The green region of the graph shows the in-sample predictions that the model has in the past 170 days. This is used for validation purposes in terms of how good the model is in predicting past daily cases. The orange region shows in-sample estimates of new cases in the most recent 10 days which may be susceptible to data delays or adjustments. The purple region shows the forecasted new cases for the next 7 days, which shows the possible direction of new cases based on the analysis of the most recent 180 days of data. The plot also shows 20% (darkest), 50% (darker) and 90% (dark) credible intervals, which shows the chance that cases will have those range of values.
• Effective Reproduction Number: The plot shows smoothed estimates of the effective reproduction number (Rt) per day which are used to generate the predictions. Discussion about Rt has been provided in the Epidemic Indicators section of the Technical Notes. Similar to the color scheme of the cases by date of report, the green portion shows the in-sample prediction of the Rt in the past 170 days and the orange indicates Rt for the most recent 10 days. The forecasted Rt is based on the most recent in-sample prediction of the Rt and is extended for the next 7 days. To analyze the future direction based on the forecasted Rt, the following rule of thumb may be used:
Where the number 1 is: Direction of COVID-19 cases is
Below Lower 90 Increasing
Between Lower 90 and Lower 20 Likely Increasing
Between Lower 20 and Upper 20 Stable
Between Upper 20 and Upper 90 Likely Decreasing
Above Upper 90: Decreasing

The model used for the analysis is based on the work by Abbot, et al (2020). For some of the specifications that were used for the dashboard, the details are:

1. The initial number of infections is estimated as a free parameter with prior based on initial cases.
2. The initial unobserved growth rate is estimated from the first 7 days of reported case counts and will be used as a prior with normal distribution and standard deviation 0.2 to estimate latent infections before the first reported case using a log linear model
3. For each step, the incidence at time t $I_t$ will be solved using the following equation, with an estimate of $R_t$ with the specified lognormal prior with mean of 1 and standard deviation of 0.2 (Abbot, et al, 2021) and the sum of previous infections are weighted with the discretized generation time probability distribution $w_s$ :
4. $I_t = R_t \sum_{s=1}^t w_s I_{t-s}$
5. The infection series are mapped to the mean of reported case counts $D_t$ by convolution with the incubation period and report delay distributions absorbed into $\xi_s$, as shown below:
6. $D_t = \sum_{s=1}^{t} \xi_t I_{t-s}$
7. Observed case counts $C_t$ are assumed to be generated by a negative binomial distribution with mean $D_t \omega_{ (t \mod 7) }$ with $\omega_{t \mod 7}$ accounting for seasonality of observed counts and overdispersion $\phi$ with exponential prior of mean 1:
8. $C_t \sim NegBin \left( D_t \omega_{ (t \mod 7) }, \phi \right)$
9. Temporal behavior is controlled by an approximate Gaussian process with a squared exponential kernel (GP) in which $R_t$ specification is a scaled random walk:
10. $R_t \sim R_{t-1} \times GP$

The priors of the GP process for the length is an inverse-gamma distribution with parameters optimized to concentrate 98% of the density between 2 and 21 days, while the magnitude has a standard normal prior.

The incubation period was based on the work of Lauer, et al (2020), which specifies a lognormal distribution with a mean of 5.2 days and standard deviation of 1.52 days. This refits the generation time estimates from Ganyani, et al (2020) to a mean of 3.6 days and standard deviation of 3.1 days. The incubation time distribution was also used with the adjustments on the reporting delay time distribution.

In estimating the Bayesian structure, each time series was fitted independently with MCMC with a minimum of 4 processing chains and warmup of 500 samples each and an overall total of 4,000 samples post-warmup.

The reproduction number is forecasted to be similar to the last nowcasted $R_t$ from the Bayesian structure for the whole forecast horizon, and the new reported counts are derived through the structure and the forecasted $R_t$. The 20%, 50%, and 90% credible intervals are produced.

References:

Abbott S, Hellewell J, Thompson RN et al. (2020), Estimating the time-varying reproduction number of SARS-CoV-2 using national and subnational case counts [version 2; peer review: 1 approved with reservations]. Wellcome Open Res 2020, 5:112 (https://doi.org/10.12688/wellcomeopenres.16006.2).

Ganyani T, Kremer C, Chen D, et al. (2020) Estimating the generation interval for coronavirus disease (covid-19) based on symptom onset data, march 2020. Euro Surveill. 2020; 25(17): 2000257.

Lauer SA, Grantz KH, Bi Q, et al. (2020) The incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: Estimation and application. Ann Intern Med. 2020; 172(9): 577–82.

Special Thanks to Ms Trixie Delmendo for the data cleaning efforts on the preprocessing of the DOH Data for Regional Statistics and Forecasting