VERIFICATION OF PROBABILISTIC SEVERE STORM FORECASTS AT THE SPC

Michael P. Kay
NOAA/NWS/SPC
Norman, Oklahoma

Harold E. Brooks
NOAA/NSSL
Norman, Oklahoma

1. INTRODUCTION

The Storm Prediction Center (SPC) is responsible for issuing severe weather forecasts for the conterminous United States. The SPC issues several scales of products ranging from the convective watch, which is issued on an as needed basis for time scales of several hours and spatial scales of one or more states highlighting areas where severe weather is imminent, to the convective outlook which is a scheduled product updated several times daily issued for the entire U.S. that describes areas where severe thunderstorm development is possible. Convective outlooks are issued for periods from several days to several hours. The convective outlook is the primary forecast used by a wide variety of users such as National Weather Service (NWS) offices, emergency managers, and other organizations as the initial step in the severe storms forecast and warning process.

The convective outlook is a categorical forecast of severe weather that depicts the threat of severe weather in terms of "risk" (slight, moderate, high). This structure does not explicitly state the forecaster's expectations of the threats of individual severe weather hazards (hail, damaging winds, and tornadoes) (Fig. 1).

In March 1999 the SPC began issuing experimental subjective probabilistic forecasts of individual severe weather types with the goal of eventually moving away from the categorical convective outlooks. Probabilistic forecasts directly express uncertainty unlike categorical forecasts where such information is often hidden in imprecise wording. Probabilistic forecasts provide the important information directly to the user who can make their own decisions without first having to gauge the uncertainty implied in categorical statements. It can be shown that the use of probabilistic expressions of uncertainty provides value for rational decision makers in the context of simple decision models (Murphy, 1993). In this context, "rational" implies that decision makers receive the forecast information and make decisions based upon their knowledge of the costs and potential losses associated with that particular decision and that they make decisions consistent with maximizing the benefit associated with the situation.

Categorical outlook

Fig. 1. 2000 UTC Convective Outlook for 16 May 1999.

An important part of any forecast, both from the user's and the forecaster's perspective, is meaningful verification. Verification systems are often designed in a post-mortem effort to assess forecast performance. An important aspect of the design of the SPC probabilistic convective outlooks has been to develop a scientifically sound and integrated system for all aspects of the forecasts and the verification. Some of the difficulties associated with attempts to develop meaningful verification of the traditional SPC categorical outlooks have been discussed previously by Weiss et al. (1980). Several important aspects of the system will be discussed including the development of a basic climatology of severe weather, the choice of appropriate probabilities for the various threats, and a framework for transforming daily severe weather reports into probabilities. Initial results will also be presented.

2. REPORT CLIMATOLOGY

It is especially important in probabilistic forecasting and verification to have knowledge of the climatological distribution of forecast events. The NWS severe weather database for the period 1980 to 1994 has been used to develop the climatology of storm reports. Doswell and Burgess (1988) discuss numerous issues which affect the quality of the database and therefore any climatologies developed from it. The climatologies of hail, wind, and tornadoes are expressed as probabilities of 1 or more events occurring within 25 miles of a point which is identical to the way probabilities are described in the experimental outlooks.

Damaging wind probabilities
Fig. 2a. Damaging wind probabilities for 16 May 1999. Brown contour is 5%, yellow is 15%.

Tornado probabilities
Fig. 2b. Tornado probabilities for 16 May 1999. Green contour is 2%, brown is 5%, yellow is 15%. Hatched cyan area represents 10% or greater chance of 1 or more F2 or stronger tornadoes.

Fig. 2. 2000 UTC probabilistic outlooks for (a) wind and (b) tornadoes for 16 May 1999.

Probabilities are produced by first binning storm reports onto a grid covering the U.S. (80 km nominal grid spacing). The area of each grid box is roughly equivalent to the area of a circle 25 statute miles in radius. These grids are then smoothed in both time and space using non-parametric density estimation techniques (Simonoff, 1996). The resulting distribution is consistent with the unknown, underlying statistical distribution of storm reports. Standard deviations of the Gaussian smoothers are 15 days and 120 km in time and space, respectively. These values of the parameters provide smooth, slowly varying annual cycles of probabilities for hail, wind, and tornadoes on any day of the year at any location within the grid. More aggressive choices for the smoothing parameters would have resulted in increased small-scale features in the climatologies. However, given the nature of the dataset, and the convective outlook, it is prudent to focus on the large-scale features which should be more reliable. For more details on the methodology for producing the probabilities see Brooks et al. (1998).

3. DEVELOPMENT OF OUTLOOK PROBABILITIES

In theory, the probabilities for the outlooks may be anywhere from 0 to 100%. However, in practice it is useful to provide guidance to forecasters about the practical range of probabilities for the events being forecast. Reliability is an important aspect of probabilistic forecasts and refers to the degree of correspondence between the forecast probabilities and the observed relative frequencies of the event being forecast. NWS forecasters responsible for probability of precipitation (PoP) forecasts have shown the ability to calibrate themselves and produce very reliable forecasts (Murphy et al., 1985). The SPC has been producing the categorical convective outlook for more than 20 years with the risk categories of slight, moderate, and high. The procedures involved in creating the product are well understood by the SPC forecasters. It is desirable to develop probabilities that are directly related to the traditional outlooks both for the benefit of the forecasters as well as the users of the SPC products. To develop the probabilities associated with each category, 1200 UTC convective outlooks were gridded for a 5 year period from 1990 to 1994 (1106 slight risk areas, 224 moderate areas, and 21 high risk areas). Storm reports were placed on the same grid for each outlook for the same 24 hour valid period as the outlook. Observed relative frequencies (probabilities) were then computed for each report type and for each risk category. This allowed a direct comparison between the probabilities of individual hazards and risk category. The 75th percentile was chosen as the threshold value for which coverage probabilities would be used. This allows SPC forecasters to immediately relate the categorical outlooks that they are familiar with to the newer probabilistic forecasts in a consistent manner. For example, the coverage probability of tornadoes in slight risk areas is approximately 2% (Fig. 3). Therefore the lower bound on the probabilistic tornado forecasts is set to 2%. Typical coverages of tornadoes in moderate and high risk areas are 5% and 15%, respectively. The corresponding probabilities for hail and wind are 5%, 15%, and 25%. The forecasters can also choose 35% probabilities for any hazard as well. The probabilities may be adjusted in the future to account for increased event reporting.

Coverage probabilities

Fig. 3. Coverage probabilities for tornadoes as a function of percentile for all 1200 UTC outlooks from 1990 - 1994.

4. DEVELOPING EVENT PROBABILITIES

Another important component of forecast verification is the determination of the skill embodied within the forecasts themselves. Verification of rare event forecasts is complicated by the fact that forecast difficulty varies greatly from situation to situation. Forecaster credit must be limited in situations where correct forecasts of non-events dominate the dataset. To further complicate matters, the SPC explicitly expects to have both false alarms (parts of the outlook where there are no events) and missed detections (events outside of outlook areas). Thus, absolute limits of 0 and 1 for such parameters as Probability of Detection may not be very useful. Using the storm reports, a "practically" perfect forecast is developed. This forecast is consistent with that a forecaster would make given perfect knowledge of the events beforehand. The methodology is very similar to that used to produce the severe weather climatologies discussed in section 2. Non-parametric density estimation is used to produce forecasts of events that represent probabilities of severe weather at any grid point. Developing such a forecast also allows for forecasters to subjectively verify their forecasts by visually comparing them to the "practically" perfect forecast. This subjective element of verification is all too often underutilized and underappreciated.

5. INITIAL RESULTS

One of the goals of the experimental outlooks was to investigate the ability of the SPC forecasters to express their uncertainty via probabilities. Reliability diagrams are one way to compare the degree of correspondence between forecast probabilities and observed relative frequencies graphically. Reliability diagrams show the probability of an event occurring given that it was forecast. From these diagrams forecasters can immediately assess from these diagrams the amount of over- and underforecasting that they may be doing and take actions to calibrate themselves. Outlook data from 1 March 1999 through 31 December 1999 were used to construct reliability diagrams for the individual hazard forecasts. Data was available for 273 1300 UTC outlooks and 266 2000 UTC outlooks. Reliability diagrams for hail, wind, and tornadoes are shown in Fig. 4.

Reliability diagram

Fig. 4. Reliability diagrams for experimental hail, wind, and tornado outlooks issued from 1 March 1999 through 31 December 1999. Thin straight line indicates perfect reliability.

The SPC forecasters learned quite rapidly how to express their uncertainty using probabilities. There is a slight amount of underforecasting at the low probabilities and overforecasting at 15% and higher probabilities. Very few forecasts were made which utilized 35% probabilities so care must be taken not to place too much emphasis on this portion of the results.

Table 1. Frequency of use of the various probabilities for each forecast type for the period 1 March 1999 through 31 December 1999.

Prob. Hail Wind Tornado
2%7834
5%26812269603888
15%915183111125
25%19691558248
35%20214821

6. CONCLUSIONS

The SPC has been producing experimental subjective probabilistic convective outlooks since March 1999. These outlooks augment and may eventually replace the categorical convective outlook product. Forecasting rare events such as tornadoes is very difficult and conveying that information to the user is equally difficult. The move towards probabilistic forecasts of severe weather represents an important step towards providing information directly to the user. The design of the new outlooks was done concurrently with the design of the verification system to ensure consistency and scientific validity.

Several important projects have taken place to help usher in a new era of severe storm forecasting at the SPC. These include the development of a baseline climatology of severe weather for the U.S., a methodology for forecasters to develop probabilistic outlooks consistent with their categorical forecasts, and a methodology for computing artificial forecasts based on actual data that help define objective limits of forecast skill. Initial results show that the SPC forecasters produce quite reliable outlooks. Skill assessment remains an important component of the verification for which no acceptable method has been developed. Baseline climatology values are so small that they are of limited value and the "practically" perfect forecasts represent an "unbeatable" baseline forecast against which skill scores will always be negative.

7. ACKNOWLEDGMENTS

The authors would like to thank Rich Thompson for his hard work organizing the forecasting experiment. We would like to acknowledge the SPC forecasters for their hard work in the ongoing forecasting experiment.

8. REFERENCES

Brooks, H. E., M. Kay, and J. A. Hart, 1998: Objective limits on forecasting skill for rare events. Preprints, Nineteenth Conf. on Severe Local Storms, Minneapolis, MN, Amer. Meteor. Soc., 552-555.

Doswell, C. A. III and D. W. Burgess, 1988: On some issues of United States tornado climatology. Mon. Wea. Rev., 116, 495-501.

Murphy, A. H., 1993: What Is a Good Forecast? An Essay on the Nature of Goodness in Weather Forecasting. Wea. Forecasting, 8, 281-293.

Murphy, A. H., W.-R. Hsu, R. L. Winkler, and D. S. Wilks, 1985: The use of probabilities in subjective quantitative precipitation forecasts: Some experimental results. Mon. Wea. Rev., 113, 2075-2089.

Simonoff, J. S., 1996: Smoothing Methods in Statistics. Springer, New York, 338 pp.

Weiss, S. J., D. L. Kelly, and J. T. Schaefer, 1980: New objective verification techniques at the National Severe Storms Forecast Center. Preprints, Eighth Conf. on Weather Forecasting and Analysis, Denver, CO, Amer. Meteor. Soc., 412-419.