BEAR Toolbox for Estimating Economic Relationships
Alistair Dieppe, European Central Bank
The Bayesian Estimation, Analysis and Regression (BEAR) toolbox is a comprehensive Bayesian (Panel) vector autoregression toolbox for forecasting and policy analysis. It is based on MATLAB® and widely used by central banks, academia, and finance. This presentation includes an overview of the toolbox and the latest developments.
Published: 7 Nov 2023
It's an honor to give this presentation today. So I'm Alastair Dieppe from the European Central Bank. We're going to be talking about the BEAR Toolbox for Estimating Economic Relationships.
In some ways, this is a compliment to the [INAUDIBLE] Toolbox. There's many similarities to that, which I'll talk about in a bit. I do want to say that what I'm going to present today is the latest version, 5.2. This version builds upon earlier work with my co-authors, Bjorn Van Roye and Romain Legard, who were key contributors to the early stages of BEAR.
But this version particularly had contributions from Edu, who we saw earlier today from MathWorks. It's been great to work with them on this, as well as Michela Baretti who's with us at the moment at the ECB. And then there's some various contributors. Actually, there's been many contributors over the years of development of BEAR.
BEAR is much younger than [INAUDIBLE], but still has been around for a few years. You can download the BEAR Toolbox from GitHub. We have the link here. But I'll talk a little bit more about that later.
And I just want to say upfront that this presentation should not be reported as representing the views of the ECB. The views are my views and don't necessarily reflect the ECB's position.
So the outline of my talk today is I'm going to briefly introduce BEAR, for those who don't know what it is. And then I'm just going to talk through some of the features of it, in particular different types of models we have, the different types of identifications, applications. And then I'm going to focus a bit on new features in 5.2, so some of the things we've worked with MathWorks on.
And then I'm going to have an illustration of an example, a conditional forecast with BEAR. And then I'll end by talking a bit about our plans going forwards. So what is this BEAR? So BEAR is Bayesian Estimation Analysis and Regression Toolbox. It's a MATLAB set of routines.
So we tried to be comprehensive in terms of vector autoregressions. So you have a vector of typically time series variables. And you're regressing these on the lags of themselves. So that's the autoregression part. We're going on the Bayesian. So that means we're combining the data along with priors.
And then we also have panels. We also have cross-section dimension. And this is a key tool, along with macroeconomic models in the Dynare type. Vector autoregression is a key tool for forecasting and policy analysis. And we're bringing the Bayesian aspect to that.
BEAR is a user-friendly graphical interface. So this allows the tool to be used by desk economists, non-technical users. It includes state of the art applications. So we have many types. I'll talk about some of these, sign restriction magnitude, proxy vars, conditional forecasts, forecast evaluation measures.
And similar to Dynare, we're making the code available. It's all transparent. We try and keep up with the state of the art research. The aim is to always be at the frontier. And the aim is to make it easy to understand, augment, and adapt.
We're hoping that this becomes a kind of trusted tool. We've identified bugs. We've done replications. In replications we've identified bugs in other people's papers. We've also identified bugs in BEAR. And I think over time, we're minimizing those bugs and trying to identify them.
By making it all open source, people can look into the code and provide feedback on ways to improve it. Before I go into a little bit more detail about BEAR, I just highlight here three quotes, three endorsements. We have some others. But these are key professors.
In the literature, we have Gary Koop from Strathclyde. So Bayesian VAR analysis is made easy using the BEAR Toolbox, powerful tool for academics, central banks, policymakers. So we have users across all those dimensions. Fabio Canova has developed a lot of the literature in this area. Toolbox built by policymakers for policymakers. A must.
And then Giorgio Primiceri, is an invaluable toolbox for the Bayesian estimation, state of the art multivariate time series models. Not only is it accessible to less technical users, but extremely useful to more advanced researchers. So hopefully, this talk will encourage those who are not using it to take a look.
So as well as an interface-- so you don't need to run these VAR models, Bayesian VAR models. You don't need to have a knowledge of MATLAB. We have an interface, which I'll talk about later, that you can click on. And the input is Excel file. So you put your data there, as long as some of the key features of the model you'll want to estimate and to run.
And then we have a user's guide. So we have the output in MATLAB. Output is also put into Excel. We have a user guide that talks you through all the steps. And then we also have a technical guide for some of the details behind the programming up or the VARS that are done there. So there's a full set there for you to look through.
So the first step is to think about what type of VAR model you want to estimate. We have quite a lot of different types in it, in the Toolbox already. And we're going to be expanding that further.
So firstly, you know Bayesian VARS. So on the Bayesian VAR we have many different types of priors. So just to recap, Bayesian VARS is a mixture of your prior of what you think is going to happen or what you think the parameters should be, along with the data.
And in terms of priors, we have things like the Minnesota, a normal, an independent Wishart prior. We have diffused prior. We have dummy observations. We have deterministic. And these are different ways of combining the data with your priors and also the standard deviations, the credibility of what you think.
Then we have block exogeneity. So this is where you have certain variables in your VAR, in your vector, of exogenous with respect to other variables. So your additional prior shrinkage to cut the transmission channel. So a typical example here is if we take a small open economy. And we're linking that with a much larger economy.
Then maybe we want the small open economy doesn't affect the large economy. But the large economy does affect the small open economy. So that's where you would use this block exogeneity. So we allow interlinkages one way, but not the other way.
We also have mean-adjusted VARs. So that's where you want to put a prior on the mean or on the steady state. So if you have an idea of where you want the VAR to head in terms of forecast, then we have some tools for doing that. We have dummy observations. And this includes things like sum of coefficients.
So if you want to account for unit roots in your data or dummy initial observations for cointegration, this depends if you want to estimate a VAR in stationary, in growth rate, for example, or if you want to estimate a VAR more in levels to account for some of the common trends going on in the data.
Now, in economics, a lot of the challenge is how to estimate what's happening in an economy. But you have a large set of variables. And there's different ways to shrink the data set to perhaps have a more robust estimate.
One of these ways is the factor augmented VAR, which is a way to shrink a large set of variables into a small set of factors. This was popularized by Bernanke in 2005 onwards. And that's also one of the types of VARs we have. And then we have standard orderly least squares VAR. There's a benchmark, so the non-Bayesian way.
And then we have a whole set of other types of VARs. We have panel. We have six types of panel VARs. The panel VAR is having a cross-section dimension by replicating the series across different units. For example, you can have multiple countries or multiple industries or sectors. And then the different types in terms of you pull in the coefficients. You're allowing for cross-section heterogeneity. You're allowing for dynamic static interdependencies and heterogeneities.
Another class or another two classes is stochastic volatility and time-varying parameters. So what I've talked about before is typically with fixed parameters. You're estimating the model over a particular sample and getting the parameters from that.
But with stochastic volatility, you can allow for change in the economy. Time-varying parameters also the parameters can change over time. And then we also have mixed frequency. So this is where you allow, for example, quarterly data combined with weekly or monthly data.
This is becoming quite popular in terms of trying to account-- like high frequency, more big data, particularly for nowcasting, real time nowcasting assessments. So once you've chosen the type of Bayesian VAR model you want to have, the second step is typically to consider a type of identification, typically a structural type of identification.
And here we have a whole set of different ways to identify VARs. The literature is expanding quite quickly on this. So it's one of the areas we want to further deepen the BEAR Toolbox. Now, the typical way to do this, to do identification of the shocks hitting your variables with Cholesky, which is providing identification on the order and the timing of the shocks.
But this is quite a strong way to identify. Effectively you're opposing zero timing assumption there. So more modern, common approach these days is to use sign and magnitude restrictions by identifying the shock by imposing this restriction typically on impact. But in the BEAR Toolbox this can be over any period and not only the sign, but the magnitude as a way of identifying the shock.
So this is all very flexible. You have a sheet in Excel, where you put the signs. The next sheet you put the signs, or 0, or magnitude restrictions. You put the time period. And then you can also put the kind of blocks by which this occurs. But there's many other types of identifications.
We have long run restrictions. So this is the Blanchard and Quah approach. And the way this is done in BEAR is you put 1,000 1,000 for the period. And then it imposes this long run zero restrictions.
We have correlation restrictions by including external variables in order to identify the shock. We have relative magnitude restrictions. So you can have a domestic economy and a foreign economy. And if you maybe want the domestic shock to have a stronger effect than on the foreign economy, so you have a stronger versus a weaker output there. So we have this implemented following the Caldera and others in 2016 paper.
And then we also have forecast error various restrictions, where you can do both absolute and relative type restrictions over a particular period. And we also have things like proxy VARs as well to identify the shock based on the variance, covariance restrictions.
So you've chosen the model. You've chosen your identification approach. And then we have the third section, which is what we call applications or the output that you want to get. And here we have a whole set of different types of applications.
The first one is the impulse response functions, the IOFs, of how your VAR behaves for different shocks. So if a whole set of standard [INAUDIBLE], you can pick what time period you want that coming out. We have unconditional forecasts. So one of the advantages of VARs is for doing forecasting.
So that's one of the outputs there. We have forecast error variance decomposition. So if you want to assess which shocks are driving over which period, then we can have a decomposition of that. We have historical decomposition.
So the decomposition of shocks driving the different variables in the VAR. So we have the point, the median estimate. But we also have the whole distribution of this. All the output is there. The aim of this BEAR is to allow that, but also to simplify and provide a bit useful also for policy analysis. And then what's actually very common, is used a lot by our user base is the conditional forecasting routine.
So this is where you set some conditions. And we have a very flexible way of doing this. So you not only set the conditions. But you can set the types of shocks that are driving that conditions. So I'm from a central bank, the European Central Bank. The example I always give here is you can set conditions.
You can say alternative paths of interest rates. But you can have different shocks causing that interest rate path. You could have demand shocks. You could have supply shocks. Or it could just be a monetary policy shock. So there's different ways to explore that.
So it's very flexible for doing different types of scenarios. And I'll give an illustration of that later. That's hard or so-called hard conditioning. But we also have soft conditioning, where you're allowing for variability around the condition values. So that's the tilting of the predictive distribution. We also have that as part of the applications.
Given the VARs are used a lot for forecasting, we also have forecast evaluation criteria. And we have both classical and Bayesian approaches. So the classical is like the sums of squared residuals, R squared, median with squared errors, et cetera. We can do rolling estimations. So we can have rolling windows, different time periods of estimating it.
We also have density, as well as forecast evaluations. But we also have Bayesian-specific criteria, so marginal likelihood, continuous ranked probability score, log score, et cetera. The BEAR, we're now on version 5.2. It's been around a while.
We try and take on board users' comments. We very much appreciate feedback and requests. So with the MathWorks team, with Edu, we have a new hyperparameter optimization. So the hyperparameter affects the priors, which affects the posterior, which affect the VAR outcome.
You can optimize over that conditional on a set of criteria. We have a new approach there, which is faster. We have both approaches, the old one, which was a grid search, and the new one, which is much faster. We have variable-specific priors. So you can now do a prior on the AR coefficient, the autoregressive coefficient, for each variable.
Before it was just for the whole set. We have impulse response functions, priors for exogenous variables. We have sign restrictions for panel VARs, so so imposing some more structure on the panel VARs. We have parallelization to speed it up. And we have things like simple options to press figures or the Excel output to lead to efficiency gains.
So in the last 10 minutes I just want to focus a little bit on some of the new features that are in version 5.2. So the next few slides I'll just talk through this. We've been working the last year or so on infrastructure, so a bit the backend to upgrade that. That facilitates running multiple estimations simultaneously.
We now have an object-based approach for the different models, makes it easier to track the models. We have a new interface. And then we have a source control in order to keep track of all of this. So I just want to touch base on some of these MATLAB-based features of that.
So the first one is path control and source control and testing. So this is all now in GitHub. You can always download the latest release. It's publicly available there. You can report issues onto that. And then we can try and fix any issues or contribute changes as well.
The code is being packaged as a Toolbox. So I'll talk about that on the next slide. And we have a testing. So one of the things I have done is implemented numerical testing. So any changes we can make, we can run the whole testing procedure to make sure that it doesn't break anything elsewhere.
And one of the features we got is that we now have a path control. So we can set where you want to load the data. It's less dependent on the paths in your file system. One of the key changes, we have a settings class. So we have seven generic types of models. I just talked about that earlier.
And these are all now more of an object structure type. So we're showing this here. So Bayesian VAR, we have the frequency. We have the start date, the end date, the endogenous variables, the number of lags, the path of where the file is and where the data is, the specifications, the applications. So all of this is in a so-called settings.
And that makes it very easy to make changes or to iterate over different types of settings. So here we have an example. We have a Bayesian VAR. We have some data. And then we're doing it over different values of lambda 1. So that's one of the parameters in the VAR. And you can just do a simple one.
And you could even do this in parallels. You on different types of VARs in a parallel and get these all coming out. It also facilitates bug tracking and makes it very easy to replication. We just need this setting file along with the Excel file with the data input. And then we can easily replicate any model you have.
Now, we have a new app, a new interface. And this builds on top of the settings object. So you can import the setting objects. You export the settings from the app into the setting objects, the different types of models. This new app includes new BEAR types. And this is also directly installable from MATLAB.
So if you have MATLAB, you can go into MATLAB. You can search the app, search for BEAR. And then you can install the app. And that will then always have the latest version available. You don't need to keep downloading it. But it will automatically be there.
We also have a standalone version. So we've compiled it up into a standalone version for those who prefer to use it without having the MATLAB there in the background. But we have all the code in MATLAB. So if you want to dig in and understand it, it's all there transparent. You can see exactly what calculations have been done, how it's been programmed up.
And we also have a set of replication files, which we're hoping to expand over time to replicate some of the key papers out there. And the idea is to facilitate knowledge and exchange. So as I've mentioned, this is available on GitHub. And it's all open source.
So in the last five minutes, I would like to just focus on an example, which is using conditional forecasting. It's one of the most popular features that we have in the BEAR Toolbox. So what you do is you estimate your VAR. You could have external domestic variables. There's all sorts of different types of VARs we have.
And then you set your conditions. And again, we have hard or soft. But you select what type of shock, what conditions, what the conditions [INAUDIBLE] exogenous or the external assumptions, or even endogenous variables. You can say, look, I want this type of path. You can set exactly which type of shocks. Or you can let the model figure out which set of shocks will lead to that.
And you need some type of identification, for example, slow, fast, medium variables. And then you can compare your conditional and unconditional forecasts. So it's very flexible for doing all sorts of different types. There's a whole range of different types of scenarios you can do with small, or large, medium scale, or even larger type of files.
Here's an example. There's more examples of the different types of models in the background slides that I think will be made available. So here we have conditional forecasts. We have examples for US and the UK. As I'm sure you're aware, inflation increased a lot in these economies.
And here we have a set of VARs for the different components. And then we have a forecast, a conditional forecast of what it could play out. Again, this is not necessarily the views of the ECB. This is just an output coming from the BEAR Toolbox.
And then here's a second example building upon that, which is where we have two different types of models. One is estimating the Bayesian VAR before COVID and using that parameterization. So before you get the increase in inflation, before the latest developments, and projecting that forwards.
Or the second one is where you take including the post-pandemic period. So you estimate over the whole period and then project. And what this is seeing is, has there been a change in the economy? It's putting more weight on the latest data and seeing how the projections change under these two different pre or including the post pandemic sample.
So it's just an illustration of the many types of exercises you can do. Again, you can do this on a sectoral, on an industry basis. You can do this on a weekly, monthly. This is a monthly VAR here, quarterly. There's all sorts of different ways of estimating and specifying your model.
So in the last few minutes, we are planning to start work on the next version, so BEAR 6. And we have two work streams. We're currently setting up the specifications and what we want to do for this. The first one is we want to add some new models. And we want to add new approaches to deal with large shocks.
So we've had a number of large shocks hit. We've had COVID. We've also had wars. We've had a strong inflation shock. So the literature has evolved of how to deal with these. So we want to incorporate some of these. We already have some. But we want to include additional approaches there.
We want to include new priors or new identification, including narrative restrictions. We've got many requests to include that. We want to include a structural change. So possibly the economy is undergoing changes at the moment. So there's different ways. We have time-varying parameters. But there's other ways to incorporate this.
And then there's a lot of non-linearities potentially out there in the data. So there's different ways of doing this, whether it's threshold, VARs, or other types of non-linearities. And then possibly the ability to combine different types of models. Rather than just focusing on one model, have a combination and combine them together to get a forecast.
In parallel to that, we want to do further work modernizing. I think we can't just keep adding models. We need to keep the code simplified. And that's a lot of the work that has gone on over the last year. We want to do further modernization, further improving the data management.
So you don't just have to rely on using Excel. But you can other ways to upload data. Further work on interface, speeding up, and maybe further work in taking high frequency data. But if I can just end with our logo. And thank you for your attention.
But we very much welcome requests, suggestions, improvements. It's a good time now if you have things you would like to be incorporated, given we're working on the next version. We very much welcome feedback. And it's all there to be downloaded, accessed. And thank you very much for your attention.