Sei sulla pagina 1di 63

MACROECONOMIC

DATA ANALYSIS
A Practical Guide to Report-Writing

Scott W. Hegerty, Ph.D.


© 2016, 2020 by Scott W. Hegerty. All rights reserved.
CONTENTS
Introduction i

1. What’s Your Story? Choosing What Project To Pursue


Who’s Your Audience? 2

2. Finding Data
Before Getting Started 4
Where to Look for Data 5
Navigating a Statistics Website 6
Choosing the Correct Variables 7
Downloading Your Data 9
Creating Your Database 10
Additional Excel Tools 11

3. Calculating and Transforming Variables


Choosing the Level of Sophistication 13
Software 13
Transforming Variables 15
Basic Statistical Tools 19
More Advanced Statistical Tools 22
Other Econometric Methods 23
Naming Your Variables 23

4. Presenting Your Data


Preparing Presentable Line Graphs 24
Other Types of Graphs and Charts 28
Tables 31
Regression Output 32
The Next Steps 35

b
5. Putting It All Together
Rough Draft, or Final Product? 36
Using Data to Tell Your Story 37
Integrating Graphs, Tables, and Text 38
Producing High-Resolution Images 39
Fonts, Colors, and Other Considerations 41
A Few Writing Tips 43
Proofreading 44
Presenting With Slides 44

An Example
Oil Prices and Oil-Price Volatility 45
A GARCH Analysis of Oil-Price Volatility 52
Moving on to Your Own Projects 55

Appendix
Books and Software Resources 56
Data Sources and Variables 57

c
INTRODUCTION

This book has grown out of both my teaching and my research.


In all my classes, I try to convey the skills required to use
macroeconomic data effectively, but while many students have
mastered some of the skills, it is often difficult to be exposed to,
let alone master, all of them. As an undergraduate, I remember
hand-copying East German macroeconomic statistics out of a
bound volume in the library. But I also remember starting true
empirical work after I had taken my Ph.D. theory courses, and
having to learn how to navigate a database from scratch. Now,
having gone through the publication process a number of times,
I have seen the finished product—and all the steps that need to
be done correctly to get there. In my work, I have also had the
opportunity to review others’ academic papers, and oftentimes
some of these steps could be improved, even if the author holds a
Ph.D. and is studying an important economic issue.
Macroeconomic Data Analysis is about the entire process,
from conception to write-up. The end goal is to present a well-
constructed report that examines an interesting idea. We do this
in five steps: 1) defining the idea, 2) gathering data, 3)
calculating variables, 4) preparing charts and tables, and 5)
putting it all together. All five need to be done correctly. I have
seen brilliant technical papers that cannot justify the economic
importance of the study itself. Other papers perform complicated
econometric tests, but simply paste the software results into the
final document. Or, the tests are introduced or conducted
effectively, but the paper never mentions why they matter or
what they are supposed to be used for.
Undergraduate students most likely will benefit from
experience with all five steps. Those just starting a research
project can go in order, honing each skill as they move from an
idea (or trying to come up with an idea) to the final paper. Many
of the ideas—such as where to go for data, or how to format a
chart—do not seem to get enough attention in Economics
classes.

i
That said, this is not a tutorial per se. It would be impossible
to introduce web searches, Excel formulas, formatting, and
graphic design in any detail all in one document, let alone
review macroeconomic concepts. In fact, I have made many of
my graphs in the open-source software R, but don’t even touch
upon it here. Since I teach R in a separate class (with some
overlap, however), I treat “Macroeconomic Data Analysis” as
more of a true introduction to the topic. I have put the
associated .R files, as well as links to some useful videos, on my
website at www.scotthegerty.com or bit.ly/2SlGXU0.
In presenting this material, I have to assume that readers
have some sort of a starting point. Some of my students have
better Excel skills than I had starting out (or have now), so it
makes no sense for me to go into too much detail here. Other
students skip Excel and go straight to more advanced software.
Besides, I do not put much value in “cookbook” tutorials that
have specific, yet unexplained, goals (such as “change the line
color to 50% gray”). It’s impossible to instill much real learning
that way.
Instead, the objective here is to show what could be done,
with ideas, data, interpretation, and presentation, so that
students can practice and learn these skills themselves. A few
sources are listed at the end for further reading. But this
document itself serves as one possible example, incorporating
macroeconomic examples throughout. I hope that it will help you
with your own future macroeconomic projects.

Scott W. Hegerty, Ph.D.


Associate Professor of Economics
Northeastern Illinois University
October 2016/April 2020

ii
1. WHAT’S YOUR STORY?
CHOOSING WHAT PROJECT TO PURSUE
One of the most important parts of studying or writing about
economics is coming up with an interesting question that is
worth answering. In fact, even academic papers often have a
hard time doing this—so they wind up making tiny changes to
an existing idea and going from there. For undergraduate
students or those who are just starting out, you don’t need to
come up with something groundbreaking. Just ask yourself 1)
what you want to find out about and 2) who might find it
interesting if you tell them. But knowing what you are looking
for before you start makes the whole process easier.

TYPES OF QUESTIONS IN MACROECONOMICS


Macroeconomics tends to concern itself with national-level
aggregate measures that are reported over time. Here are a few
types of macroeconomic data that you might need:
▪ Real variables (such as GDP, consumption, or investment)
▪ Price levels and inflation rates (such as CPI or PPI)
▪ Monetary variables (such as money supply or interest rates)
▪ Financial variables (such as stock prices or commodity prices)
▪ International variables (exchange rates, exports, or imports)

Oftentimes, you will have to combine multiple separate


variables and use your knowledge of economic theory. In some
cases, you may need both a nominal variable and a price index
to create a real variable. In others, you might want to look at
relationships among a number of separate variables. A list of
some specific variables are given in the Appendix below.
Think about what interests you. Would you like to know
more about the economic performance of a specific country?
Compare prices for tuition or medical care with overall prices?
Make policy statements regarding taxes or the minimum wage?
Deciding first what questions to ask will help you choose the
right data and give you a clear idea of what to do with it.

1
Suppose gasoline prices are rising, and you want to know if
the price of oil is higher than in the past. If you know it is rising,
you might be able to make better investment decisions. You
could find the right data, make a graph, and see where it’s
headed. But what data do you need?
You can start by finding data for oil prices. As we will see
later, we can get them starting in 1980. But if prices were lower
then, a relatively high oil price would seem low compared to
today. So you need to use a price index to make real oil prices so
that you can compare different years. Looking at a single
variable over time (often called a univariate analysis) can be
very interesting in its own right. But you need to combine your
knowledge of formulas, math, and economics to do it correctly.
If you wanted to go further and look at how high oil prices
might affect the whole U.S. economy, you could get data for real
GDP. This type of analysis would be bivariate (or multivariate if
it’s more than two). Oftentimes this requires more use of
statistics, but even the tools discussed here will allow you to
make an effective report or presentation.

WHO’S YOUR AUDIENCE?


Whether it’s a class paper or a professional presentation, you
need to know who you’re writing for. Sometimes, your audience
is “looking” for a certain result that you need to confirm; while
you need to be aware of that type of agenda, I’m not going to get
into that here. The biggest—and often trickiest—issue is
knowing the right amount of economic content to include.
It’s easy to overdo a paper with too much statistics or jargon,
but it’s also a problem if you over-explain concepts that are
familiar to people with a strong background in what you are
talking about. You will need a different approach in certain
parts of the business world, but colleagues in areas outside
Economics might not be as well-versed in the terminology as
others. Likewise, you have to walk a fine line in a non-
Economics class, where your professor has a Ph.D. but not as
much statistical background as an economist. You have to
present your data intelligently, but not be too basic about it.

2
The easiest approach is to simply ask for clarification
beforehand. That way, you can present your analysis
appropriately. But be prepared to explain some things you
thought everybody knew, and to skip some background if it’s
clear that people are familiar with it.
In general, you can present the same ideas in different ways,
depending on your audience. In increasing order of complexity,
these include:
▪ A simple graph. While not very statistical, this conveys a lot of
information. In particular, when people often base ideas or
arguments on guesswork, having your data presented at all
can make a powerful statement.
▪ Basic statistics. Many people with a college degree have taken
at least one statistics class, so summary stats such as mean,
standard deviation, minimum, and maximum will help clarify
your data further. For bivariate analyses, simple correlations
can be useful. It might help to remind your audience the
difference between “mean” and “median.” Just because people
took stats doesn’t mean they actually remember it.
▪ More advanced econometric methods. T-tests, regression
analysis, and ARIMA (for time series) can help make your case
very effectively. But, regardless of your audience, your
explanation is just as important as your results. You also must
make sure you present your results properly, so table
formatting is important. It is also essential to note that these
techniques need to be done correctly to be useful. Running an
Ordinary Least Squares (OLS) regression without specifying it
properly will lead to useless results that could lead to bad
decisions. Don’t feel the need to use more advanced
techniques…a simple graph might work just as well!

NEXT STEP: FINDING DATA


Once you’ve thought about what questions you’d like to answer,
you can find the right data. Luckily, almost anything you’d need
is available for free online. Chapter 2 explains what to do next.

3
2. FINDING DATA

BEFORE GETTING STARTED


Now that you have an idea of what question you’d like to ask,
and what data you might need, you are ready to start looking for
data. Your goal, by the end of this chapter, is to have an
organized spreadsheet with your time-series variables succinctly
named and all extraneous information deleted. Your database
might look like this:

2.1. U.S. oil prices (WTI) and Consumer Price Index in Microsoft Excel.

Notice the columns. This spreadsheet is organized vertically,


with the dates on the left-hand side. Here, the data are monthly,
starting in January 1980. The variables have succinct names.
But getting to this preliminary stage takes a number of
steps. In general, it is important to learn how to:
▪ Navigate through an appropriate website
▪ Work through menus to find data in spreadsheet form
▪ Choose variables, start and end dates, and frequency
▪ Make sure you have appropriate layout (usually columns)
▪ Download your data or paste into Excel in a readable format
▪ Examine your data, and rename variables appropriately.

Let’s look at each of these steps one-by-one. Depending on your


Excel knowledge and general ability navigating websites, some
steps might be easier than others.

4
WHERE TO LOOK FOR DATA
Probably the most useful “one-stop” data source is the Federal
Reserve Bank of St. Louis’ FRED Economic Data site
(fred.stlouisfed.org). It contains thousands of data series, from
various sources, and is easy to use. It also can create graphs for
you. Here, though, we want to be able to download our own
data—and that is straightforward as well.

2.2. The FRED data site main page (April 2020).

Some other useful sites are:


▪ The U.S. Bureau of Labor Statistics (bls.gov)
▪ The U.S. Bureau of Economic Analysis (bea.gov)
▪ The IMF’s International Financial Statistics (imf.org)
▪ Eurostat (ec.europa.eu/eurostat) for European data
▪ Countries’ central banks (sites vary; need to do a Web search).

Central-bank websites usually have data in English and


other regional languages, in addition to the country’s official
language. A lot of times, your web browser will take you straight
to the English version based on your location if you are in the
United States. If you can read the official language, though, you
may prefer to use that version. In fact, the information might be
more current if it doesn’t have to be translated by bank staff.
But you might have to override your browser in that case.
There are obviously many more data sources out there, and
you might need to look further, depending on the nature of your
question. While these websites are all laid out differently, there
is a common pattern behind where the data are located.
Learning how to navigate data sites will help you locate the
appropriate data, no matter where you find it.

5
NAVIGATING A STATISTICS WEBSITE
While it is possible to do a tutorial on how to navigate every
step, for every data site, there are far too many to memorize
them all. It is much better to look for common patterns and
layouts, and know how to move through the steps of the data-
collection process. In general, you will have to find a tab or
header marked “Data,” and then click through the links as you
choose data.

2.3. BLS site main page header (April 2020).

The BLS site looks a lot like FRED’s. Note the “Data Tools”
section. Clicking here takes you to a number of data resources
(including inflation and prices, employment, and other labor
statistics); you need to choose accordingly. I usually then go to
the “multi-screen data search.” The BLS site has a couple of
other issues I haven’t seen elsewhere; I explain how to handle
them below using some simple data manipulation tools.

2.4. The European Central Bank’s main page header (April 2020)

Pretty much every data site has some sort of a “Data” tab, or
you could try searching directly through Google. Once you know
what to look for, the process is usually pretty similar. The
European Central Bank’s website (in 2.4), for example, has
statistical data that are easy to find.

6
CHOOSING THE CORRECT VARIABLES
As you move through the site, you will have to be aware of a few
properties of the data that you download:
▪ The frequency of your data
▪ Whether your data are in nominal or real terms
▪ Seasonality and seasonal adjustment.

Frequency refers to how often the data are reported, usually


annually, quarterly, or monthly. Certain financial variables,
such as stock prices or exchange rates, can be reported daily.
Annual data are easier to find, but by nature there will be fewer
observations. Some data are never reported monthly. For
example, Gross Domestic Product (GDP) can be annual or
quarterly; economists often use indices of industrial production
(IP) as a monthly proxy for economic activity.
If you are examining multiple variables at once, it might be a
good idea to find the variable with the lowest frequency (for
example, if one of your variables is only available annually) and
then retrieve all your variables in the same way. While it is
possible to turn quarterly data into annual variables (by taking
the average values or the sum of the four quarters within each
year), it’s a lot easier just to download them in the format you
want.
Likewise, you might be looking at a number of countries, but
some do not have data in the same frequency as the others. This
is particularly true for Sub-Saharan Africa and other areas
where the data are often not as available as they are for the
United States. If most countries have quarterly data, but one
has only annual data, you have a choice. One option is to use
annual data for all countries. Or, you may wish to not study that
country (drop it from your data set) if that means you have to
sacrifice data for the others.
Try to download the entire length of data that are available.
You can always cut if you need to, but this is easier than going
back and downloading more. The default setting on some
websites might be only a couple of years; you can go in and
adjust this. Some variables are available for longer time periods
than others; this might dictate the time period of your analysis.

7
When gathering data, you also might wish to download
variables that are already in real terms, or you might want to
get nominal data and convert it yourself. This gives you the
choice of deflator (such as CPI, PPI, or the GDP deflator), so you
have more flexibility. Real (rather than nominal) GDP,
investment, and other variables are typically used by
themselves, but you may need to download both the nominal
variable as well as the real version. One instance where you
might need nominal variables, for example, is if you are
calculating a country’s trade balance as a share of the entire
economy. You can use nominal exports, imports, and GDP to
calculate [(𝑋 – 𝑀) / 𝑌], which gives you a percentage. (Recall
that prices and dollar values “cancel out” in this calculation). If
you already have price-level data, you could calculate real GDP
yourself and not have to download it.

2.5. Monthly employment in Madison, WI, 1990-2020. Source: BLS.


Seasonally adjusted data (by the author) = darker line.

Finally, if you are using quarterly or monthly data,


seasonality can be an issue. Economic activity (such as
agriculture, home sales, and construction) can peak during the
summer months, resulting in extra “peaks” and “troughs” over
the course of the year. In general, this is removed before the
data are analyzed.
Figure 2.5 shows the number of people employed (in
thousands) in the Madison, WI area from 1990 to 2020. The

8
deseasonalized data show the same overall trends over time, but
seasonality is clear in the original series. If you only paid
attention to seasonal properties, you might think the economy
was collapsing in January—when really this is predictable.
You might be able to directly download data that are already
deseasonalized by the data provider using statistical procedures
(such as the Census-X12 method). Or, you can use the
appropriate software (such as Eviews) and do it yourself. One
other way to avoid this issue is if you are using percentage
changes—you can take the change over the corresponding period
of the previous year (such as June-June). If you don’t have much
experience with these procedures, I recommend finding
deseasonalized data and just using that.

DOWNLOADING YOUR DATA


As you navigate through a data website, you can choose your
variables of interest, their frequency, whether they are nominal
or real, and whether they are seasonally adjusted or not. Make
sure your data are presented in single-column format (with each
date immediately following the one above it). The BLS, for
example, first presents data in a grid, with each year on one axis
and each month on another. You would have to take time to line
it up into a single column. For that site, you need to click on
“More Formatting Options,” which lets you pick the date range
and data layout.
For any site, if the data don’t come out the way you need
them, you may need to go back, adjust the settings, and
download them again. This takes time to figure out and gets
easier with experience. It’s better to take the time up front (and
gain valuable experience) than spend time adjusting data later.
Once you have the data they way you want them, you will be
given a choice whether to, and how to, download your data
series.
I suggest doing this a .csv (Comma Separated Values) file.
Ideally, your completed database should be in this format, so it
might help to use it from the start. Files in the .csv format can
be opened in Microsoft Excel, but are often preferred to .xls or
.xlsx (Excel) files because they can be read in a number of other

9
programs as well. Some statistical software will not recognize
Excel files at all. For this reason, make sure you save your final
dataset as a .csv when you are ready to use other programs. I
have never downloaded data as a .pdf, because I would have to
put it in Excel anyway.
Regardless of your choice of software, you will next have to
create a single, concise, database that includes all your original
variables for the same time period.

CREATING YOUR DATABASE


Even if all your data come from the same source, you might
wind up with a number of individual download files. Your first
step is to begin combining your time series into a single file. You
should create a new file, but you want to name it something
different from whatever your download was called (often, it’s
just “download”!). Keep your original download files just in case.
I suggest copying and pasting all the variables, including the
date column, from each of the other downloads immediately next
to each other. Make sure you delete any descriptions or other
information at the top, but keep track of these as they explain
your data. For example, you might have a variable in “millions
of U.S. dollars,” which is essential to know if you are explaining
your variables, graphing them, or comparing them to other
variables. (If your other variables are in billions, 300 million will
look HUGE compared to 0.4 billion!). You can always refer to
your original download, so don’t delete them.
You should also only have one date column when you are
done, but it is useful to keep your extra ones while you create
your database. If one data series is longer than the others,
delete the rows from that one only (and its date column) until all
the date column values match at the first and last values. Go to
the end of the file and delete any “extra” time periods, since
often one variable might have a more recent value that the
others don’t have. Also spot check the middle, making sure that
all date cells match, just in case one download “missed” some
dates. I’ve seen this happen—it will cause your data to be at the
wrong dates, then end early. Once you are done, then you can
delete all but the leftmost date column.

10
I suggest renaming all your variables using 1) a consistent
method and 2) relatively few characters. Choose names that you
wouldn’t mind having in print (such as on a graph). Some
software cuts off titles longer than eight characters. Also avoid
punctuation, other than _ (underscore).
For example, I would use Y for GDP (and maybe NOM_Y for
nominal GDP). If we use the transformations in Chapter 3, log
GDP would be LNY, and log differences could be DLNY. There is
no single “correct” way to name variables, but other economists
would have no problem interpreting what I wrote.

ADDITIONAL EXCEL TOOLS


You may need to use a few more data tools, particularly if you
don’t download your time series as an Excel-readable file. For
example, the BLS allows you to access your data as columns on
a Web page. You can choose “HTML Table” or Text (here, I show
“comma delimited,” but the idea is the same). You will need to
paste this into a blank Excel file.

2.6. BLS Energy Price Index data directly on its website.

11
If you are not very familiar with Excel, one very important
command is called “Paste Special.” You can Transpose data
(turn rows into columns), which is important if your data are
presented horizontally instead of vertically. It is also a good idea
to remove all formatting (such as cell borders) when you paste
into Excel; Paste Special lets you do this too (as Text).

2.7. BLS energy-price data in Excel, before and after splitting columns.

If you find each observation in a single cell rather than in


columns, you can use Excel’s “Text to Columns” (under Data) to
split cells. The “Fixed Width” option allows you to split cells at a
certain number of characters, while “Delimited” allows you to
choose the “delimiter” (separator) and split cells whenever it
appears. Comma-separated files have commas as delimiters.
Clicking through the wizard and making sure you choose
“comma” as an option will break up the single cells into columns.
BLS energy-price data provide a good example. Pasting the
data from Figure 2.6 directly into Excel (not including the
header data) will leave commas where there should be new cells.

2.8. Formatted, renamed energy-price data in Excel.

Turning text to columns (2.7 and 2.8) will make it readable in


Excel. You can then delete the variable code and rename the
variable (here, it is energy prices). You are now ready to
calculate new variables for your analysis.

12
3. CALCULATING AND
TRANSFORMING VARIABLES

CHOOSING THE LEVEL OF SOPHISTICATION


If you are just looking to make a simple graph or chart, you
might not need to do much more with your data than what you
have already done. For example, if you are plotting real GDP,
and you’ve downloaded it, you can move on to making graphs
and tables.
If you want or need to transform the data—creating new
variables based on your original ones—I will explain some basic
techniques here. Some simple transformations include:
▪ Real variables (using nominal variables and price indices)
▪ Percentage changes (over the previous period or previous year)
▪ Natural logs and log changes.

You might be very good at statistics already, in which case


you probably already know this. You might wish to skip ahead
to the next chapter, which deals more with document design—in
my experience, many people who are very good at econometrics
still can benefit from creating a well-put-together document.

SOFTWARE
Here, most of the examples will make use of Microsoft Excel,
which is good for the type of analysis that this chapter focuses
on. Your data most likely are in.xlsx, .xls, or .csv format, so you
can simply create new variables in your existing document. It is
also easy to apply one variable’s transformation to other
variables. I don’t focus too much on Excel formulas or commands
(you can find whole books on them, or do a web search for
specific tools), but I do explain some basics and point out some
useful tips.

13
If you want to learn statistical or econometric software
(which I recommend), there are a number of options. Some, like
SPSS and Minitab, are not really used much in economics.
Others are better for time-series data than others. But in
general, you will have to make a choice involving 1) cost, 2) ease
of use, and 3) workplace value.
The most useful in business and academia, hands down, are
R and SAS. Both involve programming languages and structures
that often take time to learn. A big difference is that R is free to
download, while SAS is proprietary and can cost thousands of
dollars for a license. Many jobs require proficiency in one or
both, and in recent years, I have seen SAS’ dominance erode.
Many people really like Python (and it is worth learning!), but it
is likely better for other data-science topics. R has a number of
packages that are specifically designed for econometrics.
If you open up R itself, it is just a command prompt; this can
be intimidating. It is much better to use the interface RStudio,
but even then, you need to properly code everything you want
the software to do. In addition, you can download pre-written
“packages” rather than code the actual statistical tests you want
to do. So while it’s not easy to learn, it might look more difficult
than it actually is. Most of the graphics here (other than the
Excel examples) are done in R, and I advise my own students to
learn it if at all possible.

Easier More Difficult


Free Gretl, JMulTi R, Python
License Fee EViews, Stata SAS
3.1. A simple classification of statistical software.

Other software is more “user-friendly,” with drop-down


menus and check boxes that do not involve any code. The two
that are most professionally useful also cost money for a license
(although students can often get a substantial discount): EViews
and Stata. In my opinion, Eviews is better for time series,
although I have used both. Knowledge of either looks good on a
résumé, although not as much as R or SAS.
Two free software packages that I have used are Gretl and
JMulTi, although I probably prefer the former. They are good if
you want to practice tools you have learned, but they don’t do as

14
much statistically as the others, and I don’t think the
professional world puts as much value on them. Likewise, many
people put “knowledge of Excel” on their résumés—this is
considered pretty basic now, so don’t highlight it unless you
have specialized knowledge above and beyond what we do here.

TRANSFORMING VARIABLES
If you chose “raw” variables, you will want to transform them
into the format you need before beginning. The easiest way for
beginners to do this is in Excel, although more advanced users
might find that other software will allow them to write
programs that can handle numerous variables very quickly.
Here, we focus on a few very basic transformations:
▪ Creating real variables
▪ Applying a common scale (such as millions of dollars)
▪ Calculating percentage changes
▪ Calculating natural logarithms
▪ Using log changes as percentage changes.

One advantage of creating real variables yourself is that you


can choose your deflator. In general, the formula for real values
Nom
is Real = x 100, where P is the price level. Usually this
P
price level is the Consumer Price Index or the GDP Deflator
(which captures a broader range of goods), but different
variables might use different deflators. For example, Investment
might be divided by the Producer Price Index (PPI). Export and
import price indices might be used for trade values. You might
also wish to calculate variables as a percentage of GDP (for
example, for foreign investment flows), in which case you can
use nominal values for all.
When you download data, make sure that you know the scale
in which the data are presented. You may need to convert
billions into millions, for example, if you have series that are
scaled differently. The value 1.412 might be greater than 1,122
if the former is in billions and the latter is in millions. If you
only want to compare percentage changes, this won’t matter so

15
much, but you can easily multiply 1.412 billion by 1000 to get
1,412 million.
Again, “Paste Special” in Excel can help you here. Simply
type 1000 in a blank cell, copy it, and select every cell you wish
to convert. Than make sure you check “Multiply.”

3.2. Microsoft Excel’s “Paste Special” allows you to multiply across multiple cells,
as well as transpose, remove formatting, and perform other useful functions.

Variables might also be presented in different currencies. For


example, Mexican Foreign Direct Investment (FDI) is often in
U.S. dollars, while Mexican GDP is in pesos. In this case, you
can use the peso-dollar exchange rate. If you have pesos per
 pesos 
dollar, then you multiply   x (Value in $), since the $
 $ 
signs cancel out. If you have U.S. $ per peso, multiply by the
peso value to get dollars. In both cases, you would multiply each
period’s exchange rate by the original value.
If you are going to use growth or inflation rates, you may
wish to calculate percentage changes. Typically, there are done
(X 2 − X 1 )
using preceding dates, as in %ΔX = x 100. In Excel, if
X1

16
your data are in column C, one value might be calculated as
100*(C3-C2)/C2 or 100*(C3/C2-1). You may wish to annualize
these percentage changes if you are working with quarterly or
monthly data. Since these involve 1/4 or 1/12 of the year, you
can simply replace “100” with “400” or “1200.”
You may also with to calculate year-over-year percentage
changes. You would then subtract the same quarter or month of
the previous year, skipping the observations in between.
January 2020 would be compared to January 2019. Make sure
you subtract the correct number of cells. With quarterly data,
you would have to type 100*(C6/C2-1), a gap of five quarters
(2019Q1 to 2020Q1), using Excel. But, the year-over-year series
is much smoother and looks different than the other series.

3.3. Year-over-year percentage changes in the monthly Japanese yen-dollar


exchange rate (black) versus month-over-month percentage changes (gray).

Another conversion you may wish to do is to calculate


natural logarithms (and maybe also log changes, which
approximate percentage changes). Natural logs are similar to
base-10 logs, but only natural logarithms are used in economics.
The base of a natural log is the irrational number e (kind of like
π), which has a value of 2.718… and on to infinity, without
repeating. The logarithm is the “opposite” of the exponent, in
that ln(ex) = x. The notation ln comes from the French
logarithme naturel.
Make sure that you choose this option over log10. Excel, for
example, has formulas for both log(x), log10(x), and ln(x). The
first two are exactly the same function, so use the third.
Log10(10) equals 1, while ln(10) = 2.3 and ln(e) = 1. So the two
are not the same, although they are highly correlated. It is

17
important to know that you cannot take logs of negative
numbers. If you have values below zero, you will have to try a
different approach.
I always think of logs as a “flattener,” or a function that
“demotes” mathematical functions by knocking them down one
step. For example, you can rank the functions in order:

Powers > Multiplication > Addition

You can try 23, 2 x 3, and 2 + 3 and see that 8 > 6 > 5. While
I’m not even going to try to get into all the log rules from math
class, here are the most important:

ln(x2) = 2ln(x) ln(xy) = ln(x) + ln(y)

Powers get turned into multiplication, and multiplication


becomes addition. The second one is particularly important
when decomposing percentage changes. For example, the money
equation MV = PY becomes ln(M) + ln(V) = ln(P) + ln(Y). This
can be turned into growth rates (for money and real GDP) and
inflation rates.

3.4. A randomly-generated exponential growth series (left) and its natural log
(right). The log series is flatter. Note the scale on the left side of each graph.

Log changes, or Δln(x), approximate percentage changes very


well. In fact, these are used in finance and economics all the
time. They are not exactly the same, although they are very
close, particularly for small ranges. Technically, these will be
identical for distances of zero, i.e., for single points. If you have
studied calculus, the derivative of ln(x) is 1/x. But for
macroeconomic variables, which change over time, you need to
apply the Chain Rule to get dx/dt. This is approximated as Δx,

18
so putting the two together, you get Δx/x or the percentage
change. The monetary equation ln(M) + ln(V) = ln(P) + ln(Y) can
be turned into %ΔM + %ΔV = %ΔP + %ΔY. If velocity (V) doesn’t
change (according to the theory), changes in the money supply
turn into inflation, real GDP growth, or some of both.

3.5. Month-over-month percentage changes in the yen/dollar exchange rate


(black) versus monthly log changes (gray). Note the overlap.

These transformed variables can be used to plot economic trends


as well as to test economic theory or to examine the effects of
policy. For that, however, you many need to use a different set of
tools.

BASIC STATISTICAL TOOLS


No matter how much you have studied or used statistics, keep in
mind one thing: Many, many people have taken, at most, one
statistics course. This is usually as part of an undergraduate
degree. Since more people have a bachelor’s rather than a
Master’s or doctorate, in many situations you can assume that
enough people in your audience will appreciate, and get a lot out
of, simple yet well-explained statistical methods. There is a
tendency to “overdo” the quantitative part, and oftentimes
economists have a hard time explaining their analyses in
simpler terms. Here, the basic statistical measures are
explained: Mean and median, minimum and maximum, and
standard deviation, as well as the coefficient of variation. We

19
also explain correlation, which is the simplest measure of a
relationship between two variables.
The mean, or average, is a well-known measure a variable’s
“center,” or where it is located. Given the symbol μ, it is the sum
of all values divided by the number of observations. The median
is the middle value, or the average of the two middle values if
there are an even number of observations. For variables such as
income, the median is often preferred, because one “outlier” can
drastically change the mean, but not the median.
For example, with the numbers [1, 3, 5, 9, 12], the mean
(average) is 6, and the median is 5. But if the largest number
increases to 27, the mean goes up to 9, while the median
remains 5. Medians are less sensitive to these outliers, so they
are good for measuring income or housing prices, where one
billionaire or large mansion can make an entire neighborhood
richer on average.
Analyses often report the minimum and maximum values of
a variable. This helps establish the range over which the values
are likely to fall. The numbers [1, 3, 5, 9, 12] have a minimum of
1 and a maximum of 12, but these values do not say how spread
out they are. For that, we use the standard deviation, which
captures dispersion. For example, [5, 5, 5, 5, 5] has a mean of 5,
but no spread. The numbers [1, 3, 5, 9, 12] have a mean of 6, but
each number differs more from the average value.
The standard deviation, labeled σ, is the square root of the
variance, which takes the difference of each value from the
mean, squares it, and then averages all the squared values. For
example, the numbers [5, 5, 5, 5, 5] have a standard deviation of
zero, while [1, 3, 5, 9, 12] have a standard deviation of 4. The
numbers [1, 3, 5, 9, 27] are spread out even more, which is
reflected in a standard deviation of 9.38.
Standard deviations are useful when you want to tell how
“unusual” something is. If you are starting with [5, 5, 5, 5, 5],
the number 7 stands out, because the difference is much larger
than the standard deviation. It might be important to learn the
causes of this difference. But, if you are comparing this to [1, 3,
5, 9, 12], the number 7 fits right in, even if it is “above average.”
Means and standard deviations are used together for
statistical tests, but they are often combined to put standard

20
deviations in perspective. The coefficient of variation divides the
standard deviation by the mean, or 𝜎 / 𝜇 . This is useful because
large means and large variances often go together. For example,
a neighborhood with million-dollar homes might have price
ranges that differ by hundreds of thousands of dollars. The
coefficient of variation puts this large standard deviation in
context by taking into account the large average value. That
way, these price differences can be compared to those in a
neighborhood with less-expensive homes that might differ in
price from one another by only a few thousand dollars.
Finally, the correlation coefficient (ρ) captures the
association between two variables. This value can range from -1,
which means that the variables move in perfectly opposite
directions, to +1, where they move perfectly together. A
correlation of 0 means there is no association. Typically, you
don’t need a perfect correlation to say there is at least some
relationship, although there is no universally agreed-upon
minimum value.
As anyone who understands regression will tell you,
correlations do not take into account any other variables.
Sometimes outside events are the “real reason” why two things
move together. Also, as the saying goes, “correlation does not
imply causation.” It is impossible to tell what caused what, or
whether it is coincidence or because of something else. For that
reason, economic studies only use correlations for preliminary
analysis, and use regression or other methods for their main
work. Nonetheless, many non-economist audiences will
appreciate the insights provided by this measurement.
Microsoft Excel easily allows for all these measures to be
calculated. These are calculated over ranges of cells. The
formula AVERAGE(B2:B144) gives the mean over some range of
cells; you might also wish to cover a “generic” range with
AVERAGE(B:B). STDEV.P(B2:B144) calculates the “population”
standard deviation (which I use here, instead of STDEV.S), and
dividing the two would give the coefficient of variation. The
correlation formula requires two ranges, which must be of equal
length. If you wish to calculate ρ for data in columns B and C,
you might type CORREL(B2:B144, C2:C144) or
CORREL(B:B,C:C). Note the comma that separates the columns.

21
All these methods calculate a single statistic for each
variable. For time-series data, you can create means, standard
deviations, and correlations that change over time, and these
can be added for nearly every quarter or month of your dataset.
These new variables can be plotted and examined like any of
your original ones.

MORE ADVANCED STATISTICAL TOOLS


Using some basic statistics to create new variables is a simple
way to strengthen your analysis. Here, we look at two, the
rolling standard deviation and the rolling correlation. They are
called “rolling” because they calculate the variable over a short
period (called a window), which moves (or rolls) through the
sample. The standard deviation could be calculated for only the
first 12 months of a sample, for example, as in
STDEV.P(B2:B13), which gives the 12th month’s value. The first
11 months will have no observations because they cannot be
calculated. Then, the 13th month’s value would be calculated as
STDEV.P(B3:B14), followed by STDEV.P(B4:B15), and so forth.
The window length is usually a multiple of the frequency that is
long enough to calculate values, but short enough that too many
observations aren’t “lost,” or left blank. Monthly rolling standard
deviations can be calculated with 12- or 24-month windows,
while quarterly windows are usually 8 or 12 quarters.

3.6. Rolling standard deviation JPYVOL (black) vs. log changes in the exchange
rate, DLNJPY (gray). Note high volatility during the 1997 Asian crisis and 2008.

Similarly, rolling correlations calculate changes in co-


movements in variables over time, so that periods where

22
variables are closely connected can be examined alongside times
when they are not. The calculation is similar, using code such as
CORREL(B2:B13,C2:C13), and so forth.

OTHER ECONOMETRIC METHODS


If you have advanced skills in Econometrics, you can create a
number of other variables or conduct more rigorous analyses.
Macroeconomics makes use of time-series methods beyond
traditional Ordinary Least Squares regression. The best
techniques to master are cointegration methods (for long-run
relationships among variables), ARIMA approaches to model-
fitting and forecasting, Vector Autoregressive (VAR) methods for
testing causation, and GARCH models of volatility. We look at
some of these in our Example section. One of the best books for
practical applications of these and other methods is Walter
Enders’ Applied Econometric Time Series. If you are at that
level, you probably have no problem with correlations. But, as I
note in Chapter 4, many statistical analyses could benefit from
making their software output more presentable.

NAMING YOUR VARIABLES


As I mention in Chapter 2, you want to give your variables
names that are short, simple, useful, and recognizable. For
example, I named the log changes in the Japanese yen-dollar
exchange rate DLNJPY, because this is how the difference term
and natural logarithm are usually noted by economists. The
official currency code for the yen is JPY, so the reader will
recognize the currency immediately. I call the moving standard
deviation JPYVOL, because the variation represents volatility in
the currency. Not only do you want to be clear to your audience,
you also want to make sure that you will recognize your work if
you pick up an old project months after you left it.

23
4. PRESENTING YOUR DATA

PREPARING PRESENTABLE LINE GRAPHS


One primary graphical element used in macroeconomics is the
time-series graph, which plots the movement of one or more
variables (the y-axis) over time (the x-axis). You might see a few
others, such as the scatterplot, which compares combinations of
two variables (one as x and one as y). In general, economists
value function over flash, so you don’t need to be a graphic
artist, but you still need to do a few things to make your graph
acceptable.
While statistical software can be used to make decent graphs
easily (in other chapters, I use R’s basic grapher), here we use
Microsoft Excel. If you’re not that familiar with the software, it
has a wizard that helps you through the process. Simply select
the column(s) of data you want to graph, then use Insert→Line
for a 2-D line. But, while Excel draws a graph for you, you are
nowhere near done. Here is an example of a “bad” graph:

4.1. A graph that needs to be improved.


(log) Mexican Industrial Production, 1998-2011. Source: IFS.

24
There are a number of things wrong with this graph: There
are no dates on the x-axis, and the title (“Mexico”) is given twice
when it might not need to be presented at all. More minor points
include extra decimal places and (in my opinion) the horizontal
lines. Note that in every graph, I name the data source.
Plus, the line was originally blue. I generally avoid colors for
anything printed, because it costs extra. Journals have extra
charges, or if you print at home or at work, color ink is
expensive. Many people print in black and white for that reason.
Colors might become illegible, and any references (such as “the
green line”) will be useless.
Here are some more Industrial Production data:

4.2. A rough version of a multivariate graph.


(log) Latin American and U.S. Industrial Production. Source: IFS.

The colors were originally blue, red, green, purple, and light
blue. The dates are “squished” and randomly assigned between
January, May, and September. The vertical axis still has extra
zeros. In addition, there is “white space” below 4.2 that could be
eliminated (basically, you can “zoom in”).
To fix these issues, you can (right) click on the main data
box, as well as on every axis. For example, right-clicking the
vertical axis allows you to “Format Axis,” where you can change
the minimum and maximum values, the rounding (under
“number”), and other features. You can also change fonts and
other aspects of your graph. You can click on and delete the

25
horizontal lines (that’s more my personal preference). If you
right-click the main body, you can “Select Data” (add, remove, or
format variables) or format the plot area. It is quite possible to
go way beyond what I do here, which is necessary for good
graphic design. The formatting of graph 4.3 on page 27 is plenty
for an academic report, PowerPoint, or academic journal.
Clicking on each line allows you to change the color or line
style, including width and color. There are enough shades of
gray and dash types to make each line distinguishable (I think
that more than 5 or 6 lines on one graph is too much, anyway).
You will have to do each line individually. I am not a proponent
of “cookbook” manuals, which tell the reader exactly what color
to make each line. Instead, the best way to learn is to practice
doing it yourself and try what looks good to you personally.
The x-axis is particularly problematic. Dates in your
database, which might read something like “2009m2,” need to be
entered as years only. To get a usable year variable, I suggest
copying the date column and pasting it in an empty column
(with no data immediately to its right). Then, use Text to
Columns and choose a delimiter of the correct width (here: 4
characters). Here, it cuts off everything to the right of the year.
If you right-click on your data area of your chart, you can “Select
Data” and use this year column as your date axis.
You will still have to adjust the spacing on your horizontal
axis. It is enough to have dates listed only every five years, so
you can adjust the interval between labels to 20 (if you have
quarterly data) or 60 (for monthly data).
I personally leave the labeling outside the graph itself, opting
to label everything (particularly the title) separately. The
individual line labels might be helpful, but seem to take up a lot
of space on the right-hand side of a graph. Oftentimes, a chart
title or variable label will be redundant if it is in the graphic and
the text above or below. At the very least, make sure you do not
have the default text “Chart Title” anywhere in your graph.
Also, make sure your variable names are clear. For example, it
might be a good idea to rename “Mexico” to “Mexican IP” in 4.1
above. If your chart is titled “Mexican Industrial Production,”
delete the line label entirely. The country names read well for
Figures 4.2 and 4.3. The original variable names included MXIP

26
and BRIP (for Mexico and Brazil), for example, but it is better to
rename your variables to make them clear to the reader.

4.3. A somewhat improved version of a multivariate graph.


(log) Latin American and U.S. Industrial Production, 1998-2011. Source: IFS.

Figure 4.3 shows some simple improvements that can be


made in Excel. While it is still a little “busy,” with a lot of
activity right around the base year (Industrial Production is an
index, like CPI), it is much clearer. Starting on the left, and
working clockwise:
▪ The vertical axis labels were reduced to one decimal place.
▪ The vertical axis minimum value was set to 4.2 to reduce
blank space.
▪ The data lines were set to five different types of grayscale. This
was mostly arbitrary, but I chose black for Peru because it was
separated a lot from the others at the beginning and end of the
sample. These lines include at least two types of gray (be sure
to not have them be too similar), and three types of dash
(including solid).
▪ I kept the line labels, because they would be difficult to re-
create outside the graph, and because they only take up less
than a fourth of the width of the graphic.
▪ I created a new column of years only (as explained above).

27
▪ The interval between axis labels is now 60 (months), or 5
years. I opted not to make these intervals round (such as 2000,
2005, etc.), but that can be done as well.
▪ The interval between “tick marks” was changed from 1 to 12,
or once per year.
▪ As with most graphics here, I added a border. There is also a
description of the entire graph at the bottom.

Like I mentioned previously, there is a lot more you could do,


particularly with the default fonts. But making your graph
legible—as well as removing white space and crowding—will
result in a report that is perfectly functional and that serves its
purpose in informing your audience.

OTHER TYPES OF GRAPHS AND CHARTS


While macroeconomics makes frequent use of line graphs, you
may wish to present data in other formats. If you are interested
in learning more about making really effective visuals, I
recommend two books: Storytelling with Data by Cole
Nussbaumer Knaflic, and The Visual Display of Quantitative
Information by Edward R. Tufte. Tufte’s book is a classic, and
Knaflic’s is extremely useful for modern visuals. These books
cover everything from choosing the correct type of graphic, to
properly presenting it, to adding more artistic effects.
Even if you’re just trying to be as simple as possible, there
are a few rules you need to follow:
▪ Avoid any 3-D effects. They distort people’s perceptions, since
the “nearer” object looks larger than it really is.
▪ Pie charts are also difficult to interpret. The area of each
wedge is its relative size, but people can’t calculate areas of
wedges very well. It is also difficult to tell which wedge is
bigger if they’re both similar in size.
▪ Avoid any “cute” or “flashy” graphics (such as images of stacks
of dollar bills instead of a simple bar).
▪ In general, try to be as efficient in your use of ink and space.
Sometimes a chart does not add much useful information.
Instead, a table can present the same information more
efficiently, since it is more compact.

28
For example, here is a chart I made up for some country’s
composition of capital inflows, split between Foreign Direct
Investment, portfolio investment, and Other investment:

FDI 32%
Portfolio 41%
Other 27%

4.4. Some country’s composition of foreign investment.

These data can be presented a few different ways as graphics.


First, a pie chart might be appealing visually, but here it is
impossible to tell whether FDI or Other investment is greater:

4.5. A pie chart of a country’s composition of foreign investment.

Without numbers attached, any graph hard to interpret. You


can add numbers when you create the graphic, but then ask
yourself, “Why did I need the graphic at all?” Table 4.4 above
measures 0.75 inches by 1.25 inches in a Microsoft Word
document. The pie chart (4.5) is 3 inches wide.
A bar chart might be better, since rectangles are easier to
interpret visually, but these also are not an efficient use of
space. People are good at assessing relative size from the

29
numbers themselves. Sometimes students with minimum page
requirements try to be inefficient as possible—filling up two
pages of a seven-page assignment with a couple of pie charts.
Trust me, professors have figured this one out.
Here are a few bar charts that are each 5 inches wide. The 3-
bar chart has redundant percentages (above the bars and on the
left-hand scale).

4.6. A bar chart of a country’s composition of foreign investment.

Combining the three bars into one shows each type of flow’s
share of the total. Even if you were to eliminate all the white
space on the sides of the bar, this chart is going to be much
larger than the original table. From an efficiency standpoint, it
is far better to use the table and to avoid using any graphs at
all. But, of the three types of graph presented, the third option
(the 100% bar chart in 4.7) is probably the best. It has no
distorting angles, and it conveys the information more succinctly
than do the other two. I personally would tweak it to make the
fonts larger and get rid of the extra space.

30
4.7. A 100% bar chart of the country’s foreign investment. This still can be
improved, and is still inefficient in terms of useful information per unit of space.

TABLES
Given that graphics are not an efficient use of space, much of
your non-time-series data will be in the form of tables. There are
a few rules that I personally follow:
1) Avoid any type of “grid” lines (boxes around every cell).
Instead, use a limited number of horizontal lines, usually
below the header and at the bottom.
2) Round your data appropriately. Usually two or three
decimal places are enough. The letter E should never
appear as a number; this refers to the number of zeros.
For example, the (very large) number 123456789012
sometimes reads in Excel as 1.23E+11, and the (very
small) number 0.0000123456789 reads as 1.23E-05. If you
have really large numbers, consider either re-scaling (into
billions, for example), or looking at your data for a
problem. The same goes for very small numbers.
Oftentimes, regression coefficients are so small that they
have no economic meaning. You may have to put “0.000,”
but you also have to ask why the number is so small in
the first place.

31
3) Try to follow some formatting rules. As we discuss in
Chapter 5, you might want to use a sans serif font for
tables, with numbers right-justified. At the very least,
make sure that your headers are all on one line (by
widening the columns, if necessary), that you have proper
headings, and that it is legible.

Type Percent of Total


FDI 32%
Portfolio 41%
Other 27%

4.8. A properly-formatted data table.

REGRESSION OUTPUT
Whether you’ve taken a single econometrics course or you have a
Ph.D., never paste software output into your document.

4.9. Output from Eviews software.

32
This seems to happen all the time, and in my opinion it is
tacky. In fact, academic papers that do this come off as a little
unprofessional (and might be less likely to get published as a
result). If you are running some type of regression analysis,
make sure you 1) format your results as a table and 2) only
include the most relevant information.
Output from the software Eviews is presented in 4.9. It is an
AR(2) estimation of log changes in the yen-dollar exchange rate
from 1971 to 2016. If you are familiar with econometrics and
ARIMA modeling, you can see that the AR(2) coefficient is
insignificant (in fact, an autoregressive model of order one
performs much better). Keep in mind that you will have to draw
on your previous statistical knowledge for this type of
macroeconomic analysis.
The raw output has a number of redundancies. Standard
errors, t-statistics, and p-values are all provided, but most
analyses only use one (I personally prefer the p-value). Far more
statistics related to the estimation are provided than are
typically reported. There are also too many decimal places for
each number.
Creating and formatting a table properly involves selecting
only the essential results and presenting them concisely:

Variable Coeff. (p-value)


Constant -0.229 (0.137)
AR(1) 0.330 (0.000)
AR(2) -0.032 (0.462)
AIC 4.70
DW 2.00
4.10. An AR(2) estimation of log changes in the dollar-yen exchange rate.

Here, only the coefficients and p-values (in parentheses) are


depicted, as well as the Akaike Information Criterion for
goodness of fit. I also threw in the Durbin-Watson statistic for
autocorrelation, but that is not that common nowadays. All
numbers are rounded to three decimal places (padded with zeros
if necessary), except for the two diagnostic statistics. There are
only two horizontal lines, and the font is Arial 10pt, which is a

33
common, sans serif font. The variable names are left-justified,
while the estimates are right-justified.
A further estimation (volatility modeling using GARCH)
provides another table, as well as a software-generated graph:
Mean equation
Variable Coefficient (p-value)
Constant -0.212 (0.208)
AR(1) 0.309 (0.000)
Variance equation
Variable Coefficient (p-value)
Constant 1.247 (0.163)
ARCH 0.070 (0.078)
GARCH 0.736 (0.000)
AIC 4.71
4.11. An AR(1)-GARCH(1,1) estimate of yen volatility 1971-2016.

Here, I renamed some of the variables (the ARCH coefficient


is named RESID(-1)^2 in Eviews), and applied some formatting.
Italics help draw the eye to each section. The horizontal lines
separate the mean equation, the variance equation, and the
(single) diagnostic statistic. The GARCH variance (volatility)
can be plotted within the software:

4.12. GARCH volatility of the yen.

34
This volatility series is a more sophisticated version of the
rolling standard deviation in 2.6. The same “spikes” can be
found during the 1997 Asian Crisis and the 2008 financial crisis.
This graph is a little more visually appealing than those that
Microsoft Excel can generate. But, many of the formatting
options in Eviews and other software still need to be adjusted.
The GARCH graph in 4.12 uses my own personal settings,
including the retro, typewriter-looking Courier New fonts for the
axes. Whatever your tastes, line color and width, background
color, and fonts can all be changed—and sometimes have to be,
so that your graphs look good on the page.

THE NEXT STEPS


What you do next depends on your ultimate goal for your data
analysis. Are you writing a class project, or an academic paper?
Will you be presenting via PowerPoint, or writing a document?
How you combine your tables and graphs with your text depends
on this format.
But no matter what your format, never forget that your main
goal is data analysis. Every chart and table supports your
conclusions and interpretation regarding the data you are
looking at. Well-written, and well-thought-out, text is key to any
report. Make sure that what you have to say is given as much
emphasis as the tools you use and the format in which you
choose to say it.

35
5. PUTTING IT ALL TOGETHER

ROUGH DRAFT, OR FINAL PRODUCT?


What you put together depends on whether or not you are
assembling the final product yourself. Professionally-produced
reports might have a dedicated graphic artist and proofreader,
so you might only need to provide text and either basic graphs or
(sometimes) only the raw data themselves. But that leads to a
new set of problems: you will be responsible for making sure
that the final product looks the way you want it.
Most of what is discussed here assumes that you are putting
everything together yourself. Economists don’t really value style
over substance, so don’t try to be flashy, but there is a bare
minimum standard that is too often ignored. There are plenty of
examples—even by those with Ph.D.s—of illegible graphs,
crowded text, and spelling and grammar errors. Papers are often
rejected by academic journals for these reasons, even if the ideas
are good. In the professional world, standards are even higher.
Our goal here is to assemble a high-quality document that meets
the needs of both you and your reader.

WORKING ON A TEAM?
Graphs and figures might need to be checked extra-
carefully. They can be copied incorrectly, or created by an
artist who is not concerned with the content. Sometimes
they are placed independently of the editors. In particular,
you want to:
▪ Check the placement, that is, that the graph is located
where you want it.
▪ Verify the header/chart title; these are often copied
incorrectly or typed by hand.
▪ Make sure the images are clear and imported correctly.
▪ Check spacing; also make sure that cells aren’t shifted.
▪ Check the fonts, sometimes these “disappear” and are
replaced with defaults.
▪ Look at the footer and any additional notes.

36
USING DATA TO TELL YOUR STORY
Before you begin putting together your final document, take
time to really think about how your data will be analyzed. Any
graphs or tables are there to support your interpretation. Look
everything over for any interesting patterns. Think about what
your results mean. Much of your analysis will be presented
through your writing and explanation, rather than your charts
and figures. And of course, keep in mind your audience and how
your findings will be interesting to them.
No matter what format you are using, and who your
audience is, a good report generally has four sections:
1) An introduction that explains the issue at hand and why
it needs to be addressed. Academic papers often use this
section to discuss previous literature on the subject.
2) A methods section, which explains the data used and any
statistical procedures. One challenge is to neither explain
too much nor too little. For that, you need to know your
audience. For example, if your audience knows basic
statistics, you can simply say, “means and standard
deviations are presented here.” But it is not difficult to
imagine others who might need a bit more explanation.
Likewise, I’ve seen economists use obscure econometric
methods without going into enough detail. One example
in this document is that I assume you know what a .pdf
is, but not a TIFF file.
3) An explanation of results, often tying together the main
idea, previous research, and each table or graph.
4) A conclusion that brings back the “big picture” and makes
specific recommendations.

An undergraduate assignment might not have these sections


clearly labeled, but a formal research paper will. Even if you
don’t have headers for each, it is important to state 1) What you
are doing, 2) How you did it, 3) What you found, and 4) Why it
matters. Make sure you incorporate all of these ideas into any
project you do, which will make it useful, clear to understand,
and interesting to your reader.

37
INTEGRATING GRAPHS, TABLES, AND TEXT
Different projects require different formats. Many reports are
written, while others are presented orally, with a set of slides to
support the presentation. They may be done in a corporate
setting, or by students as part of a class assignment. Here, we
focus on self-produced, written, academic assignments.
Many professional (or aspiring) economists prefer not to use
Microsoft Word or PowerPoint. (Instead, they often use LaTeX,
which supports mathematical equations, for documents.
Presentations are produced in a related software called
Beamer). I’m not going to get into that here. While much of this
explanation is for Word, much of it applies universally, no
matter what software you use.
First, while many academic papers (particularly unpublished
working papers) place all tables and graphs at the end, a
macroeconomic report should have all these elements in the
main body, as close to the text that references them as possible.
Each table or graph can be placed on a separate line, with a
paragraph space (return) above and below.
Make sure you have a concise, clear title (above or below the
element). You can be concise with this. I have seen lengthy
explanations of multiple variables in table footers; this is mostly
redundant and can be avoided. A good idea is to number tables
or figures (these can have separate numbers, such as “Table 1”
and “Figure 1”). Here, I number all of mine together. In your
text, make sure you refer to tables by their number, as in
“Figure 4 shows…” rather than “The table above shows…”.
Another option for image placement
is to embed it directly in the text. This
might be a good idea if you have wide
paragraphs or are making your own
document (such as a market report).
One trick is that since images “bump”
the text, you have to format them.
right-click on the image, choose “Wrap
Text,” and choose “Tight.” The volatility series from 4.12 is
repeated here, but it clearly needs fixing. There’s not enough
space for a title, and it is very close to the text on its left. For

38
that reason, I recommend making larger graphs in separate
lines, without wasting too much space.
Make sure you crop your images if necessary. Right-
clicking the image shows the “crop” tool (which has an
icon similar to the one here on the right). You can leave
some space on each side, but don’t cut off anything important.
Your text itself can either be single- or double-spaced.
Double-spacing is for editing purposes, so that you can make
corrections (or get comments) in the white space. This is good for
a class paper, particularly if this format is requested as part of
the assignment. But if you take a look at any book, magazine, or
professional document, there isn’t any extra space—just enough
so that the text isn’t crowded. Some of these design
considerations are explained later.
Specific length requirements are determined by the nature of
the analysis project, as well as by a course professor or academic
journal. Resist the temptation to excessively pad a document, or
to cut too much to get below a certain maximum. While you
should read and re-read your document for any redundant
phrases or sections, make sure you explain everything that you
need to. Another problem that writers sometimes have is that
they assume they wrote something, when really they just
thought it. Look through your document with this in mind.

PRODUCING HIGH-RESOLUTION IMAGES


If you are simply writing a report for a class, most likely you can
copy and paste graphs straight from Excel without a problem.
Another good way to put images into a document is Microsoft’s
“Snipping Tool.” You can drag your box around items you want
to copy, and either save to file or paste into Word. Windows also
has a “Print Screen” option, which copies everything on your
screen—exactly as you see it yourself—to your Clipboard. Saved
images can be also inserted from a folder, as can clip art and
other elements.
One thing to watch out for is the resolution of your images.
High-quality images have larger file sizes, because they carry
more information in the same area. This quality, or resolution, is
measured in dots per inch (dpi). The more dots, the higher

39
resolution the image. If you’ve ever seen a “blown up” picture
that looks boxy or blurry, it is because the resolution is too low.

5.1. A low-resolution image that was enlarged too much.

Typically, the web and basic printing do not require very


high resolution. Microsoft’s defaults are good enough. The
Snipping Tool, for example, produces images with 96dpi, and
they work fine in most cases. But, for actual print publishing
(such as an academic journal), the minimum resolution is often
300dpi. PowerPoints have no specific number, but you can
clearly tell if you fill a whole slide with a low-resolution image.
Producing these images can be a challenge. I have tried
increasing Windows’ default settings, but that involved altering
my “Display” settings, logging out, and logging back in. It
basically magnifies everything, so you have to undo it right
away to remain sane. I tried this for some of the website screen
captures here, but the best I could do was 192dpi.
Some software gives the option for printing output as a file. I
produced my graphs in R at 600dpi. My suggestion is to use a
plugin (available for free online) that allows you to print to .pdf,
or even better, as an image file. Typically, people use .jpg files,
which use compression to reduce file size. TIFFs (Tagged Image
File Format files) lose less data in the compression process, so
they are sometimes preferred by publishers.
I found a print-to-file plugin called “PDF Creator,” but I don’t
recommend specific freeware from the web because of the ads
(and other issues) that have been known to be related with free
software. An option if you are just concerned with legibility
(such as when making a PowerPoint) is to “Save As” and turn
your Word file into a .pdf. Then, in Acrobat, you can “Take a
Snapshot” (under Edit) and paste it into your PowerPoint. Make

40
sure you have zoomed in as much as you can before you take the
snapshot, so that your pasted image is as clear as possible.

FONTS, COLORS, AND OTHER CONSIDERATIONS


You don’t have to be a graphic artist to want to change your
document from Word’s default settings. As I mentioned
previously, I suggest making all images in grayscale. But you
probably should change the settings for your text. The biggest
adjustments will should make are in fonts and spacing.
There is an art to selecting fonts. Long ago, printers had sets
(fonts) of metal blocks with various engraved typefaces. Today,
software has literally dozens. These are created by designers.
You might notice that Apple and Microsoft have different choices
available. Here, we focus on four considerations:
1) Font choice: serif vs. sans serif
2) Headers, main text, tables, and boxes
3) Fonts that are readable by others
4) Font size and spacing between lines and paragraphs.

The “serif” is the “foot” or part that “sticks out” on letters in


some fonts. Sans serif literally means “without” serif in French.
a
For example, this is visible on the a in Times New Rom n, but not
on the lowercase a in the sans serif font Arial. These two fonts
are very popular, and are fine to use in an academic context.
This document is written in Century Schoolbook, which has
distinct serifs. Many times, the main text of documents uses
fonts with serifs.
Other types of text (such as boxes with additional
information) might “stand out” with a different font. So might
descriptions of tables or figures. The descriptions here are in
Arial. One common piece of advice is to use a sans serif font such
as Arial or Tahoma for tables; the numbers read more cleanly.
You might also choose a text type that stands out for section
headers. Here, I use Franklin Gothic Demi, with small capital
letters instead of lowercase ones. If you’re an artist, you can do a
lot, but remember that Economics frowns upon excessive

41
artistry. Still, trust your instincts or try to model other
documents you have seen.
Choosing common fonts might not seem very exciting, but it
ensures that your reader will see exactly what you see. If their
computer doesn’t have it, their software might substitute a
similar, but related, font. This might cause weird spacing or
move things around. One way to avoid this is to save your
document as a .pdf in a way that preserves the original fonts.
Make sure your text is the right size for its purpose. Usually,
10- or 12-point font is acceptable for academic papers. Some
fonts are “wider” than others. College professors often specify
fonts and point size, because Courier New takes more space to
write the same words as does Arial. And while many an attempt
has been made to stretch a short paper with 36-point fonts, it is
pretty obvious. Outside of this format, though, you will want to
make sure your reader’s needs are met. For example, font sizes
smaller than 6 points are often illegible, so large tables might
need to be formatted accordingly, on multiple pages.
You also might want to change the default spacing between
lines and paragraphs. There are options in-between single- and
double-spacing. If your leading is too small, the bottoms of your
letters will touch the tops of the letters in the line below.
Likewise, too large of a leading might result in too much white
space. I generally don’t put any extra space between paragraphs;
this requires me to change Word’s default settings. I do put
space around tables, figures, and headers and footers.
Word’s default margins are 1 inch on each side. You might
wish to adjust these—here, I have 1.5 inches on each side except
the left, which is 2 inches to allow for binding. But, because I
single-space, there are more words per page (about 350) than if
it were formatted like a term paper (which would have about
250 words). Sometimes, if you have a large table, you can reduce
the margins to get it to fit. Make sure you don’t go below 0.5
inches, though, because it might not print properly if you do.
One book I like is Document Design: A Guide for Technical
Communicators by Miles Kimball and Ann Hawkins. There are
a number of similar books, which are used by designers. Even if
that’s not your goal, it’s good to know about color, type, and

42
layout if you are putting together a document that you want
people to read.

A FEW WRITING TIPS


Just like you need to use all your previous knowledge of
macroeconomics and statistics to perform good data analysis,
you also need to apply your writing skills to write a good report.
Here are a few things to keep in mind when you write up your
findings:
▪ Your writing style should be more formal than it might be for
other types of writing. You should avoid using contractions
(although I don’t do that here!), for example.
▪ Make sure your sentences are concise; don’t be too wordy. Look
for places to cut where you can so that your writing is as
“tight” as possible. Also use the simpler word (such as use vs.
utilize) whenever possible.
▪ Avoid using jargon, slang, or obscure references. Oftentimes,
writers use sports analogies, unaware that American football
or cricket might not be universally understood. The same is
true for Greek mythology. Also, Latin phrases used to be very
common among the educated class; there’s less pressure
nowadays to follow suit. When you use foreign terms, they
should be italicized. Two places where you will see this in
economics are the term ceteris paribus and et al. (short for
“and others”) in a reference section.
▪ When using economic terminology, make sure you spell out
abbreviations the first time you use them, for example if you
apply a GARCH (Generalized Autoregressive Conditional
Heteroskedasticity) model.
▪ It is tricky to know how much you really need to explain,
though. You could get away with saying “I ran OLS” and be
understood, but that is the most well-known regression
method. I suggest spelling everything out just in case.
▪ You can use American or British English (e.g. Generalized vs.
Generalised), as long as you are consistent every time.
▪ I don’t think there is a specific style guide (like APA or MLA)
for Economics; I usually tell students to go with the one they
are already most comfortable with.

43
PROOFREADING
You also want to make sure your report is well-written and
thoroughly edited. If English is not your first language, get
someone to read it over for common issues (such as when to use
“a” and “the”), where rules don’t seem to apply.
Read through your document multiple times, for different
levels of detail. A quick read-through can help to see the “big
picture,” while going over every word can help catch tiny
mistakes. Make sure you look for things that Word’s squiggles
can’t catch, such as typing “work” for “word.” It’s not a
misspelling, so it won’t be flagged. If you mention people by
name (such as your college professor), make sure you get his or
her name right. The same is true for the person’s title and any
other details. Look it up if you have to.
If you have someone editing your work, don’t be extra-sloppy
and rely on your editor to catch everything. Not only will more
mistakes raise the likelihood that something gets through, your
editor may stop being your friend.

PRESENTING WITH SLIDES


If you are simply presenting with slides, you don’t have to worry
as much about color, paragraphs, and other things that matter
for text documents. But, you still want to keep in mind:
▪ Don’t “crowd” too much text on one slide
▪ Use large, readable fonts ( > 20 point)
▪ Make sure your images are high (enough)-resolution
▪ Don’t be flashy with sounds, clip art, or superfluous images.

I have seen Beamer, projected .pdfs, and even open Word


documents used in presentations, but PowerPoint is usually
fine. I use simple, white backgrounds, with no effects or
transitions (such as “Appear”). But I do change the fonts and
make sure the whole presentation is well-organized, clean-
looking, and professional.
Next, we follow all five steps to answer a macroeconomic
question, studying oil prices using real data.

44
AN EXAMPLE
OIL PRICES AND OIL-PRICE VOLATILITY
Here, we combine everything we’ve discussed here—as well as
our knowledge of macroeconomics—to examine trends in oil
prices. We can go through each of the five steps:

1) WHY IS THIS IMPORTANT?


Oil is key to the U.S. economy, so rising prices can lead to
inflation and reduce output. Consumer budgets can be squeezed
as well. Two well-known “oil shocks” in the 1970s had drastic
effects, and high oil prices preceded the 2008 recession. In
addition to price movements, their volatility represents risk and
uncertainty, which can hamper business decision-making.
Our goal here is to gather data on oil prices, plot them in real
values and percentage changes, and calculate and plot a
volatility measure. We will also create a summary table, and
compare three measures of percentage changes using
correlations.

2) GATHERING DATA
For this example, we will use monthly data from FRED. There
are multiple oil prices, but here we will use West Texas
Intermediate (WTI). A search for “WTI” will provide a number of
results, but here, we choose Global price of WTI Crude from
January 1980 to September 2016. The data are not
seasonally adjusted, but this doesn’t seem to cause too big of a
problem. Download these data as either an Excel file or a .csv.
While I have R code, which uses these data to generate similar
(but not identical!) results, on my website, it is important that
you know how to find data yourself.
Since these data are nominal, we will need to create real
values using a price index. Two options are the Consumer Price
Index (CPI) and the Producer Price Index (PPI); here we use the

45
PPI because of oil’s importance in industry. A search for “PPI”
shows that monthly data are also available; make sure you
download at least January 1980-September 2016. You can
always cut the longer series. If you download this series, you can
combine PPI and WTI into a single file. The first column (DATE)
should be 1/1/1980, followed by WTI and PPI.

E.1. A preliminary database.

Next, we can create the other variables.

3) CALCULATING VARIABLES
Here, we are going to deflate nominal WTI by the PPI (making
sure to multiply by 100). Then, we are going to create three
measures of change: Monthly log changes, monthly percentage
changes, and yearly percentage changes. Finally, we will
calculate 12-month moving standard deviations for the monthly
log-changes series.

E.2. Nominal and Real Oil Prices (dollars per barrel), 1980-2016. Source: FRED.

46
If you’re not that familiar with Excel, the formula for the
January 1980 real WTI would be 100*(B2/C2); then, you can
copy that cell and paste all cells below, or simply click the lower
right corner of that original cell to fill in the rest. I named my
new variable RWTI. If you’re curious how Nominal and Real
WTI compare, the nominal value is higher than the real value
after the “base year” (1982-1984), because price levels have
risen. Not controlling for this will make oil prices look higher
than they really are later in the series.
Next, we calculate percentage changes. We do this three
different ways, even though we are only going to use one. First,
we calculate log changes, multiplying by 100 to make it into a
percentage. In Excel, you can do this as 100*(LN(D3)−LN(D2)).
Note that the parentheses “nest” the functions, following the
order of operations. The logs are grouped, then this group is
multiplied by 100. We also create monthly percentage changes.
Assuming that Column D is your real WTI, the formula will be
100*(D3/D2−1). This is the same as subtracting before dividing,
since D2/D2 = 1. We can make annual percentage changes as
well, but we cannot do it until we have a full year of values. If
we start in January 1981, we can subtract the value from
January 1980, as in 100*(D14/D2−1). Make sure that you copy
your formula into all cells in the column, so that it updates to
100*(D15/D3−1) and so on. When you are finished, notice that
the numbers differ between the monthly and yearly versions,
since we are not annualizing them here.
Finally, we can create a rolling standard deviation of log
changes in the oil price (rather than percentage changes). Here,
we do it for 12 months of data. The first value we can calculate
is January 1981, using values beginning in February 1980. The
second value will be February 1981, using values beginning in
March 1980. The formula will be STDEV.P(E3:E14),
STDEV.P(E4:E15), and so on. The first months’ cells will be
blank, since there aren’t enough data to calculate them without
earlier months. We now have all our variables.
I named them RWTI, DLNRWTI, MOMRWTI, YOYRWTI,
and SD12DLN. These abbreviations include “R” for real, “D” for
difference, “LN” for natural log, and “MOM” for “month over
month.” SD12 signifies standard deviations over 12-month

47
windows. There are other ways to name variables, but these will
be consistent and clear.
My starting database is presented in E.3. Next, we can graph
and summarize important relationships among them.

E.3. A Database of Oil Prices and Related Variables.

4) GRAPHS AND TABLES


First, we will graph real WTI. We already did this as part of E.2,
which had nominal WTI as well. Notice that nominal WTI is a
solid black line, while the real value is both dark gray and
dashed. A graph of real WTI is presented in E.4. You can see
rising prices during the 2000s, with a spike, followed by a crash,

E.4. Real WTI Oil Price (dollars per barrel), 1980-2016. Source: FRED.

48
in 2008. When doing your analysis, make sure to look for any
important patterns, and try to explain them in terms of real
events. We can see these patterns as well with our graph of log
changes in the real oil price, which shows a sharp drop in the
mid-1980s, a spike around the 1991 Gulf War, and some large
fluctuations after 2013. The source is again named in the footer.

E.5. Real WTI Oil Price (monthly log changes), 1980-2016. Source: FRED.

For these and all the Excel graphs here, I made some
important changes to the default settings. First, I made a
column of just the years (cutting off the months and days) using
“text to columns,” and used that for my date axis. I made the
axis text darker and larger, so that it can be shrunk on a page
and still be legible. I also made sure to set the interval between
dates to 60 (months), so that the values appear only every five
years. I added vertical and horizontal lines to the axes. I also
made the time-series line thinner, but that is a matter of
personal taste. I printed each figure to a 600dpi .jpg file, then
inserted it into my main document, and then rotated it and
cropped it to fit. If your goal is an academic report, you are
probably fine copying and pasting.
Your main goal is to be legible. There is no “correct” font or
line style for most academic documents, other than what works.
The rule of thumb is to look at your document in its final form
and see if you can actually read it. If it’s too small or the text is
too light, you might have to go back and adjust your figures. I
suggest right-clicking on all parts of a Excel graph—main chart

49
area, title, and horizontal and vertical axes—and looking at all
the options. Taking time to try different options helps you learn
how to do it, so it is time well-spent if you plan on doing more
graphs in the future.
Our graph of rolling standard deviations is formatted much
the same way. Not that while the series doesn’t start until 1981,
I did not adjust the dates on the graph. This makes it easier to
compare across graphs, plus it keeps the listed dates as
multiples of 5 (rather than 1981, 1986, etc.).
We see important economic patterns as well. Here, the time
periods mentioned above (1980s, 1991, 2008, and after 2013)
show large volatility in the oil price. Your analysis would seek to
explain this.

E.6. Oil-Price Volatility (12-month moving standard deviations), 1980-2016.


Source: FRED.

Having all our charts, we will next make two tables: First,
we will present the correlations among our three measures of
price changes. Second, we will make a summary table for WTI,
RWTI, DLNRWTI, and SD12DLN.
We first calculate the correlations among log changes,
monthly percentage changes, and annual percentage changes.
We will calculate three separate correlation coefficients, because
there are three unique pairs among our three series.
Econometric software often calculates multiple pairs’
correlations more quickly, but it’s not hard to do it with the
CORREL() formula here. Just make sure you select each column

50
for each member of the pair separately. Applying some
formatting (horizontal lines, cell spacing, and changing the font
to Arial 10) and adding names gives us the following table:

DLNRWTI MOMRWTI YOYRWTI


DLNRWTI 1 0.996 0.268
MOMRWTI 1 0.256
YOYRWTI 1

E.7. Correlations among three alternative measures of oil-price changes.

There is no need to say “source: author’s calculations” or similar


wording. I also sometimes think that the “1”s are redundant,
since (by definition) a variable is perfectly correlated with itself.
But it is often left in; plus, it might help draw the reader’s eye.
The numbers show that monthly log and percentage changes are
indeed very similar, which we saw earlier in 3.3. We also saw
that the year-over-year changes appeared differently from the
other measures. Here, we confirm this finding with the low
correlation coefficients.
Next, we make our summary table. You can use the
AVERAGE(), STDEV.P(), MIN(), and MAX() formulas in Excel,
although econometric software can often do this with a single
command. My suggestion in Excel is to copy your formulas from
one variable’s column to the others. Just double-check your
formulas to make sure that you indeed are calculating the rows
and columns you want to calculate.
You can then take your calculations and create a summary
table in Excel. Make sure you “paste special” and select “Values
and number formats” so that you don’t move your formulas. I
rounded here to two decimal places and applied the same
formatting as above. Often this table is presented for its own

WTI RWTI DLNRWTI SD12DLN


Mean 41.14 27.64 -0.13 6.52
Std. Dev. 27.94 12.40 7.66 3.31
Min 11.31 9.21 -38.04 1.25
Max 133.93 66.80 37.36 17.07

E.8. Summary Statistics.

51
sake and not really discussed. There’s often not much to say, so
don’t try to milk too much of a story out of it. It is necessary to
include, however, so your reader gets a sense of the overall data.

5) PUTTING IT ALL TOGETHER


Your final product should combine well-made charts and graphs
with your explanation and interpretation of the data. Steps 1-4
above have already done this in two ways. Mainly, I explained
the steps and the basic procedures how to collect, format, and
describe data. But I also explained some basic patterns in the
variables themselves. The format used here followed one that
you could use for a formal paper. Our last step is to show these
methods using a more advanced time-series technique.

A GARCH ANALYSIS OF OIL-PRICE VOLATILITY


Here, we conduct an ARIMA-GARCH analysis of real oil prices
(monthly log changes) using the software Eviews. This is similar
to the approach used to calculate yen volatility in Section 4, but
here we apply all the steps we have covered.

E.9. Autocorrelation and Partial Autocorrelation Functions.

52
First, following the standard Box-Jenkins procedure, we
establish the order of our ARIMA model using Autocorrelation
and Partial Autocorrelation Functions (ACFs and PACFs). I
printed them to .jpg files for E.9, because pasting them from
Eviews doesn’t reproduce them perfectly. I suggest going further
and cropping out everything but the bar graphs.
I tried different combinations of ARMA(p,q) from (1,0) to (2,2),
and settled on a simple AR(1) for my base model. Remember,
Eviews output looks like this (E.10):

E.10. Eviews output for an AR(1) estimation of log changes in real oil prices.

We can then format it so it looks like in E.11. I included an


ARMA(1,1) to show that an AR(1) is preferred.

Coeff. (p-val.) Coeff. (p-val.)


C -0.127 (0.797) -0.126 (0.790)
AR(1) 0.288 (0.000) 0.142 (0.375)
MA(1) 0.159 (0.319)
AIC 6.834 6.836

E.11. Estimation results, ARIMA estimation of log changes in real oil prices.

53
We can then estimate a GARCH(1,1) model. I use Bollerslev-
Wooldridge heteroskedasticity-consistent standard errors and
get what is in E.12:

Mean Equation
Variable Coeff. (p-value)
Constant -0.393 (0.288)
AR(1) 0.216 (0.000)
Variance Equation
Variable Coeff. (p-value)
Constant 1.339 (0.038)
ARCH(1) 0.295 (0.003)
GARCH(1) 0.730 (0.000)
AIC 6.627

E.12. Estimation results, ARMA(1,0)-GARCH(1,1) estimation of log changes in


real oil prices.

The resulting variance series looks very similar to the moving


standard-deviation series, with periods of high volatility at
similar times. The correlation coefficient between the two is
0.833. (Note that you don’t need a table for a single number; I

E.13. GARCH volatility series for log changes in real WTI.

54
just put it in the text). But you can make the same statements
and draw similar conclusions regardless of the methods that you
use.

MOVING ON TO YOUR OWN PROJECTS


Now that you’ve seen examples of how to conceptualize, prepare,
and present a macroeconomic data report, hopefully you can
apply these concepts to your own projects. Regardless of what
type of analysis you are conducting, or your skill level in
econometrics, following the five steps listed above can help you
produce a well-written report. You may be strong in one or more
of these areas; if so, you can focus on the others.
Also remember that these are just hints and examples. There
is no “right” way to make a chart. There are a lot of “wrong”
ways, though—you just have to avoid them. Spend time with
Excel or other software; this will pay off in future projects or on
the job. Develop a “feel” for what looks good, and practice ways
to incorporate this into your report. And be sure to write well—
this is a skill that is valued more and more nowadays. Take the
time to check every detail. Sometimes, this means starting over
on some part of your project. But while that’s definitely a pain, it
is far better than the alternative of a rush job or inaccuracies in
your final product. And most of all, the presentation is
important—do not underestimate it. But your main objective is
the data analysis. Ask a good question, and use the data to find
answers. Your audience, whoever it may be, will be well-served.

55
APPENDIX
R code for recreating tables and/or graphs for the Japanese yen,
Madison employment, and the WTI exercise are available at
www.scotthegerty.com or bit.ly/2SlGXU0.

BOOKS AND SOFTWARE RESOURCES


Time-Series Econometrics
Walter Enders (2015), Applied Econometric Time Series, 4th
edition, John Wiley & Sons, Inc.
Ruey S. Tsay (2010), Analysis of Financial Time Series, 3rd
edition, John Wiley & Sons, Inc.

Graphics Principles and Data Visualization


Kieran Healy (2018) Data Visualization, Princeton University
Press.
Cole Nussbaumer Knaflic (2015), Storytelling With Data: A Data
Visualization Guide for Business Professionals, John
Wiley & Sons, Inc.
Edward R. Tufte (2001), The Visual Display of Quantitative
Information, 2nd edition, Graphics Press.

Technical Writing and General Document Design


Gerald J. Alred, Charles T. Brusaw, and Walter E. Oliu (2015),
Handbook of Technical Writing, 11th edition, Bedford/St.
Martin’s.
Miles A. Kimball and Ann R. Hawkins (2008), Document Design:
A Guide for Technical Communicators, Bedford/St.
Martin’s.

Software
www.r-project.org www.Eviews.com
gretl.sourceforge.net www.stata.com

56
Appendix: Data Sources and Variables

The following are some useful sites for macroeconomic data:

U.S.: BLS, BEA, FRED


Foreign: Eurostat, Central Banks (e.g. Bank of Mexico)
➔ Google first, then find “Data” or “Statistics”
➔ Choose variables, frequency, then Download (.xls or .csv)
International: IFS, World Bank, Penn World Table

Variable Freq. Unit Variations Transformations


Y y,q $ GDP, IP (Monthly), GNP, NNP (etc.) Real, Business cycle*, Growth
C y,q $ Real, % of GDP, Growth
S y,q $ Real, % of GDP, Growth
I y,q $ Gross Fixed Capital Formation, Δ Inventories Real, % of GDP, Growth
G y,q $ Real, % of GDP, Growth
P y,q,m Index CPI, PPI, PCE % Changes, ratio of two
N y,q,m # Sectoral or regional employment % Changes
u y,q,m % U3, U6
Fed funds, discount, lending, savings,
r y,q,m,d % Real, Differential
money mkt, gov't/corp. bond (etc.)
PS y,q,m,d Index DJIA, S&P, Nikkei, other U.S. or foreign indices Log (%) changes (growth rate)
PC y,q,m,d Index, $ WTI (Oil), Copper, etc., priced in dollars or indices Log (%) changes
Ms y,q,m $ MB, M1, M2 Real, % of GDP
RES y,q,m $ Including or excluding gold Real, % of M, % of MB
E y,q,m,d #/$ Nominal, Real, Bilateral, Effective (Index) Log (%) ch., Real, Cross rates
X y,q Index, $ Real, % of GDP. Growth
M y,q Index, $ Real, % of GDP, Growh
Px y,q Index, $ fob, cif; priced in dollars or indices Px/Pm ratio
Pm y,q Index, $ fob, cif; priced in dollars or indices Px/Pm ratio
CA y,q $ Real, % of GDP
KA y,q $ KA also called “FA” or “KFA” Real, % of GDP
FDI y,q $ Inward (liabilities), outward (assets) Real, % of GDP
PORT y,q $ Inward (liabilities), outward (assets) Real, % of GDP
Notes:
Variables listed in categories: Real (as in “the real economy”), Financial, and International.
The transformations list is not exhaustive. Logs and growth rates can be used on basically anything.
Freq. = Frequency at which data are reported. Yearly, quarterly, monthly, daily.
$ = Currency units. Could be U.S. or foreign. Make sure you know which.
# = number (of workers or currency units.) #/$ is currency units per dollar.
Index = 100 in base year. Effective exchange rates are in indices.
* = Requires more advanced procedures to generate.

57

Potrebbero piacerti anche