Many users of Stata looking to visualize statistics opt to output results into MS Excel via commands like -tabout-, -putexcel-, and -outreg-. While Excel offers an intuitive and comprehensive way to create summary tables, it lacks the professionalism of tables commonly found in published journal articles. These tables can be created using Latex, and fortunately, many Stata packages have Latex functionality. This article is a tutorial on how to use Stata’s -tabout- command to create publishable and client-ready tables in Latex. It assumes the reader has a good grasp of Stata and -tabout-, but no knowledge of Latex. The tutorial is not intended to teach general Latex, but only enough Latex to take advantage of Stata’s output commands.
What is Latex?
Latex is a typsetting program that uses the Tex engine to create formatted documents. This may sound confusing, but essentially Latex is a program in which useful macros from the Tex “language” have been pre-programmed for you. The underlying philosophy of Latex is that good documents don’t just look nice, but reads intuitively. This includes how spaced out lines are, the differences in font size between headers and sub-headers, and the optimal size of a table. While these format options can certainly be specified in program like MS Words and MS Excel, the advantage of Latex is that the design of your document has been more-or-less formatted for you – you only need to provide the content. The disadvantage of course, is that most of what you see is what you get. There are ways to change the pre-defined formats, but doing so would likely be too tedious to justify the effort. If you are looking to quickly produce professional outputs with little time spent on formatting, then Latex is your best bet.
Installing Latex
A lot of Stata users I’ve encountered who do not take advantage of Latex are those who have been too intimidated by the installation process. In my experience, many beginner programming guides assume too high of a background understanding from new users. An important thing many Latex guides skip is the installation process, especially for an Interactive Development Environment (IDE).
An IDE is the user-end program that you will actually interact with when you code. For Stata, there is only one IDE: the official one. For R, there are two: vanilla R and RStudio. And for Latex, like many other programming languages, there are as many as 20 IDEs. The IDE I chose – for no particular reason – is TeXStudio.
To install TeXStudio, you first need to install what is called a Latex distribution. This is basically a combination of both the Latex program and commonly-used libraries. If you are a Mac user, then MacTeX is the standard distribution. Windows has more options, but I use MiKTex, which can be downloaded here. After installing MiKTeX, you can download and install TeXStudio. Note that MiKTeX comes with an IDE, called TeXworks, but I prefer using TeXStudio.
Setting Up Latex for -tabout-
Although Stata’s -tabout- produces Latex outputs (.tex files), it does not do all the work. Stata only produces the Latex code needed for the table itself, but none of the formatting and preparation code around it. If you are familiar with HTML and CSS, you can think of the surrounding code as the “head” and “tail.” The top code sets macro formatting and relevant packages, while the bottom code tells Latex to end the document. The top and bottom code are relatively simple:
Top
\documentclass{report}
\usepackage{booktabs}
\usepackage{tabularx}
\begin{document}
\begin{center}
\footnotesize
\newcolumntype{Y}{>{\raggedleft\arraybackslash}X}
\begin{tabularx} {#} {@{} l Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y@{}} \\
\toprule
Bottom
\bottomrule
\addlinespace[.75ex]
\end{tabularx}
\par
\scriptsize{\emph{Source: }#}
\normalsize
\end{center}
\end{document}
Copy and paste the above code into Notepad. Save the top code as a .tex file called “top” and the bottom code as a .tex file called “bot”.
Making Tables in Stata
With Latex set up, you are ready to make tables. If you have only used -tabout- to produce Excel spreadsheets, then you are likely unfamiliar with the -style()- options. These options have some application for the .xls and .csv outputs you are familiar with, but they are very limited. Most of the options are intended for Latex outputs. The following is a sample table and an explanation of the code:
sysuse nlsw88, clear
tabout race union using table1.tex, ///
cells(freq col) format(1) clab(Freq Col_%) ///
replace ///
style(tex) bt font(bold) cl1(2-7) cl2(2-3 4-5 6-7) ///
topf(top.tex) botf(bot.tex) topstr(14cm) botstr(nlsw88.dta)
Everything in the code before -style()- should be familiar
- style(tex) – tells Stata that the style applies to a Latex document
- bt – sets the “booktab” option, which creates a more pleasing table
- cl1(2-7) – defines the first sub-line. This is the line right below “union worker.” The values tells Stata to create a continuous line from columns 2 to 7.
- cl1(2-3 4-5 6-7) – defines the second sub-line. This is the line below the column headers. The code tells the line to break between columns 3/4 and 5/6.
- topf(top.tex) – calls the top code that you created earlier. Pass the path of the top file into it
- botf(bot.tex) – calls the bottom code that you created earlier. Pass the path of the bottom file into it
- topstr(14cm) – sets the font size
- botstr(nlsw88.dta) – names the source of the data, which is listed at the bottom of the table
If you want to create this table yourself, copy-paste the code above into a do file. After running it, a Latex file called “table1.tex” will be created. Open the file in TeXStudio, then run the code. This will create a PDF with this table.
All the other usual -tabout- options will work in a Latex output as well. See the help file for more. A caveat however, is that -tabout- doesn’t ensure your table will fit on the page. For instance, if you have a table with 20 columns, part of the table might be pushed off the page. You could try switching from a wide to a tall table using -layout()-, but other than that, what you see is what you get. Part of the reason for this is that Latex was designed for readability, and if your table doesn’t fit on the page with the default formats, then you probably shouldn’t feel comfortable publishing such a large table. However, there are ways around this constraint with more advanced Latex coding.