3 Data Management
Throughout this chapter, we’ll be using the “auto” data set which is distributed with Stata. It can be loaded via
sysuse auto
. data) (1978 automobile
You can reload it as necessary (if you modify it and want the original) by re-running this with the clear
option. Feel free to make frequent use of preserve
and restore
.
Whenever you first begin working with a data set, you’ll want to spend some time getting familiar with it. You should be able to answer the following questions (even if some of them are approximations):
- How many observations does your data have?
- How many variables are in your data set?
- What is the unit of analysis? (What does each row represent - a person? a couple? a classroom?)
- Is there a variable which uniquely identifies each unit of analysis?
- If the data is repeated measures in some form (multiple rows per person, or data is students across several classrooms), what variable(s) identify the levels and groups?
- Are there any variables which are strings (non-numeric) that you plan on using in some statistical analysis (and will need to be converted to numeric)?
- Which variables are continuous (can take on any value in some reasonable range, such as weight) vs which are categorical (take on a set number of values where each represents something).
You’ll notice that there are no statistical questions here - we’re not even worried about those yet! These are merely logistical.
In this chapter we’ll go over various tools and commands you can use to answer these questions (and more) and overall to gain a familiarity with the logistics of your data.
3.1 Referring to variables
When we discussed basic command syntax, we said that the optional list of variables can include any number of variables (for some commands). Writing out all the variables can get very tedious as the number of variables increases. Thankfully there are two alternatives.
First, we can use the wild card *
1. For example, we could refer to the variables “turn” and “trunk” as t*
, as both variables start with “t” and are followed by anything. However, be careful, as this would also match variables such as turnips
, tprice
, t
, etc, if any such variables existed. It can also be used in the middle or beginning, e.g.:
c*t
would matchcat
,caught
andct
*2
would matchage2
,gender2
andsalary2012
.
Multiple *
can also be used, e.g.:
*age*
would matchbirthage
,age_last
orlatest_age_given
. (It would also matchwages
, so be careful with this!)
Alternatively, if the variables we want to include are next to each other in the data (e.g. in the Variables pane), we can refer to a list of them. Say the variables x1
through x25
are in ordered in the logical fashion. We could refer to the whole list of them as x1-x25
. This includes both x1
and x25
, as well as any variable in between them. We will discuss the order
command later to re-order variables.
Finally, you often don’t need to give the entire name of the variable, just enough characters for Stata to be able to uniquely identify it (similar to short names). We’ll see in a minute more about the describe
command, but for example,
describe headr
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------float %6.1f Headroom (in.) headroom
Stata will error if you don’t use enough characters:
describe t
.
t ambiguous abbreviationr(111);
Be very careful with this approach. I only recommend it’s use when exploring the data using the Command window; when writing a Do-file, use the full variable name to prevent errors!
3.2 Describing the data
The first command you should run is describe
.
describe
.
data from /Applications/Stata/ado/base/a/auto.dta
Contains data
Observations: 74 1978 automobile
Variables: 12 13 Apr 2022 17:45_dta has notes)
(
-------------------------------------------------------------------------------
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------model
make str18 %-18s Make and int %8.0gc Price
price int %8.0g Mileage (mpg)
mpg int %8.0g Repair record 1978
rep78 float %6.1f Headroom (in.)
headroom int %8.0g Trunk space (cu. ft.)
trunk weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
int %8.0g Turn circle (ft.)
turn int %8.0g Displacement (cu. in.)
displacement float %6.2f Gear ratio
gear_ratio byte %8.0g origin Car origin
foreign
-------------------------------------------------------------------------------by: foreign Sorted
This displays a large amount of information, so let’s break in down.
First, the header displays general data set information - the number of observations (obs
, the number of rows) and variables (vars
).
Next, there is a table listing each variable in the data and some information about them. The “storage type” can be one of byte
, int
, long
, float
, double
; all of which are simply numbers. We’ll touch on the differences between these when we discuss compress
, but for now they all represent numeric variables. String variables are represented as str##
where the ##
represent the number of characters that the string can be, e.g. str18
shows that make
has a maximum of 18 letters. (This limit is irrelevant, again, see compress
for details.)
The “display format” column contains format information about each variable which only control how the variables are displayed in data view. For the most part you can ignore these and leave them at the default, though you may need to work with this if you have date or time information. For further details see
help formats
“value label” and “variable label” are used to display more information when running analyses using these variables. See the label section for further details.
Finally, if the data is sorted, describe
will provide information about the sorting.
describe
can also be used on a per variable basis:
describe mpg
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------int %8.0g Mileage (mpg)
mpg
describe trunk displacement
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------int %8.0g Trunk space (cu. ft.)
trunk int %8.0g Displacement (cu. in.) displacement
If you have a very large number of variables you may wish to suppress the table of variables entirely:
describe, short
.
data from /Applications/Stata/ado/base/a/auto.dta
Contains data
Observations: 74 1978 automobile
Variables: 12 13 Apr 2022 17:45by: foreign Sorted
Alternatively, the simple
option returns only the names of the variables, in column-dominant order (meaning you read down the columns not across the rows).
describe, simple
. weight displacement
make rep78 length gear_ratio
price headroom mpg trunk turn foreign
3.3 Compressing data
As mentioned above, there are different ways to store a number variable, such as byte
and long
. The various options take more space to save - types which take less space can store only smaller numbers whereas types that take more space can store larger numbers. For example, a number stored as a byte can only take on values between -127 and 100 and only integers (e.g. not 2.5) whereas a number stored as float can store numbers up to \(1.7x10^{38}\) with up to 38 decimal places. Strings operate similarly; a string variable with 20 characters would store “abc” as 17 blank characters followed by the “abc”.
Understanding the above is not that important these days as computer power and storage has increased to the point where the majority of us will not be reaching its limits. However, Stata does offer the compress
command which attempts to store variables in the smallest possible type. For example, if a variable which is a float takes on only values 1 through 10, it is replaced by a byte (and similarly, strings are as long as the longest value).
The memory
command lets us see the size of our data, particularlly the first entry of “Data” under “Used” shows that we start with 3,182 bytes or roughly 3Kb.
memory
.
Memory usage
Used Allocated
----------------------------------------------------------------------
Data 3,182 67,108,864
strLs 0 0
----------------------------------------------------------------------
Data & strLs 3,182 67,108,864
----------------------------------------------------------------------
Data & strLs 3,182 67,108,864
Variable names, %fmts, ... 4,370 71,230
Overhead 1,081,344 1,082,136
Stata matrices 0 0
ado-files 22,919 22,919
Stored results 0 0
Mata matrices 1,904 1,904
Mata functions 1,632 1,632
set maxvar usage 4,636,521 4,636,521
Other 14,828 14,828
----------------------------------------------------------------------
Total 5,752,072 72,940,034
compress
. variable mpg was int now byte
variable rep78 was int now byte
variable trunk was int now byte
variable turn was int now byte
variable make was str18 now str17
(370 bytes saved)
memory
.
Memory usage
Used Allocated
----------------------------------------------------------------------
Data 2,812 67,108,864
strLs 0 0
----------------------------------------------------------------------
Data & strLs 2,812 67,108,864
----------------------------------------------------------------------
Data & strLs 2,812 67,108,864
Variable names, %fmts, ... 4,370 71,230
Overhead 1,081,344 1,082,136
Stata matrices 0 0
ado-files 22,919 22,919
Stored results 0 0
Mata matrices 1,904 1,904
Mata functions 1,632 1,632
set maxvar usage 4,636,521 4,636,521
Other 14,828 14,828
---------------------------------------------------------------------- Total 5,751,702 72,940,034
We see here a very modest saving (370 bytes, about 12%), but sometimes you can see much more significant gains.
(When running Stata, instead of using memory
, you can look at the “Size” entry in the properties pane.)
Don’t be afraid of artificially restricting yourself going forward; if one of your values exceeds the limitations its type supports, Stata will automatically change types. So don’t hesitate to run compress
when loading new data or after some manipulations.
3.4 Exercise 2
- “census9” is accesible via
webuse
. Load it. - Spend a minute looking at the data. What does this data seem to represent? What variables do we have? (
describe
will come in handy here!) - Are there any missing states?
- What variables (if any) are numeric and what variables (if any) are strings?
- Compress the data. How much space is saved? Why do you think this is?
3.5 Labels
A raw data set is very sparse on context. In addition to the data itself, it will have at most a variable name, which in Stata cannot include spaces and is limited to 32 characters. All other context associated with the data must either be documented in a data dictionary or exist in the recollection of the analyst.
In an Excel file, to get around this, you might add additional content to the sheet outside of the raw data - a note here, a subtitle there, etc. However, Stata does not allow such arbitrary storage. In contrast, Stata allows you to directly label parts of the data with context information which will be displayed in the appropriate Results, to make Stata output much easier to read as well as removing the need for an external data dictionary.
3.5.1 Labeling variables
Variables names, as mentioned, are limited to 32 characters and do not allow spaces (or several other special characters). This is to encourage you to choose short, simple, and memorable variable names, since you’ll likely be typing them a lot!
We can easily add more information with a variable label. If you look at the describe
output, you’ll notice that the auto data set already has variable labels applied to it.
describe
.
data from /Applications/Stata/ado/base/a/auto.dta
Contains data
Observations: 74 1978 automobile
Variables: 12 13 Apr 2022 17:45_dta has notes)
(
-------------------------------------------------------------------------------
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------model
make str17 %-17s Make and int %8.0gc Price
price byte %8.0g Mileage (mpg)
mpg byte %8.0g Repair record 1978
rep78 float %6.1f Headroom (in.)
headroom byte %8.0g Trunk space (cu. ft.)
trunk weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
byte %8.0g Turn circle (ft.)
turn int %8.0g Displacement (cu. in.)
displacement float %6.2f Gear ratio
gear_ratio byte %8.0g origin Car origin
foreign
-------------------------------------------------------------------------------by: foreign
Sorted last saved. Note: Dataset has changed since
We can see variable rep78
(an utterly incomprehensible name at first glance, as opposed to mpg
) has the label “Repair Record 1978”. You can apply your own variable labels (or overwrite existing by using the command:
label variable <variable name> "Variable label"
For example:
label variable turn "Some new label for turn"
.
describe turn
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------byte %8.0g Some new label for turn turn
To remove a variable label, you can call label variable <varname>
without a new label to remove the existing one. (Equivalent to label variable <varname> ""
, so passing an empty variable label.)
label variable turn
.
describe turn
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------byte %8.0g turn
3.5.2 Labeling values
It is tempting to store categorical variables as strings. If you ask, for example, for someone’s state of residence, you might store the answers as “MI”, “OH”, “FL”, etc. However, Stata (like most statistical software) cannot handle string variables.2 A much better way to store this data would be to assign each state a numerical value, say MI = 1, OH = 2, FL = 3, etc, then keep a data dictionary linking the values to the labels they represent.
Stata allows you to store this value labels information within the data set, such that whenever the values are output, the labels are printed instead. Let’s take a look at the foreign
variable. This variable takes on two levels, and we can tabulate it to see how many cars are in each category.
tabulate foreign
.
Car origin | Freq. Percent Cum.
------------+-----------------------------------
Domestic | 52 70.27 70.27
Foreign | 22 29.73 100.00
------------+----------------------------------- Total | 74 100.00
Here it appears that foreign
is stored as two strings, but we know from describe
that it is not:
describe foreign
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------byte %8.0g origin Car origin foreign
Additionally, if you look at the data through Data Editor or Data Browser, you see that instead of foreign
being red (as a string) it is blue, as we discussed earlier.
Let’s look at that table ignoring the value labels:
tabulate foreign, nolabel
.
Car origin | Freq. Percent Cum.
------------+-----------------------------------
0 | 52 70.27 70.27
1 | 22 29.73 100.00
------------+----------------------------------- Total | 74 100.00
Now we see the values which are actually stored.
Look at the describe
output:
describe foreign
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------byte %8.0g origin Car origin foreign
You’ll notice that the “value label” column has origin
attached to foreign
. In Stata, the value labels information are stored separately from the variables. They are two separate components - there is a variable and there is a value label. You connect the value label to a variable to use it.
The benefit of this separated structure is that if you have several variables which have the same coding scheme (e.g. a series of Likert scale questions, where 1-5 represents “Strongly Disagree”-“Strongly Agree”), you can create a single value label and rapidly apply it to all variables necessary.
Correspondingly, this process requires two commands. First we define the value labels:
label define <label name> <value> "<label>" <value> "<label>" .....
For example, if we wanted to recreate the value label associated with foreign
:
label define foreign_label 0 "Domestic" 1 "Foreign" .
Value labels exist in the data set, regardless of whether they are attached to any variables, we can see all value labels:
label list
.
foreign_label:
0 Domestic
1 Foreign
origin:
0 Domestic 1 Foreign
Here we see the original origin
as well as our new foreign_label
. To attach it to the variable foreign
:
label values foreign foreign_label .
If we wanted to attach a single value label to multiple variables, we could simply provide a list of variables:
label values <var1> <var2> <var3> <label>
To remove the value labels from a variable, use the label values <variable>
command with no label name following the variable name:
label values foreign
.
describe foreign
.
Variable Storage Display Valuename type format label Variable label
-------------------------------------------------------------------------------byte %8.0g Car origin foreign
You can view all value labels in the data set:
label list
.
foreign_label:
0 Domestic
1 Foreign
origin:
0 Domestic 1 Foreign
Note that value labels exist within a data set regardless of whether they are attached to a variable. If there is a label value that you no longer want to keep in the data-set, you can drop it:
label drop foreign_label
.
label list
.
origin:
0 Domestic 1 Foreign
This will not remove the value labels from any variables, but they will no longer be active (i.e. if you run describe
it will still show that the value labels are attached, but running tabulate
will not use them). So in order to completely remove a value label, you’ll need to both remove it from the variable as well as the data.
Do not forget that modifying value labels counts as modifying the data. Make sure you save, replace
after making these modifications (if you want to keep them) or they’ll be lost!
3.6 Managing variables
In Stata, managing the names and order of variables is important to make entering commands easier due to the shortcuts for referring to variables. Recall that variables can be referred to using wildcards (e.g. a*
to include age
, address
or a10
, or using var1-var10
to include all variables between var1
and var10
as they are ordered). Of course, you may also want to rename or re-order variables for other reasons such as making the data easier to look at.
3.6.1 Renaming variables
To rename variables:
rename <oldname> <newname>
For example:
rename rep78 repair_record_1978
.
describe, simple
. weight displacement
make repair_~1978 length gear_ratio
price headroom mpg trunk turn foreign
The output truncated the name because it was so long.
Variable names are unique; if you wanted to swap to variable names, you’d have to name one to a temporary name, rename the second, then rename the first again:
rename a tmp
rename b a
rename tmp b
You can use wildcards in the renaming too. For example, imagine you had a collection of variables from a longitudinal data set, “visit1_age”, “visit1_weight”, “visit1_height”, etc. To simplify variable names, we’d prefer to use “age1”, “weight1”, etc.
rename visit1_* *1
Finally, you can change a variable name to upper/lower/proper case easily by passing upper
/lower
/proper
as an argument and giving no new variable name.
rename length, upper
.
describe, simple
. weight displacement
make repair_~1978
price headroom LENGTH gear_ratio mpg trunk turn foreign
You can also do this for the whole data set with the special variable list _all
:
rename _all, upper
.
describe, simple
.
MAKE REPAIR_~1978 WEIGHT DISPLACEMENT
PRICE HEADROOM LENGTH GEAR_RATIO
MPG TRUNK TURN FOREIGN
rename _all, lower .
3.6.2 Changing variable ordering
The order
command takes a list of variables and moves them to the front/end/before a certain variable/after a certain variable. The options first
, last
, before(<variable>)
and after(<variable>)
control this.
order foreign // The default is `first`
.
describe, simple
.
foreign mpg trunk turnweight displacement
make repair_~1978 length gear_ratio
price headroom
order weight, last
.
describe, simple
.
foreign mpg trunk displacementlength gear_ratio
make repair_~1978 weight
price headroom turn
order mpg trunk, before(displacement)
.
describe, simple
.
foreign repair_~1978 turn displacement
make headroom mpg gear_ratiolength trunk weight price
3.7 Exercise 3
If you’ve changed to a different data set, load “census9” back up with webuse
.
- Going forward, we’ll be using a version of “census9” with changes we’re making. Save a copy of the data to somewhere convenient (such as your Desktop). Don’t forget to give it a name!
- The
drate
variable is a bit vague - the name of the variable provides no clue that “d” = “death”, and the values (e.g. 75) are ambiguous. - The variable
region
has a value label associated with it (“cenreg”). It has some inconsistent choices, namely “NE” and “N Cntrl”. Fix this.- Create a new value label which uses “Northeast” and “North Central” instead of “NE” and “N Cntrl”.
- Attach this new value label to
region
. - Remove the existing value label “cenreg”.
- Use
label list
andtabulate
to confirm it worked.
- Save the data, replacing the version you created in step 1.
3.8 Summarizing the data
While these notes will not cover anything statistical, it can be handy from a data management perspective to look at some summary statistics, mostly to identify problematic variables. Among other things, we can try and detect
- Unexpectedly missing data
- Incorrectly coded data
- Errors/typos in input data
- Incorrect assumptions about variable values
There are numerous ways to look at summary statistics, from obtaining one-number summaries to visualizations, but we will focus on two Stata commands, summarize
and codebook
.
summarize
produces basic summary statistics for numeric variables.
summarize
.
dev. Min Max
Variable | Obs Mean Std.
-------------+---------------------------------------------------------
foreign | 74 .2972973 .4601885 0 1
make | 0
price | 74 6165.257 2949.496 3291 15906
repair_~1978 | 69 3.405797 .9899323 1 5
headroom | 74 2.993243 .8459948 1.5 5
-------------+---------------------------------------------------------length | 74 187.9324 22.26634 142 233
turn | 74 39.64865 4.399354 31 51
mpg | 74 21.2973 5.785503 12 41
trunk | 74 13.75676 4.277404 5 23
displacement | 74 197.2973 91.83722 79 425
-------------+---------------------------------------------------------
gear_ratio | 74 3.014865 .4562871 2.19 3.89weight | 74 3019.459 777.1936 1760 4840
The table reports the total number of non-missing values (make
appears to be entirely missing because it is non-numeric), the mean (the average value), the standard deviation (a measure of how spread out the data is) and the minimum and maximum non-missing3 values observed.
Here’s some suggestions of how to look at these values.
- Make sure the amount of missing data is expected. If the number of observations is lower than anticipated, is it an issue with the data collection? Or did the import into Stata cause issues? 5 cars have no
repair_record_1978
. - The mean should be a reasonable number, somewhere in the rough middle of the range of possible values for the variable. If you have age recorded for a random selection of adults and the mean age is 18, something has gone wrong. If the mean age is -18, something has gone tragically wrong!
- The standard deviation is hard to interpret precisely, but in a very rough sense, 2/3rds of the values should lie within 1 standard deviation of the mean. For example, consider
mpg
. The mean is ~21 and the standard deviation is ~6, so roughly 2/3rds of the cars havempg
between 15 and 27. Does this seems reasonable? If the standard deviation is very high compared to the mean (e.g. ifmpg
’s standard deviation was 50) or close to 0, that could indicate an issue. - Are the max and min reasonable? If the minimum
LENGTH
was -9, that’s problematic. Maybe -9 is the code for missing values? If the range ofLENGTH
is 14 to 2500, maybe the units differ? They measured in feet for some cars and inches for others?
summarize
can also take a list of variables, e.g.
summarize t*
.
dev. Min Max
Variable | Obs Mean Std.
-------------+---------------------------------------------------------
turn | 74 39.64865 4.399354 31 51 trunk | 74 13.75676 4.277404 5 23
For more detailed information, we can look at the codebook. codebook
works similarly to describe
and summarize
in the sense that without any additional arguments, it displays information on every variable; you can pass it a list of variables to only operate on those. Because codebook
’s output is quite long, we only demonstrate the restricted version. Before we do that, we reload the “auto” data because we messed with it quite a bit earlier!
First, categorical data:
sysuse auto, clear
. data)
(1978 automobile
codebook rep78
.
-------------------------------------------------------------------------------
rep78 Repair record 1978
-------------------------------------------------------------------------------
int)
Type: Numeric (
Range: [1,5] Units: 1values: 5 Missing .: 5/74
Unique
Tabulation: Freq. Value
2 1
8 2
30 3
18 4
11 5 5 .
We see that rep78
takes on five unique values, as well as having some missing values (.
). If the unique values is more than expected, it’s something to investigate.
Next, continuous variables:
codebook price
.
-------------------------------------------------------------------------------
price Price
-------------------------------------------------------------------------------
int)
Type: Numeric (
Range: [3291,15906] Units: 1values: 74 Missing .: 0/74
Unique
Mean: 6165.26dev.: 2949.5
Std.
Percentiles: 10% 25% 50% 75% 90% 3895 4195 5006.5 6342 11385
We still see the number of unique values reported, but here every observation has a unique value (74 unique values, 74 rows in the data). There is no missing data. The percentiles should be checked to see if they’re reasonable, if 90% of the cars had a price under $100, something’s not right.
Finally, string variables:
codebook make
.
-------------------------------------------------------------------------------model
make Make and
-------------------------------------------------------------------------------
Type: String (str18), but longest is str17
values: 74 Missing "": 0/74
Unique
"Cad. Deville"
Examples: "Dodge Magnum"
"Merc. XR-7"
"Pont. Catalina"
Warning: Variable has embedded blanks.
We get less information here, but still useful to check that the data is as expected. There are no empty strings nor any repeated strings. The warning about “embedded blanks” is spaces; it’s telling us that there are spaces in some of the cars (e.g. “Dodge Magnum”). The reason it is a warning is that “Dodge_Magnum” and “Dodge__Magnum” read the same to us, but that extra space means Stata recognizes these as two different variables.
Two options for codebook
which come in handy:
codebook, compact
.
Variable Obs Unique Mean Min Max Label
-------------------------------------------------------------------------------model
make 74 74 . . . Make and
price 74 74 6165.257 3291 15906 Price
mpg 74 21 21.2973 12 41 Mileage (mpg)
rep78 69 5 3.405797 1 5 Repair record 1978in.)
headroom 74 8 2.993243 1.5 5 Headroom (
trunk 74 18 13.75676 5 23 Trunk space (cu. ft.)weight 74 64 3019.459 1760 4840 Weight (lbs.)
length 74 47 187.9324 142 233 Length (in.)
circle (ft.)
turn 74 18 39.64865 31 51 Turn in.)
displacement 74 31 197.2973 79 425 Displacement (cu. ratio
gear_ratio 74 36 3.014865 2.19 3.89 Gear
foreign 74 2 .2972973 0 1 Car origin -------------------------------------------------------------------------------
compact
shows a reduced size version, most useful for the “Unique” column. (Useful if that’s the only thing you’re running the codebook for.)
codebook, problems
.
in dataset /Applications/Stata/ado/base/a/auto.dta
Potential problems
Potential problem Variables
--------------------------------------------------be compressed make
str# vars that may string vars with embedded blanks make
--------------------------------------------------
This reports potential issues Stata has discovered in the data. In this data, neither are really concerns. (We can run compress
, but this isn’t a “problem” so much as a suggestion. We already saw above the concern about “embedded blanks.”) More serious problems that it can detect include:
- Constant columns (all entries being the same value, including all missing).
- Issues with value labels (if you’ve attached a value label to a variable and subsequently deleted the value label without detaching it; or if your variable takes on values unaccounted for in the value label).
- Issues with date variables.
3.9 Exercise 4
Using summarize
and codebook
to explore the “census9” data (use the version you saved earlier!) and answer the following questions:
- Are there any values which seem to be errors?
- I’d expect each state to have their own unique value of death rate, population and median age. Is this true? If not, why?
- Are there any problems with the data?
This is the reason why
*
as a comment does not work in the middle of a line (and we use//
instead).↩︎In the few situations where it can, it doesn’t handle them cleanly.↩︎
As we discuss later, in Stata, missing values (represented by
.
in the data) are considered to be higher than any other number (so 99999 < .).↩︎