/******************************************************************************
OVERVIEW:
This file estimates quantile and average treatment effects using variants of
the changes-in-changes model proposed by Athey & Imbens (2006, Econometrica).
The model constructs a counterfactual CDF of outcomes for the treatment group
in the treatment period that would have occurred in the absence of treatment
and compares this to the observed distribution of outcomes for the treatment
group in the treatment period. The horizontal difference between observed and
counterfactual CDF at quantile Q equals the Qth quantile treatment effect on
the treated (QTET), under assumptions laid out in the Athey-Imbens paper.
This file estimates (1) the original changes-in-changes model without any
covariate adjustment, (2) a changes-in-changes model using residualization to
adjust for covariate differences between groups, and (3) a changes-in-changes
model using propensity score reweighting to adjust for covariate differences
between groups. The first two models are proposed in the original Athey-Imbens
article. The third model is developed in Garlick (2014), which is available at
www.robgarlick.com/garlick_peereffects_tracking.pdf. The third model allows
for time trends and treatment effects to vary over observed characteristics,
while the model using residualization does not.
This file generates graphs showing, for each model, the observed and counter-
factual distributions and the QTETs. It also saves the QTET in a dataset that
can be used to construct additional graphs and tables. It finally calculates
the average treatment effect on the treated (ATET) from the changes-in-changes
models (which may differ from the linear difference-in-differences model) and
treatment effects on the variance and Gini coefficient.
Created by Robert Garlick and updated in January 2016. Most recent version is
available at www.robgarlick.com/nonlinear_diffindiff.do. Comments are welcome
at robert.garlick@duke.edu.
INPUTS:
This file run on a dataset, which must contain the following variables:
1) An continuous variable for the outcome.
2) A categorial (not string) variable which takes four values: 11 for the
treatment group in the post-treatment period, 10 for the treatment group
in the pre- treatment period, 1 for the control group in the post-treatment
period, and 0 for the control group in the pre-treatment period (following
Athey & Imbens' notation).
Your dataset may also contain the following variables:
3) A set of covariates that may be continuous, discrete or categorical. This
may contain interactions, polynomial terms, etc.
4) A variable defining the cluster structure of the data.
The file can accommodate any names for these variables. You should use the
macros in lines 153-168 to declare the names of the variables in your dataset.
You should also declare the name of the dataset, the directory where it is
saved, and the directory where you want to save the file output.
CUSTOMIZATION:
This file can be customized on several dimensions. These are controlled by
macros declared in lines 185-212 of the file.
1) The file can run the unconditional, residualized, or reweighted models, or
any combination of these models.
2) The file can, if running the reweighted model, use weights that equate the
distribution of observed characteristics through time within each group or
equate the distribution of observed characteristics across all groups and
time periods. See Garlick (2014) for more discussion on this distinction.
3) The file can estimate quantile treatment effects on the treated at up to
199 quantiles (percentiles 0.5 to 99.5). The only restriction is that the
chosen quantiles must be evenly spaced. For example, (25,50,75) and (10,20,
...,80,90) work but (10,25,50,75,90) does not. Using a low number of
quantiles generates very inaccurate estimates of the average, variance, and
Gini coefficient treatment effects on the treated.
4) The file can use any number of bootstrap replications to construct the
confidence intervals. Using 1 replication will run the file on the original
data and not generate confidence intervals.
5) The file can choose the preferred output format for graphs.
OUTPUTS:
This file generates the following outputs:
1) Graphs showing the observed and counterfactual outcome CDFs, evaluated at
the user-chosen quantiles, in a user-chosen format. One graph is created
for each model implemented by the file. The graph filenames have the
structure ``cdf_''.
2) Graphs showing the quantile treatment effects on the treated and bootstrap
95% pointwise confidence intervals, evaluated at user-chosen quantiles, in
a user-chosen format. One graph is created for each model implemented by
the file. The graph filenames have the structure ``treateffects_''.
3) Datasets containing the quantile treatment effects on the treated and the
bounds of the bootstrap 95% pointwise bootstrap confidence intervals,
evaluated at user-chosen quantiles. The dataset filenames have the
structure ``treateffects__fordisplay.dta''.
4) Datasets containing the point estimates of the observed and counterfactual
outcome CDFs evaluated at all user-chosen quantiles, for each bootstrap
replication. The rows of these datasets index bootstrap replications and
the columns index quantiles. The dataset filenames have the structure
``_obsd_rawbreps.dta'' and ``_cafc_rawbreps.dta'' for the
observed and counterfactual CDFs respectively. These can be used for
further analysis or to construct additional graphs.
4) Datasets containing the point estimates of the quantile treatment effects
on the treated at all user-chosen quantiles, for each bootstrap
replication. The rows of these datasets index bootstrap replications and
the columns index quantiles. The dataset filenames have the structure
``_rawbreps.dta''. These can be used for further analysis or to
construct additional graphs.
5) A dataset showing the proportion of quantiles that violate the contained
support assumption for the changes-in-changes model for each bootstrap
replication. This dataset should give guidance to researchers about whether
the bootstrap confidence intervals might give incorrect coverage because
the contained support assumption does not hold in some subsamples. The
dataset is named diagnostic_missing_quantiles.dta and contains results for
all of the models estimated by the file.
Note that the log file also reports which quantiles violate the contained
support assumption in the original dataset.
6) A text file showing the average, variance, and Gini coefficient treatment
effects on the treated for all models and their bootstrap 95% confidence
intervals. Recall that the Gini coefficient is only appropriate for data
with non-negative support.
In all cases, is one of unconditional, residualized, or reweighted.
PROVISOS:
This file does not:
- Implement the analytical standard errors derived in the Athey-Imbens paper.
- Allow for discrete outcome variables.
- Allow for instrumental variables estimation of the changes-in-changes model.
******************************************************************************/
clear
macro drop _all
program drop _all
capture log close
capture file close myfile
set more off
set mem 1000M
set seed 83174914
/******************************************************************************
Define directory structure and declare variable names
******************************************************************************/
global output = ""
/** Define directory for saving output (log, graphs, datasets, reports). **/
global data = ""
/** Define directory for loading & saving data **/
global dataset = ""
/** Define name of dataset **/
global Y = ""
/** Specify outcome variable, should be continuous **/
global Yname = ""
/** Specify the outcome variable name that you want displayed on graphs **/
global G = ""
/** Specify variable defining groups, must have values 11, 10, 1, 0. **/
global X = ""
/** Specify elements in control vector **/
global C = ""
/** Specify variable defining clusters, if data has clustered structure. **/
log using "${output}nonlinear_diffindiff.log", text replace
file open myfile using "${output}QTEs.txt", write replace
/******************************************************************************
Customize file output:
- how many bootstrap replications
- which quantiles to estimate
- which estimators to run
- which reweighting method to use
******************************************************************************/
global breps = 1
/** Set number of iterations for the bootstrap, 1=use original data only **/
global pct_min = 0.5
/** Set the minimum quantile to be estimated **/
global pct_step = 0.5
/** Set the step between estimated quantiles **/
global pct_max = 99.5
/** Set the maximum quantile to the estimated **/
global pct_nmbr = 1 + ( $pct_max - $pct_min ) / $pct_step
global u = 1
global r = 1
global w = 1
/** These macros control which estimation procedure is implemented. Set $u =
1 to implement the original QTT procedure without controls. Set $r = 1 to
control for differences in observed characteristics using regression. Set
$w = 1 to control for cross-group differences in observed characteristics
using propensity score reweighting. **/
global weight = 1
/** This macro controls which propensity score reweighting method is used.
Set $weight = 1 to control for only intertemporal differences in observed
characteristics within each group. Set $weight = 2 to control for inter-
temporal and cross-group differences in observed characteristics. **/
global format = "eps"
/** Define format in which to save graphs. **/
/******************************************************************************
Update program parameters based on customization.
******************************************************************************/
if $u==1 & $r==1 & $w==1 {
global estimators = "u r w"
}
if $u==1 & $r==1 & $w==0 {
global estimators = "u r"
}
if $u==1 & $r==0 & $w==1 {
global estimators = "u w"
}
if $u==0 & $r==1 & $w==1 {
global estimators = "r w"
}
if $u==1 & $r==0 & $w==0 {
global estimators = "u"
}
if $u==0 & $r==1 & $w==0 {
global estimators = "r"
}
if $u==0 & $r==0 & $w==1 {
global estimators = "w"
}
global distributions = "obsd cfac diff"
foreach d in $distributions {
foreach e in $estimators {
matrix define `d'`e' = J($pct_nmbr,$breps+1,.)
local counter = 1
forvalues q = $pct_min($pct_step)$pct_max {
matrix `d'`e'[`counter',1] = `q'
local counter = `counter' + 1
}
}
matrix define gini_`d' = J($breps,1,.)
matrix define mean_`d' = J($breps,1,.)
matrix define vrnc_`d' = J($breps,1,.)
}
foreach e in $estimators {
matrix define pctg`e' = J($breps,1,0)
}
/******************************************************************************
Read data and manipulate data
******************************************************************************/
use "${data}${dataset}.dta", clear
keep $Y $G $X $C
gen group11 = ($G == 11)
gen group10 = ($G == 10)
gen group01 = ($G == 1)
gen group00 = ($G == 0)
compress
/******************************************************************************
Open bootstrap loop and resample
******************************************************************************/
/** Open loop over bootstrap iterations. **/
forvalues b = 1(1)$breps {
preserve
/** Create bootstrap sample, stratifying by group & clustering as needed. **/
if `b' > 1 {
if "$C" != "" {
bsample, strata($G) cluster($C)
}
else {
bsample, strata($G) cluster($C)
}
}
/****************************
Unconditional QTT estimator
****************************/
/** Open if loop over unconditional estimator. **/
if $u == 1 {
/** Record size of group 0 in period 0, used for later normalization. **/
qui count if $G==0
local tempsize = r(N)
/** Open for loop over quantiles. **/
local counter = 1
forvalues q = $pct_min($pct_step)$pct_max {
_pctile $Y if $G==11, percentiles(`q')
matrix obsdu[`counter',`b'+1] = r(r1)
/** Record outcome value at quantile `q' of group 1 in period 1. This
is a point on the observed outcome CDF under treatment. **/
_pctile $Y if $G==10, percentiles(`q')
local yprime = r(r1)
/** Record outcome value at quantile `q' for group 1 in period 0. **/
qui count if $Y<=`yprime' & $G==0
local qprime = 100*r(N)/`tempsize'
/** Record outcome quantile at `yprime' for group 0 in period 0. **/
if (`qprime'<=0 | `qprime'>=100) {
if `b' == 1 {
display ///
"Quantile `q' outside original data support for unconditional model"
}
matrix pctgu[`b',1] = pctgu[`b',1] + 100/$pct_nmbr
/** Display warning message if the outcome quantile at `yprime' for
group 0 in period 0 is outside the support of the original data,
as the quantile treatment effect is then not defined at that
quantile. Also count the fraction of quantiles on each bootstrap
replication that are outside the support of the data. **/
}
else {
_pctile $Y if $G==1, percentiles(`qprime')
matrix cfacu[`counter',`b'+1] = r(r1)
/** Record outcome value at quantile `qprime' for group 0 in period
1. This is a point on the the counterfactual outcome CDF in the
absence of treatment. **/
}
matrix diffu[`counter',`b'+1] = ///
obsdu[`counter',`b'+1] - cfacu[`counter',`b'+1]
/** QTT defined as difference between observed CDF for the treatment
group in period 1 and counterfactual CDF, which is the observed
CDF for the treatment group in period 0 adjusted by the trend
for the control group between periods 0 and 1. **/
local counter = `counter' + 1
}
/** Close for loop over quantiles. **/
}
/** Close if loop over unconditional estimator. **/
/***************************
Residualized QTT estimator
***************************/
/** Open if loop over residualized estimator. **/
if $r == 1 {
local countr_`b' = 0
/** Residualize outcome by regressing on covariates. **/
qui reg $Y $X group11 group10 group01 group00, noconstant
qui predict Yres, resid
qui replace Yres = Yres + _b[group11] if $G==11
qui replace Yres = Yres + _b[group10] if $G==10
qui replace Yres = Yres + _b[group01] if $G==1
qui replace Yres = Yres + _b[group00] if $G==0
/** Record size of group 0 in period 0, used for later normalization. **/
qui count if $G==0
local tempsize = r(N)
/** Open for loop over quantiles. **/
local counter = 1
forvalues q = $pct_min($pct_step)$pct_max {
local p = `q'*2
_pctile Yres if $G==11, percentiles(`q')
matrix obsdr[`counter',`b'+1] = r(r1)
/** Record outcome value at quantile `q' of group 1 in period 1. This
is a point on the observed outcome CDF under treatment. **/
_pctile Yres if $G==10, percentiles(`q')
local yprime = r(r1)
/** Record outcome value at quantile `q' for group 1 in period 0. **/
qui count if Yres<=`yprime' & $G==0
local qprime = 100*r(N)/`tempsize'
/** Record outcome quantile at `yprime' for group 0 in period 0. **/
if (`qprime'<=0 | `qprime'>=100) {
if `b' == 1 {
display ///
"Quantile `q' outside original data support for residualized model"
}
matrix pctgr[`b',1] = pctgr[`b',1] + 100/$pct_nmbr
/** Display warning message if the outcome quantile at `yprime' for
group 0 in period 0 is outside the support of the original data,
as the quantile treatment effect is then not defined at that
quantile. Also count the fraction of quantiles on each bootstrap
replication that are outside the support of the data. **/
}
else {
_pctile Yres if $G==1, percentiles(`qprime')
matrix cfacr[`counter',`b'+1] = r(r1)
/** Record outcome value at quantile `qprime' for group 0 in period
1. This is a point on the the counterfactual outcome CDF in the
absence of treatment. **/
}
matrix diffr[`counter',`b'+1] = ///
obsdr[`counter',`b'+1] - cfacr[`counter',`b'+1]
/** QTT defined as difference between observed CDF for the treatment
group in period 1 and counterfactual CDF, which is the observed
CDF for the treatment group in period 0 adjusted by the trend
for the control group between periods 0 and 1. **/
local counter = `counter' + 1
}
/** Close for loop over quantiles. **/
}
/** Close if loop over regression-adjusted estimator. **/
/*************************
Reweighted QTT estimator
*************************/
/** Open if loop over reweighted estimator. **/
if $w == 1 {
local countw_`b' = 0
/** Generates weights by regressing treatment status on covariates. **/
qui gen weight = .
if $weight == 1 {
qui logit group11 $X if $G==11 | $G==10
qui predict phat1, p
qui replace weight = phat1/(1-phat1) if $G==10
qui replace weight = 1 if $G==11
qui logit group01 $X if $G==1 | $G==0
qui predict phat0, p
qui replace weight = phat0/(1-phat0) if $G==0
qui replace weight = 1 if $G==1
drop phat1 phat0
}
if $weight == 2 {
qui mlogit $G $X
qui predict phat0 phat1 phat10 phat11, pr
qui replace weight = 1 if $G==11
qui replace weight = phat11/phat10 if $G==10
qui replace weight = phat11/phat0 if $G==1
qui replace weight = phat11/phat1 if $G==0
drop phat*
}
/** Record size of group 0 in period 0, used for later normalization. **/
qui summ weight if $G==0
local totweight = r(N)
/** Open for loop over quantiles. **/
local counter = 1
forvalues q = $pct_min($pct_step)$pct_max {
local p = `q'*2
_pctile $Y if $G==11, percentiles(`q')
matrix obsdw[`counter',`b'+1] = r(r1)
/** Record outcome value at quantile `q' of group 1 in period 1. This
is a point on the observed outcome CDF under treatment. **/
_pctile $Y if $G==10 [aweight=weight], percentiles(`q')
local yprime = r(r1)
/** Record outcome value at quantile `q' for group 1 in period 0. **/
qui summ weight if $Y<=`yprime' & $G==0
local qprime = 100*r(mean)*r(N)/`totweight'
/** Record outcome quantile at `yprime' for group 0 in period 0. **/
if (`qprime'<=0 | `qprime'>=100) {
if `b' == 1 {
display ///
"Quantile `q' outside original data support for reweighted model"
}
matrix pctgw[`b',1] = pctgw[`b',1] + 100/$pct_nmbr
/** Display warning message if the outcome quantile at `yprime' for
group 0 in period 0 is outside the support of the original data,
as the quantile treatment effect is then not defined at that
quantile. Also count the fraction of quantiles on each bootstrap
replication that are outside the support of the data. **/
}
else {
_pctile $Y if $G==1 [aweight=weight], percentiles(`qprime')
matrix cfacw[`counter',`b'+1] = r(r1)
/** Record outcome value at quantile `qprime' for group 0 in period
1. This is a point on the the counterfactual outcome CDF in the
absence of treatment. **/
}
matrix diffw[`counter',`b'+1] = ///
obsdw[`counter',`b'+1] - cfacw[`counter',`b'+1]
/** QTT defined as difference between observed CDF for the treatment
group in period 1 and counterfactual CDF, which is the observed
CDF for the treatment group in period 0 adjusted by the trend
for the control group between periods 0 and 1. **/
local counter = `counter' + 1
}
/** Close for loop over quantiles. **/
}
/** Close if loop over reweighted estimator. **/
/**************************************************************
Graph estimates of the treated & counterfactual distributions
**************************************************************/
/** Only construct graphs for first bootstrap loop, using original data. **/
if `b'==1 {
/** Open loop over estimators (unconditional, residualized, reweighted) **/
foreach e in $estimators {
clear
qui set obs $pct_nmbr
qui svmat double obsd`e'
rename obsd`e'1 quantile
rename obsd`e'2 obsd
qui svmat double cfac`e'
rename cfac`e'2 cfac
capture drop obsd`e'*
capture drop cfac`e'*
if "`e'"=="u" {
local efull = "Unconditional distributions"
local ename = "cdfs_unconditional"
}
else if "`e'"=="r" {
local efull = "Residualized distributions"
local ename = "cdfs_residualized"
}
else if "`e'"=="w" {
local efull = "Reweighted distributions"
local ename = "cdfs_reweighted"
}
#delimit ;
graph twoway line quantile obsd, lcolor(blue) lpattern(solid)
|| line quantile cfac, lcolor(red) lpattern(longdash)
ytitle("Quantiles") xtitle("${Yname}") title("`efull'")
plotregion(style(none)) graphregion(fcolor(white) lcolor(white))
yscale(range(0 100)) ylabel(0(20)100) legend( row(1) order(1 2)
lab(1 "Treated outcomes") lab(2 "Counterfactual outcomes") );
qui graph export "${output}`ename'.$format", replace;
*!epstopdf "${output}`ename'.$format";
*erase "${output}`ename'.$format";
#delimit cr
}
/** Close loop over estimators. **/
}
/** Close if loop that generates graphs. **/
restore
}
/** Close loop over bootstrap iterations. **/
/******************************************************************************
Save datasets for the observed and counterfactual distributions containing
quantiles as columns and bootstrap replications as row to allow users to
bootstrap pointwise confidence intervals for the two distributions if desired.
******************************************************************************/
foreach e in $estimators {
if "`e'" == "u" {
local ename = "cdfs_unconditional"
}
else if "`e'" == "r" {
local ename = "cdfs_residualized"
}
else if "`e'" == "w" {
local ename = "cdfs_reweighted"
}
clear
matrix quantile = obsd`e''
svmat double quantile
forvalues i = 1(1)$pct_nmbr {
qui summ quantile`i' if _n==1
local q = r(mean)
label var quantile`i' "Quantile `q'. Variable name is just a placeholder."
}
qui save "${output}`ename'_obsd_rawbreps.dta", replace
clear
matrix quantile = cfac`e''
svmat double quantile
forvalues i = 1(1)$pct_nmbr {
qui summ quantile`i' if _n==1
local q = r(mean)
label var quantile`i' "Quantile `q'. Variable name is just a placeholder."
}
qui save "${output}`ename'_cfac_rawbreps.dta", replace
}
/******************************************************************************
Graph estimates of the treatment effect with confidence intervals
******************************************************************************/
/** Open loop over estimators (unconditional, residualized, reweighted). **/
foreach e in $estimators {
if "`e'" == "u" {
local efull = "Unconditional quantile treatment effects"
local ename = "treateffects_unconditional"
}
else if "`e'" == "r" {
local efull = "Residualized quantile treatment effects"
local ename = "treateffects_residualized"
}
else if "`e'" == "w" {
local efull = "Reweighted quantile treatment effects"
local ename = "treateffects_reweighted"
}
/** For each estimator, create a dataset with quantiles as columns and
quantile treatment effect estimates for each bootstrap iteration as
rows. Use this dataset to calculate 95% confidence intervals by taking
percentiles within each quantile/column. **/
clear
/** Generate dataset from matrix of point estimates. **/
matrix q = diff`e''
qui svmat double q
/** Compute 95% CI over the bootstrap estimates for each quantile. **/
matrix sdev = J($pct_nmbr,1,.)
matrix cilb = J($pct_nmbr,1,.)
matrix ciub = J($pct_nmbr,1,.)
forvalues i = 1(1)$pct_nmbr {
qui summ q`i'
matrix sdev[`i',1] = r(sd)
qui _pctile q`i' if _n>2, p(2.5 97.5)
matrix cilb[`i',1] = r(r1)
matrix ciub[`i',1] = r(r2)
}
qui save "${output}`ename'_rawbreps.dta", replace
/** Save dataset of bootstrap QTE estimates for all quantiles. **/
/** Create dataset with point estimates and confidence interval limits as
columns and quantiles as rows for easy graphical display. **/
clear
qui svmat double diff`e'
rename diff`e'1 quantile
rename diff`e'2 level
keep quantile level
qui svmat double cilb
qui svmat double ciub
#delimit ;
graph twoway line level quantile, lcolor(blue) lpattern(solid)
|| line cilb1 quantile, lcolor(blue) lpattern(dot)
|| line ciub1 quantile, lcolor(blue) lpattern(dot)
plotregion(style(none)) graphregion(fcolor(white) lcolor(white)) yline(0)
xtitle("Quantiles") ytitle("${Yname}") title("`efull'") legend( rows(1)
order(1 2) lab(1 "Treatment effect") lab(2 "Pointwise 95% CI") );
qui graph export "${output}`ename'.$format", replace;
*!epstopdf "${output}`ename'.$format";
*erase "${output}`ename'.$format";
#delimit cr
/** Save dataset so users can customize the reporting. **/
qui save "${output}`ename'_fordisplay.dta", replace
}
/** Close loop over estimators. **/
/******************************************************************************
Generate diagnostic dataset showing fraction of counterfactual outcome
quantiles for each model & each replication that fall outside the data support
******************************************************************************/
clear
set obs $breps
gen rep = _n
foreach e in $estimators {
qui svmat double pctg`e'
if "`e'" == "u" {
rename pctg`e'1 fraction_missing_unconditional
}
if "`e'" == "r" {
rename pctg`e'1 fraction_missing_residualized
}
if "`e'" == "w" {
rename pctg`e'1 fraction_missing_reweighted
}
}
save "${output}diagnostic_missing_quantiles.dta", replace
/******************************************************************************
Calculate average and selected inequality treatment effects from vector of
quantile treatment effects
******************************************************************************/
/** Open loop over estimators (unconditional, residualized, reweighted). **/
foreach e in $estimators {
if "`e'" == "u" {
local ename = "unconditional estimator"
}
else if "`e'" == "r" {
local ename = "residualized estimator"
}
else if "`e'" == "w" {
local ename = "reweighted estimator"
}
/** Create dataset with 2Q rows and B columns. Each row corresponds to a
quantile of the observed outcome CDF (rows 1->Q) or a quantile of the
counterfactual outcome CDF (rows Q+1->2Q). Each column corresponds to
a bootstrap replication. So element (q,b) contains the qth quantile of
the observed/counterfactual outcome CDF from the bth bootstrap rep. **/
local lim1 = 1
local lim2 = $pct_nmbr
local lim3 = 1+$pct_nmbr
local lim4 = 2*$pct_nmbr
clear
matrix rep = J(2*$pct_nmbr,$breps,.)
matrix rep[`lim1', 1] = obsd`e'[`lim1'..., 2...]
matrix rep[`lim3', 1] = cfac`e'[`lim1'..., 2...]
qui svmat double rep
/** Estimate Gini coefficient for observed and counterfactual outcome CDFs
and treatment effect on the Gini coefficient. Calculate the standard
error across all bootstrap replications. ***/
forvalues b = 1(1)$breps {
/** Calculate area under the CDF using discrete Riemann integration. **/
qui gen area = $pct_step*(rep`b'+rep`b'[_n-1]) if _n<=`lim2'
qui gen cumarea = 0
forvalues i = 2(1)`lim2' {
qui replace cumarea = max(0,area + cumarea[_n-1]) if _n==`i'
}
/** Create a Lorenz curve by calculating the fraction of the total area
under the CDF at each quantile. Note that this calculation works
because the quantile are evenly spaced (e.g. 5, 10, 15) and so the
fraction of the population between any two quantiles is equal. **/
qui summ cumarea
qui gen lorenz = cumarea/r(max) if _n<=`lim2'
/** Calculate the Gini coefficient from the Lorenz curve again using
discrete Riemann integration and the fact the the Gini equals 1
minus 2 times the area under the Lorenz curve. **/
qui summ lorenz if _n<=`lim2'
matrix gini_obsd[`b',1] = 1 - (2 / ($pct_nmbr-1) ) * ///
( r(mean)*r(N) - 0.5*r(max) - 0.5*r(min) )
/** Calculate area under the CDF using discrete Riemann integration. **/
qui replace cumarea = 0 if _n<=`lim3'
qui replace area = $pct_step*(rep`b'+rep`b'[_n-1]) if _n>=`lim3'
forvalues i = `lim3'(1)`lim4' {
qui replace cumarea = max(0,area + cumarea[_n-1]) if _n==`i'
}
/** Create a Lorenz curve by calculating the fraction of the total area
under the CDF at each quantile. Note that this calculation works
because the quantile are evenly spaced (e.g. 5, 10, 15) and so the
fraction of the population between any two quantiles is equal. **/
qui summ cumarea
qui replace lorenz = cumarea/r(max) if _n>=`lim3' & _n<=`lim4'
/** Calculate the Gini coefficient from the Lorenz curve again using
discrete Riemann integration. **/
qui summ lorenz if _n>=`lim3' & _n<=`lim4'
matrix gini_cfac[`b',1] = 1 - (2 / ($pct_nmbr-1) ) * ///
( r(mean)*r(N) - 0.5*r(max) - 0.5*r(min) )
/** Calculate the difference between the Gini coefficient on the observed
and counterfactual CDFs. **/
matrix gini_diff[`b',1] = gini_obsd[`b',1] - gini_cfac[`b',1]
drop area cumarea lorenz
}
/** Calculate the standard errors on the observed and counterfactual Gini
coefficients and the difference between them. **/
preserve
clear
foreach d in $distributions {
qui svmat double gini_`d'
qui summ gini_`d'1 if _n==1
local gini_`d'_b = r(mean)
qui summ gini_`d'1 if _n>1
local gini_`d'_s = r(sd)
}
/** Estimate mean and standard deviation for observed and counterfactual
outcome CDFs and treatment effects on the mean and standard deviation.
Calculate the standard errors across all bootstrap replications. ***/
restore
forvalues b = 1(1)$breps {
/** Calculate the average outcome from the CDF using disrete Riemann
integration, again using the fact that the gaps between quantiles
are all identical. **/
qui summ rep`b' if _n<=`lim2'
matrix mean_obsd[`b',1] = (r(mean)*r(N)/($pct_nmbr-1)) - ///
((r(max)+r(min))/(2*$pct_nmbr-2))
qui summ rep`b' if _n>=`lim3'
matrix mean_cfac[`b',1] = (r(mean)*r(N)/($pct_nmbr-1)) - ///
((r(max)+r(min))/(2*$pct_nmbr-2))
matrix mean_diff[`b',1] = mean_obsd[`b',1] - mean_cfac[`b',1]
/** Calculate the outcome standard deviation using the computational
formula for the variance, discrete Riemann integration and the fact
that the gaps between the quantiles are all identical. **/
qui gen square = rep`b'^2
qui summ square if _n<=`lim2'
matrix vrnc_obsd[`b',1] = (r(mean)*r(N)/($pct_nmbr-1)) - ///
((r(max)+r(min))/(2*$pct_nmbr-2)) - (mean_obsd[`b',1])^2
qui summ square if _n>=`lim3' & _n<=`lim4'
matrix vrnc_cfac[`b',1] = (r(mean)*r(N)/($pct_nmbr-1)) - ///
((r(max)+r(min))/(2*$pct_nmbr-2)) - (mean_cfac[`b',1])^2
matrix vrnc_diff[`b',1] = vrnc_obsd[`b',1] - vrnc_cfac[`b',1]
drop square
}
clear
foreach d in $distributions {
qui svmat double mean_`d'
qui summ mean_`d'1 if _n==1
local mean_`d'_b = r(mean)
qui summ mean_`d'1 if _n>1
local mean_`d'_s = r(sd)
qui svmat double vrnc_`d'
qui summ vrnc_`d'1 if _n==1
local vrnc_`d'_b = r(mean)
qui summ vrnc_`d'1 if _n>1
local vrnc_`d'_s = r(sd)
}
/** Display output. **/
#delimit ;
file write myfile _n "For `ename':"
_n " Treated Gini is `gini_obsd_b' (`gini_obsd_s')"
_n " Counterfactual Gini is `gini_cfac_b' (`gini_cfac_s')"
_n " Treatment effect on Gini is `gini_diff_b' (`gini_diff_s')"
_n " Treated mean is `mean_obsd_b' (`mean_obsd_s')"
_n " Counterfactual mean is `mean_cfac_b' (`mean_cfac_s')"
_n " Treatment effect on mean is `mean_diff_b' (`mean_diff_s')"
_n " Treated variance is `vrnc_obsd_b' (`vrnc_obsd_s')"
_n " Counterfactual variance is `vrnc_cfac_b' (`vrnc_cfac_s')"
_n " Treatment effect on variance is `vrnc_diff_b' (`vrnc_diff_s')";
#delimit cr
}
/******************************************************************************
Close off file
******************************************************************************/
clear
file close myfile
capture log close