Program evaluation
1. Nonexperiments
Means using R2 statistical analysis to find response. Uses statistical controls (program variables x1+x2+x3) to analysis outcome measures (independent variable y)
2. Experimental designsRandom field experiments
Must Have
1. comparisonrandom assignment of cases to experiment vs. control group
2. manipulation treatment should produce change in unit of analysis
3. Controlinternal validity (controlled through random assignment)
4. Generalizabiltyexternal validity
Must have 30 items in the sample
Randomly selected
Avoid selection threats
3 types of tests
1. pretest post test
2. pretest comparison
3. salamon 4
Positive
Use simple statistics use t test or f tests
Ethical issues holding on reservation
Treats to validity, external depend on large sample size N
Never randomize to suite probs
Control many factors get selection treats
Simple to understand
Negative
Cause group doesn’t explain why
Not always feasible
Not always generalizable
Lots of internal threats – instrumentation multitreatment self selectionattrition
Must have large N
With randomized field experiments, you can get the closes to causal inference, based on random assignment of program and control groups. Lottery is a randomized field experiment. Yet it is not the most efficient way to redirect scarce resources. Each group must be large (greater than 30) and be composed of the same % of sex, race, characteristics. Select numbers run program on one group and not on the other. We only need random assignment not random selection. If the numbers are less then 30 you can not do random assignment but must do a quasiexperiment. Unit of analysis is important.
3. QuasiExperimental Design
Absence of random assignment makes QE different than experimental design, tend to be retrospective. Internal Validity more questionable Selecting unit of analysis or variable that could be effective and related to treatment selected comparable places
3 types
1. cross sectional experiments with comparable units XS
2. Time series before and after treatments TS
3. Both cross sectional and time series comparison before and after time series
Types of studies:
1. No comparison
a. descriptive case study not good program evaluation
2. Posttest Only comparison group
a. Threats to valid causal inference, who knows if program caused difference, or if it was another variable.
3. PretestPosttest Comparison Group
a. You should have baseline data for this formula.
b. You control for self selection bias
c. Random assignment would help separate groups
4. PretestPosttest
a. 2 data points not strong design, history and maturation problems?
b. Have baseline data
c. Yet can´t claime effectiveness of program –other variables may effect group
5. Interrupted Times Series
a. strength control for maturation yet no comparison, selection threats, purely reflexive design
6. Interrupted Times Series Comparison Group many different levels o interventions
4 types of Validity
A. External Validity = must be generalizable
1. Time not general range of economic growth
2. Place not general to all US but specific place
Solved by having a Large N with random selection
B. Measurement Validity
• Reliability – absence of random error
• Validity—absence of nonrandom measurement error
Best way to reduce both –multiple indicator to explain differences with random measurement error
3 types of measurement validity
1. face valuesmeasurement instrument really measure what it is suppose to
2. conceptconstruct is measured indicators related to one another
3. predictive validity score on GRE how well you will do in school? The valid measurement will yield the correct outcome.
C. Statistical Conclusion Validity refers to the accuracy with which system affects are separated from random effects (stochastic affects=
Sources of randomness
1. sampling error
2. random measurement error
3. inherent in human behavior
4. small sample size
Soled by having a larger N sample
If studying a population use statistic test to nullify need large N
Type 1 error is finding a program effective when it is not
• reject the null hypothesis lower levels of significances
• academic research focuses on it
• F= low
• reject null
Type 2 error is finding no effect when the program is effective
• Accept the null hypothesis
• Beta program evaluation focuses on
• F= high
• Accept the null kill the program
• Policy analysis must focus on type 2 errors
How do determine the power of a tests
1. frequently use increase level of significance, measure powerful test
2. use .05 level of sig. As base
D. Treats to Internal Validity
Problem with internal validity it is impossible to prove causal claims, no study is accurate. Some studies have mere internal validity treats than others . How to improve internal validity design matters.
Types
1. History (TS) an event other than the change in the treatment (x) might cause the outcome (y) to change (single event)
2. Maturation (TS) Y man be changing partly because of underling trend and not because of treatment (x)
3. Testing, (TS) while taking a test, no change in treatment, may cause the outcome (y) to change – external and internal treat (aware of being studied)
4. Instrumentation (TS, CS) change in calibration of measurement procedure or instrument may partly or entirely cause the outcome (Y) to change, rather than the treatment – change treatment causes outcome to change
5. Regression artifacts (TS, CS) extreme high or low scores chosen often, there is a tendency for extreme scores to return normal. Chose highest more likely to go down.
6. Selection (CS) when the group to be compared differ on factors besides treatments (x) than these differences (z) may account partly or entirely for the observed difference in outcome (y). example public vs. private schools
7. Attrition (TS) when 2 or more groups are being compared, observed betweentreatment difference in outcome (y) may be partly or entirely attributable to a differential loss of respondents rather than to the treatment (x)
8. Multiple treatment interference (TS,CS) when one treatment (x1) is confounded with another (x2) that it is impossible to separate the impacts of one to the other
9. Contamination (CS) when one group finds out about the treatment and there is no difference in outcomes (y)
Making Bonds Work
Are governments always O verIndebtedness and Fiscally Irresponsible? Governmental debt and financial sustainability are pressing i...

Stephen Goldsmith, a former mayor of Indianapolis, has launched a new website to catalyze local government efforts to deploy data, analyt...

“At every step along the way there [are choices]—political and economic—that provide…real alternatives. Path dependency is a way t...

Discuss in some detail the major theoretical perspectives listed below. In this discussion, first, identify the position of these persp...