October 14, 1996
Web Version Date: 12/10/96
Prepared for the Committee on National
Statistics of the National Research Council --- National Academy of Sciences
DRAFT
Please do not cite or quote without
permission.
Henry E. Brady is Director, UC DATA and Professor of Political Science and Public Policy at the University of California.
Barbara West Snow is Research Director, UC DATA and Director of the California Work Pays Demonstration Project.
We would like to thank the staffs of UC
DATA and of the Research Branch of the California Department of Social
Services with whom we have worked on many of the projects described in
this paper. We have learned a lot from them. Two people have made especially
important contributions to our thinking about datasets for monitoring social
programs. Dr. Fred Gey of UC DATA has instructed us on the technical aspects
of database design, construction, and management. Werner Schink, the Chief
of the CDSS Research Branch, has provided the vision and leadership that
is necessary to bring new data systems to fruition.
Points of view or opinions expressed in
this document are those of the authors and do not necessarily represent
the official position or policies of the Regents of the University of California
or the California Department of Social Services
Table of Contents
I. The Personal Responsibility and Work
Opportunity Act of 1996
II. Program Implementation and Statistical
Needs
IV. Specific Examples of What Might
be Done
B. Answering Specific Questions
Data Systems and Statistical Requirements
for the Personal
Overview In this paper, we explore the statistical
needs for planning, monitoring, research, and evaluation that grow out
of the Personal Responsibility and Work Opportunity Act of 1996. In the
first section we provide a quick overview of the goals and the programs
of the Act with an eye towards their implications for data collection and
statistical reporting. This tells us what the Congress and the President
intend to happen, but it does not tell us what will happen. We turn, therefore,
in the second section to a discussion of how we think the Act's program
performance standards and case management requirements, the features most
closely related to data collection and statistical reporting, will be implemented.
This provides us with a picture of the statistical data collection systems
that will likely be in place after the implementation of the Act. This
is where we must begin if we are to develop a useful data collection and
analysis system.
In the third section, we go beyond the
Act to discuss a number of recurring problems facing those who create statistical
data systems for monitoring the effects of social programs. We also discuss
some solutions to these problems. Then in the fourth section we turn to
a number of specific examples of what could be done through the creative
use of surveys and administrative data systems by building upon the existing
systems and the opportunities provided by the new Act. In the fifth section
we touch on some of the problems of resources, interagency coordination,
and confidentiality that must be faced to achieve a better nation-wide
system for monitoring and evaluating social program participation and the
status of the least fortunate Americans. We end with some conclusions.
I. The Personal Responsibility and Work
Opportunity Act of 1996 A. Landmark Legislation in the Use of
Statistical Data
The Personal Responsibility and Work Opportunity
Reconciliation Act of 1996 is not only a landmark in the development of
American social policy, it is also a landmark in the governmental uses
of statistical data and information. Major innovations in data collection
and in database construction will be required to meet the goals of the
Act. For example, to avoid substantial reductions in funding, states must
meet strict outcome standards for the employment of welfare recipients.
This will require measuring and recording in extraordinary detail the work
experience of those receiving aid. States must also set five-year (or stricter)
cumulative time limits on the receipt of welfare, and they must get recipients
back to work by the time they have accumulated two years of aid. Meeting
these goals will require, for the first time, tracking recipients over
long periods of time. Longitudinal databases of unprecedented scope will
have to be constructed for this purpose. States must strictly enforce child
support laws, and several databases, including state and national registries
of new hires must be created that can be updated quickly and made available
to many governmental social service agencies. Several new studies are called
for including a national survey of children who are at risk of child abuse
or neglect, and this study must be longitudinal, must yield data at the
State level for as many states as possible, must summarize the out-of-home
placements of the child, and must determine the frequency of contact with
State or local agencies. Any one of these tasks alone would pose a substantial
challenge. Together they are formidable indeed.
In fact, as we read the bill, we were charmed
to find that the legislative staff members who drafted it and the legislators
who approved it had such faith in the ability of public administrators,
survey researchers, database managers, and statisticians to track people
over time, update databases regularly and accurately, to measure work effort
in enough detail to develop weekly logs of the number of hours and kinds
of work undertaken by someone on assistance, and to keep track of the complicated
living arrangements of modern American households in this era of single
parent families. The truth is that the two major ways that we obtain reliable
information, the sample survey and administrative databases, will be stretched
to their current limits --- and perhaps beyond them --- as they are called
upon to do these things. The bill itself recognizes some of these problems
and there is a small section that calls for a report on "what would be
required to establish [an automated data processing] system capable of
(A) tracking participants in public programs over time; and (B) checking
the case records of the States to determine whether individuals are participating
in public programs of two or more states" (pages 62-3). (EN
#2) And there is another section which calls for the study of "outcomes
measures for evaluating the success of the States in moving individuals
out of the welfare system through employment as an alternative to the minimum
participation rates" described in the new Act (page 63). These sections
suggest what is obvious to anyone who reads this bill: its implementation
will require a new level of sophistication in the provision of social statistics.
In this paper we describe the statistical
needs created by the passage of this legislation, and we make some suggestions
about what can be done to meet those needs. We start with a discussion
of the legislation, but we broaden our field of inquiry to consider statistical
needs that are not explicit in the legislation. We will also provide very
concrete examples of the problems of collecting statistical data in this
area and the possibilities for using administrative data, often in concert
with sample surveys, to improve our information about the effects of the
legislation.
B. What the Act Does--Goals and Programs
By its goals, the Act suggests that we
should monitor needy families to see if children are being cared for in
their own homes and to see if they have adequate living arrangements, especially
satisfactory child care, while their parents are working. We should see
if poor parents do get jobs and stay married. We should see if out-of-wedlock
pregnancies are reduced, and we should keep a close watch on child abuse,
neglect, and abandonment. In addition, the various Titles of the Act suggest
that we should monitor the circumstances of disabled children who might
be affected by changes in SSI, the situation of aliens who will received
reduced benefits or be denied benefits altogether, and the circumstances
of those on food stamps who face a modified program with work requirements
and reduced benefits. These are substantial tasks. Where should we begin?
II. Program Implementation and Statistical
Needs One way to think about the planning, monitoring,
research, and evaluation needs created by the Personal Responsibility and
Work Opportunity Act of 1996 is to focus, as we just did, on its goals,
the programmatic changes it makes, and the resulting impacts on poor families.
Certainly the purposes of the Act must be addressed in any satisfactory
monitoring system, but we know that political agendas, resource constraints,
and the weight of history will largely shape the statistical system that
grows out of this legislation. It seems reasonable, therefore, to consider
how the implementation of the program will structure the kinds of data
that are collected. Let us turn to this kind of analysis for a moment.
The implementation perspective looks at
a program and tries to understand how the incentives offered to those asked
to implement it will affect the final shape of the program. This approach
assumes, for example, that goals with money attached are more likely to
be implemented than those without resources, no matter what the intent
of the legislation. It assumes that tasks that are explicitly demanded
will drive out those that are not, even if this detracts from achieving
the goals of the program. And it assumes that powerful actors will bend
programs to their agendas. Because the new welfare legislation is a block
grant program the states are the major players, and we must understand
what they are likely to do.
From a statistical standpoint, the legislation
operates at two major levels. At the programmatic level, it sets a number
of program performance standards with substantial penalties for failures
to meet them. These include work participation and enforcing child support.
At the case management level, the legislation requires five year limits
on the receipt of welfare, involvement in work preparation programs and
work readiness before accumulating two years of welfare assistance, tracking
and locating those who do not pay child support, extensive redeterminations
of eligibility for Supplemental Security Income, and changes in eligibility
for food stamps. These programmatic and case management imperatives will
determine the shape of the information systems. In addition, the existing
data systems for those programs that have not been changed such as Medicaid
and Unemployment Insurance will provide the background against which new
data systems will be developed. Indeed, linkages with these and other existing
systems will be required to implement certain provisions in the Act. Many
social services data systems must be changed in some way, although special
funding for such changes is provided only in the Medicaid and child support
areas.
A. Program Performance Standards
How Program Performance will be Measured
for TANF --- How will the states provide this information? Our best
guess is that a quarterly survey will be used by many states for collecting
much of this information because existing administrative systems will not
be able to provide it, and the Act authorizes "the use of scientifically
acceptable sampling methods" to collect it (page 48). In effect, such a
survey would be a state-wide "mini-Survey of Income and Program Participation"
of a sample of current or recent recipients of TANF that will be similar
to the nation-wide survey of a sample of the entire population undertaken
by the Census Bureau in its Survey of Income and Program Participation
(SIPP). (EN#6)
Getting a quarterly reporting process underway
poses some major challenges for the states. In the State of California,
for example, it appears that a quarterly survey will be a major component
of this process. There are already two different offices, the Review and
Evaluation Bureau (REB) in DSS and the Statistical Services Bureau in the
Department of Health and Welfare which are possible candidates for carrying
out this mission. REB has traditionally done the federally mandated checking
of case files to make sure that federal standards for grant calculation
and eligibility are met. In California, there are about 300 employees of
this group who now, under the new legislation, have no specific mandate.
They are trained in statistical sampling, in checking the files of recipients,
and in doing follow-up investigations, but they are not experts in survey
interviewing although they have done "characteristics surveys" every two
years to determine the characteristics of those receiving AFDC. The Statistical
Services Bureau has traditionally obtained data from the county administrative
systems and published reports on welfare in California. These units can
both lay claim to having expertise in developing data on welfare, but it
seems likely that they might take different approaches to producing Quarterly
Reports.
Surveys designed to supply information
for the Quarterly Reports can provide part of the foundation of a statistical
system for monitoring the impact of TANF. Their usefulness, however, will
depend crucially upon their content, their design, and their quality. (EN#7)
It is possible to envision a bare bones cross-sectional survey that would
allow the states to calculate their participation rates but which would
be of limited usefulness for assessing the impact of TANF. With a bit of
effort, however, it might be possible to develop a strategy for rolling
panels, expanded content, and linkage to administrative data. The rolling
panels would allow those monitoring the program to follow families over
time and to observe what happens to them when they leave the program. They
would also provide the statistical power of a longitudinal design. Expanded
content could include issues such as adequacy of parenting, housing, health-care,
and nutrition; the sexual and fertility behavior of recipients; the school
plans and performance of minors and young adults; the establishment of
paternity and the collection of child support; and other subjects. Linkage
to administrative data may be essential to monitor TANF payments and food
stamp amounts, homeless and child care assistance, and time on aid. It
could also provide a historical picture of the family's experience with
welfare before and after the survey.
The importance of the work participation
standards, the substantial penalties for failure to submit Quarterly Reports
on time, and the penalties for failure to meet the performance standards
suggests that states will develop a method for reporting the data listed
in Figure 4. (EN#8)
States that had a centralized state-wide AFDC program and data system to
begin with, as well as states with smaller populations, may find it possible
to base their reports upon administrative data systems alone, although
the breadth and depth of the required information for the quarterly reports
suggests that surveys might be needed even in these situations.
(EN#9) Presumably the mention of specific data elements in the Act
and their importance for comparing one state versus another will create
some incentives for the Federal government to get the states to agree upon
a common set of definitions to avoid the reporting of incomparable data.
There is good reason, then, to suppose that there will be a core survey
effort with some common definitions to study the impacts of TANF.
Those who want to expand this enterprise
must convince the states that the marginal costs of more sophisticated
designs and more extensive content is relatively small. These resources,
though relatively small, may be hard to come by. Although there is plenty
in TANF and the rest of the Act for Democratic and Republican governors
and legislators to fight over as they develop state programs and enabling
legislation, it seems possible that one source of dispute might be the
content and design of such surveys. Conservatives will no doubt prefer
surveys which focus on work effort while liberals will want to include
questions about quality of life for welfare recipients. Conservatives have
the advantage of statutory language (Figure 4) which emphasizes collecting
information on work effort and program participation, but both liberals
and conservatives might be able to agree that it would be useful to know
how much children are affected by the new programs. Furthermore, some of
the information obtained from these surveys might be of interest to the
many groups (such as the powerful cities and counties in California and
elsewhere) concerned about the budgetary implications of the Act. (EN#10)
The tracking of those who leave TANF could provide information about the
likely impacts of time-limited welfare on General Assistance and on Foster
Care -- two programs that could balloon as families time-out of TANF. (EN#11)
It might even provide a way to determine whether child abuse, crime, or
other unwanted behaviors will be affected by the legislation. A bi-partisan
coalition of legislators might be built around the promise that these data
could provide an early warning system of burgeoning costs or unexpected
problems.
Other Program Performance Standards
--- The Act also includes some other program performance standards regarding
decreasing out-of-wedlock births, (EN#12)
improving the enforcement of child support, and more generally measuring
state performance so as to achieve the goals of the Act. (EN#13)
Bonuses are provided to states which reduce out-of-wedlock births and penalties
for failure to improve the rates of paternity establishment, but there
is nothing comparable to the Quarterly Reports required of the agencies
administering TANF in the states. (EN#14)
Bonuses are only offered for about five years, but penalties will continue,
and while it is not clear that bonuses are so large that they will foster
a substantial effort by the states to collect data on these issues, the
surveys for the Quarterly Reports might provide some of this information.
B. Case Management Data
Of these three time-limitations on benefits,
the most novel and complex is the five-year limitation on the receipt of
TANF because it involves keeping historical records of benefits, work-history,
and other information over a lifetime; taking into account individual relationships
to families over that lifetime; and creating an absolute limitation on
program participation without a chance to restart program eligibility.
Other social programs have kept earnings and benefit histories over long-periods
of time (e.g., Social Security), provided time-limited benefits--but with
a chance to restart program eligibility after some time (e.g., Workers'
Compensation, Unemployment Insurance), and considered current and past
family structure in the determination of eligibility or benefits (e.g.,
Social Security), but none to our knowledge has combined as many of these
features as does the five-year life-time limitation on the receipt of TANF.
All others provide opportunities to restart eligibility. Taken together,
the special features of time-limited welfare pose some substantial problems
for those designing database systems for case management in the TANF program.
Most states or counties(EN#17)
currently have computerized data systems which collect case-level information
on the structure of the family, income from work, child support status
and payments, and other data needed to calculate AFDC (or food stamp) grant
amounts. In California this information is only available at the county
level, but it is collected in state-wide systems in some states. States
or counties also have information about participation in JOBS programs,
but this is often kept in separate data systems. For example, in California,
most large counties have completely different information systems for keeping
track of AFDC benefit calculations and GAIN (the California JOBS program)
participation. Furthermore, these systems vary in structure and data elements
by county.
TANF envisions a situation in which these
systems are combined or at least communicate with one another. (EN#18)
To enforce the five-year limit in TANF and the other time-limits in the
Act, at first a statewide system must be created, and eventually a national
registry must be constructed as well. At the moment, however, we are far
from having such systems. Even within a single California county, it is
a great challenge to get computer systems working together so that a case
manager will know at any given moment whether a head of household is in
GAIN or receiving welfare, but it is an even bigger challenge to add the
new data elements required to implement TANF, to develop a common format
across systems separated by geography and by bureaucratic task, and to
link them over long periods of time as case composition and individual
names change. Most systems currently only keep information for a few months,
or at most three years, and even those states or counties that do keep
historical information have almost never developed an ongoing process of
linking it over time and across political jurisdictions. Yet TANF and the
changes to the food stamp program require careful record-keeping of months
of receiving aid across spells of aid and periods of sanctions with varying
lengths and application to different family members. There are penalties
for states which fail to comply. (EN#19)
Indeed, the complexity of the TANF provisions
-- for example, months on aid by unmarried persons under eighteen years
of age do not count towards the 60 month time limit and certain kinds of
educational experiences count as "work activities," although sometimes
for only limited periods of time -- means that it will be necessary to
keep current and historical information available for all people in a case,
including their ages, educations, workfare program participation dates
and hours, school attendance, etc. in addition to case-specific data such
as payments, child support, food stamp amounts, and Medicaid eligibility.
Accurate and up-to-date information will be necessary on who is or is not
attending school, or participating in an approved employment or community
service activity, and for how long. And since a maximum of four concurrent
weeks of job search is permitted, and only twelve months of vocational
education, these must be recoded separately from other work activities.
Data systems must track all this, and do so accurately.
The goal here, however, is not at all the
reality. We are far from being able to do this, and the Act appears to
recognize this by calling for two studies that would be first steps towards
creating a synoptic data system. (1) The Secretary of Health and Human
Services must submit a report to the Congress on the status of the automated
data processing systems operated by the States to assist management in
the administration of State programs. This report must describe what would
be required to establish a system capable of tracking participants in public
programs over time to check case records of the States to determine whether
individuals are participating in public programs of two or more States.
The report should contain a plan for building on the automated data processing
systems that exist, and a time estimate for establishing the new system.
(2) In addition, the Act calls for the Commissioner of Social Security
to develop a prototype counterfeit-resistant social security card and to
study the feasibility of issuing this card for all individuals over 3,
5, or 10 years. Social security number matching, which has been relatively
rare in the past because agencies relied predominantly upon their agency
case or person numbers for tracking, could be facilitated with the development
of a counterfeit-resistant social security card although the extensive
use of social security numbers raises numerous questions about privacy
and confidentiality.
It is a long way, however, from these reports
to a true national tracking capability. Indeed, many states are a long
way from being able to track welfare recipients over time. In California,
for example, most of the information about welfare or JOBS resides in different
county databases, and the only statewide welfare database, the MEDS system,
would have to be significantly redesigned to accommodate the needs of the
TANF legislation.
Case Management Data for Child Support
--- The Act also requires the creation of two other detailed databases
to aid in child support enforcement. The automated State case registry
will contain records on support orders established or modified in the state
after October 1, 1998 and records on each case in which services are provided
under the State plan for child support enforcement (p. 105 of the Act).
Child support enforcement actions are required for those receiving benefits
from TANF, food stamps, Medicaid, foster care maintenance payments, and
any child of an individual who applies for services. The registry must
have standardized data elements for identification of parents (such as
names, social security numbers, dates of birth, and case identification
numbers) and detailed information on case status, on the amounts of support
owed and support paid, and on administrative and judicial actions and proceedings.
Information from the State case registry will be made available to a Federal
Case Registry.
The State Directory of New Hires must be
in operation by October 1, 1997, and employers and labor organizations
in the state must furnish reports for each newly hired employee. Through
a W-4 form employers must furnish to the State Directory of New Hires the
name, address, and identification number of the employer within 20 days
of the date of hire. This directory will be used to locate individuals
for the purposes of establishing paternity and enforcing child support
obligations. The State Directory of New Hires must also report quarterly
to the National Directory of New Hires information on wages and unemployment
compensation, and new hire information can also be disclosed to the state
agencies administering TANF, Medicaid, Unemployment Compensation, Food
Stamps, and SSI. (EN#20)
These two data systems, the State Case
Registry and the Directory of New Hires cover quite different universes,
and they will be useful for answering quite different questions. The Directory
of New Hires is designed to cover all employment so that it will be somewhat
broader than the existing data from the Unemployment Insurance program.
This may help to fill in some gaps that currently exist in UI data. The
State Case Registry might be useful for producing statistics on the establishment
of paternity and child support agreements for those receiving social welfare,
but it will not cover the universe or even a well-defined demographic (as
opposed to programmatic) group.
How Will Automated Case Management Systems
Be Constructed? --- Modern database technology makes it possible to
construct the massive databases contemplated in the Act. Networking makes
it possible for field units to be in direct contact with centralized computers
so that information can be input or queried on an ongoing basis. Individual
computer tapes now hold up to 40 gigabytes of data, which is enough room
for hundreds of millions of records. Fast computers make it possible to
sort and link these millions of records. And relational databases have
created a powerful tool for organizing and accessing data.
Relational databases (EN#21)
are essentially linked tables of information so that there can be a table
of individual characteristics such as age, sex, race, and marital status
which lists everyone in the database, a table of family relationships which
shows how these people are related to one another, a table listing all
instances of one kind of event along with the characteristics of the event
such as receiving a TANF check of some amount during a specific month,
a table of another kind of event along with its characteristics such as
being enrolled in a work program of some sort, and perhaps still a third
table of events such as receiving wages for an average of 20 hours or more
per week in the last month. The structured query languages used in these
databases make it possible to ask for information on, for example, all
cases in which the head of household was 18 years or older, which had already
accumulated 24 months of welfare assistance, in which the head of household
was not enrolled in a work program, and in which the head of household
had not worked an average of 20 hours or more in the last month. Once the
database is constructed, it is very simple to construct these inquiries,
and this makes it possible to "cut" the data in many different ways without
an excessive amount of programming each time a different slice of the data
is needed.
Relational databases also have another
feature that makes them useful for organizing data. They enforce a kind
of discipline called "normalization"
(EN#22) on how tables are constructed that reduces ambiguity and simplifies
the process of updating the files. This could prove to be especially useful
in the construction of social program databases where considerable confusion
results from having cases composed of persons who can move in and out of
cases and who can form new cases. Many social program databases utilize
cases as their basic unit of analysis and computer systems store information
by these cases. Because the case is the basic unit of concern, information
on the persons within the cases is sometimes not collected in a very useful
form or at all. This can bedevil anyone who wants to follow individuals
over time because they are sometimes not identifiable within the case,
or even if they are identifiable, their relationship to other members of
the case is not always clear. In California, for example, it has been very
hard to identify teenage mothers "nested" within cases with their own mothers
or other relatives because it is impossible to distinguish a case where
a baby is the biological offspring of the teenager's mother from a case
where a baby is the biological offspring of the teenager. A well-designed
and maintained relational database would make it harder for these kinds
of problems to arise. Another kind of problem arises when a child within
a case gets SSI because of a disability. In that circumstance, the child
disappears from the AFDC case and forms a new SSI case. Finally, cases
may dissolve or form in new ways as adults get married or divorced.
These anomalies often make it very hard
to follow people, especially children, in these files. Indeed, there is
a basic paradox embedded in many of these databases. Although there is
a great deal of concern with the welfare of children, the bulk of the attention
is placed on the adults, and the databases are often designed to track
adults much more readily than to track children. If we are really going
to be concerned with outcomes for children, we must make sure that we design
data systems that allow us to follow children and to measure the outcomes
of our programs for them.
These comments suggest that much could
be gained from integrating and redesigning our current data systems. But
this is easier said than done. In our work with the State of California,
we have constructed separate longitudinal persons and cases files for the
four counties, Los Angeles and San Bernardino in Southern California and
San Joaquin and Alameda in Northern California, from information supplied
by them since December, 1992. We have also used data supplied by the state
from the Medi-Cal eligibility file to construct welfare history files for
these same persons and cases back to 1987. To put together the persons
and cases files for the counties we have had to process between four and
eight files from each county (see Figure 5
). In doing this, we have encountered different database management
systems (none of them modern relational databases) for each county, some
of them dating back 20 years, and all of them requiring substantial translation
and reformatting before we could construct our own four county database.
But it is not only the age of these systems that makes them unwieldy. They
also have quite different data elements and different definitions of the
same data elements, and some of the files are much richer than others.
This has led us to create separate persons and cases files for Los Angeles.
Figure 6 shows a table of the variable
list for the six files we have created. A comparison of the columns for
the Los Angeles County Case file and the Four County Case file demonstrates
the additional richness of the Los Angeles file. But this chart does not
begin to reveal the work that went into creating variables that were comparable
across just these four counties. In summary, then, the creation of a uniform
database requires both substantial computer talents to create datasets
that are in a common computer format as well as significant social program
knowledge and re-processing of data to insure that ostensibly similar data
really are measuring the same variable.
The difficulty here is that there are so
many different databases that must be pieced together to create a useful
longitudinal dataset. One solution to the problem would be to develop an
entirely new system that uses more modern technology. The Act provides
the opportunity to do this for TANF because it places expenditures for
"information technology and computerization needed for tracking or monitoring"
(page 21) outside the fifteen percent cap on administrative expenditures.
This may help to fund new systems, but it could also lead to a battle between
liberals and conservatives. Liberals may choose not to spend TANF resources
on information systems which they may see as mechanisms only for better
enforcement of the time-limits, and conservatives may favor spending because
they view enforcement as the essence of the new program. Perhaps, as with
the surveys for the quarterly reports, a bipartisan coalition can be put
together that is concerned with monitoring the long-term impacts of the
Act.
C. Existing Systems
From a statistical perspective, one of
the advantages of this Act is that it did not drastically change the eligibility
for two major social welfare programs, Medicaid and food stamps. (EN#23)
Consequently, the data systems and the universe of program participants
for these programs will not be substantially changed, and they will be
available as baselines for studying the impact of TANF. Thus, a sample
of those eligible for Medicaid before and after the introduction of TANF
could be studied to see how the program has affected their lives. Similarly,
a sample of food stamp families with minor children could be studied in
the same way.
Some other existing systems include Unemployment
Insurance data, Vital Statistics, foster care data, and tax data. The Unemployment
Insurance system provides quarterly wages for individuals and information
about employers. Vital statistics includes dates of birth and death and
other basic demographic information. Foster care data in California (EN#24)
include information on the children (birth date, sex, ethnicity, relation
removed from) and the placement (removal-reason, location of placement,
start-date, facility type, end-date, reason for exit). Tax data provides
detailed information on wages and other income, receipt of the Earned Income
Tax Credit, and some information on family structure. These tax data are
very rich sources of information although they are not easily accessible
because of confidentiality safeguards.
Another way to think of these various databases
is that they cover three broad areas. Vital statistics, foster care data,
and the new State Registry for Child Support may provide information on
basic family structure. Unemployment Insurance, tax data, and the new Directory
of New Hires can provide information on employment and wages. Finally,
Medicaid, food stamps, TANF, and SSI can provide information on means-tested
social program participation.
Because these datasets often contain complementary
information, much can be learned by linking them. At UC DATA, for example,
we have worked with the California Department of Social Services and the
State franchise tax board to link tax records, Unemployment Insurance data,
and AFDC data as part of a study of the take-up rate for the Earned Income
Tax Credit. (EN#25)
We are now in the process of linking foster care data and AFDC information.
(EN#26)
D. Conclusions from the Implementation
Approach
Survey and administrative data produced
to implement the Personal Responsibility and Work Opportunity Act of 1996
could be very useful in monitoring and evaluating the new programs established
as part of the Act. If surveys are used by states to produce their quarterly
reports, then these surveys could provide a flexible vehicle for monitoring
the impacts of the legislation. Existing data systems such as Medicaid
or food stamps may provide opportunities for tracking needy families who
might otherwise disappear from welfare rolls because of the changes from
AFDC to TANF. Existing data systems can also be used to produce ancillary
information on family structure, employment, and wages that can be very
useful in monitoring the impact of the legislation. Finally, as new case
management systems are developed, efforts can be made to insure that common
data elements and common definitions of data elements are used across the
counties and states.
III. Generic Problems of Monitoring and
Evaluation and Some Solutions
The implementation perspective treats legislation
much as we might trace the possible courses of a great river in flood.
We examine the topography and presume that the legislation will follow
the contours of the land. Fatalists might stop there, but with a bit of
engineering, we can sometimes channel even the most rambunctious river.
In the development of statistical data systems, there are two types of
engineering that must be done. We must deal with the technical problems
of constructing useful databases, and we must deal with the political and
bureaucratic ones of coordinating their construction. In this section we
try to identify some of the basic technical problems of monitoring and
evaluation, many of them mentioned already, to see if we can find ways
to overcome them. In the fifth section, we summarize our thoughts on the
political and bureaucratic problems.
A. The Problems and Some Solutions
Choosing the Units and Universe for
Analysis --- To answer any statistical question, the starting point
is defining the unit of analysis. We have already discussed the complexities
of monitoring individuals, especially children, in a system that operates
in terms of cases. The problem is exacerbated by the fact that there are
so many different definitions of a case. An AFDC case, food stamps household,
or tax return might cover different subsets of people within the same "family."
Certainly, one of the ongoing challenges is to improve our ability to sort
out these different definitions and to develop ways to track children and
adults within these program.
Once we have decided upon the basic units
of interest, say children or adults or families, then we must describe
the universe that we wish to study. This is difficult for at least two
reasons. First, it matters whether we sample the stock of people on welfare
or the flow of people into it (or off it). It is well-known that a cross-section
of people on welfare has a longer average spell-length and is less likely
to get off welfare in the next month than a sample of new entrants to welfare.
Yet, it is easy to sample a cross-section because administrative systems
are usually designed as repeatedly updated cross-sections, but it is hard
to sample by length of time on welfare because welfare databases have not
typically kept track of this information. It can only be obtained by laboriously
linking repeated cross-sections to create a longitudinal database. Second,
we often care about demographic groups such as all legal immigrants, all
disabled children, or all people below the poverty line instead of programmatic
groups such as all legal immigrants on welfare or all disabled children
on SSI, or all people on welfare. We often care about the demographic groups
because we want to know the fraction of a population that is served by
a program or what happens to those who are not served by it. This is often
called the problem of obtaining "denominator data," but it is also the
problem of getting some variation in the treatment so that we can determine
what happens to those who get the program (the "treatment") and what happens
to those who do not. Another closely related problem is comparing those
in one program, say AFDC, with those in another program such as TANF.
The sampling of stocks and flows can be
facilitated by constructing longitudinal relational databases from which
cross-sections, new entrants, new exits, or any other group can be easily
hived off. If all we care about is rates of participation in programs,
we can often solve the "denominator" problem by taking census or other
data on some demographic groups of interest and comparing the number of
people in each demographic group in the census with the number in our programs.
This solution sometimes flounders because the definitions of a group in
one source of data is different from that in another, but with some effort
and a bit of artifice this can often be overcome. The problem of getting
some variation in the treatment is often a difficult one (and once solved
there is often the additional problem of selection into the treatment),
but it can sometimes be solved by finding a data source that can be linked
with the program participation data and which is either a superset of those
in the program or an intersecting set which includes some people in the
program and some outside of it. As noted earlier, because Medicaid has
not been substantially changed by the new legislation, a sample of people
receiving Medicaid before the welfare program changes should be similar(EN#27)
to a sample of people receiving Medicaid after the changes. By comparing
AFDC usage in the Medicaid sample selected before the changes in welfare
with TANF usage in the Medicaid sample selected after the changes, we should
be able to see in what ways a group of needy individuals (as indicated
by their enrollment in Medicaid) is affected by the changes in the welfare
program. (EN#28)
Describing the Treatment --- A remarkable
feature of the Act is that it says a great deal about what it wants to
achieve, but very little about how it wants to achieve it (except perhaps
in the section on child support). There is virtually no specification of
what programs should be implemented to move families from welfare to work,
or what should be done to reduce teenage pregnancies beyond abstinence
education. This provides the states with a tremendous opportunity to innovate,
but it also presents those monitoring these programs with a tremendous
problem of knowing what the treatment is.
There are two levels to this problem. On
a state by state level, some effort must be put into keeping track of the
programs that are devised. This in itself may be a substantial job, as
indicated by those who have tried to document the details of the federal
waiver process. Unfortunately, however, even if this can be done, it will
not be enough because programs may differ substantially from person to
person within a state. This may be because some people will get more services
than others, because some counties offer different programs, or because
eligibility for programs will be tailored to the individuals. In any case,
this means that individual level data will be needed to assess the real
impact of the new welfare programs. This poses a tremendous challenge to
those monitoring the program because of the difficulties of linking data
on community service or employment and other programmatic information with
participation in welfare. For example, in the Cal-Learn program in California
for pregnant and parenting teens, the treatment consists of case management
services and monetary sanctions and rewards to encourage teens to finish
high school. To describe the Cal-Learn treatment fully, UC DATA has had
to obtain data from AFDC files where there are records of monetary sanctions
and rewards, from GAIN files where there are records of supportive services
and of recommendations to provide bonuses or sanctions, and from the Adolescent
Family Life Program files where there are records of case management services.
Linking over Time, Programs, and Space
--- Linking over time, programs, and space can greatly increase the power
of a statistical system. Yet each kind of linkage presents particular problems
and opportunities.
Linking over time creates a longitudinal
database which is especially useful for understanding the dynamics of program
participation. This might seem straightforward, but it requires some rules
for following cases and some understanding of the ways that files are updated.
As for cases, what should be done when a case splits up or seems to disappear?
We have followed the rule of searching for the youngest child in the original
case and continuing with that person on the grounds that the youngest child
is most likely to continue receiving assistance. But other, and possibly
better, rules are possible. Understanding the way files are updated is
important because cases may regularly disappear at some calendar date only
because bureaucratic routines call for cleaning out discontinued cases
at that time. Or updating may lead to some clerical errors in identifiers
so that attempts must be made to search for cases that have continued but
with a different identifier. Or cases may be assigned different case numbers
from one spell of welfare to the next.
Linking across programs or datasets can
greatly increase the possibilities for analysis, but it usually requires
linkage of identifiers, such as names, that might be recorded in quite
different ways. The field of probabilistic matching has developed a great
deal in the last decade, so there are now very useful algorithms for determining
the likelihood that one case is the same as another based upon the degree
to which a set of identifiers is the same in the two cases. (EN#29)
This still requires that the designer of the system choose the set of identifiers
and that the designer decide how to use information about the likelihood
of a match. In one model, a match is considered to have been made if the
likelihood exceeds a threshold and from then on the records from the two
files are treated as if they were about the same case. Alternatively, if
a subset of the records can be matched accurately (which may be possible
through intensive examination and investigation), then this information
can be used to build a model for imputation and editing the rest of the
data.
Linkage across datasets could be dramatically
improved if efforts were made to develop common identifers. We have already
noted that the Act provides for a substantial amount of matching by Social
Security number, and it calls for a study of counterfeit-resistant Social
Security Cards. The use of Social Security numbers, of course, is not foolproof
because of mispunches and other problems that can arise. An alternative
or complementary approach is to require records to have enough individual
information such as name, sex, date of birth, mother's maiden name, or
other information to facilitate matching. The California Health Information
Policy Project has championed this approach and gotten some support for
it. All of these methods, of course, raise sensitive issues of confidentiality.
Linkage across programs sometimes provides
multiple sources of information on the same data element. This makes it
possible to get a better understanding of how the method of data collection
affects the data element. For example, Henry E. Brady and Samantha Luks
have explored the differences between survey responses on the length of
welfare spells and administrative data on the receipt of welfare. (EN#30)
They have found evidence for social desirability bias in survey responses
with respondents reporting shorter spells than recorded in the administrative
data, and they have found evidence for administrative churning in the administrative
data in which one to two month interruptions in aid constitute late or
misplaced paperwork but not true interruptions. In a study of the Earned
Income Tax Credit, information on earnings from tax records, AFDC files,
and surveys provides a chance to see if respondents report different earnings
because of different incentives in the programs. (EN#31)
There are two ways that files can be linked
across space. One is simply to look for the same individuals in different
jurisdictions so that they can be followed if they move. This is essentially
another version of the matching problem described above. A second way that
data can be linked across space is to connect Census or other information
that is available on a geographic basis with individual records. This can
be done by geocoding addresses (which raises additional problems in matching),
by using Zip codes, or by using other information about geographic location.
These kinds of information can help us understand how context affects individual
behavior. Hilary Hoynes, using UC DATA information, has shown how the availability
of jobs in local areas affects the employment prospects of welfare recipients
and their ability to get off welfare. (EN#32)
Gathering Outcome Data and "Control"
Variables --- Getting people off welfare is an explicit goal of TANF,
and the Act proposes to do this by preparing them for work. In the past,
without time-limited welfare, a transition off welfare was clearly a good
thing because it indicated that assistance was no longer needed, but with
time-limited welfare, people may leave welfare without being prepared for
work, without having any job available, or without having overcome the
difficulties that led them to seek assistance in the first place. This
means that data on outcomes other than simply leaving welfare must be collected.
In fact, information on the reason for leaving welfare is required for
the Quarterly Reports (see Figure 4), but ideally we would like to have
additional information on job prospects and current quality of life.
Quality of life information is especially
important with respect to the children in the case. As we have already
noted, welfare information systems have often neglected to collect much
information on children. Yet, it is of utmost importance to know what happens
to them. The Act seems to recognize this and there is a provision for studies
of the circumstances of children of families that timed-off welfare and
of teenage parents and their children (page 53). The studies have to consider
the incomes, educational attainment, employment, criminal behavior, fertility,
and social program participation of these groups. Outcomes that are not
mentioned, but which might be equally important, are nutrition, adequacy
of housing, adequacy of health care, child abuse and neglect, and movement
to foster care or adoption.
As well as getting outcome variables, it
is very important to record the characteristics of recipients which might
increase or decrease their ability to leave welfare or to be successful
once they leave. Educational attainment, job training, disabilities, marital
status, and number of children are some of the most important characteristics.
Many of these are often very badly measured by administrative data systems
so that it is hard to do analyses which take them into account. One of
the advantages of surveys is that they can capture this information.
Data Quality and Missing Data ---
Missing data, in the form of either unit non-response or item non-response,
(EN#33) has always been
a major problem for survey researchers, but it may be an even bigger problem
for those designing administrative data systems. Administrative data systems
often have tremendous gaps in the reporting of some items --- especially
those that are unrelated to the business purpose of the data system, and
individuals or cases sometimes get lost because of faulty matching. In
addition, administrative data systems often suffer from severe problems
of non-comparable data, poor documentation, and unreliable data. These
problems are reduced in sample surveys through the use of a uniform instrument,
careful documentation, and the thorough training of interviewers and coders.
There are well-known ways to deal with missing data, but non-comparable
data pose even greater challenges.
B. Survey Data, Administrative Data,
and Linking
Surveys versus Administrative Data ---
There is no one-time fix-up to all of these problems, and no one means
of data collection is unequivocally better than another. Administrative
data, for example, may have its weaknesses, but it also has great strengths
such as large sample sizes and being an excellent record of certain kinds
of events. Figure 7, reproduced from a
study that UC DATA did for the Division of Workers' Compensation of the
State of California, (EN#34)
compares administrative data versus sample surveys along three dimensions.
"Data" refers to the amount and type of data that can be collected. "Cases"
refers to the number of observations and the degree to which they are representative
of the universe of interest. "Times" refers to the frequency and schedule
of data collection.
(EN#35)
Administrative databases often have only
a small amount of information on each case compared to surveys, (EN#36)
but what is there for business purposes is often of superior quality. For
example, the kinds of data that people often have trouble remembering in
an interview, such as the exact amount of their benefits or the dates on
which they received assistance, are carefully recorded in administrative
databases which are designed to keep track of these facts. Unfortunately,
those data which are collected in administrative databases but which are
not essential for business purposes, for example educational attainment
or race, are often of inferior quality. (EN#37)
Administrative databases are often richer in the description of services
--- receipt of benefits, leaving welfare, preparation for work, job training,
child care --- than in two other important types of information. They often
contain little on the characteristics of people, situations, or events
such as educational attainment, job history, or disability that might explain
why the individual needs the service, and they seldom contain outcome data,
such as quality of life measures concerning the adequacy of parenting,
health care, nutrition, or housing, that might more fully characterize
the situation of the individual. It is true that the receipt of some services,
such as job training, might explain why some others, such as welfare assistance,
are eventually no longer needed, but by and large, surveys must be used
to collect background information and detailed outcome measures. Surveys
are also useful if information needs change over time because it is much
easier and less costly to rewrite a survey instrument than to change an
ongoing administrative database.
Administrative databases are usually superior
to surveys because they include information on an entire universe of cases,
although this can present problems of confidentiality. Finally, administrative
databases and surveys differ in the timing of data collection. Administrative
systems collect data as part of the ongoing administrative process. This
is an advantage insofar as it insures that these events are recorded in
a timely manner before memory loss or other events obscure them. But it
is a disadvantage insofar as it means that there is often no observation
of the case during a "normal" period between important administrative events.
This means that important changes in the case can remain invisible to these
systems.
Linking Datasets --- What would
the best possible dataset look like? Let us describe datasets by how much
of the three dimensional space they fill-up in Figure
8. In this figure, the vertical axis is the number of variables or
the amount of data, the axis into the plane of the picture is the number
of observations or cases, and the horizontal axis is time. The best dataset
would provide us with as much data, as many cases, and as many time periods
as possible --- it would fill-up the entire space. Neither administrative
data nor surveys can do all of these things at once.
This suggests a hybrid approach. Why not
link surveys and administrative data to get the advantages of each method?
In fact, why not link several surveys to one another and several administrative
datasets to one another, and then link the surveys to the administrative
data? Figure 8 is a schematic of how we have done this to create the California
Welfare Research Databases. The picture (which one wag described as "Downtown
Pittsburgh") shows how UC DATA and the California Department of Social
Services have linked state level administrative data (the Longitudinal
Data Base), to county level administrative data about 15,000 research families
from four counties (the County Welfare Administrative Data Base), and to
several in-depth panel surveys (the Panel Survey Data Base) from a 15%
sample of those in the four county database and from a foreign language
survey of all language groups comprising one percent or more of the entire
15,000 research families. Taken alone, each of these datasets was constructed
by putting a number of files together, and they required a great deal of
linking, editing, cleaning, and documenting. We describe them and their
uses in more detail in the next section.
IV. Specific Examples of What Might
be Done The California Work Pays Demonstration
Project --- In the last four years, the Research Branch of the California
Department of Social Services (CDSS), University of California Data Archive
and Technical Assistance (UC DATA), and the Survey Research Center (SRC)
at the University of California, Berkeley have been working together on
California's federal AFDC waiver, the California Work Pays Demonstration
Project (CWPDP). It would take us too far afield to describe California's
waiver in detail, but the waiver has two major components: (1) changes
in the calculation of grants meant to encourage work effort such as waiving
the 100 hour work rule for unemployed parent cases and rescinding the four
month limitation on the $30 and 1/3 income disregard and (2) the Cal-Learn
program to encourage pregnant and parenting teens to finish high school
by providing monetary incentives for good grades and disincentives for
getting bad grades or dropping out of school, and case management services
to help teen parents get access to services and manage their lives.
When the first parts of the California
waiver came into effect on December 1, 1992, UC DATA was asked to design
and implement a series of data collection strategies for an experimental
evaluation of the work incentives feature of the waiver. The many elements
of this design are summarized in Figure 9.
(EN#39) Its central
feature (see the rectangle with a dashed line around it near at the top
of the picture on the left) is the designation of 15,000 cases on AFDC
in four counties (Alameda, Los Angeles, San Bernardino, and San Joaquin)
as research cases. Choosing these cases was greatly simplified by the existence
of the state-wide MEDS file (see the upper left-hand corner of Figure 9)
which has records on all Californians eligible for Medi-Cal. MEDS delineated
our universe of potential research subjects because all AFDC recipients
automatically appeared on it through their eligibility for Medi-Cal.
Ten thousand of the 15,000 research cases
were placed in an experimental group which, like the rest of the welfare
population, was subject to the rule changes, and five thousand were kept
on the rules that were fixed as of September of 1992. For them it was as
if the welfare law never changed. Though the U--or unemployed parent--cases
(which are two parent families) constitute only about 5-7% of the welfare
caseload, they were oversampled for the experiment because, as a group,
they were expected to be more responsive to the work incentive features
of the program. (Adults in this type of case tend to be employed more.)
One third of the WPDP sample was drawn from the unemployed parent cases.
(EN#40)
A supplemental group of 4,000 pregnant
and parenting teens has been drawn to evaluate the Cal-Learn component
of the CWPDP. Each of these 4,000 is being assigned to one cell of four
cells in a two factor experimental design. One of the two factors is case
management services and the other is monetary sanctions and incentives.
Teens assigned to one cell get both case management services and monetary
sanctions and incentives, those assigned to another cell get neither, and
those assigned to one of the other two cells get one of the treatments
but not the other. Thus, the impacts of both case management services and
monetary sanctions or incentives can be evaluated with this design.
The CWPDP Datasets --- In order
to develop the best possible tools to evaluate the CWPDP, a number of datasets
have been constructed based upon the MEDS file and the over 15,000 research
families in the four counties. These datasets(EN#41)
are:
Longitudinal Databases (LDB) --- Ten percent
and one percent samples of cases and persons have been taken from the MEDS
file from 1987 to 1995. (See the upper right hand corner of Figure 9.)
These samples are of all Californians who are enrolled in Medi-Cal, and
they are constructed to be continuously updated rolling cross-sections
with continuous monitoring of families once they get on aid. This continuous
follow-up provides the longitudinal component to the data. Data on quarterly
earnings from the state Unemployment Insurance data files have been added
to confidential versions of these files. The major features and data elements
in these files are summarized in Figure
10.
Research Sample Longitudinal Database (Sample
LDB) --- The MEDS files have also been used to construct a longitudinal
database for the 15,000 research cases. This file has the same information
as the LDB described above.
County Welfare Administrative Database
--- This dataset provides information derived from county AFDC and food
stamps databases on the 15,000 research subjects. On Figure 9 it appears
near the bottom of the page on the left where it is called the Uniform
Database. Our initial hope had been that we could create a truly uniform
set of codes and variables across all four counties, but as shown earlier
in Figures 5 and 6, the county AFDC and food stamp case management systems
were simply too different to make this possible.
Panel Surveys --- Two waves of in-depth
telephone interviews with a 15% subsample of the 15,000 original research
cases, or about 2,250 female heads of assistance units who speak English
or Spanish, have been conducted. In addition, two waves of a parallel foreign
language survey of 1,350 people who speak Armenian, Cambodian, Laotian,
or Vietnamese have been finished. These four language groups were chosen
because out of the 15,000 research cases, each of them constituted one
percent or more of the sample. The English-Spanish and Foreign Language
surveys ask basically the same questions, but the Foreign Language survey
includes some additional items about refugee status, including ESL classes
and camp experiences. The content of the surveys is summarized in Figure
11.
These surveys include background information
and outcome information that is almost never available from administrative
data systems. This includes questions about education, AFDC history, work
history, housing quality and stability, economic hardship, hunger, respondent
and child's health and disabilities, labor market activities of partner/spouse,
income, child support, child care knowledge and use of child care, and
knowledge of work incentives. The rate of interview refusal is extraordinarily
low, and the greatest problem with conducting the interviews is locating
the respondents.
Cal-Learn Studies --- A number of datasets
are being developed for the Cal-Learn study. Administrative data from a
variety of data sources including AFDC, GAIN (the California JOBS program),
and Adolescent Family Life Programs has been put together into the Cal-Learn
Administrative database. There are two Cal-Learn surveys. There is a "Retrospective
Survey" of those in the Cal-Learn program who have already had children
(see the left-hand side of Figure 9) and a "Prospective Survey" (see the
bottom right of Figure 9) for those teens at risk of becoming teenage parents.
In the prospective survey, teens who are potentially, but not yet eligible
for Cal-Learn, (i.e., eleven to seventeen year old sons and daughters of
adults in the WPDP English/Spanish telephone survey) will be interviewed
by telephone.
Our experience in California (and the experience
of others around the country) has demonstrated the possibility of linking
administrative datafiles to construct research quality longitudinal datasets,
and the added benefits to be gained by conducting surveys that can be linked
with the administrative data. We have found that administrative datafiles
become more and more useful as they are extended in time to create longitudinal
datasets, as they are linked together to provide more variables, and as
they are cleaned and documented to make them readily accessible. Our datasets
have been designed so that they can be linked, so that they complement
one another, and so that they provide information on important policy issues
such as teenage parenting, quality of life for welfare recipients, disabilities,
job preparation, and employment. We have found that they provide the basis
for monitoring many aspects of the welfare system and for answering very
diverse research and evaluation questions. We consider some of them in
the next section.
B. Answering Specific Questions
In this section we will consider how four
important questions --- disability and AFDC, the use of the Earned Income
Tax Credit by welfare recipients, and the role that job availability plays
in exiting from welfare --- have been approached using the California Work
Pays Demonstration Datasets.
Disability and AFDC --- Although
TANF allows states to exempt up to 20 percent of their caseload from the
five year limit on assistance due to conditions such as disability, we
know surprisingly little about the actual impacts of disability on the
receipt of welfare. Probably the major reason for this is that there are
very few datasets which link information on disability to information about
AFDC receipt. Henry Brady, Marcia Meyers, and Sam Luks are using the CWPDP
data to investigate the impact of child and adult disabilities on the duration
of welfare spells. (EN#42)
The CWPDP surveys were designed to ask
detailed questions about disabilities of children and mothers. This alone,
however, would not have been that useful because the surveys only reliably
indicate AFDC status at a point in time. Linking the panel surveys with
the Research Sample Longitudinal Database provides reliable retrospective
AFDC history back to 1987. Linking the panel surveys with the County Welfare
Administrative Database (CWAD) provides reliable information on AFDC history
after the interview dates. In addition, efforts are being made to link
the County Welfare Administrative Database to state Supplemental Security
Income files to check on SSI status for each of the families in the CWAD
(and in the panel surveys which are nested within the CWAD). At the moment,
the researchers are relying upon survey responses about SSI status.
There are few studies which directly examine
the relationship between disabilities, work, and welfare receipt but Acs
and Loprest have used 1990 SIPP data to show that mothers with severe or
multiple limitations are less likely to than others to leave welfare for
a job. They find little consistent evidence that the disability status
of children affects these transitions. This may be because it matters what
transitions are being studied. Using the CWPDP data, Brady, Meyers and
Luks show that the disabilities of mothers and children do not seem to
predict exits from AFDC, but they do predict the kind of exits that will
occur. Simply put, disabilities of both mothers and children appear to
simultaneously increase time on AFDC and to increase the likelihood that
the case will move from AFDC to SSI. These two competing effects cancel
one another out when we only look at exits from AFDC because families can
exit into SSI or completely off AFDC and SSI.
Take-up of the Earned Income Tax Credit
--- The EITC project has linked most of the CWPDP databases to State tax
records to get a better understanding of how many welfare families take-up
the EITC. This study is relying upon the detailed income and earnings data
available in the CWAD and in the LDB after it has been linked to UI data.
There are very few reliable studies of take-up rates among poor people.
The CWPDP databases provide a very large sample of poor people along with
detailed information on their incomes. This makes it possible to do a detailed
study of this subject.
Job Availability and Exits from Welfare
--- One of the most important questions facing those implementing the new
welfare bill is whether there will be jobs for those who seek to make the
transition to work. In a recent paper using CWPDP data, (EN#43)
Hilary Hoynes has demonstrated how transitions off welfare are facilitated
by strong demand for labor and impeded by weak labor markets. Her study
uses the LDB linked to local area data through zip codes and county of
residence. The LDB provides a large enough sample to determine whether
exits are affected by local labor market conditions.
V. Challenges to Getting Data Collection
Strategies On Line We hope that the reader is convinced by
this time that there are many opportunities for improving welfare data
systems for monitoring, evaluation, planning and research. It should also
be clear, however, that there are some real challenges to creating an improved
system. In this section, we summarize some of those challenges.
Advanced Case Management Systems
--- With time limits and other program mandates in TANF, it will be necessary
to keep current disaggregated information available for all people in a
case, including their ages, educations, workfare program participation
dates and hours, school attendance, etc. in addition to case-specific data
such as payments, child support, food stamp amounts, and Medi-Cal eligibility.
Simply implementing the different mandates of the program will require
a capability to make benefits contingent on a number of changes in the
status of a case, or of the individuals in a case. Accurate and up-to-date
information will be necessary on who is or is not attending school, or
participating in an approved employment or community service activity,
and for how long. As we described earlier, there are hours per week minimum
standards for all TANF recipients engaged in work, workfare, or community
service. Data systems must track all this, and do so accurately.
Information Sharing --- Extensive
information sharing and/or data system interface will be necessary to meet
the data reporting requirements of the Personal Responsibility and Work
Reconciliation Act of 1996. The sharing of information is mandated at two
levels: Federal and State. Generally, all of the programs addressed in
the Act will be required to share information about individuals and families
with other government agencies. Child support agencies will be the recipients
of data from the INS, government licensing bureaus, a multiplicity of government
agencies, businesses, credit bureaus, the UI program and banks. Social
security number matching, which has been relatively rare in a past that
predominantly relied on agency case numbers for tracking, may become standard.
If the provisions
of this legislation become implemented, data needed to operate single programs
will be located in the files of different agencies.
(EN#44)
Survey Support --- If surveys are
selected as a primary means of acquiring quarterly report information,
one of the main challenges will be finding welfare recipients for interview
within the required time frame. This task is frequently a time-consuming
and labor intensive process and time is what States don t have with respect
to quarterly reports. Survey response rates among welfare recipients are
frequently problematic even under ideal conditions in which time is not
a consideration because they move frequently and information on their current
addressees and telephone numbers is not updated. So even if surveys are
conducted, the only way reports could be finished in time is if a sampling
frame with current information is continuously updated and submitted to
a centralized data bank as soon as it becomes available. This suggests
that a centralized source of welfare data will be necessary to support
a survey effort, as well as to collect additional program data
.(EN#45)
B. Resources
Possibly the best approach to improving
welfare data systems would be a dedicated TANF information system that
would be designed specifically for the information needs of the TANF program.
There are some real advantages to a new system. It is often easier to implement
entirely new software designs than to update old programs that may have
been written in the languages of long ago for different purposes entirely.
A new system provides the opportunity to use the new technologies that
we have mentioned in this paper. And a new system could be designed to
maintain good locator information on TANF recipients which could facilitate
the collection of information on recipients through quarterly surveys.
But a new system will be very costly. The resources to create a new system
do not have to meet the limit of fifteen percent on administrative costs,
but they will have to be taken from resources that would otherwise go to
the program itself. This suggests that TANF systems may be build upon existing
systems or upon other systems that are better funded.
Although it is very unlikely that states
would locate welfare data in child support agencies, the Child Support
and Medicaid programs have special Federal funding for data system development,
whereas TANF does not. State Medicaid systems together are allocated $500
million over the first 12 quarters of TANF to change their information
systems because of the unlinking of welfare and Medicaid. The Child Support
enforcement agencies will be given 90% of the planned cost of their new
information systems submitted before the Personal Responsibility bill was
signed. If information systems were located in either of these agencies,
access to special funding would mean that states might not have to use
funds from cash benefits or from employment or pregnancy prevention programs
to develop their TANF information systems.
We do not know what strategy states will
follow, but we do know that resource constraints will figure prominently
in what they decide to do. One of our hopes is that by analyzing the possibilities
for broadening state efforts beyond merely meeting the letter of the law,
it will be possible to create systems that serve statistical needs as well
as case management needs.
C. Political Will
Whether or not to use block grant funds
for the development of surveys or new case management systems is partly
a political choice. We have noted that liberals and conservatives might
differ over the content of surveys and the utility of creating new case
tracking systems. But we have also noted that there are strong incentives
in TANF for the development of some sort of surveys and some sort of case
management system. Furthermore, there are excellent bi-partisan reasons
to want to know what happens to TANF recipients as they reach the end of
their aid. Very few politicians want big surprises --- especially those
that break the bank. A good statistical monitoring system can ensure that
there is an early warning system.
D. Inter-Agency Agreement
The sharing of information between agencies
is always a more or less challenging affair. Agencies differ in their confidentiality
provisions, in their financial arrangements for data sharing, and in their
responsiveness to one another concerning data requests. It helps a great
deal to have personal contacts within the agency whose data is needed.
It also helps to be prepared to pay for the requested data files, especially
if programming time is required to create them. In California, the Medicaid
eligibility system has provided extensive longitudinal files for research
on welfare spell durations and for social program evaluation. Several studies
based on the Medi-Cal records have included shared data from other organizations,
as we discussed earlier. One of the major challenges in the development
of new statistical systems is the negotiation of interagency agreements.
One way to facilitate this would be to develop standard interagency agreements
with similar specifications, so that authorized users of information may
be held to a single standard. The problems here are formidable. There are
many different agencies with very different enabling legislation.
E. Confidentiality
One of the major impediments to interagency
agreement will be legitimate concerns with confidentiality. The Act seems
to be of two minds on this issue. On the one hand it restricts disclosure
and access to information in a number of places. (EN#46)
But on the other hand the TANF portion of the Act has few references to
privacy or confidentiality issues, except that as part of each State plan,
a provision must be included that addresses how the State will, "Take such
reasonable steps as the State deems necessary to restrict the use and disclosure
of information about individuals and families receiving assistance under
the program attributable to funds provided by the Federal Government" (
page 10 ). (EN#47) Furthermore,
as we have seen, the Act includes extensive provisions for matching data
across datasets using Social Security numbers. It also has a provision
for a study of a prototype counterfeit-resistant social security card so
as to provide individuals with reliable proof of citizenship.
As researchers, we have a strong interest
in access to data, but we have no interest in compromising the privacy
of individuals. Hence, while we might see exciting research opportunities
in the ability to link different datasets, we also see dangers of violations
of confidentiality. We are not experts in this field, but we believe that
one of the greatest challenges facing those who want to develop new systems
for monitoring welfare reform is protecting people's privacy while maintaining
a clear picture of how their lives have been affected by the new legislation.
As information is power, and the linkage
of information systems is required under this legislation, much careful
thought must go into developing, not only comprehensive, easy to understand,
and uniform confidentiality requirements, but also ways to allow access
to data for research and evaluation. Providing accurate information and
good research on the many social groups affected by the Act can make an
important contribution to improving policies in ways that can assist poor
people.
V. Conclusions The concept of block grants to States carries
a connotation of reduced paperwork, reporting, bureaucracy, and red tape.
However, in the Personal Responsibility and Work Opportunity Reconciliation
Act of 1996, there is a large and increasingly important Federal role in
compiling information from the States for specially mandated studies, and
for assuring that the programs defined and implemented differently in the
States remain accountable to the intent of the Federal legislation. Because
of the quantity and specificity of the Act s reporting requirements and
because States will suffer serious financial penalties for not complying
with them, the many changes under block grants may very well increase the
need for data collection.
In this paper we have attempted to integrate
an implementation perspective with the planning, monitoring, research and
evaluation perspective. The implementation perspective looks at the way
new programs are likely to be implemented and tries to develop systems
that are compatible with the programs. This perspective tries to understand
the motivations of the various groups and agencies involved in the implementation
of the programs in order to be realistic about what can be accomplished.
It has the defect of getting locked into the presuppositions of the legislation,
or the world view of the actors involved.
In adopting the planning, monitoring, research
and evaluation perspective, we have attempted to describe what needs to
be done to get a reasonable strategy or strategies for data collection
and a statistical system in place. This is a useful contrast to the implementation
perspective, but it can falter because it fails to understand the intent
of the legislation and the goals of the various actors.
We believe that both perspectives make
a case for the approach we have taken at UC DATA in our work with the California
Department of Social Services. By linking many different datasets, each
with its own strengths and weaknesses, we can produce a more complete picture
of what is happening in the lives of groups of people affected by welfare
reform. As we move to a block granted welfare system and see an increasingly
diverse set of state programs, let us not forget that there are many reasons
why many of the nation's poorest families may end up outside of the new
system. Some may be success stories; others may not be so successful. It
is a task of welfare researchers in the next century to document where
both groups end up.
Responsibility and Work Opportunity
Act of 1996
There are many useful summaries of the
new Act, and this is not the place to go into detail about what it does.
(EN#3) The Goals of the act are listed in Figure
1 and many of the major titles contribute in some way to these goals
by requiring work, facilitating the establishment of paternity, enforcing
child support payments, providing child care, and penalizing teenage pregnancy.
As shown in Figure 2, the Temporary Assistance
for Needy Families (TANF) program created by Title I provides time-limited
welfare through block grants to the states. Title III greatly strengthens
child support enforcement and tracking. Title VI provides child care, and
Title VIII cuts food stamp benefits and adds a work requirement to the
program for adults between the ages of 18 and 50 who are not supporting
minor children. Title V makes only relatively minor changes in child protection
programs, but it funds a major study of children at risk for child abuse.
Title II changes narrows the scope of Supplemental Security Income for
disabled children and Title IV restricts welfare and public benefits for
aliens.
Program Performance Standards for TANF
--- The most conspicuous and important feature of the Title I of the Act
is the mandatory work requirements. Figure
3 summarizes these requirements. They have three components: (1) a
requirement that a certain fraction (which increases over time) of the
total heads of households on Temporary Assistance be engaged in work activities
as defined by the Act, (2) a definition of the allowable work activities,
and (3) a set of standards for the minimum number of hours that the head
of the household must be engaged in "work activities" and for the minimum
number of hours that the head must be engaged in a specific subset of these
activities. (EN#4) A
little reflection suggests that it will be a daunting task to devise a
system for keeping track of all this information. Yet, the Act requires
Quarterly Reports which include the information necessary to calculate
participation rates, and it penalizes those states which do not meet the
work performance standards. (EN#5)
The Quarterly Reports require not only the information on Figure 3, but
also the additional information described in Figure
4. Note that one of the requirements of the law is a sample of closed
as well as open cases. As well as requiring a lot of information in these
reports, they must be submitted quickly. To avoid a penalty of four percent
of the block grant, reports must be submitted no later than the end of
the quarter following the one for which the data are collected.
worked for 20 hours or more per week (page
227).
A. Technical Challenges