Reducing Barriers to Effective Social and Policy Entrepreneurship

Robert Litan - Former Director, Economic Studies, Brookings Institution
Ella Bell Smith - Professor of Business Administration, Tuck School of Business at Dartmouth
Matthew Slaughter - Paul Danos Dean of the Tuck School of Business at Dartmouth
Robert Lawrence - Faculty Chair of The Practice of Trade Policy at Harvard Kennedy School

Published: October 21, 2021


Despite the barriers that impede the scaling of successful social programs, or that prevent corporations from doing more to address social objectives that are in their long-run financial interest, there are multiple ways of lowering these barriers.

Social Policies and Interventions Should be Evaluated to The Maximum Extent

In a market setting, prices and sales are signals for products and services that consumers want scaled and the capital markets, in addition to profits, provide the resources required to do so. In the non-profit and governmental sectors, peer-reviewed evaluations are the analogue to market signals, and funding that is tied to evidence, or the kind of data-driven evidence-based decision-making is analogous to profit and the sales driven funding provided by private capital markets.

But what kinds of evaluations? Among social scientists, the generally regarded “gold standard” is the randomized control trial, or RCT, which the FDA has used to screen new pharmaceuticals for safety and efficacy since it was required to do so by Congress in 1962. In an RCT, one randomly chosen group of subjects is given the “treatment” – the new drug or, in the policy world, the intervention or policy – and its performance along the dimension being studied (say grades or test scores in education, recidivism rates for criminal justice reforms, or labor participation rates with an unemployment insurance or a guaranteed income program) is compared to a control group that does not receive the treatment. If there is a difference in outcomes between the two groups over some time period (which can vary depending on the nature of the intervention) and that difference is “statistically significant,” or different enough to be unlikely to have happened by chance, then researchers and policy makers can safely infer that the intervention has “worked.” Otherwise, the intervention has not worked.

Relatively few governmental or social programs have been rigorously studied with an RCT, in part because to do so can be time-consuming and expensive. As in the pharmaceutical arena, there must be enough subjects in both the treatment and control groups, or the “sample sizes” be large enough, to ensure statistical “power,’ or the ability to conclude with a high degree of confidence that any differences between the two groups is truly “significant.”

The Laura and John Arnold Foundation has been a leader in funding and tracking RCTs of social interventions. The Foundation’s long-time former Vice President, Jon Baron, includes these successful interventions that have passed the rigorous RCT test [Baron]:

  • Per Scholas Job Training, a program that provides training and employment services in the information technology sector to low-income workers (in the third year after random assignment, workers in the Per Scholas group earned an average of $4,800, or 27 percent, more than workers in the control group)

  • Knowledge Is Power Program (KIPP) elementary and middle schools, a national network of public charter schools whose mission is to help underserved students enroll in and graduate from college (two to three years after random assignment, students in KIPP schools scored 5 to 10 percentile points higher in reading and math than students in the control group); and

  • Teen Options to Prevent Pregnancy (TOPP), an intervention for low-income adolescent mothers that aims to reduce rapid repeat pregnancy and promote healthy birth spacing (20 months after random assignment, 21 percent of the TOPP group had experienced a repeat pregnancy, compared to 38 percent of the control group).

RCTs are also used heavily in the tech and advertising sectors, styled as “A/B” tests, which many companies use to test the effectiveness of different marketing campaigns and website pages, and which many new companies use to fine-tune their early business strategies [Ries].

In addition to cost and time constraints, however, RCTs have other important limitations. They can detect average effects, but not isolate the impact on specific individuals. In the social policy arena, the results from an RCT in one location may not be easily translated into other locations, with different cultures or institutions, or from one period to another since conditions can change. [Deaton and Cartwright].

Where RCTs are not possible there are other accepted quantitative ways to measure program effectiveness, including “quasi-experimental” studies which compare treatment and control groups, though not with participants randomly assigned, and attempt to adjust for all relevant factors in addition to the treatment in order to isolate its effects. Standard regression models make these kinds of adjustments including one for “self-selection” bias (for voluntary programs). In addition, for certain non-profits that provide services, evaluations of users of those services can be helpful. As a broad rule, however, innovative programs not yet widely tested should be subject to some type of evaluation in a pilot setting before being scaled up, assuming the evaluations warrant further investment.

Ideally, as Results for America (RFA) has advocated, programs at all levels of government, both new and existing, that are capable of being evaluated, should have some level of assessment, rather than be put on auto-pilot year after year. Importantly, Congress should include funds for assessing benefits, costs and other impacts (such as incentive or disincentive effects of tax law changes) of any spending and tax initiatives proposed by any Administration that make it into law. While broad-based multi-year programs like free college tuition or child tax credits for families with incomes below certain thresholds cannot be run as randomized control trials – because they are available to anyone who qualifies – there are statistical techniques for studying their effectiveness, such as comparisons of behavior of those just below and above the income cut-offs.

Also ideally, all government programs, both new and existing, at all levels of government, should be evaluated. Realistically, we recognize this is not likely. But that shouldn’t preclude trying to do so, perhaps on some schedule that allows for gradual evaluations of existing programs. In addition, where feasible, and at a minimum, funding for new programs should have an evaluation component built in. 

Federal government policymakers should not be exempt from evaluation requirements. As suggested earlier, it would have been ideal if Congress had included in the two infrastructure packages enacted over the summer some relatively small amounts for project-specific assessment, as Professors Glaeser and Poterba have advocated for infrastructure projects, as well as post-program evaluations. Hopefully, going forward in future legislative packages, administrations and Congress will adopt this idea. In the meantime, executive-branch agencies will continue to be bound by Executive Orders issued by Presidents since the Carter administration to perform cost-benefit assessments before implementing new “major” regulations. 

As much progress as RFA has made, evidence-based decision-making has a long way to go. RFA reports on its website that since 2013, only $3 billion in federal funding has been switched for evidence-based reasons.[i] The website also reports that over 100 cities are working with it on evidence-based solutions, but there are no data indicating how dollars of local spending have been switched on evidence-based grounds. In short, the nation has a long way to go before data-driven evidence-based decision-making is ingrained throughout policymaking.

Philanthropies can take more risk than governments in funding the development of new social programs and interventions, and ideally evaluations to accompany them. The U.S. is fortunate to have so many foundations actively supporting this kind of work. RFA, for example, could not do what it does without the support of several of America’s largest foundations (see its website, https://results4america.org/, for details).

The Scaling of Social Interventions Should Be Linked to Positive Evaluations

Linking funding of social programs or interventions to positive evaluations would improve financial incentives for scaling of these initiatives, and ideally, replace auto-pilot financing. In this way, “social capital markets” would more closely resemble private capital markets for financing for-profit activities.

But for this linkage between evaluation and financing to work properly, the funding source – whether it be government or philanthropic – must know or must be able to easily find out whether a positive evaluation is legitimate, has been peer reviewed or certified in some sense by a recognized authority.

Bloomberg Philanthropies provides certifications for cities, at three different levels, but not at the program level, which is what is most needed. We have in mind here more than just registries of studies which are maintained by some federal agencies such as the Department of Health and Human Services (in health) and the Department of Education (in education).  Non-profits and state and local governments need more than just lists, even if all the entries have passed peer review, and been published in highly regarded academic journals. They need more guidance – a way to narrow the lists by the quality of the assessments reported.

Results for America features on its website a compilation of program evaluations across a wide range of subject areas, as well as links to the growing literature on evidence-based decision-making. In addition, the website features its “Moneyball for Government” initiative, which since 2013 has provided reports on program effectiveness across a wide range of areas, such as workforce development and education, and has over 270 leaders from across as participants.[ii]  [Nussle and Orszag, eds]. Results for America has assisted evidence-based decision-making in nearly 170 state and local governments. Likewise, Project Evident, another non-profit and website (https://www.projectevident.org/) devoted to advancing evidence-based decision-making, reports similar information.

We urge these organizations, or others working to achieve a similar objective, to go one step further and provide ratings, rankings, or at least classifications – for example, by type of study (RCT, quasi-experimental, statistical with self-selection controls) – in much the same way that Consumer Reports provides ratings of consumer products and services, or the ratings agencies do for new issues of corporate and government debt. Ideally, the ratings or evaluations would be adjusted for “publication bias,” the tendency of academic journals to publish only the most striking results, using the latest statistical techniques [Andrews and Kasy].

Accelerators and Networking for Social Entrepreneurs

The world of social change could benefit from accelerators analogous to those that serve for-profit entrepreneurs. Project Evident has a “Talent Accelerator” for social entrepreneurs, as part of its broader mission to help non-profits and policy makers build their own capacity for making evidence-based decisions.[iii] Philanthropies should support more initiatives like this one.

Making better decisions is only part of pursuing effective change, however. Social entrepreneurs could benefit greatly from regularly interacting with and learning from each other – ideally, in person, as the pandemic wanes – in how to launch and build effective organizations, and how to navigate government bureaucracies and build community support for evidence-based policies. Best practices and tools for making government more transparent also should be shared on a regular basis.

Philanthropies can make a meaningful difference here, too. Just as is the case for scientific innovations, which often are the outgrowth of insights from one discipline applied to another [Epstein], social entrepreneurs in different fields could benefit from a similar process of cross-fertilization.

More Spotlighting of Organizational Successes

Earlier, we discussed how Role Model effects can be powerful motivators for minorities who may not otherwise be exposed to and thus inspired by individuals like them who have gone on to achieve great success. There is no reason why Role Model effects cannot be effective for everyone, including potential and current change agents.

Toward that end, the media – not just Hollywood, but also television news organizations – can and should do more to report on the entrepreneurial leaders and their organizations that are helping to make our country a more perfect union. This is just as much “news” as the political events of the day.

We mean more than the “feel good” stories that are now a staple of the nightly news shows on the major networks. Instead, we have in mind feature programming that goes into greater depth. CNN’s annual show “Heroes” and the NBC’s “Making a Difference” segments celebrating inspiring individuals are helpful steps in this regard. But shows like this should be produced and shown more frequently, with greater emphasis not just on the individuals achieving change, but their organizations and state and local government leaders that are doing so at scale.

The federal government, and specifically the President, also can do more in the same vein by hosting annual “social entrepreneurship” awards ceremonies that give public recognition in the same way. Websites like Results for America and Project Evident, among others, also can do more in this regard. One role model is the Edutopia website (https://www.edutopia.org/) that features videos of best practices in K-12 education. 

Harnessing the Power of Standardized Information to Address Climate Change

We have discussed in earlier sections the power of information to drive outcomes, both in market and non-market settings. In that connection, we have highlighted the importance of standardization so that information can easily be assessed. Prices for example are expressed in units of currency that are easily understood. Ranking systems that are transparent and widely accepted, such as the ratings that show up in Consumer Reports or which are published by ratings agencies for bond issues, have similar features.

With respect to some of the social objectives that are the focus of this essay, efforts are ongoing to standardize ESG reporting, primarily to harness the power of shareholders to monitor and discipline corporations in pursuit of ESG objectives. Much less attention has been given, at least so far however, to using ESG labels for products and services.

This imbalance should be rectified. One way to do so is for companies to begin labelling the products (think automobiles or appliances) and services (electricity) by the carbon or GHG content that is both embodied in their production (counting all the inputs), as well as in their use by purchasers. This recommendation goes beyond and is fundamentally different from the energy efficiency ratings, or “Energy Stars,” that the federal government, specifically the Environmental Protection Agency, now assigns to multiple products.[iv] Energy efficiency is related to GHG emissions but is not a direct measure of the emissions that products entail.

To be most useful, the carbon or GHG measures should be expressed in ways that users can easily understand – not necessarily by carbon content per pound (or kilowatt) but valued in monetary terms with dollars (or cents) indicating the monetary value of the damage with which those products or services are associated. The federal government could post annual estimates of the social cost of carbon that companies would be required to use each year in making these disclosures, with clear explanations that the higher the monetary value, the more damage to the climate the product or service is imposing.

A variation of product and service GHG or carbon disclosures is for the government to require major GHG emitters to report annual amounts of GHG emissions entailed in production (again, ideally counting all the inputs into the production process), and then to compile a public inventory of the largest emitters. Cass Sunstein and Richard Thaler provide a model for this concept in their widely acclaimed Nudge, drawing on the powerful example of the Toxic Release Inventory maintained by the EPA based on required reporting of firms and individuals of the quantities of potentially hazardous chemicals they have stored or released [Sunstein and Thaler, pp. 190-91].[v] These highly acclaimed authors (Sunstein is one of the nation’s leading legal scholars and Thaler is a Nobel prize winning economist) note how consumer pressure has been successful in reducing the use of hazardous chemicals as a result. Similar effects should flow from pervasive GHG-related disclosures (this is especially true in the absence of a carbon tax or its regulatory equivalent, a nationwide system of cap-and-trade).

However, any disclosure-based product or service labeling system requires both widely accepted standards for generating these disclosures, as well as mechanisms for auditing the labels. Unfortunately, as is true in the investment world, no single set of carbon or GHG standards generated by a widely accepted standards-setting organization exists, such as Underwriters Labs, a non-profit organization that both develops product safety standards and certifies compliance with them.

Carbon or GHG labelling need not – and may not – wait for the ESG standardization process to be completed. At this writing, the SEC is considering whether to delegate ESG standard-setting to a new organization modeled after the FASB, which sets financial accounting standards. If this happens, it is not clear, however, whether that same new ESG standards-setter would go beyond disclosures of the impact of the climate on the finances of specific firms and set standards for the impact those firms are having on the environment, namely carbon footprints of the kind that would help consumers and investors. If the latter type of disclosures – namely, carbon footprints – are mandated by the SEC, that body could face legal challenges whether such a step exceeds the agency’s statutory investor protection mandate.

Alternatively, Congress could authorize the EPA to delegate to UL, or perhaps another equivalent body, the job of setting carbon footprint disclosure standards. We are agnostic on whether this body should also have a monopoly on certification as well, or whether that function, like the auditing of financial statements should be open to competition but policed by a body like the Public Company Accounting Oversight Board (PCAOB) or a new environmental equivalent.

Yet another approach would be for Congress to begin by appropriating monies for UL or industry-specific carbon labelling standards-setting initiatives and then letting the marketplace take over the labelling function. In fact, some food and consumer packaged goods companies are already providing carbon labels. One study by researchers at New York University found that between 2015 and 2019 over half the growth in consumer products sales consisted of -sustainable-marketed products.[vi] Pressure continues to mount from consumers and investors to make these kinds of disclosures [Eaglesham and Shifflett].

With established standards for disclosing carbon footprint, companies wanting to appeal to climate change-sensitive customers on their own will have market-driven incentives for affixing labels that comply with these standards. As more companies adopt the practice and are rewarded in the marketplace for doing so, others are likely to follow. It is thus quite conceivable that so long as standards are in place, government mandated labels may be unnecessary, or at least the market might be given a chance before mandates are considered.

Harnessing the Power of Transparency To Help Overcome Systemic Racism

Ultimately eliminating racism will require mindset changes, which as we note in several places in this essay, can be accomplished only by first forging consensus that a problem exists, followed by the willingness to spend time and resources to address it. While no individual leader of any organization can change the entire landscape, each organization can control its own cultural norms and rules and over time these can promote more racial equality [Livingston].

One way for businesses to do this is to harness the power of transparency to overcome biases in the workplace by setting transparent diversity goals. Multiple stakeholders would see those goals, set by the companies themselves, and hold them accountable. In addition, more companies could also follow what is an emerging trend of tying compensation of high-ranking executives to meeting those goals [Wahba]. And more companies could be using sponsors, not just mentors, inside their organizations to help advance the careers of minorities who might otherwise be left behind. 

One critique of enhanced diversity efforts, even if they work, is that affording more inclusive opportunities within corporations ostensibly benefits a fortunate few and will not address the disadvantages faced by large numbers of other Black Americans and minorities. This critique overlooks the power of role models. As the legendary music producer and performer Quincy Jones remarked about overcoming the huge hurdles in his life, including discrimination, in the documentary Quincy, “you can’t be it if you don’t see it.” Jones points to Black musicians who were his role models in his life and gave him the inspiration to become one himself. We don’t know how many more American Blacks, other minorities and women will be encouraged by seeing more people like themselves in positions of leadership and power – though we have a good idea that there are potentially many more of them.

Another concern about enhanced diversity efforts is that they allegedly could hurt the effectiveness (or profitability) of organizations because the number of truly “qualified” minorities may be limited and that employers can’t be held responsible if there is a shortage of talent in the “pipeline.” Here are two responses.

One answer is that an over-emphasis on formal credentials – such as earning a college degree for many jobs – overlooks people who are qualified or can prove so if given the chance. Diversity efforts, especially if executives have financial incentives to pursue them, can promote innovative efforts within companies to identify talent through other means. 

A second answer is that, to the extent there are pipeline shortages of qualified minorities for certain skilled positions, companies can have much greater incentive to support and publicly advocate for funding public K-12 and for structural reforms that can enhance the engagement and achievement of minorities. The same should be true for more job-specific training, such as that provided by community colleges.

