Weeding Out Tax and Benefits Fraud with the Help of Big Data

Weeding out fraud is a big priority for the public sector. From unpaid and fraudulent taxes to benefit fraud, governments are turning to data to hit back.

In Indiana, for example, the state has begun cracking down on identity thieves, saving a whopping $85 million in false tax refunds, reports The Pew Charitable Trusts. By crunching big data, the state is able to identify false or stolen identities and from there determine which tax returns are fake and then block those refunds.

The program is indicative of efforts that are taking place across the U.S., using data analysis to tackle tax and benefit fraud.

In New York, the city’s Human Resource Administration started running benefit recipients through a computerized pattern-recognition system uncovering $46.5 million in fraud in 2014, up from $29 million in 2009 (New York Times). By integrating multiple sources of data like car registration, property and business ownership, investigators can build profiles of potential fraudsters based on irrefutable evidence that they are under-reporting assets, falsely claiming low wages, and so on.

The data-mining process is extremely important,” Steven Banks, the agency’s commissioner, told the New York Times. “It allows us to zero in on likely fraud so we don’t divert resources to finding what otherwise might be a needle in a haystack.”

Despite the explosion in the use of software to detect fraud and abuse of government social programs, because of a lack of government-wide data sharing, only a few states are drawing on their own databases to identify waste and fraud. Because states house their data – whether its tax, health, DMV data, and so on – in separate databases other agencies can’t easily access that data, especially if it’s not in the same format, making data mining impossible.

This is where big data solutions, such as those offered by Informatica, step in. These solutions help agencies manage, ingest, and correlate data whether it’s structured, unstructured or semi-structured, from any source, in any format and language.

DLT and Informatica recently partnered on a webinar to explain how state government can transform raw data from across the IT ecosystem – data about people, places and things that are represented differently – to ensure that teams including revenue offices, human services, DMVs, courts, and more have access to the right data. View the webinar on-demand here.

Featured image courtesy of Ken Teegardin via Flickr.

editor@acronymonline.org'
Caron Beesley