Analysis: Federal Data Strategy Action Plan

The summer of 2019 is off to a great start for data professionals seeking to make valuable contributions working in the federal public sector.  After several solicitations for public comments over the last year, the Office of Management and Budget (OMB) has at last issued the final draft of the Federal Data Strategy. The Federal Data Strategy by design is intended to help the government accelerate the use of data to drive and deliver mission objectives. The Strategy consists of a mission statement, 10 guiding principles and a set of 40 practices to assist agencies on how best to leverage and derive value from their federal data. Now the strategy needs to be operationalized and so we have the first draft of the Federal Data Strategy Action Plan, published by OMB June 4th. The plans seek to implement the strategy with specific actions, tasks, deliverables to produce under designated timelines and measurable outcomes. An overview of the first draft of the 2019-2020 Federal Data Strategy Action Plan by OMB is detailed below. Comments on the first draft have been solicited to the public and will remain open for the next 30 days.

Below are, verbatim, the actions as described and published in its current form within the first draft of the Federal Data Action Plan (see “Draft of the 2019-2020 Federal Data Strategy Action Plan” https://strategy.data.gov/action-plan/ ). There are three categories of actions and each is described below. Regardless of category, the actions and next steps required to realize them provide an excellent opportunity for technology companies, which specialize in all facets of the data life cycle - capture, management, store, secure, integrate, analyze, model, automate, etc. DLT and our clients can help agencies implement their federal data action plans and deliver data as information assets to drive and achieve mission objectives.

• Shared actions are led by a single agency or existing council for the benefit of all agencies and with available cross-agency resources. They provide government-wide thought leadership, direction, tools, and/or services for implementing the Federal Data Strategy. Funding is identified to implement each of the identified shared actions through the President’s Management Agenda.

• Community actions are taken by a group of agencies around a common topic, usually through an established cross-agency council or other existing coordinating mechanism, and with available cross-agency resources. They represent ongoing, mature, cross-agency priorities that will use the Federal Data Strategy practices and implementation guidance to more quickly and consistently achieve their goals.

• Agency-specific actions are taken by an individual agency and are designed to build capacity using currently available agency resources. They set expectations for progress and success in implementing the practices.

Shared Actions: Government-wide Data Services

Action 1: Create an OMB Data Council

By November 2019, the Office of Management and Budget will establish a cross-office Data Council that will coordinate across statutory offices on information policy development and implementation activities and provide guidance on government-wide data standards and improvements required by statute, such as the Digital Accountability and Transparency Act, the Foundations for Evidence-Based Policymaking Act (hereinafter “Evidence Act”), and the Geospatial Data Act. The OMB Data Council will also provide to agencies a coordinated voice (or response) and common guidance regarding the implementation of the Federal Data Strategy. OMB’s efforts to ensure relevant participants are engaged in data governance will be a model for individual agencies. The OMB Data Council will provide a way to address issues that cross agencies’ and OMB’s statutory functions to help inform government-wide management and budget priorities for data management and use.

Action 2: Develop a Curated Data Science Training and Credentialing Catalog

By February 2020, Federal agencies will have access to a curated catalog of federal and non-federal training offerings in data science, aligned to federal needs.

The General Services Administration (GSA), with federal and non-federal stakeholder input, will create an inventory of data science training and credentialing opportunities used by and available to federal agencies. The catalog will relate training and credentialing to career paths, including on-ramps for federal employees at various stages of development and interest and will describe the required education and expertise to advance to the next stage of training. The catalog will provide federal employees with the beginnings of a roadmap for how data science training and credentialing can match their development goals. This Action will work in tandem with Action 14 to ensure agencies have sufficient hiring and reskilling options to leverage data as a strategic asset.

This shared action has received financial resources as part of the Cross-Agency Priority Goal: Leveraging Data as a Strategic Asset and a GSA responsible party has been assigned.

Action 3: Develop a Data Ethics Framework

By November 2019, the Federal Government will have available a consistent framework for evaluating ethical repercussions and tradeoffs associated with data management and use. GSA will work with academia, professional associations, and federal data stakeholders to create a Data Ethics Framework that provides key decision points and considerations for ethical data management and use that go beyond legal requirements and supports the Federal Data Strategy principles. Specifically, GSA will research relevant ethical frameworks for data management and use, will then conduct stakeholder feedback on a draft Data Ethics Framework alongside academic and professional association partners, and finally will publish and promote the Data Ethics Framework for the Federal Government. This framework will build on fitness for use assessments, including potential use in automated technologies, such as Artificial Intelligence (AI). This framework will be updated as needed in future Action Plans.

This shared action has received financial resources as part of the Cross-Agency Priority Goal: Leveraging Data as a Strategic Asset and a GSA responsible party has been assigned.

Action 4: Develop a Data Protection Toolkit

By August 2020, the Federal Government will have developed a consistent approach for measuring and mitigating the risk of re-identification from the release of disparate data sets, often referred to as the “mosaic effect.” The Federal Committee on Statistical Methodology will update Statistical Working Paper 22: Report on Statistical Disclosure Limitation Methodology and will collaborate with the Department of Education to create a re-identification risk assessment toolkit for federal agencies. It will include templates based on best practices for assessing, managing, and mitigating the risk that individuals or enterprises are re-identified from the release of confidential federal data. It also will array a suite of approaches for safely accessing data while accounting for confidentiality concerns, from fully open to restricted access in data enclaves. The toolkit will be designed as a user-friendly website in support of both more and less technical users.

This shared action has received financial resources as part of the Cross-Agency Priority Goal: Leveraging Data as a Strategic Asset and a Federal Committee on Statistical Methodology responsible party has been assigned.

Action 5: Develop a Repository of Federal Data Strategy Resources and Tools

By November 2019, GSA will create a government-wide repository of tools and resources to assist agencies with implementing the Federal Data Strategy, as well as companion efforts, such as implementing the Evidence Act. The repository may include more detailed practice descriptions, case studies that demonstrate the practices “in action,” and tool kits for implementing individual or groups of practices. The Federal Data Strategy is seeking comments on priorities for populating the repository.

Action 6: Pilot a One-stop Standard Research Application

By August 2020, the Federal Government will pilot a one-stop standard application for accessing federal data assets for research and other evidence building purposes.

The Federal Statistical Research Data Center Program Management Office at the U.S. Census Bureau, in collaboration with member agencies and others with active researcher access programs, will develop an automated and streamlined research application (analogous to TSA Pre-check or Global Entry) that would provide a standardized approach for qualified and trained researchers to access agency data that cannot be made public and would reduce the paperwork burden resulting from duplicative forms. This approach would have the added benefit of also holistically setting consistent and appropriate access requirements and data security and privacy protocols in accordance with applicable laws and regulations.

This shared action has received financial resources as part of the Cross-Agency Priority Goal: Leveraging Data as a Strategic Asset and a U.S. Census Bureau responsible party has been assigned.

Action 7: Pilot an Automated Inventory Tool for Data.gov

By August 2020, the Federal Government will have an automated tool that leverages agency Information Collection Review (ICR) processes and documentation under the Paperwork Reduction Act and possibly other existing sources, to populate metadata, or information about each dataset, on agency enterprise data inventories. This automation tool will complement the inventory requirements under the Evidence Act. This action will begin a multi-year process of deploying an automated approach to populating needed information on agency data inventories in order to address public, cross-agency and intra-agency needs for data discovery and access, leveraging existing processes to the extent feasible. Discovery includes the capacity to identify duplication in a manner that helps to avert unnecessary collections and promotes collaboration to leverage single collections for multiple benefits. Discovery and access both include sufficient metadata to understand whether the data is collected under a pledge of confidentiality or privacy and, if so, whether restricted access versions of the data are available. Inputs to the automated data collection inventory include, but are not limited to, existing processes such as the ICR process required of agencies and OMB under the Paperwork Reduction Act, with a particular focus on OMB control numbers as an identifier and ICR packages as sources of metadata.

This shared action has received financial resources as part of the Cross-Agency Priority Goal: Leveraging Data as a Strategic Asset and a Department of Education responsible party has been assigned and work is currently underway to execute this goal.

Action 8: Pilot Standard Data Catalogs for Data.gov

By February 2020, the Federal Government will have an improved standard Federal Data Catalog kit pilot for metadata management and data hosting capabilities, in support of legally required federal data catalog requirements. This automation tool will complement the inventory and Federal Data Catalogue requirements under the Foundations for Evidence-based Policymaking Act. The General Services Administration will create a government-wide data catalog platform pilot with a shared code base and cloud hosting that agencies can install quickly and cheaply, and that is customizable enough to support agency needs, leveraging Data.gov’s existing open source codebase and modern container/plugin techniques. This approach will provide cost savings across the Federal Government’s hosting and management of data catalogs, create more complete and sophisticated federal data catalogs with advanced features such as automated quality assurance, and result in increased use and improved user experience for the public and agencies.

This shared action has received financial resources as part of the Cross-Agency Priority Goal: Leveraging Data as a Strategic Asset and a GSA responsible party has been assigned and work is currently underway to execute this goal.

Community Actions: Cross-Agency Collaboration

Action 9: Improve Data Resources for AI Research and Development

By February 2020, the Federal Government, through the implementation of Executive Order on Maintaining American Leadership in Artificial Intelligence6, will have improved the data and computing resources for AI Research and Development.

All agencies shall review their Federal data and models to identify opportunities to increase access and use by the greater non-Federal AI research community in a manner that benefits that community while protecting safety, security, privacy, and confidentiality based on OMB guidance and in response to public feedback (see Action 16). Specifically, agencies shall improve data and model inventory documentation to enable discovery and usability, and shall prioritize improvements to access and quality of AI data and models based on the AI research community’s user feedback.

Action 10: Improve Financial Management Data Standards

By August 2020, the Federal Government, through the implementation of the President’s Management Agenda, will have improved the management and use of several financial management data assets, by:

• Getting Payments Right: The Federal Government will reduce the amount of cash lost to the taxpayer through incorrect payments; clarify and streamline reporting and compliance requirements to focus on actions that make a difference; and partner with states to address improper payments in programs that they administer using federal funds.

• Result Oriented Accountability for Grants: The Federal Government will rebalance compliance efforts with a focus on results for the American taxpayer; standardize grant reporting data and improve data collection in ways that will increase efficiency, promote evaluation, reduce
reporting burden, and benefit the American taxpayer; measure progress and share lessons learned and best practices to inform future efforts, and support innovation to achieve results.

• Federal IT Spending Transparency: The Federal Government will improve business, financial, and acquisition outcomes; enable federal executives to make data-driven decisions and analyze trade-offs between cost, quality, and value of IT investments; reduce agency burden for reporting IT budget, spend, and performance data by automating the use of authoritative data sources; and enable IT benchmarking across Federal Government agencies and with other public and private sector organizations.

Action 11: Improve Geospatial Data Standards

By August 2020, the Federal Government will have improved geospatial data standards for all Federal Geospatial Data Committee (FGDC) data themes.

The Geospatial Data Act requires the FGDC to establish standards for each of the National Geospatial Data Asset data themes and content standards for metadata, consistent with international standards and with public feedback, excluding public disclosure of any information that reasonably could be expected to cause damage to the national interest, security, or defense of the nation.

Agency-Specific Actions: Agency Activities

Action 12: Constitute a Diverse Data Governance Body

By September 2019, all agencies will have established an appropriately inclusive and empowered data governance body to ensure that agency resources are aligned with agency priorities based on a maturity assessment and mission needs.

All agencies will charter and constitute a data governance body driven by chief data officers (as required by the Evidence Act) with participation from Senior Executives in agency business units, and agency-wide data support functions such as policy leaders, general counsels, privacy officers, statistical advisors, evaluation officers, and chief information officers. Convening appropriate senior level staff and technical experts to discuss data resources, set policy, and recommend future investments may have many benefits including increasing transparency and trust about the data brought to bear in decision-making, aligning goals and procedures to protect privacy and secure data, and reducing the resources needed for data management through new efficiencies. An agency data governance structure—variously called data governance board, data council, and data stewardship committee—identifies the scope of the data that needs to be managed and specifies policies, standards, reporting structures, and roles for data management. The board uses data maturity models to assess agency capabilities and seeks meaningful and broad agency input before recommending data investment priorities. It ensures the monitoring of and compliance with policies, standards, and responsibilities throughout the information lifecycle.

Action 13: Assess Data and Related Infrastructure Maturity

By May 2020, all agencies will conduct an initial maturity assessment focusing on data and data infrastructure (e.g. organizational structures and knowledge bases, policies, workforce skills) needed to answer agency priority questions and to set a baseline for future improvements. This assessment will identify readiness to meet other requirements of the Federal Data Strategy and related legal requirements. It can be used to make investment decisions and to prioritize subsequent action steps.

Action 14: Identify Opportunities to Increase Staff Data Skills

By May 2020, agencies will have begun to identify critical data skills required to support high-quality analysis and evaluation, data management, and privacy protection.

Data-driven decision-making requires not only accessible, high-quality data but also a workforce with adequate knowledge of data security practices and data skills, including data science, statistics, and program evaluation, to leverage insights from data while also safeguarding protected information. In an increasingly complex and data-saturated decision landscape, even staff who traditionally have not employed data in their day-to-day functioning will be better able to meet critical business needs by attaining at least basic data literacy skills.

Achieving parity between skill needs and workforce capacity is an ongoing process involving several sequential steps:

1. Identifying critical data skills for the agency (and relevant sub-units) in the areas of analysis and evaluation, data management, and privacy protection;
2. Assessing current staff capacity for critical data skills including by surveying survey staff for unseen skills (and skills that may not be related to employees’ job series); and
3. Developing an initial plan to address gaps between critical data skill needs and current capacity.

The identification of critical data skills will naturally be informed by the determination of agency key questions, and aided by the Data Science catalog created by GSA, but thoughtful consideration should be given to the less visible data skill needs of staff performing non-traditional data roles (such as IT, communications, and finance) as well as the needs of staff fulfilling more traditional programming and analysis functions. An initial plan to address identified skill gaps should establish a mechanism for providing the time and resources staff need to learn and apply new skills. This plan should leverage options for increasing staff capacity, such as acquiring easy-to-use tools and dashboards, making available additional training and educational opportunities, taking advantage of on-the-job learning experiences, participating in intentional data communities, and capitalizing on hiring and retention flexibilities.

Action 15: Identify Data Needs to Answer Key Agency Questions

By August 2020, all agencies will take initial steps to identify the data needed to answer key questions of interest to the agency. The learning agenda process is one critical tool to help agencies identify and prioritize the data needed to answer key agency questions by engaging senior leaders and stakeholders. By focusing on the priority questions, agencies will consider what data are currently available; any issues around data quality or coverage; and if data are not available, how they might be collected or acquired.

Action 16: Identify Priority Datasets for Agency Open Data Plans

By August 2020, agencies will identify then use their initial list of highest priority datasets as the focus for enhancing their data inventories and catalogs and approaches to secure data access and sharing. Agencies will identify an initial set of priority agency datasets that are key to mission success and/or a priority for stakeholders outside of the agency. These datasets can be the initial focus for testing and implementing improvements to agency comprehensive data inventories and catalogs, as required by the Evidence Act, as well as for improving secure processes for data access and sharing, for concretely engaging stakeholders to help them understand and use federal data, and for obtaining effective feedback on the agency’s planning processes to improve open data access.