Data Governance is not new but how it needs to be managed continues to evolve as new disruptions abound.
From reevaluating what information should be collected, to determining the best ways to gather and study data for optimal decision-making, to updated protections for maintaining integrity and security, data management should never be a stagnant process. This rapidly moving and increasingly complex data infrastructure landscape has created a seriousness of purpose at federal agencies toward resources in the procurement of the most advanced systems to investments in workforce training.
Collecting, analyzing, protecting, and securing the abundance of data now available has never been more challenging. The relentless march to the cloud and deployment of this information is moving faster than the development of new solutions to support these efforts.
With this conundrum, data governance simplification must be managed with a combination of internal and external processes. In fact, it is up to all government CIOs and CTOs to implement best practices that will effectively simplify their data governance. Below are three ways agency technology departments can do this while reducing costs and without risking non-compliance.
Minimize Data Copies
Curbing redundancies is as obvious a first step as it sounds. The best way to simplify how you protect and govern your data is simply to try to have less of it.
Right now, most agencies have too many data copies and replications existing of the same transaction in the source system, the data lake, the EDW, the on-premise analytical data mart and the cloud database. This not only makes it difficult to manage a single version of the truth but the sheer volume is burdensome on systems and staff.
Most importantly, the redundant info creates more pathways for hackers to breach your vulnerabilities. A logical data fabric leveraging data virtualization technology means that instead of creating new analytical data marts or additional downstream data repositories to support new use cases and data requirements, virtual data views can be created on top of existing data repositories.
This means that less data needs to be replicated or copied and that a single authoritative data source can often be maintained. Quite simply, by reducing the repetition and duplication of the data, you will automatically reduce the governing of it.
Centralize Access Control and Monitoring
In a highly distributed data landscape, it is vital that government agencies always know who has access and to what. But like most overloaded systems, this is often easier said than done.
Different data repositories and technologies all have different terms of access and it becomes more complicated with granular row-level and column-level data access control. Furthermore, when changes are needed in terms of user access control, it can often require the involvement of multiple teams and multiple systems.
Create a way to monitor within a single control plane.
By mapping all the data sources into a logical data fabric layer and centralize all data access points, it can dramatically simplify the data access control process. When achieved, a domain team or group can be defined once within the data virtualization layer, which can then be used to map granular access control over the entire data estate. This provides the agility needed for IT to enforce all-agency and department-only data access control.
By doing this, it is also possible to confidently track and report all data usages, including the most sensitive, regardless of the underlying data repositories or the end-user BI tools used.
Simplify by Knowing Why
Knowing who is accessing the data is different from knowing why they want to access data. Having capabilities that track and monitor the insights generated from historical activities is critical for leveraging how to improve system performance, drive automation, and enable self-service capabilities.
By collecting all operational and active metadata within a single platform, federal agencies can also easily and confidently track and report on all data usage and activities. This can be aided by establishing data security policies to clearly define the rules and policies for how individuals and groups can access data and the kind of access they are allowed.
Data security policies can allow the data science team to access and view detailed customer profile information whilst hiding and masking sensitive PII data on the fly. Global data security policy is all about providing the right data to the right person at the right time and ensuring that it is done in a scalable and cost-effective way. They can only be developed with a historical understanding of what and why the data is being gathered and viewed.
In an age of increasing complexity in all aspects of data technology and architecture, government agencies need to simplify their approach to data governance to succeed. These steps joined by the process of decoupling the data security policy from the underlying data repositories can make a huge difference.
This is especially true for enabling real-time, interactive access to data across the entire data landscape. It still allows users to create consolidated data views critical for data insights, without tripping over the redundancy nightmare of endless versions and copies of the same data scattered across the platform.
It’s possible to simplify while staying compliant. The time is now.
Syed Rasheed is a Technology Strategist at Denodo, a provider of data integration and management services to companies and governments.
Have an Opinion?
This article is an Op-Ed and the opinions expressed are those of the author. If you would like to respond, or have an editorial of your own you would like to submit, please email C4ISRNET Senior Managing Editor Cary O’Reilly.