How to protect big data without breaking analytics
The new era of computing has arrived: Organizations are anxious to process, analyze and derive maximum value from the power ofbig data. However, as the opportunity increases, the challenge of ensuring information is trusted and protected becomes exponentially more difficult. If not addressed head on, confidence in big data outcomes is lost and the desire to act upon new insights is stifled.
The question becomes, how can you support business goals and real time analysis while also ensuring the protection of sensitive data no matter what form it takes – structured, streaming, files and more?
While this may seem like a daunting task, specific data protection issues can be addressed with a focused practical approach that offers concrete benefits in the near term. The protection of sensitive information from eyes that don’t need to see it—whether the eyes reside within the organization or within a contractor or other trusted partner—is a reasonable and achievable objective. Let’s break down the problem into three quick tips.
1. Discover and understand sensitive data
Ask 5 of your colleagues what data records constitute payment card information and you are likely to get 5 different answers. Before rolling out an enterprise data protection strategy, you should convene a cross functional team to decide what constitutes sensitive data and what should be protected.
Not all data is high risk. Many have failed, because they don’t understand the distributed data landscape and where the sensitive data resides. Keep in mind, sensitive data is duplicated and shared across production systems, non-production systems and with third parties like business partners and vendors.
2. Monitor and audit data activity without slowing down performance
Monitoring and auditing data activity will give you complete insight into the who, what, when and how of all data transactions. With a complete access history, you can understand data and application access patterns, prevent data leakage, enforce data change controls and respond to suspicious in real time.
Leading monitoring solutions also deliver automated compliance reports on a scheduled basis, distribute them to oversight teams for electronic sign-offs and escalation and document the results of remediation activities. Beware of solutions that rely on native logging as they will likely inhibit rather than support your ability to do analytics in real time.
3. Mask sensitive information in applications, databases, reports, analytics and documents
Mask sensitive information in applications, databases, reports, analytics and documents facilitates information sharing and analytics without compromising data privacy
Yes – You got that right. You can mask data inside your analytics platforms without breaking anything! The technology known as semantic masking de-indentifies data in context based on rules to ensure accurate and consistent results for analytics. The value of semantic masking is to retain the utility (usefulness) of the data while also adhering to compliance/regulation requirements.
Let’s explore an example scenario. Semantically masked data will have the same symptoms and gender but the age, family income and ethnicity are intelligently masked to the proper range and to a valid set of data points. The result is researchers achieve valid results while protecting privacy.
With2.5 quintillion bytes of datacreated every day, now is the time to understand sensitive data and establish business-driven security policies to keep customer, business, personally identifiable information (PII) and other types of sensitive data safe. A focus on discovery, monitoring and auditing and data masking are the foundation of a successful data security strategy.
The bottom line – the increasing number of analytics systems storing sensitive data exponentially increases the risk of a breach– more data stores means far greater risk.