Foreword
The global big data industry is presently in a period of rapid development. Technological evolution and innovation in applications are advancing with increasing speed. New forms of data storage, computing, and analysis technologies such as non-relational databases, distributed and parallel computing, machine learning, and deep mining have found a habitat for rapid evolution. At the same time that big data mining analysis in fields such as telecommunications, Internet, finance, transportation, and medicine is producing value in commerce and applications, it is also beginning to permeate traditional primary and secondary industries. Big data is progressively becoming a national basic strategic resource and an essential factor of production in society.
At the same time, big data security issues are increasingly apparent. Because big data entails centralized storage and management of very high-value data, it has become a major target of cyber attacks. Problems of big data ransomware attacks and data leaks are becoming more serious every day, and global big data security incidents occur frequently. In response to big data security demands, research and development (R&D) and production has emerged in security technologies and solutions in the form of programs and goods, but these lag in comparison to industry developments.
In the Politburo during the second national big data strategy team study session (国家大数据战略第二次集体学习), Chairman Xi Jinping said that China must: build a digital economy with data as a key factor; advance the integrated development of the real economy and the digital economy; and advance integration of the Internet, big data, artificial intelligence, and the real economy. At the same time, we must realistically ensure national data security. This demands that we must persist in the national overall security view; establish the proper cybersecurity view; persist in “protecting development through security and using development to support security”; and give full play to big data’s important role in areas such as advancing industrial transformation and upgrading, and in raising national governance modernization levels. At the same time, we must: profoundly recognize the importance and urgency of big data security; clearly identify big data security challenges; actively face complicated and severe security risks; persist in emphasizing both security and development; accelerate the construction of a big data security safeguard system; and ensure national big data development strategy is successfully implemented.
This report begins from the starting point of the transformations brought about by big data, deeply discussing how big data security differs from traditional security. It then focuses on technological fields, giving an overview of big data security technology, and discussing security threats and security safeguard technology developments in the three areas of platform security, data security, and personal privacy security. Finally, based on the current conditions in big data security technology development, it assesses the future direction and offers recommendations for big data security technology development in order to provide a foundation and reference for big data industry and security technology development.
Table of Contents
Foreword
1. Understanding and Thinking About Big Data Security
2. Overall View of Big Data Security Technology
2.1. Big Data Platform Security
2.2. Data Security
2.3. Privacy Protection
3. Technological Problems and Challenges for Big Data Security
3.1. Platform Security Problems and Challenges
3.2. Data Security Problems and Challenges
3.3. Personal Privacy Security Challenges
4. The Situation of Big Data Security Technology Development
4.1. Big Data Platform Security Technology [Omitted]
4.2. Data Security Technology [Omitted]
4.3. Personal Privacy Protection Technology
4.4. The Present Conditions of Big Data Security Technology Development
5. Recommendations for the Future Development of Big Data Security Technology
5.1. Structure an Integrated Big Data Security Defense System From the Heights of an Overall Security View
5.2. Start From the Aspect of Attack Defense to Strengthen Big Data Platform Security Protection
5.3. With Key Links and Technologies as Breakthrough Points, Improve the Data Security Technology System
5.4. Strengthen Privacy Protection Core Technology Industrialization Investment, Taking Into Account the Two Important Priorities of Data Use and Privacy Protection
5.5. Emphasize Big Data Security Review Technology R&D, and Structure a Third Party Security Review and Assessment System
1 Understanding and Thinking About Big Data Security
The scale, processing, applications, and other aspects of big data have presented distinctive features compared with traditional data. Big data is a high-volume, structurally diverse, and timely form of data. To process big data requires employing new technologies such as computing frameworks and intelligent algorithms. Big data applications emphasize applying new concepts to assisted decision-making, discovering new knowledge, and even more so optimizing online closed-loop business workflows. From a security perspective, what influence have these new distinctive features of big data produced? We believe that:
1.1 Big data has already had a profound impact on the operating mechanisms of the economy, the mode of social life, and national governance. We must understand and solve big data security issues from the “great security” (大安全) perspective.
In the big data development process, resources, technologies, and applications are mutually dependent and develop in an upward spiral. Whether formulating commercial tactics, social governance, or national strategy, big data’s ability to support decision-making is more and more emphasized. But big data must also be seen as a double-edged sword. It may not be possible to predict or prepare for the influence or destructive power of big data analysis and forecasting results. For example, when the analytical results from a U.S. fitness application’s user fitness data were published online, the result was to leak suspected U.S. military secrets; this was previously unimaginable. In the future, intelligent policy decisions on the basis of big data may have even more important uses in economic processes, the life of society, and national governance. Big data may have profound influence on the national “11 kinds of security” (11 种安全).
It is therefore necessary to to examine big data security issues from the “grand security” perspective. We must examine the scene from the heights of the overall security view, break down traditional modes of thinking in security protection for key technologies, and build a big data security assurance system that touches on economics, law, technology, and other perspectives.
1.2 Big data is gradually evolving into a new-generation fundamental support technology. Big data platforms’ own security is becoming an important influence factor in the security of big data and the integrated real economy.
Big data at present is becoming a general-purpose data processing technology. In addition to advancing innovation in artificial intelligence, virtual reality, and other new information technology applications, the Internet and big data are accelerating the advancement of digitization, networkization, and intelligentization through deep integration with the real economy. Even so, behind the booming development in informatization and industrialization, security issues are naturally emerging. As the methods of cyber attack on big data platforms change, attack objectives have changed from simply stealing data and paralyzing systems to intervening in and controlling analytical results. Attack effects have shifted from directly observable system downtime and information leakage to small and hard to detect analytical result errors, with results that could rise from a cybersecurity incident to industrial manufacturing accidents. Traditional cybersecurity technology based on monitoring, early warning, and response now faces trouble coping with these attacks. We must innovate in theory, counter constantly evolving forms of cyber attack, and design and construct a better big data platform protection system in order to raise the level of cross-sectoral foundational security assurance provision.
1.3 In the big data era, data value is maximized in the flow process. It is necessary to build a data-centric security defense system that suits the trend of cross-boundary (跨界) data flows.
In the big data era, data is a special kind of asset that, in the process of circulation and use, continually creates new value. In big data applications, therefore, data in motion is the norm and data at rest is the exception. At the same time, it can be foreseen that the future big data business environment will be more open, the business ecosystem more complicated, the roles in processing data more multifaceted, and the boundaries between systems, businesses, and organizations more blurry, leading to even richer and more diversified production, flows, processing, etc., for data. Data’s frequent cross-boundary flows not only may lead to risks of traditional data leaks; they may also produce new risks. Especially in the data sharing channels, traditional data access control technology cannot solve cross-organization data permissions management and data routing issues. Relying only on written contracts or agreements, it is difficult to achieve monitoring and auditing of processing on the data recipient’s side, which could easily lead to the risk of data abuse. The most prominent case is the Cambridge Analytica incident exposed this year. In the future, data sharing and flows will become a hard business requirement. Traditional static isolation security protection methods are thoroughly unable to fulfill data flow security protection needs. We must analyze and judge security risks from the angle of changing trends and build a data-centric, continuous data security protection system.
1.4 Big data promotes the vigorous development of new business models in the digital economy, but the masses face difficult tensions between increasing convenience from ubiquitous information services and protecting personal information rights.
In recent years in China, new business models in e-commerce, mobile payments, the sharing economy, etc., have developed rapidly. Information services built on the Internet, the mobile Internet, and the Internet of Things already permeate every aspect of social life and provide the masses with convenient, efficient, and constant services. For example, with inclusive finance, financial technology companies can use big data mining and analysis of personal data to better understand user needs and provide personalized services. Using big data to control financial risks can realize pipeline operations, minimize operating costs, improve service efficiency, and improve user experience. For example, an Internet financial services enterprise has coined the “310” personal credit services model, meaning “three minutes filling out a form, one minute evaluating a loan, and zero human intervention.” Traditional credit services cannot compare with this user experience, and business costs have fallen from 2,000 RMB to 2.3 RMB per loan. Still, users enjoy this convenient service at the cost of selling (出让) their personal information rights. Information services such as everyday recommendations, personalized newspapers, and no-deposit car rentals are all based on big data mining and analyzing users’ personal data, forming a user profile, and providing a customized service. With big data use as backdrop, however, ubiquitous data collection technology and specialized and diversified data processing technology make it difficult for users to control the conditions of collection and use of their personal information, and users’ right to self determination over their personal information has naturally been weakened. Especially with regard to the increasing frequency of data sharing between businesses, processing data from different sources with big data’s extremely strong analytical capabilities may resurface data that had previously undergone anonymization, resulting in the failure of today’s de-sensitization technologies and direct threats to users’ privacy security.
In sum, big data security is a comprehensive issue touching on areas such as technology, law, regulation, and social governance, and it can influence national security, industrial security, and people’s legitimate rights and interests. At the same time, innovation in areas such as big data’s scope, processing methods, and theories of application will not only bring about change in the security requirements of big data platforms, but also will drive changes in data security protection concepts and bring about requirements and expectations for high-level privacy protection technology.
2 Overall View of Big Data Security Technology
As mentioned above, big data security is a cross-disciplinary, comprehensive issue that can be researched from perspectives including law, economics, and technology. This report uses technology as an entry point to comb through big data’s current security requirements and related technologies. It puts forward an overview of big data security; see Figure 1. In the process of plotting a big data security overview, we referred to domestic and international (e.g. NIST) big data technology reference frameworks and research. Considering big data platforms as the upper application layer providing storage and computational resources, they form the arena for tools in data processes such as collection, storage, computing, analysis, and display. We therefore start off from big data platforms to assemble the big data security overview.
In the overview, big data security technology systems are divided into the three layers of big data platform security, data security, and personal privacy protection, with each resting on the one before it. Big data platforms not only must ensure their own basic unit’s security, they must also provide security assurance mechanisms for data and applications operating on the platform. Beyond platform security assurance, data security protection technology provides security protection strategies for data flows in enterprise applications. And privacy security protection is the protection of personal sensitive information on the foundation of data security.