首页 首页 大数据 查看内容

将在2018年全面袭来的四大数据分析趋势

木马童年 2019-5-1 05:55 20 0

与社交、移动与云一起,分析与数据相关技术已经成为数字化时代下核心业务的颠覆者。随着企业在2017年开始逐步由数据生成组织转向数据驱动型组织,数据收集与分析也越来越多地成为各类企业的关注重点。 在今天的文章 ...

与社交、移动与云一起,分析与数据相关技术已经成为数字化时代下核心业务的颠覆者。随着企业在2017年开始逐步由数据生成组织转向数据驱动型组织,数据收集与分析也越来越多地成为各类企业的关注重点。

在今天的文章中,我们将立足2018年,探讨如何推动数据分析战略的具体方法、角色及关注方向。

1. 数据湖必须体现商业价值

2. CDO将迎来自己的时代

3. 数据管理者正式崛起?

4. 数据治理策略将成为高层管理者面临的主要议题

原文标题:4 data analytics trends that will dominate 2018

Together with social, mobile and cloud, analytics and associated data technologies have emerged as core business disruptors in the digital age. As companies began the shift from being data-generating to data-powered organizations in 2017, data and analytics became the center of gravity for many enterprises. In 2018, these technologies need to start delivering value. Here are the approaches, roles and concerns that will drive data analytics strategies in the year ahead.

Data lakes will need to demonstrate business value or die

Data has been accumulating in the enterprise at a torrid pace for years. The internet of things (IoT) will only accelerate the creation of data as data sources move from web to mobile to machines.

[ Keep up to date with the 10 hottest data analytics trends today (and 5 going cold) and beware these 12 data analytics myths. | Bolster your career with our guide to the big data certifications that will pay off. | Get the latest on data analytics by signing up for our CIO newsletter. ]

"This has created a dire need to scale out data pipelines in a cost-effective way," says Guy Churchward, CEO of real-time streaming data platform provider DataTorrent.

For many enterprises, buoyed by technologies like Apache Hadoop, the answer was to create data lakes — enterprise-wide data management platforms for storing all of an organization’s data in native formats. Data lakes promised to break down information silos by providing a single data repository the entire organization could use for everything from business analytics to data mining. Raw and ungoverned, data lakes have been pitched as a big data catch-all and cure-all.

But while data lakes have proven successful for storing massive quantities of data, gaining actionable insights from that data has proven difficult.

"The data lake served companies fantastically well through the data 'at rest' and 'batch' era," Churchward says. "Back in 2015, it started to become clear this architecture was getting overused, but it's now become the Achilles heel for real real-time data analytics. Parking data first, then analyzing it immediately puts companies at a massive disadvantage. When it comes to gaining insights and taking actions as fast as compute can allow, companies relying on stale event data create a total eclipse on visibility, actions, and any possible immediate remediation. This is one area where ‘good enough’ will prove strategically fatal."

Monte Zweben, CEO of Splice Machine, agrees.

"The Hadoop era of disillusionment hits full stride, with many companies drowning in their data lakes, unable to get a ROI because of the complexity of duct-taping Hadoop-based compute engines," Zweben predicts for 2018.

To survive 2018, data lakes will have to start proving their business value, says Ken Hoang, vice president of strategy and alliances at data catalog specialist Alation.

"The new dumping ground of data — data lakes — has gone through experimental deployments over the last few years, and will start to be shut down unless they prove that they can deliver value," Hoang says. "The hallmark for a successful data lake will be having an enterprise catalog that brings information discovery, AI, and information stewarding together to deliver new insights to the business."

However, Hoang doesn't believe all is lost for data lakes. He predicts data lakes and other large data hubs can find a new lease on life with what he calls “super hubs” that can deliver “context-as-a-service” via machine learning.

"Deployments of large data hubs over the last 25 years (e.g., data warehouses, master data management, data lakes, Salesforce and ERP) resulted in more data silos that are not easily understood, related, or shared," Hoang says. "A hub of hubs will bring the ability to relate assets across these hubs, enabling context-as-a-service. This, in turn, will drive more relevant and powerful predictive insights to enable faster and better operational business results."

Ted Dunning, chief application architect for MapR, predicts a similar shift: With big data systems becoming a center of gravity in terms of storage, access and operations, businesses will look to build a global data fabric that will give comprehensive access to data from many sources and to computation for truly multi-tenant systems.

"We will see more and more businesses treat computation in terms of data flows rather than data that is just processed and landed in a database," Dunning says. "These data flows capture key business events and mirror business structure. A unified data fabric will be the foundation for building these large-scale flow-based systems."

These data fabrics will support multiple kinds of computation that are appropriate in different contexts, Dunning says. "The emerging trend is to have a data fabric that provides data-in-motion and data-at-rest needed for multi-cloud computation provided by things like Kubernetes."

Langley Eide, chief strategy officer of self-service data analytics specialist Alteryx, says IT won't be left alone on the hook when it comes to making data lakes deliver value: Line-of-business (LOB) analysts and chief digital officers (CDOs) will also have to take responsibility in 2018.

"Most analysts have not taken advantage of the vast amount of unstructured resources like clickstream data, IoT data, log data, etc., that have flooded their data lakes — largely because it's difficult to do so," Eide says. "But truthfully, analysts aren't doing their job if they leave this data untouched. It's widely understood that many data lakes are underperforming assets – people don't know what's in there, how to access it, or how to create insights from the data. This reality will change in 2018, as more CDOs and enterprises want better ROI for their data lakes."

Eide predicts that 2018 will see analysts replacing "brute force" tools like Excel and SQL with more programmatic techniques and technologies, like data cataloging, to discover and get more value out of the data.

The CDO will come of age

As part of this new push to get better insights from data, Eide also predicts the CDO role will come into its own in 2018.

"Data is essentially the new oil, and the CDO is beginning to be recognized as the linchpin for tackling one of the most important problems in enterprises today: driving value from data," Eide says. "Often with a budget of less than $10 million, one of the biggest challenges and opportunities for CDOs is making the much-touted self-service opportunity a reality by bringing corporate data assets closer to line-of-business users. In 2018, the CDOs that work to strike a balance between a centralized function and capabilities embedded in LOB will ultimately land the larger budgets."

Eide believes CDOs that enable resources, skills, and functionality to shift rapidly between centers of excellence and LOB will find the most success. For this, Eide says, agile platforms and methodologies are key.

Rise of the data curator?

Tomer Shiran, CEO and co-founder of analytics startup Dremio, a driving force behind the open source Apache Arrow project, predicts that enterprises will see the need for a new role: the data curator.

The data curator, Shiran says, sits between data consumers (analysts and data scientists who use tools like Tableau and Python to answer important questions with data) and data engineers (the people who move and transform data between systems using scripting languages, Spark, Hive, and MapReduce). To be successful, data curators must understand the meaning of the data as well as the technologies that are applied to the data.

"The data curator is responsible for understanding the types of analysis that need to be performed by different groups across the organization, what datasets are well suited for this work, and the steps involved in taking the data from its raw state to the shape and form needed for the job a data consumer will perform," Shiran says. "The data curator uses systems such as self-service data platforms to accelerate the end-to-end process of providing data consumers access to essential datasets without making endless copies of data."

Data governance strategies will be key themes for all C-level executives

The European Union's General Data Protection Regulation (GDPR) is set to go into effect on May 25, 2018, and it looms like a specter over the analytics field, though not all enterprises are prepared.

The GDPR will apply directly in all EU member states, and it radically changes how companies must seek consent to collect and process the data of EU citizens, explain lawyers from Morrison & Foerster's Global Privacy + Data Security Group: Miriam Wugmeister, Global Privacy co-chair; Lokke Moerel, European Privacy Expert; and John Carlin, Global Risk and Crisis Management chair (and former Assistant Attorney General for the U.S. Department of Justice's National Security Division).

"Companies that rely on consent for all their processing operations will no longer be able to do so, and will need other legal bases (i.e., contractual necessity and legitimate interest)," they explain. "Companies will need to implement a whole new ecosystem for notice and consents."

Even though GDPR fines are potentially massive — the administrative fines can be up to 20 million Euros or 4 percent of annual global turnover, whichever is highest — many enterprises, particularly in the U.S., are not prepared.

"When the Y2K boom came around, everyone was preparing for odds that they may or may not face," says Scott Gnau, CTO of Hortonworks. "Today, it seems that barely anyone is properly preparing for the GDPR being enforced in May 2018. Why not? We're currently in a phase where every organization is not only trying to deal for 'what's next,' but they're struggling to maintain and deal with issues that need solving now. Many organizations are likely relying on chief security officers to define the rules, systems, parameters, etc., to help their global system integrators figure out the best course of action. That is not a realistic expectation to put on one individual's role."

To enforce GDPR properly requires the C-suite be informed, prepared, and communicative with all facets of their organization, Gnau says. Organizations will need a better handle on the overall governance of their data assets. But large breaches, like the Equifax breach that came to light in 2017, means they will struggle to balance providing self-service access to data for employees while protecting that same data from prospective threats.

As a result, Gnau predicts data governance will be a focus point for all organizations in 2018.

"A key goal should be developing a system that balances democratization of data, access, self-service analytics, and regulation," Gnau says. "The way we architect data safely going forward will have an impact on everyone — customers in the U.S. and overseas, the media, your partners, and more."

Zachary Bosin, director of solution marketing for multi-cloud data management specialist Veritas Technologies, predicts a U.S. company will be one of the first to be fined under the GDPR.

"Despite the impending deadline, only 31 percent of companies surveyed by Veritas worldwide believe they are GDPR-compliant," Bosin says. "Penalties for non-compliance are steep, and this regulation will impact every and any company that deals with EU citizens."

在不久的将来,多智时代一定会彻底走入我们的生活,有兴趣入行未来前沿产业的朋友,可以收藏多智时代,及时获取人工智能、大数据、云计算和物联网的前沿资讯和基础知识,让我们一起携手,引领人工智能的未来!

数字化时代 数据驱动 数据收集 数据分析 数据治理
0
为您推荐
大数据技术改变城市的运作方式,智慧城市呼之欲出

大数据技术改变城市的运作方式,智慧城市呼

纽奥良虽像大多数城市一样有火灾侦测器安装计划,但直到最近还是要由市民主动申装。纽…...

大数据分析面临生死边缘,未来之路怎么走?

大数据分析面临生死边缘,未来之路怎么走?

大数据分析开始朝着营销落地,尤其像数果智能这类服务于企业的大数据分析供应商,不仅…...

什么是工业大数据,要通过3B和3C来理解?

什么是工业大数据,要通过3B和3C来理解?

核心提示:工业视角的转变如果说前三次工业革命分别从机械化、规模化、标准化、和自动…...

大数据普及为什么说肥了芯片厂商?

大数据普及为什么说肥了芯片厂商?

科技界默默无闻的存在,芯片行业年规模增长到了3520亿美元。半导体给无人驾驶汽车带来…...

大数据技术有哪些,为什么说云计算能力是大数据的根本!

大数据技术有哪些,为什么说云计算能力是大

历史规律告诉我们,任何一次大型技术革命,早期人们总是高估它的影响,会有一轮一轮的…...

个人征信牌照推迟落地,大数据 重新定义个人信用!!

个人征信牌照推迟落地,大数据 重新定义个

为金融学的基础正日益坚实。通过互联网大数据精准记录海量个人行为,进而形成分析结论…...