32,99 €
An integrated, strategic approach to higher-value analytics Leaders and Innovators: How Data-Driven Organizations Are Winning with Analytics shows how businesses leverage enterprise analytics to gain strategic insights for profitability and growth. The key factor is integrated, end-to-end capabilities that encompass data management and analytics from a business and IT perspective; with analytics running inside a database where the data reside, everyday analytical processes become streamlined and more efficient. This book shows you what analytics is, what it can do, and how you can integrate old and new technologies to get more out of your data. Case studies and examples illustrate real-world scenarios in which an optimized analytics system revolutionized an organization's business. Using in-database and in-memory analytics along with Hadoop, you'll be equipped to improve performance while reducing processing time from days or weeks to hours or minutes. This more strategic approach uncovers the opportunities hidden in your data, and the detailed guidance to optimal data management allows you to break through even the biggest data challenges. With data coming in from every angle in a constant stream, there has never been a greater need for proactive and agile strategies to overcome these struggles in a volatile and competitive economy. This book provides clear guidance and an integrated strategy for organizations seeking greater value from their data and becoming leaders and innovators in the industry. * Streamline analytics processes and daily tasks * Integrate traditional tools with new and modern technologies * Evolve from tactical to strategic behavior * Explore new analytics methods and applications The depth and breadth of analytics capabilities, technologies, and potential makes it a bottomless well of insight. But too many organizations falter at implementation--too much, not enough, or the right amount in the wrong way all fail to deliver what an optimized and integrated system could. Leaders and Innovators: How Data-Driven Organizations Are Winning with Analytics shows you how to create the system your organization needs to dramatically improve performance, increase profitability, and drive innovation at all levels for the present and future.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 333
Foreword
Acknowledgments
About the Author
Introduction
Why You Should Read This Book
Let's Start with Definitions
Industry Trends and Challenges
Who Should Read This Book?
How to Read This Book
Let Your Journey Begin
Endnotes
Chapter 1: The Analytical Data Life Cycle
Stage 1: Data Exploration
Stage 2: Data Preparation
Stage 3: Model Development
Stage 4: Model Deployment
End-to-End Process
Chapter 2: In-Database Processing
Background
Traditional Approach
In-Database Approach
The Need for In-Database Analytics
Success Stories and Use Cases
In-Database Data Quality
Investment for In-Database Processing
Endnotes
Chapter 3: In-Memory Analytics
Background
Traditional Approach
In-Memory Analytics Approach
The Need for In-Memory Analytics
Success Stories and Use Cases
Investment for In-Memory Analytics
Chapter 4: Hadoop
Background
Hadoop in the Big Data Environment
Use Cases for Hadoop
Hadoop Architecture
Best Practices
Benefits of Hadoop
Use Cases and Success Stories
A Collection of Use Cases
Endnote
Chapter 5: Bringing It All Together
Background
Collaborative Data Architecture
Scenarios for the Collaborative Data Architecture
How In-Database, In-Memory, and Hadoop Are Complementary in a Collaborative Data Architecture
Use Cases and Customer Success Stories
Investment and Costs
Endnotes
Chapter 6: Final Thoughts and Conclusion
Five Focus Areas
Cloud Computing
Security: Cyber, Data Breach
Automating Prescriptive Analytics: Iot, Events, and Data Streams
Cognitive Analytics
Anything as a Service (XaaS)
Conclusion
Afterword
Index
End User License Agreement
ii
iii
vi
vii
xi
xii
xiii
xv
xvii
xix
xx
xxi
xxii
xxiii
xxiv
xxv
xxvi
xxvii
v
1
2
3
4
5
6
7
8
9
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
Cover
Table of Contents
Begin Reading
Chapter 1: The Analytical Data Life Cycle
Figure 1.1 Analytical data life cycle
Figure 1.2 Technologies for the analytical data life cycle
Chapter 2: In-Database Processing
Figure 2.1 Traditional approach to analytics
Figure 2.2 In-database approach to analytics
Figure 2.3 Data and analytic ecosystem
Figure 2.4 Traditional approach
Figure 2.5 In-database process
Chapter 3: In-Memory Analytics
Figure 3.1 In-memory analytics
Figure 3.2 CFO mandate for the project
Figure 3.3 From silos model to a functional, integrated architecture
Figure 3.4 Integrating data and analytics
Figure 3.5 In-memory analytics process
Figure 3.6 Manual lookup and paperwork
Figure 3.7 Distribution of information
Chapter 4: Hadoop
Figure 4.1 Hadoop, HDFS, and MapReduce
Figure 4.2 Traditional architecture
Figure 4.3 Hadoop architecture
Chapter 5: Bringing It All Together
Figure 5.1 Collaborative data architecture
Figure 5.2 Hadoop as a staging warehouse for your structured data
Figure 5.3 Hadoop as a data lake
Figure 5.4 Hadoop for data exploration and discovery
Figure 5.5 In-database processing
Figure 5.6 Hadoop for data exploration
Figure 5.7 No Hadoop
Figure 5.8 Integrating Hadoop, in-database, and in-memory
Figure 5.9 Retail traditional process for analytics
Figure 5.10 Integrating in-database and in-memory
Figure 5.11 Public administration architecture for fraud
Figure 5.12 Integrating Hadoop, in-database, and in-memory
Chapter 6: Final Thoughts and Conclusion
Figure 6.1 Top five focus areas
Figure 6.2 Cloud computing
Figure 6.3 Typical cloud computing services
Figure 6.4 Cyber-attacks by industry
Figure 6.5 Prescriptive analytics
Figure 6.6 Internet of Things connectivity
Figure 6.7 Cognitive, prescriptive, predictive, and descriptive analytics
Figure 6.8 Types of disaster incidents
Introduction
Table 1 Outline of the Chapters
Chapter 2: In-Database Processing
Table 2.1 Traditional Run Times at Different Process
Table 2.2 In-Database Run Times at Different Process
Table 2.3 Benefits of In-Database Processing
Table 2.4 Variations of Title
Table 2.5 Duplicate Records
Chapter 4: Hadoop
Table 4.1 Big Data Sources
Chapter 6: Final Thoughts and Conclusion
Table 6.1 Causes of Data Breaches
Table 6.2 Different Types of Analytics (Descriptive, Predictive, and Prescriptive)
Table 6.3 Industry Uses of IoT
The Wiley & SAS Business Series presents books that help senior-level managers with their critical management decisions.
Titles in the Wiley & SAS Business Series include:
Agile by Design: An Implementation Guide to Analytic Lifecycle Management
by Rachel Alt-Simmons
Analytics in a Big Data World: The Essential Guide to Data Science and its Applications
by Bart Baesens
Bank Fraud: Using Technology to Combat Losses
by Revathi Subramanian
Big Data, Big Innovation: Enabling Competitive Differentiation through Business Analytics
by Evan Stubbs
Business Forecasting: Practical Problems and Solutions
edited by Michael Gilliland, Len Tashman, and Udo Sglavo
Business Intelligence Applied: Implementing an Effective Information and Communications Technology Infrastructure
by Michael Gendron
Business Intelligence and the Cloud: Strategic Implementation Guide
by Michael S. Gendron
Business Transformation: A Roadmap for Maximizing Organizational Insights
by Aiman Zeid
Data-Driven Healthcare: How Analytics and BI Are Transforming the Industry
by Laura Madsen
Delivering Business Analytics: Practical Guidelines for Best Practice
by Evan Stubbs
Demand-Driven Forecasting: A Structured Approach to Forecasting, Second Edition
by Charles Chase
Demand-Driven Inventory Optimization and Replenishment: Creating a More Efficient Supply Chain
by Robert A. Davis
Developing Human Capital: Using Analytics to Plan and Optimize Your Learning and Development Investments
by Gene Pease, Barbara Beresford, and Lew Walker
Economic and Business Forecasting: Analyzing and Interpreting Econometric Results
by John Silvia, Azhar Iqbal, Kaylyn Swankoski, Sarah Watt, and Sam Bullard
Financial Institution Advantage and the Optimization of Information Processing
by Sean C. Keenan
Financial Risk Management: Applications in Market, Credit, Asset, and Liability Management and Firmwide Risk
by Jimmy Skoglund and Wei Chen
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
by Bart Baesens, Veronique Van Vlasselaer, and Wouter Verbeke
Harness Oil and Gas Big Data with Analytics: Optimize Exploration and Production with Data Driven Models
by Keith Holdaway
Health Analytics: Gaining the Insights to Transform Health Care
by Jason Burke
Heuristics in Analytics: A Practical Perspective of What Influences Our Analytical World
by Carlos Andre, Reis Pinheiro, and Fiona McNeill
Hotel Pricing in a Social World: Driving Value in the Digital Economy
by Kelly McGuire
Implement, Improve and Expand Your Statewide Longitudinal Data System: Creating a Culture of Data in Education
by Jamie McQuiggan and Armistead Sapp
Killer Analytics: Top 20 Metrics Missing from Your Balance Sheet
by Mark Brown
Mobile Learning: A Handbook for Developers, Educators, and Learners
by Scott McQuiggan, Lucy Kosturko, Jamie McQuiggan, and Jennifer Sabourin
The Patient Revolution: How Big Data and Analytics Are Transforming the Healthcare Exp
erience by Krisa Tailor
Predictive Analytics for Human Resources
by Jac Fitz-enz and John Mattox II
Predictive Business Analytics: Forward-Looking Capabilities to Improve Business Performance
by Lawrence Maisel and Gary Cokins
Statistical Thinking: Improving Business Performance, Second Edition
, by Roger W. Hoerl and Ronald D. Snee
Too Big to Ignore: The Business Case for Big Data
by Phil Simon
Trade-Based Money Laundering: The Next Frontier in International Money Laundering Enforcement
by John Cassara
The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions
by Phil Simon
Understanding the Predictive Analytics Lifecycle
by Al Cordoba
Unleashing Your Inner Leader: An Executive Coach Tells All
by Vickie Bevenour
Using Big Data Analytics: Turning Big Data into Big Money
by Jared Dean
Visual Six Sigma, Second Edition
, by Ian Cox, Marie Gaudard, Philip Ramsey, Mia Stephens, and Leo Wright
For more information on any of the above titles, please visit www.wiley.com.
Tho H. Nguyen
Copyright © 2016 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978)~750-8400, fax (978)~646-8600, or on the Web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax~(201)~748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.
Library of Congress Cataloging-in-Publication Data is available:
ISBN 9781119232575 (Hardcover)
ISBN 9781119276913 (ePDF)
ISBN 9781119276920 (ePub)
Cover design: Wiley
Cover image: © aleksandarvelasevic/iStock.com
This book is dedicated to Ánh, Ana, and family, who provided their unconditional love and support with all the crazy, late nights and frantic weekends it took to complete this book.
By James Taylor
I have been working with advanced analytics since 2001 and have watched the market evolve and mature over the intervening years. Where once only banks focused on predictive analytics to manage risk, now companies across all industries do. The role of advanced analytics in managing the customer journey has gone from innovative to mainstream. The time to develop and deploy advanced analytics has gone from months to seconds, even as the amount of data being analyzed has exploded. Leading companies see analytics as a core competency, not just a point solution, and innovators are increasingly looking at data as a source of future innovation. How to become data-driven and win with analytics is on everyone's to-do list, and books like the one Tho has written are critical in developing a practical plan to achieve data-driven, analytic innovation.
Tho and I met a few years ago when we co-presented on analytics through our work as faculty members of the International Institute for Analytics. We have a shared interest in the technologies and approaches that are both driving the increased use of analytics in organizations and responding to the increased demand from organizations of every size and in every industry.
All organizations have data, and we live in an era where organizations have more of this data digitized and accessible than ever before. More digital channels and more devices generate digital exhaust about our customers, partners, suppliers, and even our equipment. Government and third-party data are increasingly accessible, with marketplaces and APIs making yet more data available to us. Our ability to store and analyze text, audio, image, and video data expands our reach yet further. All these data stretch our data infrastructure to the limit and beyond, driving the adoption of new technologies like in-database and in-memory analytics and Hadoop. But simply storing and managing the data is not enough. To succeed, we need to use the data to drive better decision making. This means we need to understand it, analyze it, and deploy the resulting insights so that they can be acted on. These new technologies must be integrated into an end-to-end data and analytic life cycle if they are to add value.
Over the years, I have spoken with literally hundreds of organizations that are using analytics to improve their decision making. I have helped train thousands of people in the key techniques and skills required for successful adoption of analytic technology. My experience is that organizations that can think coherently about their decision making, especially their day-to-day operational decision making, and can see huge benefits from making those decisions more analytically— from using their data to see what will work and what will not. Such a data-driven approach to decision making drives a degree of innovation in organizations second to none. Succeeding and innovating with analytical decision-making, however, requires a coherent approach to the analytic life cycle and the effective adoption of data management and analytic technologies.
With this book, Tho has provided an overview of the analytical life cycle and technologies required to deliver data-driven analytic innovation. He begins with an overview of the analytical data life cycle, the journey from data exploration to data preparation, analytic model development and ultimately deployment into an organization's decision making, involved to transform data into strategic insights using analytics. This sets the scene for chapters on the critical technology categories that are transforming how organizations manage and use data. Each of these technologies is considered and put in its correct place in the life cycle supported by real customer examples of the value to be gained.
First, in-database processing integrates advanced analytics into a database or data warehouse so data can be analyzed in situ. Eliminating the time and cost of moving data from where it is stored to somewhere it can be analyzed both reduces elapsed time and allows for more data to be processed in business-realistic time frames. Improved accuracy and reduced time to value are the result.
In-memory analytics delivers incredibly fast response to complex analytical problems to reduce time to analyze data. Increasing speed in this way allows for more iterations, more approaches to understanding the data, and greater likelihood of finding useful insight. This increased speed can also be used to analyze fast-changing or streaming data without waiting for the data to be stored somewhere.
Finally, the Hadoop big data ecosystem allows for the collection and management of more data (both structured and semi-structured) than ever before. Organizations that might once have thrown away or archived data perceived as low value can now store and access data cost-effectively. Integrated with traditional data storage techniques, Hadoop allows for broader and more flexible data management across the organization.
These new approaches are combined with an overview of some more traditional techniques to bring it all together at the end with a description of the kind of collaborative data architecture and effective analytic data life cycle required. A final chapter discusses the impact of cloud computing, cyber-security, the Internet of Things (IoT), cognitive computing, and the move to “everything as a service” business models on data and analytics.
If you are one of those business and IT professionals trying to learn how to use data to drive innovation in your organizations and become leaders in your industry, then you need an overview of the data management and analytical processes critical to data-driven success. This book will give you that overview, introduce you to critical best practices, and show you how real companies have already used these processes to succeed.
James Taylor is CEO and principal consultant, Decision Management Solutions, and a faculty member of the International Institute for Analytics. He is the author of Decision Management Systems: A Practical Guide to Using Business Rules and Predictive Analytics (IBM Press, 2012). He also wrote Smart (Enough) Systems (Prentice Hall, 2007) with Neil Raden and The Microguide to Process and Decision Modeling in BPMN/DMN with Tom Debevoise. James is an active consultant, educator, speaker, and writer working with companies all over the world. He is based in Palo Alto, California, and can be reached at [email protected].
First, I would like to recognize you, the reader of this book. Thank you for your interest to learn and to become leaders and innovators within your organization. I am contributing the proceeds to worthy causes that focus on technology and science to improve the world, from fighting hunger to advocating education to innovating social change.
There are many people who deserve heartfelt credit for assisting me in writing this book. This book would have not happened without the ultimate support and guidance from my esteemed colleagues and devoted customers from around the world. A sincere appreciation to my friends at Teradata and SAS, who encouraged me to write this book and helped me to validate with the technical details and to keep it simple for nontechnical readers to understand.
I owe a huge amount of gratitude to the people who reviewed and provided input word by word, chapter by chapter—specifically, Shelley Sessoms, Bob Matsey, and Paul Segal. Reading pages of technical jargons, trying to follow my thoughts, and translating my words in draft form can be a daunting challenge, but you did it with grace, patience, and swiftness. Thank you for the fantastic input that helped me to fine-tune the message.
A sincere appreciation goes to all marketing professionals, IT professionals, and business professionals who I have interacted with over the years. You have welcomed me, helped me to learn, allowed me to contribute, and provided insights in this book. Finally, to all my family (the Nguyen and Dang crew), the St. Francis Episcopalian sponsors, the Rotary Club (the Jones Family, the Veale Family)—they all have contributed to my success, and I would not be where I am today without them. To my wife and daughter, thank you for being the love of my life and the light of my day.
Tho H. Nguyen
Tho Nguyen came to the United States in 1980 as a refugee from Vietnam with his parents, five sisters, and one brother. Sponsored by the St. Francis Episcopal Church in Greensboro, North Carolina, Tho had abundant guidance and support from his American family who taught him English and acclimated him to an exciting and promising life in America.
Tho holds a Bachelor of Science in Computer Engineering from North Carolina State University and an MBA degree in International Business from the University of Bristol, England. During his MBA studies, Tho attended L'École Nationale des Ponts et Chaussées (ENPC) in Paris, University of Hong Kong, and Berkeley University in California. Tho proudly represented the Rotary Club as an Ambassadorial Scholar which provided him a fresh perspective and a deep appreciation of the world.
With more than 18 years in the Information Technology industry, Tho works closely with technology partners, research and development, and customers globally to drive and deliver value-added business solutions in analytics, data warehousing, and data management. Integrating his technical and business background, Tho has extensive experience in product management, global marketing, and business alliance and strategy management. Tho is a faculty member of the International Institute for Analytics, an active presenter at various conferences, and a blogger on data management and analytics.
In his spare time, Tho enjoys spending time with his family, traveling, running, and playing tennis. He is an avid foodie who is very adventurous and likes to taste cuisines around the world.
Data management and analytic practices have changed dramatically since I entered the industry in 1998. Data volumes are exploding beyond imagination, easily in the petabytes. There are many varieties of data that we are collecting, both structured and semi-structured data. We are acquiring data at much higher velocity, demanding daily renewal, sometimes even hourly. As the Greek philosopher Heraclitus so wisely stated centuries ago, “The only thing that is constant is change.”
The management of data, and how we handle and analyze it, has changed dramatically since the start of the “big data” era. Ultimately, all of the data must deliver information for decision making. It is definitely an exciting time that creates many challenges but also great opportunities for all of us to explore and adopt new and disruptive technologies to help with data management and analytical needs. And, now, the journey of this book begins.
I have attended a number of conferences where I have been able to share with both business and IT audiences the technologies that can help them more effectively manage their data, in return creating a more streamlined analytical life cycle. I have learned from customers the challenges they encounter and the fascinating things they are doing with agile analytics to drive innovation and gain competitive advantage for their companies. These are the biggest and most common themes:
“How can I integrate data management and analytical process into a unified environment to make my processes run faster?”
“I do NOT have days or weeks to prepare my data for analysis.”
“My analytical process takes days and weeks to complete, and by the time it is completed, the information is outdated.”
“My staff is spending too much time with tactical data management tasks and not enough time focusing on strategic analytical exploration.”
“What I can do to retain my staff from leaving because their work is no longer challenging?”
“My data is scattered all over. Where do I go to get the most current version of the data for analysis?”
A good friend of mine, who is an editor, approached me to consider writing a book that combines real-world customer successes based on the concepts they adopted from presentations and white papers that I authored over the years. After a few months of developing the abstracts, outlines, and chapters, we agreed to proceed publishing this book with a focus on customer success stories in each section. My goals for this book are to:
Educate on what innovative technologies are available for integrating data management and analytics in a cohesive environment.
Inform about what fascinating technologies leading edge companies are adopting and implementing to help them solve some of the big data challenges.
Share customer case studies and successes across industries such as retail, banking, telecommunications, e-commerce, and transportation.
Whether you are from business or IT, I believe you will appreciate the real-world best practices and use cases that you can leverage in your profession. These best practices have been proven to help provide faster data-driven insights and decisions.
Writing this book was a privilege and honor. Mixed feelings went through my head as I started writing the book even though I was excited about sharing my experiences and customer successes with other IT and business professionals. The reasons for the mixed feelings were twofold:
Will the technology discussed in this book still be considered as innovative or relevant when the book is published?
How can I bring value to the readers who consider themselves to be innovators and leaders in the IT market?
Customer interactions are very important to me and a highlight in my profession. I have talked to many customers globally, tried to understand their business problems, and advised them on the appropriate technologies and solutions to solve their issues. I also have traveled around the world, sharing with customers and prospects the latest technologies and innovation in the market and how some of the leading-edge companies have adopted them to be more competitive and become the pioneers of managing data and applying analytics in a unified environment. Before I dive into the details, I believe it is appropriate to set the tone and definitions to be referenced throughout this book and some trends in the industry that demand inventive technologies to sustain leadership in a competitive, global economy. The topics of this book are focused on data management and analytics and how to unite these two elements into one single entity for optimal performance, economics, and governance—all of which are key initiatives for business and IT in many corporations.
The term data management has been around for a long time and has transformed into many other trendy buzzwords over the years. However, for simplification purposes, I will use the term data management since it is the foundation for this book. I define data management as a process by which data are acquired, integrated, and stored for data users to access. Data management is often associated with the ETL (extraction, transformation, and load) process to prepare the data for the database or warehouse. The ETL process is very much embedded into the data management environment. The ultimate result from the ETL process is to satisfy data users with reliable and timely data for analytics.
There are many definitions for analytics, and the focus on analytics has recently been on the rise. Its popularity has reemerged since the 1990s because many companies across industries have recognized the value of analytics and the field of data analysis to analyze the past, present, and future with data. Analytics can be very broad and has become the catch-all term for a variety of different business initiatives. According to Gartner, analytics is used to describe statistical and mathematical data analysis that clusters, segments, scores, and predicts what scenarios have happened, are happening, or are most likely to happen.1 Analytics have become the link between IT and business to exploit massive mounds of data. Based on my interactions with customers, I define analytics as a process of analyzing large amounts of data to gain knowledge and understanding about your business and deliver data-driven decisions to make business improvements or changes within an organization.
Now that the definitions have been established, let's examine the state of the IT industry and what customers are sharing with me regarding the challenges they encounter in their organizations:
Data as a differentiator and an asset
: Forbes Research concurs that data is a differentiator and an asset.
2
As an industry, we are data rich but knowledge poor because organizations are unable to make sense of all the data they collect. We are barely scratching the surface when it comes to analyzing all of the data that we have access to or can acquire. In addition, the ability to analyze the data has become much more complex, and companies may not have the right infrastructure and/or tools to do the job effectively and efficiently. As data volumes continue to grow, it is imperative to have the proper foundation for managing big data and beyond.
Analytics for everything
: Customers demand real-time analytics to empower data-driven decisions from CEO to a factory operator. Based on recent TechRepublic research, 70 percent of the respondents use analytics in some shape or form to drive performance and decisions. Whether it is to open a brand new division or develop another product line, the right decision will have a significant impact on the bottom line and, ultimately, the organization's success. As business becomes more targeted, personalized, and public, it is vital to make precise, fact-based (data-driven), transparent decisions. These decisions need to have an auditable history to show regulatory compliance and risk management.
The “now” factor
: It seems that the X factor that a company should possess is to have immediate availability of products and services for their consumers. For example, the retail industry is facing the “now” factor challenge. Extremely low prices and great services are no longer enough to attract consumers. Businesses need to have what consumers are looking for such as color, size, and fit—when they need it. That is the key to attract and retain customers for success. Consumers are willing to pay at a premium on product availability. Based on a retail survey from
Forbes
, 58 percent said availability is more important than price, and 92 percent said they will not wait for products to come into stock. Companies must outsmart their competition and be able to share information with customers for products and services readiness.
These trends translate into challenges and opportunities for companies in every industry. The customers that I deal with consider these as their top three challenges:
Database performance
: With a database architecture that may not scale to match the amount of data, it's difficult to process full data sets—or accomplish data discovery, analysis, and visualization activities.
Analytical capabilities
: Because of inept data access and time consuming data preparation, analysts tend to focus on solving access issues instead of running tactical analytical processes and strategic tasks. In addition, there is an inability to develop and process complex analytic models fast enough to keep up with economic changes.
Data quality and integration
: Having a multitude of data variety, silo data marts, and localized data extracts makes it difficult to get a handle on exactly how much data there is and what kind. When data are not in one location and/or data management is disjointed, its quality is questionable. When quality is questionable, results are uncertain.
Data is every organization's strategic asset. Data provide information for operational and strategic decisions. Because we are collecting many more types of data (from websites, social media, mobile, sensors, etc.) and the speed at which we collect the data has significantly accelerated, data volumes have grown exponentially. Customers that I have spoken to have doubled their data volumes in less than 24 months, which is beyond what Moore's law (that the rate of change doubles in 24 months) predicted over 50 years ago. With the pace of change escalating faster than ever, customers are looking for the latest innovation in technologies to try and satisfy their needs in both IT and business within a corporation and transform every challenge into big opportunities to positively impact the profitability and bottom line. I truly believe the new and innovative technologies such as in-database processing, in-memory analytics, and the emerging Hadoop technology will help tame the challenges of managing big data, uncover new opportunities with analytics, and deliver a higher return on investment by augmenting data management with integrated analytics.
This book is for business and IT professionals who want to learn about new and innovative technologies and learn what their peers have done to be successful in their line of work. It is for the business analysts who want to be smarter at delivering information to different parts of the organization. It is for the data scientists who want to explore new ways to apply analytics. It is for managers, directors, and executives who want to innovate and leverage analytics to make data-driven decisions impacting profitability and the livelihood of their business.
You should read this book if your profession is in one of these groups:
Executive managers, including chief executive officers, chief operating officers, chief strategy officers, chief marketing officers, or any other company leader, who want to innovate and drive efficiency or deliver strategic goals
Line of business managers that oversee existing technologies and want to adopt new technologies for the company
Sales managers and account directors who want to introduce new concepts and technologies to their customers
Business professions such as business analysts, program managers, and offer managers who analyze data and deliver information to the leadership team for decision making
IT professionals who manage the data, ensuring data quality and integration, so that the data can be available for analytics
This book is ideal for professions who want to improve the data management and analytical processes of their organization, explore new capabilities by applying analytics directly to the data, and learn from others how to be innovative and to become pioneers in their organization.
This book can be read in a linear manner, chapter by chapter. It proceeds very much as a process of crawling, walking, sprinting, then running. However, if you are a reader who is already familiar with the concept of in-database processing, in-memory analytics, or Hadoop, you can simply skip to the chapter that is most relevant to your situation. If you are not familiar with any of the topics, I highly suggest starting with Chapter 1, as it highlights the analytical life cycle of the data and data's typical journey to become information and insights for your organization. You can proceed to Chapters 2 to 4 (crawl, walk, sprint) to see how specific technologies can be applied directly to the data. Chapter 5 (how to run the relay) brings all of the elements together and how each technology can help to manage big data and advanced analytics. Chapter 6 discusses the top five focus areas in data management and analytics as well as possible future technologies.
Table 1 provides a description and focus for each chapter.
Table 1 Outline of the Chapters
Chapter
Description
Takeaway
1.
The Analytical Data Life Cycle
The purpose of this chapter is to illustrate the typical life cycle of data and the stages (data exploration, data preparation, model development, and model deployment) involved to transform data into strategic insights using analytics.
What is the analytical data life cycle?
What are the characteristics of each stage of the life cycle?
What technologies are best suited for each stage of the data?
2.
In-Database Processing
This purpose of this chapter is to provide the reader with the concept of in-database processing. In-database processing refers to the integration of advanced analytics into the database or data warehousing. With this capability, analytic processing is optimized to run where the data reside, in parallel, without having to copy or move the data for analysis.
What is in-database processing?
Why in-database processing?
What process should leverage in-database?
What are some best practices?
What are some use cases and success stories?
What are the benefits of using in-database analytics?
3.
In-Memory Analytics
This purpose of this chapter is to provide the reader the concept of in-memory analytics. This latest innovation provides an entirely new approach to tackle big data by using an in-memory analytics engine to deliver super-fast responses to complex analytical problems.
What is in-memory analytics?
Why in-memory analytics?
What process should leverage in-memory analytics?
What are some best practices?
What are some use cases and success stories?
What are the benefits of using in-memory analytics?
4.
Hadoop and Big Data
This purpose of this chapter is to explain the value of Hadoop. Organizations are faced with the unique big data challenges collecting more data than ever before, both structured and semi-structured data. There has never been a greater need for proactive and agile strategies to manage and integrate big data.
What is Hadoop?
Why Hadoop in big data environment?
How does Hadoop play in the modern architecture?
What are some best practices?
What are some use cases and success stories?
What are the benefits of using Hadoop in big data?
5.
End-to-End – Bringing it All Together
This purpose of this chapter is to summarize and bring together the various technologies and concepts shared in
Chapters 2
–
4
. Combining traditional methods with modern and new approaches can save time and money for any organization.
How are in-database analytics, in-memory analytics, and Hadoop complementary?
What are use cases and customer success stories?
What are some benefits of an integrated data management and analytic architecture?
6.
Conclusion and Forward Thoughts
This purpose of this chapter is to conclude the book with the power of having an end-to-end data management and analytics platform for delivering data-driven decisions. It also provides final thoughts about the future of technologies.
What is the future for data management?
What is the future for analytics?
What are the top five focus areas in data management and analytics?
An organization's most valuable asset is its customers. Yet right next to customers are those precious assets that the enterprise can leverage to attract, retain, and interact with those valuable customers for profitable growth: your data. Every organization that I have encountered has huge, tidal waves of data—streaming in like waves from every direction—from multiple channels and a variety of sources. Data are everywhere—as far as the eye can see! All day, every day, data flow into and through the business and your database or data warehouse environment. Now, let's examine how all your data can be analyzed in an efficient and effective process to deliver data-driven decisions.
1.
Gartner, “Analytics,”
IT Glossary
,
http://www.gartner.com/it-glossary/analytics/
.
2.
Forbes Insight,
Betting on Big Data
(Jersey City, NJ: Forbes Insights, 2015),
http://images.forbes.com/forbesinsights/StudyPDFs/Teradata-BettingOnBigData-REPORT.pdf
.