What is big data?
Big data is many things to many people these days, and thus there are various definitions for Big Data circulating the IT industry. The common notion, however, among most is that first, we are now creating more data and at a much higher rate than any other time in the history. Second, this data is mostly unstructured and difficult to analyze, and third is that there is tremendous hidden insight in this data.
Big Data has four Aspects: Volume, Velocity, Variety, and Value
A common agreement among the thought leaders and solution providers of Big Data is that it consists of 3-4 aspects or dimensions:
Volume: Business and consumer data is growing at a huge volume which is no longer can fit our existing technological paradigm from both solution and tools.
- Social Media: Facebook, Tweeter, and other social media platforms are creating terabytes of data on a daily basis that reflect the attitude and sentiment of consumers toward products, current affair, and even policies.
- Machine Data: More and more devices are data aware and generate billions of events that can be used to use in analytics to predict behavior, performance, failure, and much more.
- Enterprise Unstructured Data: Companies have an existing share of Big Data internally that has not yet been analyzed or fully leveraged, some say at a volume of 80% or more. These could include transaction documentation, technical content, communication, events, and much more.
Velocity: There is a higher velocity in creation of this massive volume of data that required us to be able to quickly analyze and react to the Big Data. The other aspect of velocity is the speed of which we need to find a needle of insight from the haystack of Big Data.
Tools such as Hadoop has been developed to address the first two V’s in the Big Data paradigm but creating new approaches in storing and retrieval of massive amount of data.
Variety: With new sources of business and consumer data, the variety of data collected has also increased in both structured and unstructured data. Video, Audio, Images, Speech, Tweets, Facebook posts, metadata, and many more forms of data is now being collected on a daily basis. This data by itself is of little value until it is related to an existing structured data such as customer or product master data in a business.
Value: Big Data by itself could either be considered a waste of storage or a gold mine. The differentiator is the potential perceived and unperceived value of this data to businesses. Value is purpose driven and thus needs to be aligned with business intent and objectives. Having or collecting massive volume of Big Data will not serve any business. The intrinsic value is in how this data can add insight into a company’s strategic or operational decision making process.
Our collaborative approach starts with identifying both your business objectives and goals as well as your intrinsic Big Data assets. In our engagement with you, we focus on:
Relevancy – Big Data’s value in your business is an extension of your existing structured data which consists of master, transactional, and reporting data assets. The strategy is focused on identifying Big Data assets internally and externally that fully enriches and extends the business insight of these existing data assets. Zendeux will develop the Big Data solutions that would specifically address your strategic and operational decision making goals.
Expertise – We leverage the depth of the Zendeux Alliance to ensure you have experts and resources to match your Big Data strategy solution needs.
Big Data Architecture – Before a Big Data implementation which may cost millions of dollars, a sound data architecture focused on integrating a company’s Big Data with its existing data model is required. Zendeux takes the legacy data architecture practices to the next level by providing unstructured data modeling processes to build a structured model from the unstructured data content. This would allow a target schema for the existing and extended unstructured data to be mapped in Big Data platforms or existing enterprise data warehouse.
Unstructured Text Processing – Using the state of the art Textual ETL™ developed by Bill Inmon (the father of data warehousing and author of over 52 books), we can extract valuable structured data from free text content such as legal documents, sales contracts, emails, tweets, technical documentation and much more to enrich your existing report data.
Data Matching & Extraction – When unstructured data processing requires complex matching mechanism, Zendeux uses its proprietary Fuzzy Matching Engine™ to extract exact data with high accuracy from even the most chaotic data sources.
Big Data Quality – Using the Zendeux Data Management Framework (ZDMF) we ensure the quality and integrity of your Big Data prior to load to Hadoop or other Big Data platform. This way we can ensure analytics for Big Data is done with trust and integrity. One of the sacrifices businesses make in Hadoop implementations is that of quality, as most are designed to address Volume and Velocity. However, once the data has been loaded into Hadoop, it is much more difficult to perform data quality processing.
Trust – The Zendeux Framework helps ensure that essential areas of Big Data Management are addressed before the costly technological implementation starts. Our Big Data Strategy and implementation is based on enabling trusted data for trusted decision making.