Thursday, January 29, 2015

About big data

From: http://www.slideshare.net/timoelliott/ukisug2014-big-data-presentation

Types of big data:


  • Human generated data:
    • Swipe of credit card
    • Scan of bar code
    • Actions captured on mobile phones
  • Machine generated data:
    • Data logers
    • Sensors


From hindsight, to insight to foresight:

  • Descriptive: what did happen
  • Diagnostic: why did it happen
  • Predictive: will it happen again
  • Prescriptive: how can we make it happen
Types of data (sorted by % use in descending order) - ref2:
  • Time series
  • Business transactions
  • Geospatial/location
  • Graph (network)
  • Clickstream
  • Health records
  • Sensor
  • Image
  • Genomic
According to Gartner (Ref3):
  • Rising up to the peak of inflated expectation:
    • Data science
    • Predictive analytics
  • In the downslope of desillusion
    • Complexe event processing
    • Big Data
    • Content analytics

Big data can be used for:
  • Engage and empower data consumers:
    • Discovering new business opportunities
    • Identify new product opportunities
    • More reliable decision making
  • Plan and optimize: Improve operation efficiency
  • Personalize experience
Data lake: is a storage repository that holds a vast amount of raw data in its native format until it is needed.
HTAP = OLTP + OLAP = new generation of in-memory data platforms that can perform both online transaction processing (OLTP) and online analytical processing (OLAP) without requiring data duplication.

Ref 2: from http://www.paradigm4.com/wp-content/uploads/2014/06/P4-data-scientist-survey-FINAL.pdf
Ref 3: Gartner 2014 hype cycle: http://www.gartner.com/newsroom/id/2819918