What is Big Data?

Big Data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data in a single data set.

Big data Every Where !

Lots of data is being collected and warehoused                           

  • Web data, e-commerce                        
  • purchases at department/ grocery stores
  • Bank/Credit Card transactions
  • Social Network
  • Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167 times the information contained in all the books in the US Library of Congress.
  • Facebook handles 50 billion photos from its user base.
  • eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising. Inside eBay’s 90PB data warehouse.
HACE Theorem:

Big Data starts with large-volume, Heterogeneous, Autonomous sources with distributed and decentralized control, and seeks to explore Complex and Evolving relationships among data. These characteristics make it an extreme challenge for discovering useful knowledge from the Big Data.

Some makes it 4V's:


Need for Big Data:
  • What is the maximum file size you have dealt so far?
    • Movies/Files/Streaming video that you have used?
  • What have you observed?
  • What is the maximum download speed you get?
  • Simple computation                           
    • How much time to just transf  
Memory unit
Size
Binary size
Kilobyte(KB)
103
210
Megabyte(MB)
106
220
Gigabyte(GB)
109
230
Terabyte(TB)
1012
240
Petabyte(PB)
1015
250
Exabyte(EB)
1018
260
Zettabyte(ZB)
1021
270
Yottabyte(YB)
1024
280


     Big Data Now: 2013 Edition

     Current Perspectives from O'Reilly Media
     Publisher: O'Reilly
     Released: February 2014
   








Description


In Big Data Now: 2013 Edition, we pulled together our top posts from the O'Reilly Data blog from late fall 2012 through late fall 2013. In 2013,“big data” became more than just a technical term for scientists, engineers,and other technologists—the term entered the mainstream on a myriad of fronts, becoming a household word in news, business, health care, and people’s personal lives

Posts have been divided into four main chapters:
  • Evolving Tools and Techniques
  • Changing Definitions
  • Real Data
  • Health Care
    


    The Culture of Big Data

     By Mike Barlow
     Publisher: O'Reilly
     Released: October 2013






Description

Technology does not exist in a vacuum. In the same way that a plant needs water and nourishment to grow, technology needs people and process to thrive and succeed. Culture (i.e., people and process) is integral and critical to the success of any new technology deployment or implementation.

Big data is not just a technology phenomenon. It has a cultural dimension. It's vitally important to remember that most people have not considered the immense difference between a world seen through the lens of a traditional relational database system and a world seen through the lens of a Hadoop Distributed File System. This paper broadly describes the cultural challenges that accompany efforts to create and sustain big data initiatives in an evolving world whose data management processes are rooted firmly in traditional data warehouse architectures.
Posted by Unknown On 22:38 No comments

0 comments:

Post a Comment

  • RSS
  • Delicious
  • Digg
  • Facebook
  • Twitter
  • Linkedin
  • Youtube

Blog Archive

Contact Us


Name

E-mail *

Message *