Did you ever stop to think about the amount of data available on the internet? Imagine this next time you run a google search: the moment you type in your query, you send one of the 2 million searches that Google receives per minute. In the same amount of time Facebook users post, roughly, 700 000 new pieces of content and online shoppers spend about $ 280 000.
Most of this data is not structured, which means that it is not properly ordered, doesn’t have the same setup or doesn’t come in the same format. Which is a shame, because this big pile of data contains very useful information, specially for certain companies. Suppose that you have an amazon store and you sell a line of products. As you may know, amazon users, in other words your consumer, can leave positive or negative comments. You can imagine that this is the information you are looking for. It’s right there, the only problem is capturing it in a structured way.
This “big pile of data” is knows as Big Data. To summarize, a big heap of unordered data. This, as stated earlier, forms a problem for companies. Since they are used of working with “conventional” databases such as a relational database or datawarehouses, where everything is predefined and nicely structured. They now have to deal with customers blabbing about their products on the internet without the possibility to capture it.
In order to solve this Big Data tools are developed and are still being developed as we speak. These tools are designed to work with loose data with no rules or boundaries. These Big Data tools are used in the category of Business Intelligence. In other words they are used to analyze data and capture an image about what is going on. There is however one big different with conventional Business Intelligence based on a datawarehouse. Since big data is not structured, it’s very hard to give exact and correct numbers. While a normal datawarehouse (which has been set up correctly) can give you exact numbers about sales, costs and all of that, big data tools can only give you an idea of what is actually going on. They have a more predictive task rather than being able to detect trends or to measure things.
There are many industries out there that already use big data. The gaming industry (I’m a nerd I know) uses big data to track data before and after gameplay and to predict performance. Supermarkets are using big data to create personalized folders for their customers. Even the ideal place to place a product can be predicted using big data.
Fun story, a few years ago, a reasonably large supermarket conveyed a study. They wanted to find patterns in the way that people buy products. For example, if you buy a printer you’ll most likely also buy ink for this piece of hardware. One of the most surprising outcomes was that if diapers where bought, chances were high that a crate of beer was also bought. Reason? Apparently it’s mostly men doing the diaper shopping and while they are add it they also refill their “to-coop-with-reality”-stash.
Result? While the diapers where placed in the front of the store, the beer was placed way on the other end of the story. Not consumer friendly? No probably not, but it was a great boost for the sales figures.
Because the supermarket “forced” people to go through the entire store to get what they need, they took along other things that they initially weren’t planning on buying.
In other words, like I said so many times before, be wary of what you post on the internet, because big brother is indeed watching you ;-).