Since the 5 megabytes IBM hard disk that weighed over a ton and ran 24-inch (60-cm) discs in 1956, technologies have leaped far with many breakthroughs in storage capacity. However, the amount of information created has also increased exponentially along with it. There comes a point where there are so much data, traditional methods of processing become inadequate to deal with them.
Data sets like that are called Big Data, and today they continue to be seen as a valuable but mostly untapped resource for online commerce.
Let us take a quick look at the story of big data, shall we? 2002 is the year regarded as “beginning of the digital age”, marking the growth in digital storage usage, but the term Big data was coined prior to that. Forbes reported that the first instance recorded is in an article in the ACM digital library that dated back to 1997.
In it, the author described a problem encountered when datasets expand to a point where the computer system core no longer had enough space for them. The phrase “Big data” was used to name that problem. The phrase “Big data” was used to name that problem. Fast-forward to nowadays, the invention of the Internet of things devices like smartphones, tablets, and laptops that grant instant access to the online world has made creating and uploading information easier than ever before. In 2016, IBM estimated that more than 90% of the world technologies data is created during just the 2015- 2016 period alone.
According to Storage Servers, it is predicted that in 2018 about 50 gigabytes of information will be created per second (which is an enormous amount compared to the meager 100 gigabytes per day in 1992).
For corporations that deal with millions of users daily, the colossal volume of information they gather posing a big challenge that requires a different method of management than traditional ones. However, keep in mind that the term “big” is used in a relatively subjective sense. In online business, what can qualify as large enough for management restructuring for one company can be very different for others, ranging from hundreds of gigabytes to hundreds of terabytes.
For a general understanding of the concept, we would need to see what aspects big data can be described. The five most important concepts about big data are:
Volume: The size of data. This is the defining characteristic that decides whether an information bank is qualified as big data. It is generally agreed that data with a quantity that exceeds the capacity of conventional processing software is considered big data. Today big data size can even reach zettabyte or brontobyte unit.
Variety: The diversity and complexity of data. This takes into account the formats (structured semi-structured and unstructured), types (text, audio, video, etc.) and the sources (social media, business transaction, podcast, etc.) that information comes from.
Velocity: The speed at which data is created and processed. With current technologies and wide access to the Internet (the UN reported that about half of the world population is now online), big data is often created in real-time. For example, Amazon cloud handles more than 500 000 transactions every second, and YouTube takes care of 300 hours of video uploaded per minute.
Veracity: The quality of data. This takes into account how much noise (additional meaningless information) there are in the sets. Needless to say, the better the quality, the more accurate and valuable insights one can gain from the information.
Value: The actual values you get and bring to the customers from the data collected. Along with Veracity, this is a later added concept compared to the first three.
The two main applications of big data in online business are: optimizing customer service and forecasting trends
The most valuable advantage that big data brings to vendors is insights into customers’ behaviors pattern, therefore enabling businesses to tailor their service accordingly to each individual. For example, say about 60% of a customer’s transactions are board games, it is safe to assume that he will likely be interested in the new version of “Call of Cthulhu” as well.
That sounds simple enough, but in reality, it is quite hard to get an accurate and sufficient depiction of a customer taste and interest. Perhaps many of you who are reading this have encountered countless online advertisements that have absolutely no relation whatsoever to what you have in mind. This is because customer behaviors are very complex and hard to predict: as human, we can want hundreds of different things, and even though we have certain favorite subjects, at different times we might have completely different needs. Furthermore, when making a buying decision, we take into account dozens of different factors like price, brand name and place, each of which we have our own unique preference of.
Big data help painting the fullest, most updated picture of customers by constantly pulling and analyzing information from various sources: not only from customers’ transactions but also from the keywords their search for, the products their actually clicked on, their wishlist items as well as many other external and internal variables. In the example above, maybe “Call of Cthulhu” won’t catch the customer attention because he is more interested in science fiction, not horror.
Or maybe it did, but he won’t buy it unless there is a sale. The only way to produce the closest suggestion possible is having a system looking at all of that customer-related information it has, pick up the pattern (maybe more than half of his transaction has coupons applied, and 40% of the times he choose sci-fi genre) and then make a very educated guess about what to be pitched at him. This, though still has its limitations, has greatly enhanced product recommendation accuracy and makes cross-selling and upselling much more effective.
Needless to say, this is a boost to loyalty as well. The more you learn about your buyers, the better you can cater for their needs and meet their expectations, thereby paving a way for a long and lasting relationship with repeat customers.
From the gained knowledge about each customer, vendors can take it one step further and make strategic predictions about upcoming trends, demands and even the type of users they are going to have. This method is frequently applied by big brands like Amazon and Walmart to forecast the bestseller list before holiday and occasion. Opinion mining can be combined with big data analysis to rank products in term of potential popularity and profitability. Forecasting trends, on the whole, is harder to pull off for sure, but it can cut a considerable amount of storage and inventory management expense, especially for those who aim to employ the just-in-time management style.
As stated before, the problem with this potential resource predominantly lies in the capacity of information storage and process. Due to the sheer size of data sets, it requires special storage and analysis procedure. That being said, there are a number of other factors that can hinder this process and cause additional difficulties for businesses.
First of all, many companies don’t have the methods to collect information or don’t collect data sufficiently for mining. To benefit from big data, you need to have the data first. You may have picked up by now that what many companies refer to as big data is already available from a long time ago - it is simply just that people were not well aware of it. According to Dataconomy, over 1/3 of retailers remain in the dark over their data. Customers’ opinions have always been there, what was not are channels for them to easily express and have it documented. Product review, product rating, comment section and the like are common communication tools to collect such information.
Secondly, companies either don’t have or don’t make full use of tools to utilize the insights gained from data analysis. Big data is, well, big, complex and made of many components, each of which has a chance to be inaccurate, untrue and biased. Larger quantity does not necessarily guarantee better quality. If your source is known to be unreliable (like twitter posts, As such, it is said that the model the data is based on is just as important as its content.
Finally, there are always worries about personal privacy and information security. Having somebody know so much about you to the point of being able to predict how you will behave (albeit in a narrow way) certainly give rise to some concerns. Data breaches will be a nightmare for both vendors and customers, especially when so much personal-identifiable information is involved. Recently, GDPR has officially been issued to online businesses as a measure to protect customers’ personally identifiable data. I recommend you to keep an eye to this regulation, or you may have to face a million-dollar fine.
As stated above, the definition of “big data” varies between companies. You don’t need to jump right into expensive custom-made programs, because some of the modules that are suitable for you maybe are already well available out there. For Magento 2 vendors in particular, here are some of the ready-made tools that can be easily integrated into your store:
The most common application of Big Data is in the recommendation system. “Who bought this also bought” is an effective solution which automatically suggests relatable products for a customer base on their buying history. “Auto Related Product” is another popular take at upselling and cross-selling. Of course, a sophisticated sales system would need more than that, but these two are a good start.
To sum it up, the discovery of big data (or rather the application of it) has upped the game for virtual customer services. Vendors would need to be aware of this change and take measures if they want to keep it up with the ever-demanding customers. Despite the name, big data is not something that only available for big corporates like Google, Alibaba or Amazon. In a certain sense, making use of big data means starting looking at the resources that you have at hand but haven’t thought of before.
Katherine enjoys reading about marketing and online commerce. What intrigues her the most is the different creative ways vendors come up with to facilitate shopping process in their virtual stores, for example a One step checkout page or a Layered navigation system. Her favorite leisure activities are drawing, writing and spending time with her feline housemate.
Posted by ketharine in Blog . July 05, 2018