Dear Readers,
Data the real Asset
In an earlier Blog you read about my views on why many software projects are a liability, if you want to jog your memory go to http://thombrem.blogspot.in/2012/11/software-asset-or-liability.html
So if I think software is a liability you may ask, what is an asset on your project? Undoubtedly it is the DATA. Data is used to make decisions such as
- Where to go next (new products and services)
- What processes within your organization need automation or change
- Who's slacking
- What data is NOT being captured
- Why your business is sucking. etc.
- What data will be generated by a project?
- Will it be analyzed?
- Will the information extracted from it be useful and actionable?
- OR Is a process going to be automated by this application?
In other words, will Life be easier and more fun after this project? ;)
So your IT consultant may tell you that Scala or Ruby is better than Java or C++. DO NOT get carried away by these fallacies. Keep the focus on Data and Analysis of Data. Everything else is a fools paradise in IT. And as you will learn constantly: Anything that does not make sense goes away eventually! Good riddance!
Brains Vs Big Data
So now that you know you need to focus on Data. Perhaps you will start a Big Data project. Analyse all your customer buying patterns, Study their social media behaviour and unleash cloud computing to compute their changing behaviour and tastes. Will you become a successful enterprise? Maybe not! but you will surely end up spending many million dollars on your IT before you get this Epiphany.
Data is useful but ALL data may not be useful. Big data aims at collecting ALL data available blindly unleashing Analytics on it and producing 'intelligence'. I am not doubting that a correctly implemented BIG data project will succeed. BUT, here again I must ask you to stop and think!
Can the same or near-same results be produced by other means?
YES!
Can it be done at a fraction of the cost?
YES!
How?
Sampling
Representative sampling is one method that immediately comes to mind. I am sure there are many more ways to find out the very same intelligence without Analysis of Absolutely ALL the Data.
e.g. the Sampling Theorem (by Nyquist-Shannon) states that if the frequency of ANY waveform is B hertz, this waveform can be reconstructed by sampling at 2B hertz! Bloody brilliant! Now applying this to your data (time series) and figure out how many sample points you will need to construct a perfect model. This is the beauty of theory and brains Vs Brawn.
Pattern Novelty
Many times you find patterns that are repeating over time, over and over again. (No change). This is likely to happen in your BIG data projects as well. e.g. Everybody already knows Christmas season is big sales or that the next tropical storm will trigger panic buying. Unless there is likely to be Pattern Novelty in your Domain, I feel you should not be spending hard earned cash on Big data projects. In other words, The Big data project should have lasting value.
Conclusion
Smart organizations harness brains and new ideas not Brawn (Big Data). If you start a big data project with a long lasting goal in mind you will probably succeed. But if you jump into Big Data because "Everyone else is doing it" you will be going with the Flow. And as I like to say..."If you go with the flow you will end up down the drain"
+Milind K Thombre
(comments Welcome)
Is there anyway one can leverage historical data? Clearly Indian retailers have something crude in place to anticipate demand and are improving on it. Any idea what they are doing?
ReplyDelete'Indian' retailers come in many flavours. There are the really mega stores you see in malls and then there are the kiranas (local grocers). The local grocer has no checkout line, stocks about 2000 distinct items and knows the price of everything by heart. The big guys hire the MBA's from local colleges who don't know their elbow from their ass. LOL! This is going to be a David and Goliath battle between the local grocer who uses no software but his brain and the mega stores who have an association rule mining engine running in the cloud.
DeleteHistorical data is mainly seasonal (varying by season) and trend (varying by years) generally retail data is highly seasonal and (for successful enterprises) with an upward trend. You can 'fit' a regression model on past data and predict future sales volumes. However this is not simple mathematics as you can guess, as products keep changing flavours, new products are introduced and old ones withdrawn etc. There can be many complicated scenarios which are near impossible to model, yet completely intuitive to predict for a human. Add to this the complication that Indian seasons do not come at the same day and month each year. etc etc.
Also note that Kotelnikov(Russia) had originally proposed the sampling theorem 15-20 years before Niquist-Shannon.
ReplyDelete