Dear Readers,
Data the real Asset
In an earlier Blog you read about my views on why many software projects are a liability, if you want to jog your memory go to http://thombrem.blogspot.in/2012/11/software-asset-or-liability.html
So if I think software is a liability you may ask, what is an asset on your project? Undoubtedly it is the DATA. Data is used to make decisions such as
- Where to go next (new products and services)
- What processes within your organization need automation or change
- Who's slacking
- What data is NOT being captured
- Why your business is sucking. etc.
- What data will be generated by a project?
- Will it be analyzed?
- Will the information extracted from it be useful and actionable?
- OR Is a process going to be automated by this application?
In other words, will Life be easier and more fun after this project? ;)
So your IT consultant may tell you that Scala or Ruby is better than Java or C++. DO NOT get carried away by these fallacies. Keep the focus on Data and Analysis of Data. Everything else is a fools paradise in IT. And as you will learn constantly: Anything that does not make sense goes away eventually! Good riddance!
Brains Vs Big Data
So now that you know you need to focus on Data. Perhaps you will start a Big Data project. Analyse all your customer buying patterns, Study their social media behaviour and unleash cloud computing to compute their changing behaviour and tastes. Will you become a successful enterprise? Maybe not! but you will surely end up spending many million dollars on your IT before you get this Epiphany.
Data is useful but ALL data may not be useful. Big data aims at collecting ALL data available blindly unleashing Analytics on it and producing 'intelligence'. I am not doubting that a correctly implemented BIG data project will succeed. BUT, here again I must ask you to stop and think!
Can the same or near-same results be produced by other means?
YES!
Can it be done at a fraction of the cost?
YES!
How?
Sampling
Representative sampling is one method that immediately comes to mind. I am sure there are many more ways to find out the very same intelligence without Analysis of Absolutely ALL the Data.
e.g. the Sampling Theorem (by Nyquist-Shannon) states that if the frequency of ANY waveform is B hertz, this waveform can be reconstructed by sampling at 2B hertz! Bloody brilliant! Now applying this to your data (time series) and figure out how many sample points you will need to construct a perfect model. This is the beauty of theory and brains Vs Brawn.
Pattern Novelty
Many times you find patterns that are repeating over time, over and over again. (No change). This is likely to happen in your BIG data projects as well. e.g. Everybody already knows Christmas season is big sales or that the next tropical storm will trigger panic buying. Unless there is likely to be Pattern Novelty in your Domain, I feel you should not be spending hard earned cash on Big data projects. In other words, The Big data project should have lasting value.
Conclusion
Smart organizations harness brains and new ideas not Brawn (Big Data). If you start a big data project with a long lasting goal in mind you will probably succeed. But if you jump into Big Data because "Everyone else is doing it" you will be going with the Flow. And as I like to say..."If you go with the flow you will end up down the drain"
+Milind K Thombre
(comments Welcome)