Wednesday, 1 July 2009

Of Twitter, Clouds and Google Goo

I have always thought Twitter is a time waster. However, I have also noticed that there are many companies and individuals using it as a marketing channel to reach the global mass, not to mention the propaganda tool as witnessed in the recent Iranian demonstrations. So instead of subscribing to RSS feeds, following tweets is now the "in" thing to do.

Using a social network for commercial and political gains is not new. We have seen it on Facebook and 2nd Life. However, to use tweets as an input for trading decision sounds dubious. Much of the twitter tweets are just noise and even garbage.

Today, out of curiosity I subscribed to Twitter. Within minutes of me opening my new Twitter account, I already got a follower. To be honest, I was pleasantly surprised and even flattered. Yet when I checked, it turned out to be a prostitute or cyber-pimp pushing some porn site . So to make use of tweets for trading, the system will have to identify which users to follow and filter out the fake, malicious and manipulating users - much like virus scanners rely on their virus database. This is a time-consuming and even labor intensive heuristic process relying on large volume of data. Then it has to filter and analyse the millions of messages per day to extract the useful information.

Let's put aside the ethical issues behind the practice of 'trading on rumors'. To say that a machine can determine market sentiment by reading tweets is at best an overstatement. Even human beings have trouble reading the sentiment in cyberspace, and that is why people have to add all sorts of smileys, emoticons and internet etiquette to assist the reader of the message. Also, the same words can have drastically different intentions and reactions based on different cultural, religional and circumstantial backgrounds. The idea of trawling through the internet to extract marketing information is not new. People have been attempting it on RSS feeds, newsgroups, user forums, etc. for a few years now. However, they are very focused/targetted on certain types of contents and are not of realtime nature -certainly not as ambitious as making trading decisions in real-time.

To apply complex fuzzy logic algorithms on large amount of (current as well as historical statistical) data is very CPU-, memory- and data-intensive. Such jobs are best suited for cloud-computing, which many big players are pushing - Sun, Microsoft, Amazon and Google. A couple of days ago I stumbled upon some cloud-computing PR articles and interviews and found this one - 谷雪梅谈云计算. I realised that the Google chief engineer interviewed in that video was my high school classmate. We affectionately called her 'Goo' back then. I guess now I have to call her the 'Goo of Google'. In that interview, a question was asked about how Google makes money. Well, I believe in 'Knowledge is power' and more so in the information age and that is how Google makes money. Comparing to the new comers (e.g. Bing, WolframAlpha) Google's search algorithm is quite lazy and unsophisticated - it relies on external links or the more you pay the higher the position in the list, yet it has accumulated vast amount of historical data and trained its systems to give better results. So Google has now taken the steps further to sell the infrastructure services and technologies such as GWT, cloud computing, Chrome and Google Wave. It seems Google has better things to do than to recycle the Twitter garbage - for now.

No comments: