
- Image via CrunchBase
When Robert Scoble brought up a conversation about the real time web as a threat to Google, one of the things that immediately came to mind was how to solve the problem. Here are some thoughts on that.
The first key issue is to realize that you will, or a system will be inundated by a raw fire hose approach to the concept of the real time web. We generate gigabytes of data daily, and even Google does not update their indexes or their page rank systems on a daily basis. Sites like Technorati update when they can, sometimes daily, sometimes not, and often missing things. By looking at the raw fire hose, things are going to get disrupted, continuity will be broken, and cognitive dissonance will follow.
The second key issue would be to work with a representative sample or Meta data, the feeds from systems like Digg, Stumble Upon, Reddit, Techmeme, Google RSS public readers for industry thought leaders, the Twitter public feed, FriendFeed, and Social Median all provide raw feeds of the information that is hitting their systems. This Meta Data, data that is digested and recommended by people is a smaller data set, but one that is at least a representative sample of the overall tenor of the internet as it is happening.
Data output, you are going to want a dashboard, and the ability to drill down, much like Executive level decision making systems, the overall dashboard, with the ability to drill down into the conversation is about the only real output that will make sense given the volume of input. Taking a 100,000 foot view on the home page, based on authority (or sheer volume of a message) and based on popularity of a concept (not necessarily the article or specific write up) being presented.
When you drill down into the meme/conversation you will want to identify actors, who is speaking the loudest, IE If Robert Scoble Says X is a great idea, and then is backed on that by Louis Gray and Mike Fruchter, quickly followed by WinExtra and Techwag the larger actors in this (Robert, Louis), with tags off on how Mike, Dan and Steve played off and interconnected the conversation. As well as knowing where the conversation ended up, how it was consumed (tweeted? Retweeted? FriendFeed? Dugg?) And then how those branches played into a bigger audience based on the concept. You could literally time line this puppy if you had the information based on the Meta data, keywords, and a decent database.
Keep the database open for data mining of concept key words and terms, followed not just by drill down but also the time line it takes for a conversation to expand beyond the limited horizon of the individual actor. We can time line it out and dashboard it to see how that meme extends over time then fades out to the next shiny thing on the internet, or how it turns into long tail information, as the meme begins its normal sine wave distribution curve along that long tail.
Tapping the Meta data would not be that hard, building the industrial sized database and then the programming around the dashboard concept would be hard. Mostly because it would have to be scalable, and be able to be accessed by a large group of interested internet users. The point is that it is doable, even as an exercise in how to vacuum the internet based on Meta data with a simple graphical interface for meme tracking and drill down, this is not hard, and has been done by many businesses. By approaching this as an executive decision support making process, tapping the real time web is solvable, and probably without much difficulty at all.
Tags: real time web, tapping, meta data, data mine, interesting, problem, solved, maybe
Related articles by Zemanta
- A list of 10 social media habits that I am stopping immediately (socialmediatoday.com)
- 3 Powerful Tools For Monitoring Your Brand (davidrisley.com)
- Tips for Startups in Social Networking (techwag.com)
- FriendFeed: the Friends of Friends (conversationagent.com)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=7c9536db-c887-43a7-ba96-9cbeaad9794d)












