Saturday, April 28, 2007

Finding People Across Multiple Social Networks

In the past year, we've had linked-in, facebook, myspace, twitter, etc. all register hundreds of thousands of user profiles.

I'm currently on linked-in and facebook heavily. I haven't tried myspace or twitter yet. In addition, there are dozens of other user profiles floating around the Internet in blogs, web pages, etc. And there are still old-school sites such as Classmates.com which provide people finding services through a local service.

Finding people on the Internet is actually reasonably hard to do. Simply googling a person's name might find the person, but if your name is "John Smith" then your out of luck. In addition, let's say you want to find everyone in your high school class (a typical facebook or classmates use case) - doing this using Google is really really tough because of the need for very structured meta-data against a person which google doesn't index.

So what about creating a search engine that uses metadata based searching to find people? For example, in facebook you have employment information, high school information, etc.

Here are a couple mock-ups to illustrate what I'm thinking of here...





In addition, what about linking profiles between networks so you could see an aggregated profile of a person's blog, facebook page, myspace page, every video they've posted to YouTube, etc.

A couple of issues:

1. Unique identifiers - there would to be a way to uniquely identify someone across networks. The most obvious identifier to use would be email address since this is commonly included in most user profiles.

This is a common problem in health care and there are already good solutions that use probabilistic scoring algorithms to analyze meta-data based on linkage business rules determine that two profiles are really the same person. A simple obvious example - if two profiles have the same email address, we can probably assume they're the same person. It gets more interesting when trying to match without a unique identifier and you're dealing with scoring of a variety of attributes such as gender, date of birth, first name, last name, postal code, etc. Given the rich profile detail on most of these network sites the ability to match profiles should be very strong - if you two profiles with the same gender, date of birth, last name and geographic indicator such as postal code, its pretty safe to assume that they are pointing to the same individual.

2. What information gets aggregated - this is a trickier problem. One solution would be to simply require participating networks to have an opt-in/opt-out ability to syndicate their profile information to the aggregator. This is similar to what Facebook or LinkedIn does today - the user gets to decide which information is public. Another option would be to enable any user to see their own aggregated profile and at least diagnose where the information is coming from. Then at least I can easily tell in a single view what is being syndicated and change it.

3. How I get contacted - all networking sites have different rules on how they allow contacts in order to prevent spammers, stalkers, etc. Again, one simple solution would be to simply delegate the contact rules back to the local network. If I find a facebook profile and want to contact that person (adding them as a friend) then I would follow the facebook process to do so (which is quite different than LinkedIn). This would also allow networking sites to compete on features but still provide a common meeting point.

4. Indexing additional information - it would be great to see additional general information from Google but optimized for people searching. There must be some interesting algorithms that could be developed for analyzing any web page for common profile meta-data. For example, take a standard web based discussion board. Each message might have at least a name, an email and the message content. If you can make a match on this, then you could aggregate this content together to see what people have posted over the past 6 months.

I think there are tonnes of possibilities here, and with the increase in meta-data from these networks, we should be able to do a lot better than standard google searching and having to go from network to network to search for people.

No comments:

       

Blog Archive