Over $100m in 12 months backs natural language for the semantic web

Radar Networks is accelerating down the path towards the world’s largest body of knowledge about what people care about using Twine to organize their bookmarks.  Unlike social bookmarking sites, Twine uses natural language processing technology to read and categorize people’s bookmarks in a substantial ontology.  Using this ontology, Twine not only organizes their bookmarks intelligently but also facilitates social networking and collaborative filtering that result in more relevant suggestions of others’ bookmarks than other social bookmarking sites can provide.

Twine should rapidly eclipse social bookmarking sites, like Digg and Redditt.  This is no small feat!

The underlying capabilities of Twine present Radar Networks with many other opportunities, too.  Twine could spider out from bookmarks and become a general competitor to Google, as Powerset hopes to become.  Twine could become the semantic web’s Wikipedia, to which Metaweb’s Freebase aspires.

Radar Network might also lead the way to value that Reuters seeks through its acquisition of ClearForest and their Calais web service.  Twine is already much more open about its semantic web ontology than Freebase and its technology is capable of Calais’ tagging.   Calais is accumulating semantic assets along with Powerset, Metaweb and Radar Networks.   Unlike Calais, however, Twine has an immediately apparent go to market plan.

Twine could also compete on various social networking fronts, such as with LinkedIn for professionals or eHarmony for dating.  Twine accumulates deeper insight into what people care about than sites that do not have the ability to categorize semantically.  With a little vocabulary or psychological profiling, Twine could do this extremely well!  For some reason, I don’t think Radar Networks intends for Twine to become a Facebook competitor, but it could if they wanted.  Compared to Twine, Open Social is boring. 

Radar Networks’ CEO and founder, Nova Spivack, has been quoted as saying:

 “Twine is more like a semantic Facebook, and Metaweb is more like a semantic Wikipedia.”

Maybe it’s that simple.  But I don’t think so.

No, the real battle is for relevance – if not dominance – in Web 3.0, the semantic web. 

The semantic web standards for ontology are only the first step.  In addition to large, semantically rigorous ontologies, this battle will involve natural language processing.  And relatively big, smart money is being applied to the problem.  In the last 12 months, over $100m has been invested in Powerset, Metaweb, Radar Networks and ClearForest!  The locus of attach is between semantic web standards (especially RDF and OWL, the web ontology language) , natural language processing, and related web sites and services. 

  • Radar Networks is led by Nova Spivack and received early funding from Vulcan Capital, who have been funding advanced artificial intelligence research through the Halo project for years.  Leap Frog was also in the A round.  A B round of $13m from Velocity Interactive, Draper, and Vulcan announced in February brings total investment to $18m.

  • Powerset was founded by Barney Pell to commercialize “lexical function grammar” technology licensed from Xerox PARC in competition with Google.   Powerset was first funded by visionaries including Esther Dyson , early Googlers, and founders or CEOs of PayPal, LinkedIn, Facebook, Plaxo, Napster, and Answers.com.  Through 2006, Powerset raised $12.5m, most of it from Foundation Capital and Founders Fund, for roughly 30% of the company. 

  • Metaweb was co-founded by Danny Hillis and funded by $15m led by Benchmark Capital in 2006 and $42.5m more this year from Benchmark and Goldman Sachs.

  • Reuters bought ClearForest for $25-30m last April.  The Calais web service is being ramped up this year.

Some of these companies will build huge ontologies, including lots of factual knowledge that may rapidly rival Doug Lenat’s Cyc through the power of mass collaboration and natural language processing.  Of course, Cyc also involves reasoning technology, such as an inference engine, and these text analyzers do not answer questions as Halo strives to do or automate governance and compliance, as I have done at Haley and continue to pursue, but that day will come. 

P.S. Check out True Knowledge, and maybe Hakia, too.