text analytics

TA/NLP: It’s a jungle out there!

Text analytics and natural language processing have made tremendous advances in the last few years.  Unfortunately, there is a lot more to understanding natural language that TA/NLP.

I was reading a paper today about NLP pipelines for question answering that used machine learning to find what tools are good at what tasks and to configure a pipeline by selecting the best tool for a given task from each of the types of components in the pipeline.  The paper has a long list of various components, so I checked a few out.  Most of those of interest were available on the web so that they could be easily composed into pipelines without a lot of software setup.  Looking at these I quickly tired in disappointment.  Here are some of the reasons.

I am not surprised by these results.  NLU is hard.  But they are not particularly strong results either.  I’m surprised that people find such results useful (if they do).