Projects BH5

DataArt’s Big Data Competence Center launched an app that processes news from the most influential U.S. and U.K. news sources and converts it into easy-to-understand charts and infographics.

While collecting news, the app does structured crawling, i.e. when the article content, its title, author and tags are taken from the web-page without any information noise. During the processing the news articles are split into ten categories (business, politics, sports, etc.), and a natural language processor sorts the articles by the referenced object (person, organization, location) or by the emotions presented (whether the text has positive or negative connotations).

The data is aggregated and recorded in charts or presented in infographics allowing, for instance, to find all news posts about a certain person or place or to filter the news by their emotional coloring.

The application was made using pure HTML5 and MongoDB. The product is agile and fast despite the amount of data been analyzed.