What is it?
Conta-me Histórias (Tell me stories), is an online tool that allows users to automatically generate temporal summarization of news articles maintained by the Portuguese Web Archive. During the last decade, we have been witnessing an ever-growing number of online content posing new challenges for those who aim to understand a given event. This exponential growth of the volume of data, together with the phenomenon of media bias, fake news and filter bubbles, has contributed to the creation of new challenges in information access and transparency. For instance, following the media coverage of long-lasting events like wars, migration or economic crises can be oftentimes confusing and demanding. One possible solution is the adoption of timelines to support story-telling as a way to organize the different phases of complex events. In this context, we invite students, journalists and researchers to explore our solution in this beta application.
How does it work?
The Arquivo.pt project archives the web periodically. It collects and stores entire websites, processing the data to make it searchable and finally providing a full-text search service that enables the retrieval of the past versions of the site.
To showcase the data archived by the Portuguese Web Arquive (http://arquivo.pt) , we show the user the most important excerpts (namely text titles) of a topic over time. For the selection of the best news titles we resort to YAKE! a keyword extractor designed by our team, and recently awarded as Best Short Paper at the 40th European Conference for Information Retrieval (ECIR 2018) (ECIR'18 ).
Additionally, we used the SentiLex-PT01, a sentiment analysis tool specially designed for Portuguese, used in this project to analyze headlines polarity.
Finally, making use of PAMPO designed to detect a list of relevant entities related to the query.
We presented a web application that allows users to generate temporal summarization on large news data sets. One of the main goals of this work is to attract the attention of the public for this promising research area. In the era of post-truth and fake news, web and news archives initiatives are important contributions to preserve history. In this context, our project may be considered an additional solution that allows users to better explore this kind of data.
The name ‘Conta-me Histórias’ is a reference to a popular song from one of the most important Portuguese rock musicians group Xutos & Pontapés.
In this project we make use of 24 Portuguese news sources.