Briefly: a tweakable text summarizer

A simple interface to digest voluminous text

June 11, 2024

695 words/4 min read

Page content

People have to deal with large volumes of text in their day-to-day work. It is hard to decide what to skip, what to digest and what to read in detail. Time is a precious resource and we are usually willing to tradeoff some quality or convenience for time.

This is where a good summarizer can help. It can help you out of the trilemma outlined above. An automated text summarizer lets you pick an acceptable tradeoff between the quality of a summary and the time required to produce it.

Types of summarizers

There are two types of summarizers: abstractive and extractive. An abstractive summarizer constructs an abstract from the input text. It frames its own sentences for the summary. Consequently it requires to be trained using text summarized by human experts.

An extractive summarizer ranks sentences in the input text and picks the top ranked ones for the summary. The model doesn’t require training, however, it requires an algorithm to pick ‘important’ sentences for the summary.

Introducing Briefly

Briefly is an extractive text summarizer. It uses the principle of topic modelling to identify subtopics in an article and then pick the most relevant and important sentences from your subtopic and aggregates it into a summary. Briefly extracts relevant sentences from text and does not craft its own sentences due to which it preserves the semantics.

Briefly is a tweakable summarizer with a dual interface:

A convenient, web interface for interactive experimental summarization.
A command line interface for batch mode use.

Briefly web interface

Using Briefly

One can use Briefly via the web or the command line interface. Let us look at how to use its options to tune the summary.

The min_word_count lets the summarizer ignore the sentences that have words with counts less than this number.
merge_threshold controls the identification of subtopics during the process of summarization.
summary_size controls the number of summary sentences from each identified subtopic.
passes is the number of runs over which you would like the summarizer to aggregate the summary.
Finally the include_context flag lets you include a line of context before and after each summary sentence. This helps smoothen the ‘discontinuities’ in the generated summary.

All these options may interact with each other depending on the text to be summarized. What works best for your summarization task will have to be determined by you.

Experiments with Briefly

To experiment with Briefly, try giving it different genres of text such as news articles, reports and fiction. Briefly gives different results with different types of text.

However, news articles and reports seem to work better compared to fiction, stories and rambling narratives.

To get the latest version of Briefly, visit its Github page.

Enhancements

One idea to make the summarizer available to a large number of people is by implementing it as a service. This means we can make it available through the LAN or through the internet. We give the service a REST API. This ensures that we can communicate with our app remotely.

Communication occurs through resources referred by endpoints. The endpoint /options contains the options to get the desired summary output.

Interaction with the API occurs with three types of requests: GET, PUT and POST. The GET request returns the resources we need. options is a resource that can be returned this way. The PUT request can be used to change the resource. The parameters inside the options resource is changed using this request. However, there is a resource that we have to make. This is the summary resource referred by the /summary endpoint. The POST request creates the final summary and returns it.

FastAPI runs our REST API on our local server. Have a look at a screenshot of how it runs below:

REST API interface