Briefly: a tweakable text summarizer
A simple interface to digest voluminous text
People have to deal with large volumes of text in their day-to-day work. It is hard to decide what to skip, what to digest and what to read in detail. Time is a precious resource and we are usually willing to tradeoff some quality or convenience for time.
This is where a good summarizer can help. It can help you out of the trilemma outlined above. An automated text summarizer lets you pick an acceptable tradeoff between the quality of a summary and the time required to produce it.
Types of summarizers
There are two types of summarizers: abstractive and extractive. An abstractive summarizer constructs an abstract from the input text. It frames its own sentences for the summary. Consequently it requires to be trained using text summarized by human experts.
An extractive summarizer ranks sentences in the input text and picks the top ranked ones for the summary. The model doesn’t require training, however, it requires an algorithm to pick ‘important’ sentences for the summary.
Introducing Briefly
Briefly is an extractive text summarizer. It uses the principle of topic modelling to identify subtopics in an article and then pick the most relevant and important sentences from your subtopic and aggregates it into a summary. Briefly extracts relevant sentences from text and does not craft its own sentences due to which it preserves the semantics.
Briefly is a tweakable summarizer with a dual interface:
- A convenient, web interface for interactive experimental summarization.
- A command line interface for batch mode use.
Using Briefly
One can use Briefly via the web or the command line interface. Let us look at how to use its options to tune the summary.
- The
min_word_count
lets the summarizer ignore the sentences that have words with counts less than this number. merge_threshold
controls the identification of subtopics during the process of summarization.summary_size
controls the number of summary sentences from each identified subtopic.passes
is the number of runs over which you would like the summarizer to aggregate the summary.- Finally the
include_context
flag lets you include a line of context before and after each summary sentence. This helps smoothen the ‘discontinuities’ in the generated summary.
All these options may interact with each other depending on the text to be summarized. What works best for your summarization task will have to be determined by you.
Experiments with Briefly
To experiment with Briefly, try giving it different genres of text such as news articles, reports and fiction. Briefly gives different results with different types of text.
However, news articles and reports seem to work better compared to fiction, stories and rambling narratives.
To get the latest version of Briefly, visit its Github page.
Enhancements
One idea to make the summarizer available to a large number of people is by implementing it as a service. This means we can make it available through the LAN or through the internet. We give the service a REST API. This ensures that we can communicate with our app remotely.
Communication occurs through resources referred by endpoints. The endpoint
/options
contains the options to get the desired summary output.
Interaction with the API occurs with three types of requests: GET
, PUT
and POST
. The GET
request returns the resources we need. options
is a
resource that can be returned this way.
The PUT
request can be used to change the resource. The parameters inside the
options
resource is changed using this request.
However, there is a resource that we have to make. This is the summary resource
referred by the /summary
endpoint. The POST
request creates the
final summary and returns it.
FastAPI runs our REST API on our local server. Have a look at a screenshot of how it runs below: