LexRank: Graph-based Lexical Centrality as Salience in Text Summarization Degree Centrality In a cluster of related documents, many of the sentences are. A brief summary of “LexRank: Graph-based Lexical Centrality as Salience in Text Summarization”. Posted on February 11, by anung. This paper was. Lex Rank Algorithm given in “LexRank: Graph-based Lexical Centrality as Salience in Text Summarization” (Erkan and Radev) – kalyanadupa/C-LexRank.

Author: Juzshura Yosar
Country: Fiji
Language: English (Spanish)
Genre: Travel
Published (Last): 4 February 2008
Pages: 389
PDF File Size: 16.64 Mb
ePub File Size: 3.89 Mb
ISBN: 198-5-40512-728-5
Downloads: 32475
Price: Free* [*Free Regsitration Required]
Uploader: Meztim

After lexixal DUC evaluations, a more detailed analysis and more careful implementation of the method was presented together with a comparison against degree centrality and centroid-based summarizatio Xn graph-bssed, j gives the probability of reaching from state i to state j inn transitions. To extract the most important sentences, from the resulting similarity matrix we apply a thresholding mechanism. Citations Publications citing this paper. Purely extractive summaries often give better results compared to automatic abstractivesummaries.

LexRank: Graph-based Lexical Centrality as Salience in Text Summarization

Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Figure 3 shows the graphs that correspond to the adjacency matrices derived by assumingthe pair of sentences that have a similarity above 0. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence.

The convergence property of Markov chains also provides us with a simple iterative algorithm, called power method, to compute the stationary distribution Algorithm 2. All of ourapproaches are based on the concept of prestige 2 in social networks, which has also inspiredmany ideas in computer networks and information retrieval.

A Flexible Clustering Tool for Summarization. In fact, truly abstractivesummarization has not reached to a mature stage today. In this paper we present a detailed analysis of our approach andapply it to a larger data set including data from earlier DUC evaluations. All the feature values are normalized so that the sentencethat has the highest value gets the score 1, and the sentence with the lowest value getsthe score 0.


An unsupervised approach using multiple-sequence alignment. Skip to search form Skip to main content. Saljence order to the web – Page, Brin, et al. Graph-based lexical centrality as salience in text summarization Cached Download Links [www. In this model, a connectivity matrix based on intra-sentencecosine similarity is used as the adjacency matrix of the graph representation of sentences. Even the simplest approach wehave taken, degree centrality, centra,ity a good enough heuristic to perform better than lead-basedand centroid-based summaries.

Salirnce will discuss how random walks on sentence-based graphs can help in text summarization. At each iteration, the eigenvector isupdated by multiplying with the transpose of the stochastic matrix. Task 2 of both DUC and involve centtrality summarization of news documents clusters. The pagerank citation ranking: In Research and Development in Information Retrieval, pp.

A brief summary of “LexRank: Graph-based Lexical Centrality as Salience in Text Summarization”

Our summarization approach in this paper is to assess the centrality of centraality sentence in a cluster and extract the most important ones to include in the summary. Zha argues that the terms thatappear in many sentences with high salience scores should have high salience scores, and thesentences that contain many terms with high salience scores should also have high saliencescores.

The top scores we have got in all data sets come from our new methods. We test the centraliy on the problem of Text Summarization TS.

Centrality-based Sentence Salience In this section, we propose several other criteria to assess sentence salience. A MEAD policy is a com Continuous LexRank on weighted To determine the similarity between two sentences, we have used the cosine similarity metric that is based on word overlap and idf weighting.

In LexRank, we have tried to make use of more of theinformation in the graph, and got even better results in most of the cases. Bringing order to the web. This is due to the binary discretization we perform on the cosine matrix using Our summarization approach in this paper is to assess the centrality of each sentence in a cluster and extract the most important ones to include in the summary.


LexRank scores for the graphs in Figure 3. Theseed paragraphs are then considered as the representative descriptions of the correspondingsubtopics, and included in the summary. The higher the threshold, the less informative, or even mis-leading, similarity graphs we must have.

However, there are more advanced techniques of assessing similarity which are often used in the topical clustering of documents or sentences Hatzivassiloglou et al. Our LexRank implementation requires thecosine similarity threshold, 0. The results show that degree-based methods including LexRank outperform both centroid-based methods and other systems participating in DUC in most of the cases.

LexRank: Graph-based Lexical Centrality as Salience in Text Summarization

A clustering tool for summarization – Hatzivassiloglou, Klavans, et al. Acknowledgments We would like to thank Mark Newman for providing some useful references for this paper. An intuitive interpretation of the stationary distribution can be understood by theconcept of a random walk.

We discussseveral methods to compute centrality using the similarity graph. A social network is a mappingof relationships between interacting entities e. Each element of the vector r gives the asymptotic probabilityof ending up in the corresponding state in the long run regardless of the starting state.

As in all discretization operations, this means an informationloss. Our system, based on LexRank ranked in first place in more than one task in the recent DUC evaluation. References Publications referenced by this paper.