Automatic recognition of topic transition for newspaper articles and application to document summary

自动识别报纸文章的主题转换并应用于文档摘要

基本信息

  • 批准号:
    15500086
  • 负责人:
  • 金额:
    $ 2.5万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
  • 财政年份:
    2003
  • 资助国家:
    日本
  • 起止时间:
    2003 至 2004
  • 项目状态:
    已结题

项目摘要

In this study, we paid attention to the automatic summary for the newspaper articles.The following was researched as a first step to summarize multi-documents precisely.(1)A subject template is made from the large-scale corpus, and we extract the subsequent articles of a target article using that subject template correctly. The extracted subsequent articles are classified in the subject cluster.(2)Every subject cluster is summarized, and the whole of the subsequent articles is summarized in consideration of a connection between the clusters.Regarding (1), we proposed a method for topic tracking using subject templates and machine learning (support vector machines). And also, we showed that our methods can extract subsequent articles with high accuracy using large corpus (the corpus by Topic detection and Tracking and articles of Mainichi Shimbun newspaper) (research paper 4,5).Regarding (2), we found that we have to extract synonyms of each word for multi-document summarization. We proposed a method to identify synonym pairs from Japanese newspaper (3,4). For identifying synonyms, we compared Lin's method with Hindle's method and we found Lin's method is better than Hindle's method for Japanese documents.We proposed a method which is based on Lin's method for Japanese documents. Moreover, we performed some experiments of sentence extraction using automatically extracted synonym pairs and title of newspaper article (research paper 1,2). The method is as the following, firstly from newspaper articles we extracted synonyms of words in titles of newspaper articles using the proposed method which is based on Lin's method, then we performed sentence extraction using the results. The results show that identifying synonyms is useful for sentence extraction.
在这项研究中,我们关注报纸文章的自动摘要。以下是精确总结多重件的第一步。(1)由大规模语料库制作主题模板,我们提取了使用该主题模板正确提取目标文章的后续文章。 (2)总结了每个主题群集,然后总结了每个受试者群集,并且考虑到群集之间的联系,总结了每个主题群集,然后将整个文章汇总。重新标记(1),我们提出了一种使用受试者模板和机器学习(支持向量机)的主题跟踪的方法。而且,我们还表明,我们的方法可以使用大型语料库(按主题检测和跟踪以及Mainichi Shimbun报纸的语料库进行高精度提取随后的文章(研究论文4,5)。奖励(2),我们发现我们必须为多文章汇总提取每个单词的同义词。我们提出了一种识别日本报纸(3,4)的同义成对的方法。为了识别同义词,我们将林的方法与Hindle的方法进行了比较,我们发现Lin的方法比Hindle的日本文档方法更好。我们提出了一种基于Lin的日本文档方法的方法。此外,我们使用自动提取的同义词对和报纸文章的标题进行了一些句子提取的实验(研究论文1,2)。该方法如下所示,首先是从报纸文章中,我们使用基于Lin方法的提议方法提取了报纸文章中的单词同义词,然后我们使用结果进行了句子提取。结果表明,识别同义词对于句子提取很有用。

项目成果

期刊论文数量(28)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi: "Complementing News Stories with Newswire Articles for Topic Tracking"Proceedings of PACLING'03(Pacific Association for Computational Linguistics 2003). 1. 265-274 (2003)
Yoshimi Suzuki、Fumiyo Fukumoto、Yoshihiro Sekiguchi:“用新闻专线文章补充新闻故事以进行主题跟踪”PACLING03 会议记录(太平洋计算语言学协会 2003 年)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Extracting Similar Nouns for Sentence Extraction
提取相似名词进行句子提取
続報記事抽出のための記事間類似度を利用したSVM学習データの自動生成
利用文章之间的相似度自动生成SVM训练数据以进行后续文章提取
Complementing news stories with newswire articles for topic tracking
用新闻专线文章补充新闻报道以进行主题跟踪
  • DOI:
  • 发表时间:
    2003
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yoshimi SUZUKI;Fumiyo FUKUMOTO;Yoshihiro SEKIGUCHI
  • 通讯作者:
    Yoshihiro SEKIGUCHI
Clustering Similar Nouns for Selecting Related News Articles
对相似名词进行聚类以选择相关新闻文章
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

SUZUKI Yoshimi其他文献

Ehrlichia ruminantium多型解析のためのMulti-Locus Variable Number Tandem Repeat Analysis (MLUA)法開発
反刍埃里希体多态性分析的多位点可变数量串联重复分析(MLUA)方法的开发
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    CASARETO Beatriz E;NIRAULA Mohan;SUZUKI Toshiyuki;OHBA Hideo;AGOSTINI Sylvain;SUZUKI Yoshimi;中尾亮
  • 通讯作者:
    中尾亮
Nitrogen fixation in fringing coral reefs : a comparison among different sub-environments
边缘珊瑚礁的固氮:不同亚环境之间的比较
  • DOI:
  • 发表时间:
    2009
  • 期刊:
  • 影响因子:
    0
  • 作者:
    CASARETO Beatriz E;NIRAULA Mohan;SUZUKI Toshiyuki;OHBA Hideo;AGOSTINI Sylvain;SUZUKI Yoshimi
  • 通讯作者:
    SUZUKI Yoshimi

SUZUKI Yoshimi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('SUZUKI Yoshimi', 18)}}的其他基金

Construction of a computational model of word sense based on vocabulary, phonology, and pronunciation, and its application to multiple document summarization
基于词汇、音韵、发音的词义计算模型的构建及其在多文档摘要中的应用
  • 批准号:
    18K11429
  • 财政年份:
    2018
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Developing a competency model of readiness for bioterrorism in public health nurses
开发公共卫生护士应对生物恐怖主义的能力模型
  • 批准号:
    17K12598
  • 财政年份:
    2017
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Tone Understanding of Article based on Word Connotation and Application to Text Summarization
基于词义的文章语气理解及其在文本概括中的应用
  • 批准号:
    26330247
  • 财政年份:
    2014
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Evaluation of public health nursing students achievement levels based on the collaboration with local governments and universities pioneering the introduction of public health nursing electives
基于与地方政府和大学合作的公共卫生护理学生成绩水平评估,率先引入公共卫生护理选修课
  • 批准号:
    26463577
  • 财政年份:
    2014
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Evaluation of the breast cancer early detection educational program for Filipino women in Japan based on partnership
基于伙伴关系的日本菲律宾妇女乳腺癌早期检测教育计划的评估
  • 批准号:
    23593411
  • 财政年份:
    2011
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
The saurus construction using corpus of science and technology and its application to document retrieval
科技语料库构建及其在文献检索中的应用
  • 批准号:
    20500127
  • 财政年份:
    2008
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Developing partnership program to promote Filipino women's health
制定合作伙伴计划以促进菲律宾妇女的健康
  • 批准号:
    20890234
  • 财政年份:
    2008
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Young Scientists (Start-up)
Characteristics of coral bleaching in the Mauritius : micro ecosystem and biogeochemistry
毛里求斯珊瑚白化的特征:微生态系统和生物地球化学
  • 批准号:
    19255002
  • 财政年份:
    2007
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Chemical and biological study of coral reef around the Mauritius : investigation on the bleaching
毛里求斯周围珊瑚礁的化学和生物学研究:白化调查
  • 批准号:
    15255004
  • 财政年份:
    2003
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (A)
Quantitative relationship on the behavior between organic matters and radionuclides in seawater
海水中有机物与放射性核素行为的定量关系
  • 批准号:
    13480155
  • 财政年份:
    2001
  • 资助金额:
    $ 2.5万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了