Automatic recognition of topic transition for newspaper articles and application to document summary
自动识别报纸文章的主题转换并应用于文档摘要
基本信息
- 批准号:15500086
- 负责人:
- 金额:$ 2.5万
- 依托单位:
- 依托单位国家:日本
- 项目类别:Grant-in-Aid for Scientific Research (C)
- 财政年份:2003
- 资助国家:日本
- 起止时间:2003 至 2004
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In this study, we paid attention to the automatic summary for the newspaper articles.The following was researched as a first step to summarize multi-documents precisely.(1)A subject template is made from the large-scale corpus, and we extract the subsequent articles of a target article using that subject template correctly. The extracted subsequent articles are classified in the subject cluster.(2)Every subject cluster is summarized, and the whole of the subsequent articles is summarized in consideration of a connection between the clusters.Regarding (1), we proposed a method for topic tracking using subject templates and machine learning (support vector machines). And also, we showed that our methods can extract subsequent articles with high accuracy using large corpus (the corpus by Topic detection and Tracking and articles of Mainichi Shimbun newspaper) (research paper 4,5).Regarding (2), we found that we have to extract synonyms of each word for multi-document summarization. We proposed a method to identify synonym pairs from Japanese newspaper (3,4). For identifying synonyms, we compared Lin's method with Hindle's method and we found Lin's method is better than Hindle's method for Japanese documents.We proposed a method which is based on Lin's method for Japanese documents. Moreover, we performed some experiments of sentence extraction using automatically extracted synonym pairs and title of newspaper article (research paper 1,2). The method is as the following, firstly from newspaper articles we extracted synonyms of words in titles of newspaper articles using the proposed method which is based on Lin's method, then we performed sentence extraction using the results. The results show that identifying synonyms is useful for sentence extraction.
在本研究中,我们关注的是报纸文章的自动摘要。作为多文档精确摘要的第一步,我们进行了以下研究。(1)从大规模语料库中制作主题模板,并提取主题模板。目标文章的后续文章正确使用该主题模板。将提取的后续文章分类到主题簇中。(2)对每个主题簇进行总结,并考虑到簇之间的连接来总结整个后续文章。关于(1),我们提出了一种主题跟踪方法使用主题模板和机器学习(支持向量机)。此外,我们还表明,我们的方法可以使用大型语料库(主题检测和跟踪的语料库以及每日新闻报纸的文章)高精度地提取后续文章(研究论文 4,5)。关于(2),我们发现我们必须提取每个单词的同义词以进行多文档摘要。我们提出了一种从日本报纸中识别同义词对的方法(3,4)。在同义词识别方面,我们将Lin's方法与Hindle's方法进行了比较,发现Lin's方法对于日语文档比Hindle's方法要好。我们提出了一种基于Lin's方法的日语文档方法。此外,我们使用自动提取的同义词对和报纸文章标题进行了一些句子提取实验(研究论文1,2)。该方法如下,首先使用基于Lin方法的所提出的方法从报纸文章中提取报纸文章标题中的单词的同义词,然后使用结果进行句子提取。结果表明,识别同义词对于句子提取很有用。
项目成果
期刊论文数量(28)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi: "Complementing News Stories with Newswire Articles for Topic Tracking"Proceedings of PACLING'03(Pacific Association for Computational Linguistics 2003). 1. 265-274 (2003)
Yoshimi Suzuki、Fumiyo Fukumoto、Yoshihiro Sekiguchi:“用新闻专线文章补充新闻故事以进行主题跟踪”PACLING03 会议记录(太平洋计算语言学协会 2003 年)。
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
- 通讯作者:
続報記事抽出のための記事間類似度を利用したSVM学習データの自動生成
利用文章之间的相似度自动生成SVM训练数据以进行后续文章提取
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:北條博之;鈴木良弥
- 通讯作者:鈴木良弥
Extracting Similar Nouns for Sentence Extraction
提取相似名词进行句子提取
- DOI:
- 发表时间:2004
- 期刊:
- 影响因子:0
- 作者:Yoshimi Suzuki;Fumiyo Fukumoto
- 通讯作者:Fumiyo Fukumoto
Complementing news stories with newswire articles for topic tracking
用新闻专线文章补充新闻报道以进行主题跟踪
- DOI:
- 发表时间:2003
- 期刊:
- 影响因子:0
- 作者:Yoshimi SUZUKI;Fumiyo FUKUMOTO;Yoshihiro SEKIGUCHI
- 通讯作者:Yoshihiro SEKIGUCHI
Sentence Extraction using Similar Words
使用相似词提取句子
- DOI:
- 发表时间:2005
- 期刊:
- 影响因子:0
- 作者:Yoshimi Suzuki;Fumiyo Fukumoto
- 通讯作者:Fumiyo Fukumoto
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
SUZUKI Yoshimi其他文献
Ehrlichia ruminantium多型解析のためのMulti-Locus Variable Number Tandem Repeat Analysis (MLUA)法開発
反刍埃里希体多态性分析的多位点可变数量串联重复分析(MLUA)方法的开发
- DOI:
- 发表时间:
2010 - 期刊:
- 影响因子:0
- 作者:
CASARETO Beatriz E;NIRAULA Mohan;SUZUKI Toshiyuki;OHBA Hideo;AGOSTINI Sylvain;SUZUKI Yoshimi;中尾亮 - 通讯作者:
中尾亮
Nitrogen fixation in fringing coral reefs : a comparison among different sub-environments
边缘珊瑚礁的固氮:不同亚环境之间的比较
- DOI:
- 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
CASARETO Beatriz E;NIRAULA Mohan;SUZUKI Toshiyuki;OHBA Hideo;AGOSTINI Sylvain;SUZUKI Yoshimi - 通讯作者:
SUZUKI Yoshimi
SUZUKI Yoshimi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('SUZUKI Yoshimi', 18)}}的其他基金
Construction of a computational model of word sense based on vocabulary, phonology, and pronunciation, and its application to multiple document summarization
基于词汇、音韵、发音的词义计算模型的构建及其在多文档摘要中的应用
- 批准号:
18K11429 - 财政年份:2018
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Developing a competency model of readiness for bioterrorism in public health nurses
开发公共卫生护士应对生物恐怖主义的能力模型
- 批准号:
17K12598 - 财政年份:2017
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Tone Understanding of Article based on Word Connotation and Application to Text Summarization
基于词义的文章语气理解及其在文本概括中的应用
- 批准号:
26330247 - 财政年份:2014
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Evaluation of public health nursing students achievement levels based on the collaboration with local governments and universities pioneering the introduction of public health nursing electives
基于与地方政府和大学合作的公共卫生护理学生成绩水平评估,率先引入公共卫生护理选修课
- 批准号:
26463577 - 财政年份:2014
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Evaluation of the breast cancer early detection educational program for Filipino women in Japan based on partnership
基于伙伴关系的日本菲律宾妇女乳腺癌早期检测教育计划的评估
- 批准号:
23593411 - 财政年份:2011
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The saurus construction using corpus of science and technology and its application to document retrieval
科技语料库构建及其在文献检索中的应用
- 批准号:
20500127 - 财政年份:2008
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Developing partnership program to promote Filipino women's health
制定合作伙伴计划以促进菲律宾妇女的健康
- 批准号:
20890234 - 财政年份:2008
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Young Scientists (Start-up)
Characteristics of coral bleaching in the Mauritius : micro ecosystem and biogeochemistry
毛里求斯珊瑚白化的特征:微生态系统和生物地球化学
- 批准号:
19255002 - 财政年份:2007
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Chemical and biological study of coral reef around the Mauritius : investigation on the bleaching
毛里求斯周围珊瑚礁的化学和生物学研究:白化调查
- 批准号:
15255004 - 财政年份:2003
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Quantitative relationship on the behavior between organic matters and radionuclides in seawater
海水中有机物与放射性核素行为的定量关系
- 批准号:
13480155 - 财政年份:2001
- 资助金额:
$ 2.5万 - 项目类别:
Grant-in-Aid for Scientific Research (B)