Studies on fast pattern matching algorithms based on text compressions

基于文本压缩的快速模式匹配算法研究

基本信息

  • 批准号:
    09680343
  • 负责人:
  • 金额:
    $ 0.45万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
  • 财政年份:
    1997
  • 资助国家:
    日本
  • 起止时间:
    1997 至 1998
  • 项目状态:
    已结题

项目摘要

The aim of text compressions is to decrease the amount for storing files in secondary disk stor- ages. Therefore the traditional criterion is the compression ratio. In this project we propose a new criterion to select a compression method. The criterion is the efficiency of string pattern matching in compressed texts without decoding. The goals of this project are :Goal 1 : A faster search in compressed text in comparison with a decompression followed by a simple search.Goal 2 : A faster search in compressed text in comparison with a simple search in uncompressed text.Main results of this research in these two years are summarized as follows.(1) We developed and implemented a multiple pattern matching algorithm in compressed text by the LZW compression method, which is used in the COMPRESS command in UNIX.(2) We also devised a more efficient algorithm for a single pattern in LZW compressed texts, which is based on the Shift-And approach.(3) We proved by experiments that the algorithms of (1) and (2) are approximately twice faster than a decompression followed by a simple search. That is, we have achieved Goal 1.(4) We proved by experiments that the algorithms of (1) and (2) are faster than a simple search on uncompressed texts. That is, we have achieved Goal 2.(5) We also developed compressed pattern matching algorithms for other compression methods, such as, byte pair encoding, Huffman encoding, finite-state encoding, and compression using antidictionaries, and then evaluate them. We have finished this project successfully.
文本压缩的目的是减少在辅助磁盘存储中存储文件的金额。因此,传统标准是压缩比。在此项目中,我们提出了一个新标准来选择一种压缩方法。标准是在不解码的情况下在压缩文本中匹配的字符串模式的效率。该项目的目标是:目标1:与减压后的压缩文本中更快的搜索,然后进行简单搜索。目标2:与未压缩文本中的简单搜索相比,在压缩文本中更快的搜索搜索。这两年的这项研究的结果总结如下所示。 (2)我们还为LZW压缩文本中的单个模式设计了一种更有效的算法,该算法基于移位和方法。(3)我们通过实验证明了(1)和(2)的算法比对压缩的算法(2)的算法快两倍,然后进行了简单的搜索。也就是说,我们已经实现了目标1。(4)我们通过实验证明了(1)和(2)的算法比对未压缩文本的简单搜索要快。也就是说,我们已经实现了目标2。(5)我们还为其他压缩方法开发了压缩模式匹配算法,例如,字节对编码,Huffman编码,有限状态编码和使用反杂种的压缩,然后对其进行评估。我们已经成功完成了这个项目。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Kida,T.et al.: "Shift-And approach to pattern matching in LZW compressed text" Technical Report DOI-TR-156,Kyushu University. 1-13 (1999)
Kida,T.et al.:“LZW 压缩文本中模式匹配的 Shift-And 方法”技术报告 DOI-TR-156,九州大学。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Kida,T.et al.: "Multiple Pattern Matching in LZW Compressed Text" Proc.Data Compression Conference,DCC'98. 103-112 (1998)
Kida,T.et al.:“LZW 压缩文本中的多重模式匹配”Proc.Data Compression Conference,DCC98。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
T.Kida et al.: "Multiple Pattern Matching in LZW Compressed Text" Proc.Data Compression Conference,DCC'98. (to appear). (1998)
T.Kida 等人:“LZW 压缩文本中的多重模式匹配”Proc.Data Compression Conference,DCC98。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Takeda,M.: "Pattern matching machine for text compressed using finite state model" Technical Report DOI-TR-142,Kyushu University. 1-12 (1997)
Takeda,M.:“使用有限状态模型压缩文本的模式匹配机”技术报告 DOI-TR-142,九州大学。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Shibata, Y.et al.: "Pattern matching in text compressed by using antidictionaries" Technical Report DOI-TR-157, Kyushu University. 1-12 (1999)
Shibata, Y.et al.:“使用反词典压缩的文本中的模式匹配”技术报告 DOI-TR-157,九州大学。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

TAKEDA Masayuki其他文献

TAKEDA Masayuki的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('TAKEDA Masayuki', 18)}}的其他基金

Mechanism of driver mutation positive lung cancer
驱动突变阳性肺癌发生机制
  • 批准号:
    15K21525
  • 财政年份:
    2015
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Studies on lower urinary tract dysfunction pathogenesis by complex systems network and dynamic homeostasis collapse
复杂系统网络和动态稳态崩溃对下尿路功能障碍发病机制的研究
  • 批准号:
    15H04972
  • 财政年份:
    2015
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Research on vesicular type transporter and refractory lower urinary tract dysfunction
囊泡型转运蛋白与难治性下尿路功能障碍的研究
  • 批准号:
    26670699
  • 财政年份:
    2014
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Mechanisms And Possible Novel Treatments for Nocturiarelated to Abnormal Circadian Rhythm
与昼夜节律异常相关的夜尿症的机制和可能的新疗法
  • 批准号:
    23659754
  • 财政年份:
    2011
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Teaching Materials and Tools for Education of Information Science for Elementary, Junior High and High School Students
中小学生信息科学教育教材与工具
  • 批准号:
    23650515
  • 财政年份:
    2011
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Research on the afferent transduction in the lower urinary tract
下尿路传入传导的研究
  • 批准号:
    23390381
  • 财政年份:
    2011
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Foundational technology for light-weight XML-DBMS based on very fast compressed data stream processing
基于极快压缩数据流处理的轻量级 XML-DBMS 基础技术
  • 批准号:
    22300010
  • 财政年份:
    2010
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Reseaarch on the function and development of new therapy for Ion channels in the lower urinary tract
下尿路离子通道的功能研究及新疗法的开发
  • 批准号:
    20390423
  • 财政年份:
    2008
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Key Technology for XML DB in Embedded Device Based on Efficient Compressed Pattern Matching
基于高效压缩模式匹配的嵌入式XML数据库关键技术
  • 批准号:
    19300008
  • 财政年份:
    2007
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Development of efficient machine discovery system based on data compression and pattern matching
基于数据压缩和模式匹配的高效机器发现系统的开发
  • 批准号:
    15300049
  • 财政年份:
    2003
  • 资助金额:
    $ 0.45万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了