A Hybrid Approach to Computationally Hard Problems : Combining Approximation, Parallelization, and Randomization

计算难题的混合方法:结合近似、并行化和随机化

基本信息

  • 批准号:
    14580390
  • 负责人:
  • 金额:
    $ 1.02万
  • 依托单位:
  • 依托单位国家:
    日本
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
  • 财政年份:
    2002
  • 资助国家:
    日本
  • 起止时间:
    2002 至 2004
  • 项目状态:
    已结题

项目摘要

In this research project, we studied the effect of a hybrid approach to computationally hard problems. This approach combines three basic approaches (namely, approximation, randomization, and parallelization) to computationally hard problems. Previously, not so many algorithms were based on such a hybrid approach. The main purpose here is to use this hybrid approach to solve computationally hard problems that have not been solved so far. This may lead to the finding of new design techniques of efficient algorithms for hard problems.We focused on three computationally hard problems arising from the field of computational biology. The first is the protein NMR peak assignment problem which is crucial towards the automation of assigning a group of "spin systems" obtained experimentally to a protein sequence of amino acids. We formulated this problem as an interval scheduling problem (ISP), where a protein sequence P of amino acids is viewed as a discrete time interval I (the amino acids on … More P one-to-one correspond to the time units of I), each subset S of spin systems that are known to originate from consecutive amino acids from P is viewed as a "job" js, the preference of assigning S to a subsequence Q of consecutive amino acids on P is viewed as the profit of executing job js in the subinterval of I corresponding to Q, and the goal is to maximize the total profit of executing the jobs (on a single machine) during I. We showed that the interval scheduling problem is Max SNP-hard (even if each job takes either one or two consecutive time units), and designed an efficient 2-approximation algorithm for it. However, our experiments show that the 2-approximation algorithm does not output satisfactory assignments in practice. The reason is as follows : In the real practice of protein NMR peak assignment, each job js usually requires at most 10 consecutive time units, and typically the jobs that require one or two consecutive time units are the most difficult to assign/schedule. For this reason, we then designed several efficient heuristics for the problem ; some of them run on PC-clusters in short (parallel) time. Our experiments show that these heuristics work very well in practice.The second problem we considered is the following : Given a set of species and their similarity data, reconstruct a phylogeny (also called evolutionary tree) so that species are close in the phylogeny if and only if they have high similarity. Assume that the similarity data are represented as a graph G=(V, E), where each vertex represents a species and two vertices are adjacent if they represent species of high similarity. The phylogeny reconstruction problem can then be abstracted as a graph-theoretic problem called the phylogenetic k-th root problem (PR_k), where k is a predetermined proximity threshold. We showed that the problem can be solved in linear time if the input data have no errors and the phylogeny to be constructed is of bounded degree. We also showed that the problem is NP-hard if the input data have errors (no matter the phylogeny to be constructed is of bounded degree or not).The third problem we considered is the problem of DNA sequence alignment with inversions and reversals. Previously, inversions and reversals had not been considered seriously in sequence alignment although there are real in practice ; the only known algorithm previously runs in O(n^2m^2) time and consumes O(n^2m^2) space, where n and m are the lengths of the two input sequences respectively. We designed a space-efficient algorithm for this problem which consumes only O(nm) space with the same amount of time. Our algorithm enables the computation for a pair of DNA sequences of length up to 10,000 to be carried out on an ordinary desktop computer. Less
在该研究项目中,我们研究了用于计算困难问题的混合方法的效果。这种方法结合了三种基本方法(即近似,随机化和并行化)与计算困难问题相结合。以前,并非太多算法是基于这种混合方法。这里的主要目的是使用这种混合方法来解决到目前为止尚未解决的计算困难问题。这可能会导致发现有效问题的有效算法的新设计技术。我们专注于计算生物学领域引起的三个计算上的硬问题。首先是蛋白质NMR峰分配问题,该问题对于分配一组“自旋系统”的自动化至关重要。 We formulated this problem as an interval scheduling problem (ISP), where a protein sequence P of amino acids is viewed as a discrete time interval I (the amino acids on … More P one-to-one correspond to the time units of I), each subset S of spin systems that are known to originate from consecrated amino acids from P is viewed as a "job" js, the preference of assigning S to a subsequence Q of consecrated P上的氨基酸被视为在与Q相对应的I子间隔中执行作业J的利润,其目标是最大化执行工作的总利润(在一台机器上)。我们表明,间隔时间表的时间表是最大的SNP-HARD(即使每个工作都需要一个或两个连续的时间或两个连续的时间单位),并为有效的2-appRroxims ang Algroximims Aldimims Algroximims Algroximsmem。但是,我们的实验表明,在实践中,2-辅助算法不会输出满意度分配。原因如下:在蛋白质NMR峰分配的真实实践中,每个作业JS通常最多需要连续10个单位,通常需要连续一两个时间单位的工作是最难​​分配/时间表的工作。因此,我们为这个问题设计了几种有效的启发式方法。其中一些在短(平行)时间内在PC群集上运行。我们的实验表明,这些启发式方法在实践中非常有效。我们考虑的第二个问题是以下问题:鉴于一组物种及其相似性数据,重建了系统发育(也称为进化树),以便当它们具有高相似性时,系统发育中的物种在系统发育中很接近。假设相似性数据表示为g =(v,e),其中每个顶点代表一个物种,如果它们代表高相似性的物种,则两个顶点相邻。然后可以将系统发育重建问题提取为一种称为系统发育k- root问题(PR_K)的图理论问题,其中k是预定的接近度阈值。我们表明,如果输入数据没有误差,并且要构建的系统发育是有界程度的,则可以在线性时间内解决问题。我们还表明,如果输入数据有误差(无论要构建的系统发育是否均具有界限程度),则该问题是NP-HARD。我们认为的第三个问题是DNA序列对齐与反转和逆转的问题。以前,尽管实践中有真实的,但在顺序一致性上并没有认真考虑反转和逆转。唯一已知的算法先前在O(n^2m^2)时间内运行并消耗O(n^2m^2)空间,其中N和M分别是两个输入序列的长度。我们为此问题设计了一种空间效率的算法,该算法仅消耗相同时间的O(nm)空间。我们的算法可以在普通台式计算机上进行一对长达10,000的DNA序列的计算。较少的

项目成果

期刊论文数量(62)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Zhi-Zhong Chen: "Tight Upper Bound on the Number of Edges in a Bipartite K_<3,3>-free or K_5-free Graph with an Application"Information Processing Letters. 84・3. 141-145 (2002)
陈志忠:“二部 K_<3,3> 无或 K_5 无图边数的紧上界及其应用”信息处理快报 84・3 (2002)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Zhi-Zhong Chen: "Computing Phylogenetic Roots with Bounded Degrees and Errors"SIAM Journal on Computing. (In press).
陈志忠:“计算有界度和误差的系统发育根”SIAM 计算杂志。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Approximation algorithms for NMR spectral peak assignment
  • DOI:
    10.1016/s0304-3975(02)00086-5
  • 发表时间:
    2003-04-18
  • 期刊:
  • 影响因子:
    1.1
  • 作者:
    Chen, ZZ;Jiang, T;Xu, Y
  • 通讯作者:
    Xu, Y
Zhi-Zhong Chen: "Better Approximation Algorithms for NMR Spectral Peak Assignment"Lecture Notes in Computer Science. 2452. 82-96 (2002)
陈志忠:“核磁共振谱峰分配的更好近似算法”计算机科学讲义。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Zhi-Zhong Chen: "More reliable protein NMR peak assignment via improved 2-interval scheduling"Lecture Notes in Computer Science. 2832. 580-592 (2003)
陈志忠:“通过改进的 2 间隔调度更可靠的蛋白质 NMR 峰分配”计算机科学讲义。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

CHEN Zhi-zhong其他文献

CHEN Zhi-zhong的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('CHEN Zhi-zhong', 18)}}的其他基金

Hybrid Approaches to Computationally Hard Problems : Approximation, Randomization, and Parallelization
计算难题的混合方法:近似、随机化和并行化
  • 批准号:
    20500021
  • 财政年份:
    2008
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)

相似国自然基金

非线性随机微分方程的解及其不变概率测度的数值近似算法
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
非线性随机微分方程的解及其不变概率测度的数值近似算法
  • 批准号:
    12101392
  • 批准年份:
    2021
  • 资助金额:
    24.00 万元
  • 项目类别:
    青年科学基金项目
随机多目标优化问题的近似算法研究
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    24 万元
  • 项目类别:
    青年科学基金项目
非凸随机半定规划的SA算法研究及应用
  • 批准号:
    11701061
  • 批准年份:
    2017
  • 资助金额:
    25.0 万元
  • 项目类别:
    青年科学基金项目
部分集合多重覆盖问题的近似算法
  • 批准号:
    11771013
  • 批准年份:
    2017
  • 资助金额:
    48.0 万元
  • 项目类别:
    面上项目

相似海外基金

On the Study of Symbolic-Numeric Computation Using Randomized and/or Approximation Algorithms
关于使用随机和/或近似算法的符号数值计算的研究
  • 批准号:
    21K11760
  • 财政年份:
    2021
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
CCF-BSF: AF: Small: New Randomized Approaches in Approximation Algorithms
CCF-BSF:AF:小:近似算法中的新随机方法
  • 批准号:
    1717947
  • 财政年份:
    2017
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Standard Grant
AHybrid Approach to Computationally Hard Problems : Combining Approximation, Parallelization, and Randomization
计算难题的混合方法:结合近似、并行化和随机化
  • 批准号:
    17500012
  • 财政年份:
    2005
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Randomized approximation algorithms for combinatorial optimization problems
组合优化问题的随机逼近算法
  • 批准号:
    36809-1997
  • 财政年份:
    2000
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
Randomized approximation algorithms for combinatorial optimization problems
组合优化问题的随机逼近算法
  • 批准号:
    36809-1997
  • 财政年份:
    1999
  • 资助金额:
    $ 1.02万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了