Journal of Artificial Intelligence Resarch: Volume 50の論文一覧

Journal of Artificial Intelligence Resarch Vol. 50 (2014)に記載されている内容を一覧にまとめ、機械翻訳を交えて日本語化し掲載します。

1 論文
2 参考文献
3 関連情報

論文

Topic-Based Dissimilarity and Sensitivity Models for Translation Rule Selection

Topic-Based Dissimilarity and Sensitivity Models for Translation Rule Selection / トピックベースの相違度および感度モデルを用いた翻訳規則選択

Translation rule selection is a task of selecting appropriate translation rules for an ambiguous source-language segment. As translation ambiguities are pervasive in statistical machine translation, we introduce two topic-based models for translation rule selection which incorporates global topic information into translation disambiguation. We associate each synchronous translation rule with source- and target-side topic distributions.With these topic distributions, we propose a topic dissimilarity model to select desirable (less dissimilar) rules by imposing penalties for rules with a large value of dissimilarity of their topic distributions to those of given documents. In order to encourage the use of non-topic specific translation rules, we also present a topic sensitivity model to balance translation rule selection between generic rules and topic-specific rules. Furthermore, we project target-side topic distributions onto the source-side topic model space so that we can benefit from topic information of both the source and target language. We integrate the proposed topic dissimilarity and sensitivity model into hierarchical phrase-based machine translation for synchronous translation rule selection. Experiments show that our topic-based translation rule selection model can substantially improve translation quality.

翻訳規則選択は、曖昧な原言語セグメントに対して適切な翻訳規則を選択するタスクです。統計的機械翻訳においては翻訳の曖昧性が広く存在するため、本研究では、グローバルトピック情報を翻訳の曖昧性解消に組み込む2つのトピックベース翻訳規則選択モデルを導入します。各同期翻訳規則を、原言語側および訳言語側のトピック分布に関連付けます。これらのトピック分布を用いて、トピック分布と与えられた文書のトピック分布との非類似度が大きい規則にペナルティを課すことで、望ましい（より非類似度の低い）規則を選択するトピック非類似度モデルを提案します。トピック非特異性翻訳規則の使用を促進するため、汎用規則とトピック特異性規則の間で翻訳規則選択のバランスをとるトピックセンシティビティモデルも提示します。さらに、訳言語側のトピック分布を原言語側のトピックモデル空間に投影することで、原言語と訳言語の両方のトピック情報を活用できるようにします。提案するトピック非類似度およびセンシティビティモデルを、階層型フレーズベース機械翻訳に統合し、同期翻訳規則選択に利用します。実験により、トピックベースの翻訳ルール選択モデルは翻訳品質を大幅に向上できることが示されています。

Finding Optimal Solutions for Voting Game Design Problems

Finding Optimal Solutions for Voting Game Design Problems / 投票ゲーム設計問題における最適解の探索

In many circumstances where multiple agents need to make a joint decision, voting is used to aggregate the agents’ preferences. Each agent’s vote carries a weight, and if the sum of the weights of the agents in favor of some outcome is larger than or equal to a given quota, then this outcome is decided upon. The distribution of weights leads to a certain distribution of power. Several `power indices’ have been proposed to measure such power. In the so-called inverse problem, we are given a target distribution of power, and are asked to come up with a game in the form of a quota, plus an assignment of weights to the players whose power distribution is as close as possible to the target distribution (according to some specied distance measure).Here we study solution approaches for the larger class of voting game design (VGD) problems, one of which is the inverse problem. In the general VGD problem, the goal is to find a voting game (with a given number of players) that optimizes some function over these games. In the inverse problem, for example, we look for a weighted voting game that minimizes the distance between the distribution of power among the players and a given target distribution of power (according to a given distance measure). Our goal is to find algorithms that solve voting game design problems exactly, and we approach this goal by enumerating all games in the class of games of interest. We first present a doubly exponential algorithm for enumerating the set of simple games. We then improve on this algorithm for the class of weighted voting games and obtain a quadratic exponential (i.e., 2^O(n^2)) algorithm for enumerating them. We show that this improved algorithm runs in output-polynomial time, making it the fastest possible enumeration algorithm up to a polynomial factor. Finally, we propose an exact anytime-algorithm that runs in exponential time for the power index weighted voting game design problem (the `inverse problem’). We implement this algorithm to find a weighted voting game with a normalized Banzhaf power distribution closest to a target power index, and perform experiments to obtain some insights about the set of weighted voting games. We remark that our algorithm is applicable to optimizing any exponential-time computable function, the distance of the normalized Banzhaf index to a target power index is merely taken as an example.

複数のエージェントが共同で意思決定を行う必要がある多くの状況において、エージェントの選好を集約するために投票が用いられます。各エージェントの投票には重みが与えられ、ある結果を支持するエージェントの重みの合計が所定の割り当て以上であれば、その結果が決定されます。重みの分布は、特定の力の分布につながります。このような力を測定するために、いくつかの「力指標」が提案されています。いわゆる逆問題では、目標となる力の分布が与えられ、割り当てという形でゲームを作成し、さらに、（指定された距離尺度に従って）目標分布に可能な限り近い力の分布を持つプレイヤーに重みを割り当てることが求められます。ここでは、より大規模な投票ゲーム設計（VGD）問題に対する解法を研究します。その一つが逆問題です。一般的なVGD問題では、これらのゲーム全体にわたって何らかの関数を最適化する投票ゲーム（プレイヤー数が与えられたもの）を見つけることが目標となります。例えば逆問題では、プレイヤー間の権力分布と、与えられた目標権力分布（与えられた距離尺度に従って）との間の距離を最小化する重み付き投票ゲームを探します。私たちの目標は、投票ゲーム設計問題を正確に解くアルゴリズムを見つけることであり、関心のあるゲームのクラスに属するすべてのゲームを列挙することでこの目標に近づきます。まず、単純なゲーム集合を列挙するための二重指数アルゴリズムを提示します。次に、このアルゴリズムを重み付き投票ゲームのクラス向けに改良し、それらを列挙するための二次指数（すなわち、2^O(n^2)）アルゴリズムを得ます。この改良アルゴリズムは出力多項式時間で実行され、多項式因数まで最速の列挙アルゴリズムであることを示します。最後に、指数時間で実行される、いつでも実行可能な正確なアルゴリズムを提案し、このアルゴリズムは、権力指数重み付き投票ゲーム設計問題（「逆問題」）を対象とします。このアルゴリズムを実装し、正規化されたバンザフ指数分布を持つ重み付き投票ゲームのうち、目標とするべき指数に最も近いものを見つけ、重み付き投票ゲームの集合に関する知見を得るための実験を行いました。なお、このアルゴリズムは指数時間で計算可能な任意の関数の最適化に適用可能です。正規化されたバンザフ指数と目標とするべき指数の距離は、あくまで例として挙げているに過ぎません。

Enhanced Partial Expansion A*

Enhanced Partial Expansion A* / 拡張部分展開A*

When solving instances of problem domains that feature a large branching factor, A* may generate a large number of nodes whose cost is greater than the cost of the optimal solution. We designate such nodes as surplus. Generating surplus nodes and adding them to the OPEN list may dominate both time and memory of the search. A recently introduced variant of A* called Partial Expansion A* (PEA*) deals with the memory aspect of this problem. When expanding a node n, PEA* generates all of its children and puts into OPEN only the children with f = f (n). n is re-inserted in the OPEN list with the f -cost of the best discarded child. This guarantees that surplus nodes are not inserted into OPEN.In this paper, we present a novel variant of A* called Enhanced Partial Expansion A* (EPEA*) that advances the idea of PEA* to address the time aspect. Given a priori domain- and heuristic- specific knowledge, EPEA* generates only the nodes with f = f(n). Although EPEA* is not always applicable or practical, we study several variants of EPEA*, which make it applicable to a large number of domains and heuristics. In particular, the ideas of EPEA* are applicable to IDA* and to the domains where pattern databases are traditionally used. Experimental studies show significant improvements in run-time and memory performance for several standard benchmark applications. We provide several theoretical studies to facilitate an understanding of the new algorithm.

大きな分岐係数を特徴とする問題領域のインスタンスを解く場合、A*は最適解のコストよりも大きなコストを持つノードを大量に生成することがあります。このようなノードを余剰ノードと呼びます。余剰ノードを生成してOPENリストに追加すると、検索の時間とメモリの両方が占有される可能性があります。最近導入されたA*の亜種であるPartial Expansion A* (PEA*)は、この問題のメモリの側面に対処します。ノードnを拡張する場合、PEA*はそのすべての子ノードを生成し、f = f (n)となる子ノードのみをOPENリストに追加します。nは、破棄された最良の子ノードのfコストとともにOPENリストに再挿入されます。これにより、余剰ノードがOPENリストに挿入されないことが保証されます。本稿では、PEA*の考え方を発展させて時間の側面に対処した、A*の新しい亜種であるEnhanced Partial Expansion A* (EPEA*)を紹介します。EPEA*は、領域およびヒューリスティックに固有の事前知識が与えられた場合、f = f(n)となるノードのみを生成します。EPEA*は常に適用可能または実用的であるとは限りませんが、本研究ではEPEA*のいくつかの変種を研究し、多数のドメインとヒューリスティックに適用できるようにします。特に、EPEA*のアイデアはIDA*と、パターンデータベースが伝統的に使用されているドメインに適用できます。実験的研究では、いくつかの標準的なベンチマークアプリケーションで実行時間とメモリパフォーマンスが大幅に向上することが示されています。新しいアルゴリズムの理解を促進するために、いくつかの理論的研究を提供します。

An Efficient Algorithm for Estimating State Sequences in Imprecise Hidden Markov Models

An Efficient Algorithm for Estimating State Sequences in Imprecise Hidden Markov Models / 不正確な隠れマルコフモデルにおける状態系列推定のための効率的なアルゴリズム

We present an efficient exact algorithm for estimating state sequences from outputs or observations in imprecise hidden Markov models (iHMMs). The uncertainty linking one state to the next, and that linking a state to its output, is represented by a set of probability mass functions instead of a single such mass function. We consider as best estimates for state sequences the maximal sequences for the posterior joint state model conditioned on the observed output sequence, associated with a gain function that is the indicator of the state sequence. This corresponds to and generalises finding the state sequence with the highest posterior probability in (precise-probabilistic) HMMs, thereby making our algorithm a generalisation of the one by Viterbi. We argue that the computational complexity of our algorithm is at worst quadratic in the length of the iHMM, cubic in the number of states, and essentially linear in the number of maximal state sequences. An important feature of our imprecise approach is that there may be more than one maximal sequence, typically in those instances where its precise-probabilistic counterpart is sensitive to the choice of prior. For binary iHMMs, we investigate experimentally how the number of maximal state sequences depends on the model parameters. We also present an application in optical character recognition, demonstrating that our algorithm can be usefully applied to robustify the inferences made by its precise-probabilistic counterpart.

不正確な隠れマルコフモデル（iHMM）の出力または観測から状態シーケンスを推定するための効率的で正確なアルゴリズムを紹介します。1つの状態から次の状態、および状態からその出力にリンクする不確実性は、単一の確率質量関数ではなく、一連の確率質量関数によって表されます。状態シーケンスの最良推定値として、観測された出力シーケンスを条件とする事後結合状態モデルの最大シーケンスと、状態シーケンスの指標であるゲイン関数を関連付けたものを検討します。これは、（高精度確率的）HMMにおいて事後確率が最高の状態シーケンスを見つけることに対応し、一般化することで、我々のアルゴリズムをViterbiのアルゴリズムの一般化としています。我々のアルゴリズムの計算量は、最悪の場合でもiHMMの長さの2乗、状態数の3乗、そして最大状態シーケンスの数に本質的に線形であると我々は主張します。我々の不高精度なアプローチの重要な特徴は、典型的には高精度確率的HMMが事前確率の選択に敏感な場合に、複数の最大シーケンスが存在する可能性があることです。バイナリiHMMについては、最大状態シーケンスの数がモデルパラメータにどのように依存するかを実験的に調査します。また、光学式文字認識への応用も示し、我々のアルゴリズムが高精度確率的HMMによる推論を堅牢化するために有効に適用できることを実証します。

Reconnection with the Ideal Tree: A New Approach to Real-Time Search

Reconnection with the Ideal Tree: A New Approach to Real-Time Search / 理想木との再接続：リアルタイム探索への新しいアプローチ

Many applications, ranging from video games to dynamic robotics, require solving single-agent, deterministic search problems in partially known environments under very tight time constraints. Real-Time Heuristic Search (RTHS) algorithms are specifically designed for those applications. As a subroutine, most of them invoke a standard, but bounded, search algorithm that searches for the goal. In this paper we present FRIT, a simple approach for single-agent deterministic search problems under tight constraints and partially known environments that unlike traditional RTHS does not search for the goal but rather searches for a path that connects the current state with a so-called ideal tree T . When the agent observes that an arc in the tree cannot be traversed in the actual environment, it removes such an arc from T and then carries out a reconnection search whose objective is to find a path between the current state and any node in T . The reconnection search is done using an algorithm that is passed as a parameter to FRIT. If such a parameter is an RTHS algorithm, then the resulting algorithm can be an RTHS algorithm. We show, in addition, that FRIT may be fed with a (bounded) complete blind-search algorithm. We evaluate our approach over grid pathfinding benchmarks including game maps and mazes. Our results show that FRIT, used with RTAA*, a standard RTHS algorithm, outperforms RTAA* significantly; by one order of magnitude under tight time constraints. In addition, FRIT(daRTAA*) substantially outperforms daRTAA*, a state-of-the-art RTHS algorithm, usually obtaining solutions 50% cheaper on average when performing the same search effort. Finally, FRIT(BFS), i.e., FRIT using breadth-first-search, obtains best-quality solutions when time is limited compared to Adaptive A* and Repeated A*. Finally we show that Bug2, a pathfinding-specific navigation algorithm, outperforms FRIT(BFS) when planning time is extremely limited, but when given more time, the situation reverses.

ビデオゲームから動的ロボットまで、多くのアプリケーションでは、非常に厳しい時間制約の下、部分的に既知の環境におけるシングルエージェントの決定論的探索問題を解決することが求められます。リアルタイムヒューリスティックサーチ(RTHS)アルゴリズムは、特にこれらのアプリケーション向けに設計されています。サブルーチンとして、そのほとんどは、目標を探索する標準的だが制限付きの探索アルゴリズムを呼び出します。この論文では、厳しい制約と部分的に既知の環境におけるシングルエージェントの決定論的探索問題に対するシンプルなアプローチであるFRITを紹介します。FRITは、従来のRTHSとは異なり、目標を探索するのではなく、現在の状態といわゆる理想ツリーTを接続するパスを探索します。エージェントは、実際の環境ではツリーのアークをトラバースできないことに気付くと、そのようなアークをTから削除し、現在の状態とT内の任意のノードとの間のパスを見つけることを目的とした再接続探索を実行します。再接続探索は、FRITにパラメータとして渡されるアルゴリズムを使用して行われます。このようなパラメータがRTHSアルゴリズムであれば、結果として得られるアルゴリズムもRTHSアルゴリズムになります。さらに、FRITに(有界)完全なブラインドサーチアルゴリズムを入力できることも示します。ゲームマップや迷路などのグリッドパスファインディングベンチマークでこのアプローチを評価します。結果によると、標準RTHSアルゴリズムであるRTAA*と共に使用したFRITは、厳しい時間制約下でRTAA*より1桁大幅に優れたパフォーマンスを発揮します。さらに、FRIT(daRTAA*)は最先端のRTHSアルゴリズムであるdaRTAA*よりも大幅に優れたパフォーマンスを発揮し、通常は同じ探索努力を実行した場合、平均で50%安価にソリューションを取得できます。最後に、FRIT(BFS)、つまり幅優先探索を使用するFRITは、Adaptive A*やRepeated A*と比較して、時間が制限されている場合に最高品質のソリューションを取得します。最後に、経路探索に特化したナビゲーションアルゴリズムであるBug2は、計画時間が極めて限られている場合にはFRIT(BFS)よりも優れた性能を示すが、計画時間をより長くすると状況は逆転することを示す。

Property Directed Reachability for Automated Planning

Property Directed Reachability for Automated Planning / 自動プランニングのためのプロパティ指向到達可能性

Property Directed Reachability (PDR) is a very promising recent method for deciding reachability in symbolically represented transition systems. While originally conceived as a model checking algorithm for hardware circuits, it has already been successfully applied in several other areas. This paper is the first investigation of PDR from the perspective of automated planning.Similarly to the planning as satisfiability paradigm, PDR draws its strength from internally employing an efficient SAT-solver. We show that most standard encoding schemes of planning into SAT can be directly used to turn PDR into a planning algorithm. As a non-obvious alternative, we propose to replace the SAT-solver inside PDR by a planning-specific procedure implementing the same interface. This SAT-solver free variant is not only more efficient, but offers additional insights and opportunities for further improvements. An experimental comparison to the state of the art planners finds it highly competitive, solving most problems on several domains.

特性指向到達可能性（PDR）は、記号的に表現された遷移システムにおける到達可能性を決定するための、非常に有望な最近の手法です。元々はハードウェア回路のモデル検査アルゴリズムとして考案されたが、既に他のいくつかの分野で成功を収めています。本論文は、自動計画の観点からPDRを初めて調査した論文です。充足可能性パラダイムとしての計画と同様に、PDRは効率的なSATソルバーを内部的に採用することでその強みを発揮します。計画をSATに変換する標準的なエンコード方式のほとんどが、PDRを計画アルゴリズムに変換するために直接使用できることを示す。自明ではない代替案として、PDR内のSATソルバーを、同じインターフェースを実装した計画固有の手順に置き換えることを提案します。このSATソルバーを必要としないバリアントは、より効率的であるだけでなく、さらなる改善のための追加の洞察と機会を提供します。最先端のプランナーと実験的に比較すると、このプランナーは非常に競争力があり、いくつかのドメインのほとんどの問題を解決できることがわかります。

Knowledge Forgetting in Answer Set Programming

Knowledge Forgetting in Answer Set Programming / 回答セットプログラミングにおける知識忘却

The ability of discarding or hiding irrelevant information has beenrecognized as an important feature for knowledge based systems, including answer set programming. The notion of strong equivalence in answer set programming plays an important role for different problems as it gives rise to a substitution principle and amounts to knowledge equivalence of logic programs. In this paper, we uniformly propose a semantic knowledge forgetting, called HT- and FLP-forgetting, for logic programs under stable model and FLP-stable model semantics, respectively. Our proposed knowledge forgetting discards exactly the knowledge of a logic program which is relevant to forgotten variables. Thus it preserves strong equivalence in the sense that strongly equivalent logic programs will remain strongly equivalent after forgetting the same variables. We show that this semantic forgetting result is always expressible; and we prove a representation theorem stating that the HT- and FLP-forgetting can be precisely characterized by Zhang-Zhou’s four forgetting postulates under the HT- and FLP-model semantics, respectively. We also reveal underlying connections between the proposed forgetting and the forgetting of propositional logic, and provide complexity results for decision problems in relation to the forgetting. An application of the proposed forgetting is also considered in a conflict solving scenario.

無関係な情報を破棄または隠蔽する能力は、解答セットプログラミングを含む知識ベースシステムの重要な機能として認識されています。解答セットプログラミングにおける強同値性の概念は、置換原理を導き、論理プログラムの知識同値性につながるため、様々な問題において重要な役割を果たしています。本稿では、安定モデル意味論およびFLP安定モデル意味論における論理プログラムに対し、それぞれHT-忘却およびFLP-忘却と呼ばれる意味的知識忘却を統一的に提案します。提案する知識忘却は、忘却された変数に関連する論理プログラムの知識を正確に破棄します。したがって、強同値な論理プログラムは、同じ変数を忘却した後も強同値性を維持するという意味で、強同値性を維持します。この意味的忘却の結果は常に表現可能であることを示す。そして、HT-忘却およびFLP-忘却が、それぞれHT-モデル意味論およびFLP-モデル意味論におけるZhang-Zhouの4つの忘却公理によって正確に特徴付けられることを示す表現定理を証明します。また、提案された忘却と命題論理の忘却との間に根底にある関連性を明らかにし、忘却に関連する意思決定問題における計算量に関する結果を提供します。提案された忘却の適用は、紛争解決シナリオにおいても検討されます。

Game-Theoretic Patrolling with Dynamic Execution Uncertainty and a Case Study on a Real Transit System

Game-Theoretic Patrolling with Dynamic Execution Uncertainty and a Case Study on a Real Transit System / 動的実行の不確実性を考慮したゲーム理論的パトロールと実際の交通システムにおけるケーススタディ

Attacker-Defender Stackelberg security games (SSGs) have emerged as an important research area in multi-agent systems. However, existing SSGs models yield fixed, static, schedules which fail in dynamic domains where defenders face execution uncertainty, i.e., in domains where defenders may face unanticipated disruptions of their schedules. A concrete example is an application involving checking fares on trains, where a defender’s schedule is frequently interrupted by fare evaders, making static schedules useless.To address this shortcoming, this paper provides four main contributions. First, we present a novel general Bayesian Stackelberg game model for security resource allocation in dynamic uncertain domains. In this new model, execution uncertainty is handled by using a Markov decision process (MDP) for generating defender policies. Second, we study the problem of computing a Stackelberg equilibrium for this game and exploit problem structure to reduce it to a polynomial-sized optimization problem. Shifting to evaluation, our third contribution shows, in simulation, that our MDP-based policies overcome the failures of previous SSG algorithms. In so doing, we can now build a complete system, that enables handling of schedule interruptions and, consequently, to conduct some of the first controlled experiments on SSGs in the field. Hence, as our final contribution, we present results from a real-world experiment on Metro trains in Los Angeles validating our MDP-based model, and most importantly, concretely measuring the benefits of SSGs for security resource allocation.

攻撃者-防御者スタックベルグセキュリティゲーム（SSG）は、マルチエージェントシステムにおける重要な研究分野として浮上しています。しかし、既存のSSGモデルは固定的で静的なスケジュールを生成するため、防御側が実行の不確実性に直面する動的な領域、すなわち防御側が予期せぬスケジュールの混乱に直面する可能性のある領域では機能しない。具体的な例として、電車の運賃チェックを含むアプリケーションが挙げられます。この場合、防御側のスケジュールは運賃逃れ者によって頻繁に中断されるため、静的なスケジュールは役に立たない。この欠点に対処するため、本論文は4つの主要な貢献を行う。まず、動的な不確実性を伴う領域におけるセキュリティリソースの割り当てのための、新しい一般ベイズスタックベルグゲームモデルを提示します。この新しいモデルでは、実行の不確実性は、防御側のポリシー生成にマルコフ決定過程（MDP）を用いることで処理されます。第二に、このゲームのスタックベルク均衡を計算する問題を研究し、問題構造を利用してそれを多項式サイズの最適化問題に簡約します。評価に移ると、第三の貢献では、シミュレーションにおいて、MDPベースのポリシーが従来のSSGアルゴリズムの欠点を克服することを示します。これにより、スケジュールの中断に対処できる完全なシステムを構築できるようになり、結果として、現場でSSGに関する最初の制御された実験のいくつかを実施できるようになりました。したがって、最後の貢献として、ロサンゼルスの地下鉄電車での実際の実験の結果を示し、MDPベースのモデルを検証し、最も重要なこととして、セキュリティリソースの割り当てにおけるSSGの利点を具体的に測定しました。

HC-Search: A Learning Framework for Search-based Structured Prediction

HC-Search: A Learning Framework for Search-based Structured Prediction / HC-Search：探索に基づく構造化予測のための学習フレームワーク

Structured prediction is the problem of learning a function that maps structured inputs to structured outputs. Prototypical examples of structured prediction include part-of-speech tagging and semantic segmentation of images. Inspired by the recent successes of search-based structured prediction, we introduce a new framework for structured prediction called HC-Search. Given a structured input, the framework uses a search procedure guided by a learned heuristic H to uncover high quality candidate outputs and then employs a separate learned cost function C to select a final prediction among those outputs. The overall loss of this prediction architecture decomposes into the loss due to H not leading to high quality outputs, and the loss due to C not selecting the best among the generated outputs. Guided by this decomposition, we minimize the overall loss in a greedy stage-wise manner by first training H to quickly uncover high quality outputs via imitation learning, and then training C to correctly rank the outputs generated via H according to their true losses. Importantly, this training procedure is sensitive to the particular loss function of interest and the time-bound allowed for predictions. Experiments on several benchmark domains show that our approach significantly outperforms several state-of-the-art methods.

構造化予測は、構造化された入力を構造化された出力にマッピングする関数を学習する問題です。構造化予測の典型的な例としては、品詞タグ付けや画像の意味的セグメンテーションなどがあります。検索ベースの構造化予測の最近の成功に触発され、HC-Searchと呼ばれる構造化予測のための新しいフレームワークを紹介します。構造化された入力が与えられると、フレームワークは学習済みのヒューリスティックHに導かれる検索手順を使用して高品質の候補出力を発見し、次に別の学習済みコスト関数Cを使用してそれらの出力の中から最終的な予測を選択します。この予測アーキテクチャの全体的な損失は、Hが高品質の出力につながらないことによる損失と、Cが生成された出力の中から最良のものを選択しないことによる損失に分解されます。この分解に従って、最初に模倣学習によって高品質の出力をすばやく発見するようにHをトレーニングし、次にHによって生成された出力を実際の損失に従って正しくランク付けするようにCをトレーニングすることで、段階的に貪欲に全体的な損失を最小化します。重要なのは、このトレーニング手順が、対象となる特定の損失関数と予測に許可されている時間制限に敏感であることです。いくつかのベンチマークドメインでの実験では、私たちのアプローチがいくつかの最先端手法を大幅に上回っていることが示されています。

A Multivariate Complexity Analysis of Lobbying in Multiple Referenda

A Multivariate Complexity Analysis of Lobbying in Multiple Referenda / 複数の住民投票におけるロビー活動の多変量複雑性分析

Assume that each of n voters may or may not approve each of m issues. If an agent (the lobby) may influence up to k voters, then the central question of the NP-hard Lobbying problem is whether the lobby can choose the voters to be influenced so that as a result each issue gets a majority of approvals. This problem can be modeled as a simple matrix modification problem: Can one replace k rows of a binary n x m-matrix by k all-1 rows such that each column in the resulting matrix has a majority of 1s? Significantly extending on previous work that showed parameterized intractability (W[2]-completeness) with respect to the number k of modified rows, we study how natural parameters such as n, m, k, or the “maximum number of 1s missing for any column to have a majority of 1s” (referred to as “gap value g”) govern the computational complexity of Lobbying. Among other results, we prove that Lobbying is fixed-parameter tractable for parameter m and provide a greedy logarithmic-factor approximation algorithm which solves Lobbying even optimally if m < 5. We also show empirically that this greedy algorithm performs well on general instances. As a further key result, we prove that Lobbying is LOGSNP-complete for constant values g>0, thus providing a first natural complete problem from voting for this complexity class of limited nondeterminism.

n人の有権者それぞれが、m個の議題それぞれを承認するかどうかは任意であると仮定します。エージェント（ロビー活動）が最大k人の有権者に影響を与えることができる場合、NP困難であるロビー活動問題における中心的な問いは、ロビー活動が影響を与えるべき有権者を選択し、結果として各議題が過半数の承認を得られるかどうかです。この問題は、単純な行列修正問題としてモデル化できます。すなわち、n×m行列のk行を、すべて1のk行に置き換え、結果の行列の各列の過半数が1になるようにできるか、という問題です。修正された行数kに関してパラメータ化された扱いにくさ（W[2]完全性）を示した先行研究を大幅に拡張し、n、m、k、あるいは「任意の列の過半数が1となるために必要な1の最大欠落数」（「ギャップ値g」と呼ばれる）といった自然パラメータが、ロビー活動の計算複雑性をどのように支配するかを考察します。その他の結果の中で、我々はロビー活動がパラメータmに対して固定パラメータで扱いやすいことを証明し、m < 5の場合でもロビー活動を最適に解く貪欲対数係数近似アルゴリズムを提供します。また、この貪欲アルゴリズムが一般的なインスタンスで良好に機能することを経験的に示す。さらなる重要な結果として、我々はロビー活動が定数値g>0に対してLOGSNP完全であることを証明し、これにより、この限定された非決定性の複雑性クラスに対する投票から得られる最初の自然完全問題を提供します。

Monotone Temporal Planning: Tractability, Extensions and Applications

Monotone Temporal Planning: Tractability, Extensions and Applications / 単調な時間計画：扱いやすさ、拡張、および応用

This paper describes a polynomially-solvable class of temporal planning problems. Polynomiality follows from two assumptions. Firstly, by supposing that each sub-goal fluent can be established by at most one action, we can quickly determine which actions are necessary in any plan. Secondly, the monotonicity of sub-goal fluents allows us to express planning as an instance of STP≠ (Simple Temporal Problem with difference constraints). This class includes temporally-expressive problems requiring the concurrent execution of actions, with potential applications in the chemical, pharmaceutical and construction industries. We also show that any (temporal) planning problem has a monotone relaxation which can lead to the polynomial-time detection of its unsolvability in certain cases. Indeed we show that our relaxation is orthogonal to relaxations based on the ignore-deletes approach used in classical planning since it preserves deletes and can also exploit temporal information.

本論文では、多項式的に解ける時間的計画問題のクラスについて説明します。多項式性は2つの仮定から導かれます。第1に、各サブゴールフルーエントが最大で1つのアクションによって確立できると仮定することにより、どの計画に必要なアクションを迅速に決定できます。第2に、サブゴールフルーエントの単調性により、計画活動をSTP≠(差分制約付きの単純な時間的問題)のインスタンスとして表現できます。このクラスには、アクションの同時実行を必要とする時間的に表現力のある問題が含まれ、化学、製薬、建設業界への応用が期待されます。また、あらゆる（時間的）計画問題には単調な緩和法が存在し、特定のケースでは多項式時間で解決不可能であることが検出できることを示します。実際、この緩和法は、削除を維持し、時間情報も活用できるため、従来の計画で用いられる削除無視アプローチに基づく緩和法と直交していることを示します。

Improving Delete Relaxation Heuristics Through Explicitly Represented Conjunctions

Improving Delete Relaxation Heuristics Through Explicitly Represented Conjunctions / 明示的に表現された接続詞による削除緩和ヒューリスティックの改善

Heuristic functions based on the delete relaxation compute upper and lower bounds on the optimal delete-relaxation heuristic h+, and are of paramount importance in both optimal and satisficing planning. Here we introduce a principled and flexible technique for improving h+, by augmenting delete-relaxed planning tasks with a limited amount of delete information. This is done by introducing special fluents that explicitly represent conjunctions of fluents in the original planning task, rendering h+ the perfect heuristic h* in the limit. Previous work has introduced a method in which the growth of the task is potentially exponential in the number of conjunctions introduced. We formulate an alternative technique relying on conditional effects, limiting the growth of the task to be linear in this number. We show that this method still renders h+ the perfect heuristic h* in the limit. We propose techniques to find an informative set of conjunctions to be introduced in different settings, and analyze and extend existing methods for lower-bounding and upper-bounding h+ in the presence of conditional effects. We evaluate the resulting heuristic functions empirically on a set of IPC benchmarks, and show that they are sometimes much more informative than standard delete-relaxation heuristics.

削除緩和法に基づくヒューリスティック関数は、最適な削除緩和ヒューリスティックh+の上限と下限を計算し、最適計画と満足化計画の両方において極めて重要です。本稿では、削除緩和された計画タスクに限られた量の削除情報を付加することで、h+を改善するための原理的かつ柔軟な手法を紹介します。これは、元の計画タスクにおけるフルーエントの連言を明示的に表現する特別なフルーエントを導入することで実現され、h+は極限において完璧なヒューリスティックh*となります。先行研究では、タスクの増加が導入する接続詞の数に対して潜在的に指数関数的となる手法が紹介されています。本稿では、条件効果に依存し、タスクの増加がこの数に対して線形となるように制限する代替手法を定式化します。この手法によっても、h+は極限において完全なヒューリスティックh*となることを示す。本稿では、様々な設定において導入すべき有益な接続詞の集合を見つける手法を提案し、条件効果が存在する場合のh+の下限値と上限値を求める既存の手法を分析・拡張します。得られたヒューリスティック関数を一連のIPCベンチマークで経験的に評価し、標準的な削除緩和ヒューリスティックよりもはるかに有益となる場合があることを示す。

Integrating Queueing Theory and Scheduling for Dynamic Scheduling Problems

Integrating Queueing Theory and Scheduling for Dynamic Scheduling Problems / 動的スケジューリング問題における待ち行列理論とスケジューリングの統合

Dynamic scheduling problems consist of both challenging combinatorics, as found in classical scheduling problems, and stochastics due to uncertainty about the arrival times, resource requirements, and processing times of jobs. To address these two challenges, we investigate the integration of queueing theory and scheduling. The former reasons about long-run stochastic system characteristics, whereas the latter typically deals with short-term combinatorics. We investigate two simple problems to isolate the core differences and potential synergies between the two approaches: a two-machine dynamic flowshop and a flexible queueing network. We show for the first time that stability, a fundamental characteristic in queueing theory, can be applied to approaches that periodically solve combinatorial scheduling problems. We empirically demonstrate that for a dynamic flowshop, the use of combinatorial reasoning has little impact on schedule quality beyond queueing approaches. In contrast, for the more complicated flexible queueing network, a novel algorithm that combines long-term guidance from queueing theory with short-term combinatorial decision making outperforms all other tested approaches. To our knowledge, this is the first time that such a hybrid of queueing theory and scheduling techniques has been proposed and evaluated.

動的スケジューリング問題は、古典的なスケジューリング問題に見られるような困難な組合せ論と、ジョブの到着時刻、資源要件、処理時間に関する不確実性に起因する確率論の両方から構成されます。これらの2つの課題に対処するため、我々は待ち行列理論とスケジューリングの統合を研究します。前者は長期的な確率的システム特性について推論を行うのに対し、後者は典型的には短期的な組合せ論を扱います。2つのアプローチの核心的な相違点と潜在的な相乗効果を明らかにするために、2つの単純な問題、すなわち2マシンの動的フローショップと柔軟な待ち行列ネットワークを研究します。我々は、待ち行列理論における基本的な特性である安定性が、組合せスケジューリング問題を周期的に解くアプローチに適用できることを初めて示します。動的フローショップにおいては、組合せ推論の使用は待ち行列アプローチ以外のスケジュール品質にほとんど影響を与えないことを経験的に示します。対照的に、より複雑な柔軟な待ち行列ネットワークにおいては、待ち行列理論からの長期的なガイダンスと短期的な組合せ的意思決定を組み合わせた新しいアルゴリズムが、他のすべてのテスト済みアプローチよりも優れた性能を示します。私たちの知る限り、このようなキューイング理論とスケジューリング技術のハイブリッドが提案され、評価されたのはこれが初めてです。

False-Name Manipulation in Weighted Voting Games is Hard for Probabilistic Polynomial Time

False-Name Manipulation in Weighted Voting Games is Hard for Probabilistic Polynomial Time / 重み付き投票ゲームにおける偽名操作は確率多項式時間では困難

False-name manipulation refers to the question of whether a player in a weighted voting game can increase her power by splitting into several players and distributing her weight among these false identities. Relatedly, the beneficial merging problem asks whether a coalition of players can increase their power in a weighted voting game by merging their weights. For the problems of whether merging or splitting players in weighted voting games is beneficial in terms of the Shapley–Shubik and the normalized Banzhaf index, merely NP-hardness lower bounds are known, leaving the question about their exact complexity open. For the Shapley–Shubik and the probabilistic Banzhaf index, we raise these lower bounds to hardness for PP, “probabilistic polynomial time,” a class considered to be by far a larger class than NP. For both power indices, we provide matching upper bounds for beneficial merging and, whenever the new players’ weights are given, also for beneficial splitting, thus resolving previous conjectures in the affirmative. Relatedly, we consider the beneficial annexation problem, asking whether a single player can increase her power by taking over other players’ weights. It is known that annexation is never disadvantageous for the Shapley–Shubik index, and that beneficial annexation is NP-hard for the normalized Banzhaf index. We show that annexation is never disadvantageous for the probabilistic Banzhaf index either, and for both the Shapley–Shubik index and the probabilistic Banzhaf index we show that it is NP-complete to decide whether annexing another player is advantageous. Moreover, we propose a general framework for merging and splitting that can be applied to different classes and representations of games.

偽名操作とは、重み付け投票ゲームにおいて、プレイヤーが複数のプレイヤーに分裂し、それらの偽名に重みを分配することで、自身の権力を増大させることができるかどうかという問題を指します。関連して、有益な合併問題とは、プレイヤーの連合が重み付け投票ゲームにおいて、重み付けを合併することで権力を増大させることができるかどうかを問うものです。重み付け投票ゲームにおいてプレイヤーを合併させるか分裂させるかという問題については、シャプレー・シュビック法と正規化バンザフ指数の観点から、NP困難性の下限値のみが分かっており、その正確な計算量については未解決のままです。シャプレー・シュビック法と確率的バンザフ指数については、これらの下限値を「確率的多項式時間」であるPPの困難性まで引き上げます。PPはNPよりもはるかに大きなクラスと考えられています。両方のパワー指数について、有益な合併の一致する上限を提供し、新しいプレーヤーの重みが与えられている場合は常に有益な分割の一致する上限も提供することで、以前の推測を肯定的に解決します。関連して、有益な併合問題、つまり、1人のプレーヤーが他のプレーヤーの重みを引き継ぐことで自分のパワーを増やすことができるかどうかを考える問題を考察します。併合はShapley-Shubik指数にとって決して不利にならないこと、そして有益な併合は正規化Banzhaf指数にとってNP困難であることが知られています。確率的Banzhaf指数にとっても併合は決して不利にならないことを示します。また、Shapley-Shubik指数と確率的Banzhaf指数の両方について、他のプレーヤーを併合することが有利かどうかを判断することはNP完全であることを示します。さらに、ゲームのさまざまなクラスと表現に適用できる、合併と分割の一般的なフレームワークを提案します。

Probabilistic Inference in Credal Networks: New Complexity Results

Probabilistic Inference in Credal Networks: New Complexity Results / 信条ネットワークにおける確率推論：新たな複雑性の結果

Credal networks are graph-based statistical models whose parameters take values in a set, instead of being sharply specified as in traditional statistical models (e.g., Bayesian networks). The computational complexity of inferences on such models depends on the irrelevance/independence concept adopted. In this paper, we study inferential complexity under the concepts of epistemic irrelevance and strong independence. We show that inferences under strong independence are NP-hard even in trees with binary variables except for a single ternary one. We prove that under epistemic irrelevance the polynomial-time complexity of inferences in credal trees is not likely to extend to more general models (e.g., singly connected topologies). These results clearly distinguish networks that admit efficient inferences and those where inferences are most likely hard, and settle several open questions regarding their computational complexity. We show that these results remain valid even if we disallow the use of zero probabilities. We also show that the computation of bounds on the probability of the future state in a hidden Markov model is the same whether we assume epistemic irrelevance or strong independence, and we prove a similar result for inference in naive Bayes structures. These inferential equivalences are important for practitioners, as hidden Markov models and naive Bayes structures are used in real applications of imprecise probability.

信条ネットワークはグラフベースの統計モデルであり、そのパラメータは従来の統計モデル（ベイジアンネットワークなど）のように明確に指定されるのではなく、集合的な値をとる。このようなモデルにおける推論の計算複雑性は、採用される無関係性/独立性の概念に依存します。本稿では、認識論的無関係性と強い独立性の概念の下での推論複雑性について考察します。強い独立性の下では、1つの3値変数を除く2値変数を持つ木においても推論はNP困難であることを示す。認識論的無関係性の下では、信条木における推論の多項式時間計算複雑性は、より一般的なモデル（例えば、単連結トポロジー）には拡張されにくいことを証明します。これらの結果は、効率的な推論が可能なネットワークと推論が困難である可能性が高いネットワークを明確に区別し、計算複雑性に関するいくつかの未解決の疑問を解決します。これらの結果は、ゼロ確率の使用を禁止した場合でも有効であることを示す。また、隠れマルコフモデルにおける将来の状態の確率の境界値の計算は、認識論的無関係性を仮定した場合と強い独立性を仮定した場合のどちらでも同じであることを示し、ナイーブベイズ構造における推論についても同様の結果が得られることを証明します。隠れマルコフモデルとナイーブベイズ構造は、不正確な確率を扱う実際のアプリケーションで使用されるため、これらの推論の同値性は実務家にとって重要です。

Planning through Automatic Portfolio Configuration: The PbP Approach

Planning through Automatic Portfolio Configuration: The PbP Approach / 自動ポートフォリオ構成による計画：PbPアプローチ

In the field of domain-independent planning, several powerful planners implementing different techniques have been developed. However, no one of these systems outperforms all others in every known benchmark domain. In this work, we propose a multi-planner approach that automatically configures a portfolio of planning techniques for each given domain. The configuration process for a given domain uses a set of training instances to: (i) compute and analyze some alternative sets of macro-actions for each planner in the portfolio identifying a (possibly empty) useful set, (ii) select a cluster of planners, each one with the identified useful set of macro-actions, that is expected to perform best, and (iii) derive some additional information for configuring the execution scheduling of the selected planners at planning time. The resulting planning system, called PbP (Portfolio- based Planner), has two variants focusing on speed and plan quality. Different versions of PbP entered and won the learning track of the sixth and seventh International Planning Competitions. In this paper, we experimentally analyze PbP considering planning speed and plan quality in depth. We provide a collection of results that help to understand PbPs behavior, and demonstrate the effectiveness of our approach to configuring a portfolio of planners with macro-actions.

ドメイン非依存プランニングの分野では、異なる手法を実装した強力なプランナーが複数開発されています。しかし、これらのシステムのいずれも、既知のすべてのベンチマークドメインにおいて他のすべてのシステムよりも優れた性能を発揮しません。本研究では、各ドメインに対してプランニング手法のポートフォリオを自動的に構成するマルチプランナーアプローチを提案します。特定のドメインの構成プロセスでは、一連のトレーニングインスタンスを使用して、(i)ポートフォリオ内の各プランナーのマクロアクションの代替セットを計算および分析して(空の可能性のある)有用なセットを特定し、(ii)特定されたマクロアクションの有用なセットをそれぞれ持つ、最もパフォーマンスが優れていると予想されるプランナーのクラスターを選択し、(iii)プランニング時に選択されたプランナーの実行スケジュールを構成するための追加情報を導き出します。結果として得られるプランニングシステムはPbP (ポートフォリオベースプランナー)と呼ばれ、速度とプランの品質に重点を置いた2つのバリアントがあります。異なるバージョンのPbPが第6回および第7回国際プランニングコンペティションの学習トラックに出場し、優勝しました。本稿では、プランニングの速度とプランの品質を詳細に考慮してPbPを実験的に分析します。PbPの動作を理解するのに役立つ結果のコレクションを提供し、マクロアクションを持つプランナーのポートフォリオを構成する方法の有効性を示します。

MDD Propagation for Sequence Constraints

MDD Propagation for Sequence Constraints / シーケンス制約のためのMDD伝播

We study propagation for the Sequence constraint in the context of constraint programming based on limited-width MDDs. Our first contribution is proving that establishing MDD-consistency for Sequence is NP-hard. Yet, we also show that this task is fixed parameter tractable with respect to the length of the sub-sequences. In addition, we propose a partial filtering algorithm that relies on a specific decomposition of the constraint and a novel extension of MDD filtering to node domains. We experimentally evaluate the performance of our proposed filtering algorithm, and demonstrate that the strength of the MDD propagation increases as the maximum width is increased. In particular, MDD propagation can outperform conventional domain propagation for Sequence by reducing the search tree size and solving time by several orders of magnitude. Similar improvements are observed with respect to the current best MDD approach that applies the decomposition of Sequence into Among constraints.

我々は、制限された幅のMDDに基づく制約プログラミングの文脈におけるSequence制約の伝播を研究します。我々の最初の貢献は、Sequenceに対するMDD一貫性の確立がNP困難であることを証明したことだ。しかし、このタスクはサブシーケンスの長さに関して固定パラメータで実行可能であることも示す。さらに、制約の特定の分解と、ノードドメインへのMDDフィルタリングの新たな拡張に基づく部分フィルタリングアルゴリズムを提案します。提案するフィルタリングアルゴリズムの性能を実験的に評価し、最大幅が増加するにつれてMDD伝播の強度が増すことを示す。特に、MDD伝播は、探索木のサイズと解決時間を数桁削減することにより、Sequenceに対する従来のドメイン伝播よりも優れた性能を発揮できます。SequenceをAmong制約に分解する現在の最良のMDDアプローチに関しても同様の改善が見られます。

A Decision-Theoretic Model of Assistance

A Decision-Theoretic Model of Assistance / 支援の決定理論的モデル

There is a growing interest in intelligent assistants for a variety of applications from sorting email to helping people with disabilities to do their daily chores. In this paper, we formulate the problem of intelligent assistance in a decision-theoretic framework, and present both theoretical and empirical results. We first introduce a class of POMDPs called hidden-goal MDPs (HGMDPs), which formalizes the problem of interactively assisting an agent whose goal is hidden and whose actions are observable. In spite of its restricted nature, we show that optimal action selection for HGMDPs is PSPACE-complete even for deterministic dynamics. We then introduce a more restricted model called helper action MDPs (HAMDPs), which are sufficient for modeling many real-world problems. We show classes of HAMDPs for which efficient algorithms are possible. More interestingly, for general HAMDPs we show that a simple myopic policy achieves a near optimal regret, compared to an oracle assistant that knows the agent’s goal. We then introduce more sophisticated versions of this policy for the general case of HGMDPs that we combine with a novel approach for quickly learning about the agent being assisted. We evaluate our approach in two game-like computer environments where human subjects perform tasks, and in a real-world domain of providing assistance during folder navigation in a computer desktop environment. The results show that in all three domains the framework results in an assistant that substantially reduces user effort with only modest computation.

電子メールの仕分けから障害のある人の日常の家事支援まで、さまざまなアプリケーションにおけるインテリジェントアシスタントへの関心が高まっています。本稿では、意思決定理論の枠組みにおいて知的支援の問題を定式化し、理論的および実証的な結果を提示します。まず、隠れ目標MDP（HGMDP）と呼ばれるPOMDPのクラスを導入します。これは、目標が隠蔽され、行動が観測可能なエージェントを対話的に支援する問題を定式化します。HGMDPは制約が多いものの、決定論的ダイナミクスにおいても最適な行動選択がPSPACE完全であることを示す。次に、ヘルパーアクションMDP（HAMDP）と呼ばれる、より制約の厳しいモデルを導入します。これは、多くの現実世界の問題をモデル化するのに十分なモデルです。効率的なアルゴリズムが可能なHAMDPのクラスを示す。さらに興味深いことに、一般的なHAMDPにおいて、エージェントの目標を知っているオラクルアシスタントと比較して、単純な近視眼的なポリシーでほぼ最適なリグレットを達成できることを示す。次に、一般的なHGMDPの場合を想定し、このポリシーのより洗練されたバージョンを導入し、支援対象のエージェントについて迅速に学習する新しいアプローチと組み合わせる。我々は、被験者がタスクを実行する2つのゲームのようなコンピュータ環境と、コンピュータのデスクトップ環境でのフォルダナビゲーション中に支援を提供するという現実世界の領域で、このアプローチを評価しました。結果は、3つの領域すべてにおいて、フレームワークが、わずかな計算量でユーザーの労力を大幅に削減するアシスタントを生み出すことを示しています。

Sentiment Analysis of Short Informal Texts

Sentiment Analysis of Short Informal Texts / 短いインフォーマルテキストの感情分析

We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words.The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter’ (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.

我々は、(a)ツイートやSMSなどの短い非公式テキストメッセージの感情（メッセージレベルのタスク）と(b)メッセージ内の単語またはフレーズの感情（用語レベルのタスク）を検出する最先端の感情分析システムについて説明します。このシステムは、さまざまな表層形式、意味、および感情の特徴を活用した教師あり統計テキスト分類アプローチに基づいています。感情の特徴は、主に、新しい高カバレッジのツイート固有の感情辞書から得られます。これらの辞書は、感情語ハッシュタグ付きのツイートと絵文字付きのツイートから自動的に生成されます。否定文脈における単語の感情を適切に捉えるため、否定語には別途感情辞書が生成されます。このシステムは、SemEval-2013の共通タスク「Twitterにおける感情分析」（タスク2）で1位を獲得し、メッセージレベルタスクでFスコア69.02、用語レベルタスクでFスコア88.93を獲得しました。競技後の改良により、パフォーマンスはFスコア70.45（メッセージレベルタスク）と89.50（用語レベルタスク）に向上しました。このシステムは、SemEval-2013 SMSテストセットと映画レビュー抜粋コーパスという2つの追加データセットでも最先端のパフォーマンスを達成しました。アブレーション実験では、自動生成辞書の使用により、最大6.5パーセントポイントのパフォーマンス向上が実証されました。

Policy Iteration Based on Stochastic Factorization

Policy Iteration Based on Stochastic Factorization / 確率的因子分解に基づく政策反復

When a transition probability matrix is represented as the product of two stochastic matrices, one can swap the factors of the multiplication to obtain another transition matrix that retains some fundamental characteristics of the original. Since the derived matrix can be much smaller than its precursor, this property can be exploited to create a compact version of a Markov decision process (MDP), and hence to reduce the computational cost of dynamic programming. Building on this idea, this paper presents an approximate policy iteration algorithm called policy iteration based on stochastic factorization, or PISF for short. In terms of computational complexity, PISF replaces standard policy iteration’s cubic dependence on the size of the MDP with a function that grows only linearly with the number of states in the model. The proposed algorithm also enjoys nice theoretical properties: it always terminates after a finite number of iterations and returns a decision policy whose performance only depends on the quality of the stochastic factorization. In particular, if the approximation error in the factorization is sufficiently small, PISF computes the optimal value function of the MDP. The paper also discusses practical ways of factoring an MDP and illustrates the usefulness of the proposed algorithm with an application involving a large-scale decision problem of real economical interest.

遷移確率行列が2つの確率行列の積として表される場合、乗算の係数を交換することで、元の行列の基本特性の一部を保持する別の遷移行列を取得できます。導出された行列は元の行列よりもはるかに小さくなる可能性があるため、この特性を利用してマルコフ決定過程(MDP)のコンパクト版を作成し、動的計画法の計算コストを削減できます。このアイデアを基に、本稿では、確率的因子分解に基づくポリシー反復(略してPISF)と呼ばれる近似ポリシー反復アルゴリズムを紹介します。計算の複雑さの観点から、PISFは、標準的なポリシー反復のMDPのサイズに対する3次依存性を、モデル内の状態数に対して線形にのみ増加する関数に置き換えます。提案されたアルゴリズムは優れた理論的特性も備えています。つまり、有限回の反復後に常に終了し、確率的因子分解の品質のみによってパフォーマンスが決まる決定ポリシーを返します。特に、因数分解における近似誤差が十分に小さい場合、PISFはMDPの最適値関数を計算します。本論文では、MDPを因数分解する実際的な方法についても議論し、提案アルゴリズムの有用性を、経済的に重要な大規模意思決定問題を含むアプリケーションで示します。

Speeding Up Iterative Ontology Alignment using Block-Coordinate Descent

Speeding Up Iterative Ontology Alignment using Block-Coordinate Descent / 高速化ブロック座標降下法を用いた反復的オントロジーアラインメント

In domains such as biomedicine, ontologies are prominently utilized for annotating data. Consequently, aligning ontologies facilitates integrating data. Several algorithms exist for automatically aligning ontologies with diverse levels of performance. As alignment applications evolve and exhibit online run time constraints, performing the alignment in a reasonable amount of time without compromising the quality of the alignment is a crucial challenge. A large class of alignment algorithms is iterative and often consumes more time than others in delivering solutions of high quality. We present a novel and general approach for speeding up the multivariable optimization process utilized by these algorithms. Specifically, we use the technique of block-coordinate descent (BCD), which exploits the subdimensions of the alignment problem identified using a partitioning scheme. We integrate this approach into multiple well-known alignment algorithms and show that the enhanced algorithms generate similar or improved alignments in significantly less time on a comprehensive testbed of ontology pairs. Because BCD does not overly constrain how we partition or order the parts, we vary the partitioning and ordering schemes in order to empirically determine the best schemes for each of the selected algorithms. As biomedicine represents a key application domain for ontologies, we introduce a comprehensive biomedical ontology testbed for the community in order to evaluate alignment algorithms. Because biomedical ontologies tend to be large, default iterative techniques find it difficult to produce a good quality alignment within a reasonable amount of time. We align a significant number of ontology pairs from this testbed using BCD-enhanced algorithms. Our contributions represent an important step toward making a significant class of alignment techniques computationally feasible.

バイオメディカルなどの分野では、オントロジーはデータの注釈付けに広く利用されています。そのため、オントロジーを整合させることでデータの統合が容易になります。オントロジーを自動的に整合させるアルゴリズムはいくつか存在し、その性能レベルは様々です。整合アプリケーションが進化し、オンライン実行時間の制約が生じるようになると、整合の品質を損なうことなく妥当な時間で整合を実行することが重要な課題となります。多くの整合アルゴリズムは反復的であり、高品質のソリューションを提供するのに他のアルゴリズムよりも時間がかかることがよくあります。本稿では、これらのアルゴリズムで利用される多変数最適化プロセスを高速化するための、新しく一般的なアプローチを紹介します。具体的には、ブロック座標降下法（BCD）を用います。これは、分割スキームを用いて特定されたアライメント問題のサブディメンションを活用します。このアプローチを複数の既知のアライメントアルゴリズムに統合し、強化されたアルゴリズムが、オントロジーペアの包括的なテストベッドにおいて、大幅に短縮された時間で、同様または改善されたアライメントを生成することを示します。BCDは、部分の分割や順序付けに過度な制約を与えないため、分割と順序付けのスキームを変化させ、選択されたアルゴリズムごとに最適なスキームを経験的に決定します。バイオメディカルはオントロジーの重要な応用分野であるため、アライメントアルゴリズムを評価するために、コミュニティ向けに包括的なバイオメディカルオントロジーテストベッドを導入します。バイオメディカルオントロジーは大規模になる傾向があるため、デフォルトの反復手法では、妥当な時間内に高品質のアライメントを生成することが困難です。このテストベッドから、BCD強化アルゴリズムを用いて、多数のオントロジーペアをアライメントします。私たちの貢献は、重要なクラスのアライメント手法を計算的に実現可能にするための重要な一歩となります。

Arbitration and Stability in Cooperative Games with Overlapping Coalitions

Arbitration and Stability in Cooperative Games with Overlapping Coalitions / 重複連合を持つ協力ゲームにおける調停と安定性

Overlapping Coalition Formation (OCF) games, introduced by Chalkiadakis, Elkind, Markakis, Polukarov and Jennings in 2010, are cooperative games where players can simultaneously participate in several coalitions. Capturing the notion of stability in OCF games is a difficult task:deviating players may continue to contribute resources to joint projects with non-deviators, and the crucial question is what payoffs the deviators expect to receive from such projects. Chalkiadakis et al. introduce three stability concepts for OCF games—the conservative core, the refined core, and the optimistic core—that are based on different answers to this question. In this paper, we propose a unified framework for the study of stability in the OCF setting, which encompasses the stability concepts considered by Chalkiadakis et al. as well as a wide variety of alternative stability concepts. Our approach is based on the notion of arbitration functions, which determine the payoff obtained by the deviators, given their deviation and the current allocation of resources. We provide a characterization of stable outcomes under arbitration. We then conduct an in-depth study of four types of arbitration functions, which correspond to four notions of the core; these include the three notions of the core considered by Chalkiadakis et al. Our results complement those of Chalkiadakis et al. and answer questions left open by their work. In particular, we show that OCF games with the conservative arbitration function are essentially equivalent to non-OCF games, by relating the conservative core of an OCF game to the core of a non-overlapping cooperative game, and use this result to obtain a strictly weaker sufficient condition for conservative core non-emptiness than the one given by Chalkiadakis et al.

重複連合形成（OCF）ゲームは、2010年にChalkiadakis、Elkind、Markakis、Polukarov、Jenningsによって導入され、プレイヤーが複数の連合に同時に参加できる協力ゲームです。OCFゲームにおける安定性の概念を捉えることは困難な作業です。逸脱プレイヤーは非逸脱プレイヤーとの共同プロジェクトにリソースを提供し続ける可能性があり、重要な問題は逸脱プレイヤーがそのようなプロジェクトからどのような利益を期待するかです。Chalkiadakisらは、この質問に対する異なる答えに基づいた、OCFゲームの3つの安定性概念（保守的コア、洗練コア、楽観的コア）を導入しました。本論文では、Chalkiadakisらによって検討された安定性概念だけでなく、さまざまな代替安定性概念を網羅した、OCF設定における安定性の研究のための統一的な枠組みを提案します。我々のアプローチは、逸脱者の逸脱と現在の資源配分を前提として、逸脱者が得る利得を決定する仲裁関数の概念に基づいています。我々は仲裁下における安定的な結果の特徴付けを行う。次に、コアの4つの概念に対応する4種類の仲裁関数について詳細な研究を行う。これには、Chalkiadakisらが考察したコアの3つの概念が含まれます。我々の結果はChalkiadakisらの結果と補完し、彼らの研究で未解決だった疑問に答えるものです。特に、OCFゲームの保守的なコアを非重複協力ゲームのコアに関連付けることで、保守的な仲裁関数を持つOCFゲームが非OCFゲームと本質的に等価であることを示す。そして、この結果を用いて、Chalkiadakisらが示したものよりも厳密に弱い、保守的なコアの非空性の十分条件を得る。

Demand Side Energy Management via Multiagent Coordination in Consumer Cooperatives

Demand Side Energy Management via Multiagent Coordination in Consumer Cooperatives / 消費者協同組合におけるマルチエージェント協調による需要側エネルギー管理

A key challenge in creating a sustainable and energy-efficient society is to make consumer demand adaptive to the supply of energy, especially to the renewable supply. In this article, we propose a partially-centralized organization of consumers (or agents), namely, a consumer cooperative that purchases electricity from the market. In the cooperative, a central coordinator buys the electricity for the whole group. The technical challenge is that consumers make their own demand decisions, based on their private demand constraints and preferences, which they do not share with the coordinator or other agents. We propose a novel multiagent coordination algorithm, to shape the energy demand of the cooperative. To coordinate individual consumers under incomplete information, the coordinator determines virtual price signals that it sends to the consumers to induce them to shift their demands when required. We prove that this algorithm converges to the central optimal solution and minimizes the electric energy cost of the cooperative. Additionally, we present results on the time complexity of the iterative algorithm and its implications for agents’ incentive compatibility. Furthermore, we perform simulations based on real world consumption data to (a) characterize the convergence properties of our algorithm and (b) understand the effect of differing demand characteristics of participants as well as of different price functions on the cost reduction. The results show that the convergence time scales linearly with the agent population size and length of the optimization horizon. Finally, we observe that as participants’ flexibility of shifting their demands increases, cost reduction increases and that the cost reduction is not sensitive to variation in consumption patterns of the consumers.

持続可能でエネルギー効率の高い社会を実現するための重要な課題は、消費者の需要をエネルギー供給、特に再生可能エネルギーの供給に適応させることです。本稿では、部分的に中央集権化された消費者（またはエージェント）組織、すなわち市場から電力を購入する消費者協同組合を提案します。この協同組合では、中央コーディネーターがグループ全体の電力を購入します。技術的な課題は、消費者がコーディネーターや他のエージェントと共有しない、自身の需要制約と嗜好に基づいて、独自の需要決定を行うことです。本稿では、協同組合のエネルギー需要を形成するための、新たなマルチエージェント協調アルゴリズムを提案します。不完全情報下で個々の消費者を調整するために、コーディネーターは仮想的な価格シグナルを決定し、必要に応じて消費者に需要の転換を促す。このアルゴリズムが中央最適解に収束し、協同組合の電力コストを最小化することを証明します。さらに、反復アルゴリズムの時間計算量と、それがエージェントのインセンティブ適合性に及ぼす影響についての結果を示す。さらに、実世界の消費データに基づくシミュレーションを実施し、(a)提案アルゴリズムの収束特性を明らかにし、(b)参加者の異なる需要特性と異なる価格関数がコスト削減に及ぼす影響を理解します。結果は、収束時間はエージェントの個体数と最適化期間の長さに比例することを示しています。最後に、参加者の需要シフトの柔軟性が高まるにつれてコスト削減が増加し、コスト削減は消費者の消費パターンの変動に敏感ではないことがわかる。

Belief Tracking for Planning with Sensing: Width, Complexity and Approximations

Belief Tracking for Planning with Sensing: Width, Complexity and Approximations / センシングを用いた計画立案のためのビリーフトラッキング：幅、複雑性、近似値

We consider the problem of belief tracking in a planning setting where states are valuations over a set of variables that are partially observable, and beliefs stand for the sets of states that are possible. While the problem is intractable in the worst case, it has been recently shown that in deterministic conformant and contingent problems, belief tracking is exponential in a width parameter that is often bounded and small. In this work, we extend these results in two ways. First, we introduce a width notion that applies to non-deterministic problems as well, develop a factored belief tracking algorithm that is exponential in the problem width, and show how it applies to existing benchmarks. Second, we introduce a meaningful, powerful, and sound approximation scheme, beam tracking, that is exponential in a smaller parameter, the problem causal width, and has much broader applicability. We illustrate the value of this algorithm over large instances of problems such as Battleship, Minesweeper, and Wumpus, where it yields state-of-the-art performance in real-time.

我々は、状態が部分的に観測可能な変数の集合に対する評価であり、信念が起こり得る状態の集合を表す計画設定における信念追跡の問題を考える。この問題は最悪のケースでは解決不可能であるが、最近、決定論的適合問題および偶発問題において、信念追跡はしばしば有界かつ小さい幅パラメータに関して指数関数的であることが示されました。本研究では、これらの結果を2つの方法で拡張します。まず、非決定論的問題にも適用される幅の概念を導入し、問題幅に関して指数関数的な因数分解された信念追跡アルゴリズムを開発し、それが既存のベンチマークにどのように適用されるかを示す。次に、ビームトラッキングという、意味があり強力で健全な近似手法を紹介します。この近似手法は、問題の因果幅という小さなパラメータに対して指数関数的に増加し、適用範囲がはるかに広くなります。このアルゴリズムの価値を、バトルシップ、マインスイーパー、ウンプスといった大規模な問題例で実証し、リアルタイムで最先端のパフォーマンスを実現します。

参考文献

Journal of Artificial Intelligence Resarch Vol. 50 (2014)

論文