Journal of Artificial Intelligence Resarch: Volume 49の論文一覧

Journal of Artificial Intelligence Resarch Vol. 49 (2014)に記載されている内容を一覧にまとめ、機械翻訳を交えて日本語化し掲載します。

1 論文
2 参考文献
3 関連情報

論文

Multimodal Distributional Semantics

Multimodal Distributional Semantics / マルチモーダル分布意味論

Distributional semantic models derive computational representations of word meaning from the patterns of co-occurrence of words in text. Such models have been a success story of computational linguistics, being able to provide reliable estimates of semantic relatedness for the many semantic tasks requiring them. However, distributional models extract meaning information exclusively from text, which is an extremely impoverished basis compared to the rich perceptual sources that ground human semantic knowledge. We address the lack of perceptual grounding of distributional models by exploiting computer vision techniques that automatically identify discrete visual words in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. We propose a flexible architecture to integrate text- and image-based distributional information, and we show in a set of empirical tests that our integrated model is superior to the purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.

分布意味モデルは、テキスト中の単語の共起パターンから単語の意味の計算的表現を導き出します。このようなモデルは、計算言語学における成功例であり、多くの意味的タスクにおいて意味的関連性の信頼性の高い推定値を提供することができます。しかし、分布モデルはテキストからのみ意味情報を抽出しますが、これは人間の意味的知識の基盤となる豊富な知覚的情報源と比較すると、極めて貧弱な基盤です。本研究では、画像内の個別の視覚的単語を自動的に識別するコンピュータービジョン技術を活用することで、分布モデルの知覚的基盤の欠如に対処します。これにより、単語の分布表現を拡張し、関連する画像の視覚的単語との共起も含めることができます。我々は、テキストベースと画像ベースの分布情報を統合する柔軟なアーキテクチャを提案し、一連の実証的テストにおいて、この統合モデルが純粋にテキストベースのアプローチよりも優れており、後者に関していくぶん補完的な意味情報を提供することを示す。

Towards Minimizing Disappointment in Repeated Games

Towards Minimizing Disappointment in Repeated Games / 繰り返しゲームにおける失望の最小化に向けて

We consider the problem of learning in repeated games against arbitrary associates. Specifically, we study the ability of expert algorithms to quickly learn effective strategies in repeated games, towards the ultimate goal of learning near-optimal behavior against any arbitrary associate within only a handful of interactions. Our contribution is three-fold. First, we advocate a new metric, called disappointment, for evaluating expert algorithms in repeated games. Unlike minimizing traditional notions of regret, minimizing disappointment in repeated games is equivalent to maximizing payoffs. Unfortunately, eliminating disappointment is impossible to guarantee in general. However, it is possible for an expert algorithm to quickly achieve low disappointment against many known classes of algorithms in many games. Second, we show that popular existing expert algorithms often fail to achieve low disappointment against a variety of associates, particularly in early rounds of the game. Finally, we describe a new meta-algorithm that can be applied to existing expert algorithms to substantially reduce disappointment in many two-player repeated games when associates follow various static, reinforcement learning, and expert algorithms.

我々は、任意の相手との繰り返しゲームにおける学習の問題を検討します。具体的には、エキスパートアルゴリズムが繰り返しゲームにおいて効果的な戦略を迅速に学習する能力を研究し、最終目標として、わずか数回のインタラクションで任意の相手に対してほぼ最適な行動を学習します。我々の貢献は3つあります。まず、繰り返しゲームにおけるエキスパートアルゴリズムを評価するための、失望と呼ばれる新しい指標を提唱します。従来の後悔の概念を最小化するのとは異なり、繰り返しゲームにおける失望を最小化することは、利得を最大化することと同義です。残念ながら、失望をなくすことは一般に保証できない。しかし、エキスパートアルゴリズムは、多くのゲームにおいて、多くの既知のアルゴリズムクラスに対して、低い失望を迅速に達成することができます。第二に、既存の一般的なエキスパートアルゴリズムは、特にゲームの初期ラウンドにおいて、多様な仲間に対して低い失望度を達成できないことが多いことを示す。最後に、既存のエキスパートアルゴリズムに適用可能な新しいメタアルゴリズムについて説明します。このメタアルゴリズムは、仲間が様々な静的アルゴリズム、強化学習アルゴリズム、エキスパートアルゴリズムに従う場合、多くの2人対戦繰り返しゲームにおいて失望度を大幅に低減します。

A Procedural Characterization of Solution Concepts in Games

A Procedural Characterization of Solution Concepts in Games / 手続き型特性評価ゲームにおける解概念の理解

We show how game-theoretic solution concepts such as Nash equilibrium, correlated equilibrium, rationalizability, and sequential equilibrium can be given a uniform definition in terms of a knowledge-based program with counterfactual semantics. In a precise sense, this program can be viewed as providing a procedural characterization of rationality.

ナッシュ均衡、相関均衡、合理化可能性、逐次均衡といったゲーム理論的解の概念が、反事実的意味論を持つ知識ベースプログラムによって統一的に定義されることを示す。厳密に言えば、このプログラムは合理性の手続き的特徴付けを提供するものと見なすことができます。

Representing and Reasoning About the Rules of General Games With Imperfect Information

Representing and Reasoning About the Rules of General Games With Imperfect Information / 不完全情報を伴う一般ゲームのルールの表現と推論

A general game player is a system that can play previously unknown games just by being given their rules. For this purpose, the Game Description Language (GDL) has been developed as a high-level knowledge representation formalism to communicate game rules to players. In this paper, we address a fundamental limitation of state-of-the-art methods and systems for General Game Playing, namely, their being confined to deterministic games with complete information about the game state. We develop a simple yet expressive extension of standard GDL that allows for formalising the rules of arbitrary finite, n-player games with randomness and incomplete state knowledge. In the second part of the paper, we address the intricate reasoning challenge for general game-playing systems that comes with the new description language. We develop a full embedding of extended GDL into the Situation Calculus augmented by Scherl and Levesque’s knowledge fluent. We formally prove that this provides a sound and complete reasoning method for players’ knowledge about game states as well as about the knowledge of the other players.

一般ゲームプレイヤーとは、ルールを与えられるだけで、これまで知られていなかったゲームをプレイできるシステムです。この目的のために、ゲーム記述言語（GDL）は、プレイヤーにゲームルールを伝えるための高水準知識表現形式として開発されました。本稿では、一般ゲームプレイのための最先端の方法とシステムの根本的な限界、すなわち、ゲームの状態に関する完全な情報を持つ決定論的ゲームに限定されているという限界に対処します。我々は、ランダム性と不完全な状態知識を持つ任意の有限n人プレイヤーゲームのルールを形式化することを可能にする、標準GDLのシンプルでありながら表現力豊かな拡張を開発します。本論文の後半では、新しい記述言語に伴う、一般的なゲームプレイングシステムにおける複雑な推論課題について論じます。ScherlとLevesqueの知識流暢性によって強化された状況計算への拡張GDLの完全埋め込みを開発します。これにより、プレイヤーのゲーム状態に関する知識だけでなく、他のプレイヤーの知識についても、健全かつ完全な推論手法が提供されることを正式に証明します。

Selfishness Level of Strategic Games

Selfishness Level of Strategic Games / 戦略ゲームの利己的レベル

We introduce a new measure of the discrepancy in strategic games between the social welfare in a Nash equilibrium and in a social optimum, that we call selfishness level. It is the smallest fraction of the social welfare that needs to be offered to each player to achieve that a social optimum is realized in a pure Nash equilibrium. The selfishness level is unrelated to the price of stability and the price of anarchy and is invariant under positive linear transformations of the payoff functions. Also, it naturally applies to other solution concepts and other forms of games.We study the selfishness level of several well-known strategic games. This allows us to quantify the implicit tension within a game between players’ individual interests and the impact of their decisions on the society as a whole. Our analyses reveal that the selfishness level often provides a deeper understanding of the characteristics of the underlying game that influence the players’ willingness to cooperate.In particular, the selfishness level of finite ordinal potential games is finite, while that of weakly acyclic games can be infinite. We derive explicit bounds on the selfishness level of fair cost sharing games and linear congestion games, which depend on specific parameters of the underlying game but are independent of the number of players. Further, we show that the selfishness level of the $n$-players Prisoner’s Dilemma is c/(b(n-1)-c), where b and c are the benefit and cost for cooperation, respectively, that of the n-players public goods game is (1-c/n)/(c-1), where c is the public good multiplier, and that of the Traveler’s Dilemma game is (b-1)/2, where b is the bonus. Finally, the selfishness level of Cournot competition (an example of an infinite ordinal potential game), Tragedy of the Commons, and Bertrand competition is infinite.

戦略ゲームにおけるナッシュ均衡と社会的最適状態における社会的厚生の乖離を測る新しい尺度を導入し、これを利己主義レベルと呼びます。これは、純粋ナッシュ均衡において社会的最適状態を実現するために各プレイヤーに提供する必要がある社会的厚生の最小値です。利己主義レベルは、安定の代償や無秩序の代償とは無関係であり、利己主義レベルは利得関数の正の線形変換に対して不変です。また、他の解概念や他の形式のゲームにも当然適用できます。いくつかのよく知られた戦略ゲームにおける利己主義レベルを研究します。これにより、ゲーム内におけるプレイヤーの個々の利益と、彼らの決定が社会全体に与える影響との間の暗黙の緊張を定量化することが可能になります。私たちの分析は、利己主義レベルが、プレイヤーの協力意欲に影響を与えるゲームの特性をより深く理解する上で役立つことが多いことを明らかにしました。特に、有限順序ポテンシャルゲームの利己主義レベルは有限であるのに対し、弱非巡回ゲームでは利己主義レベルは無限大になることがあります。公平な費用分担ゲームと線形混雑ゲームの利己主義レベルの明確な境界を導出します。これらの境界は、ゲーム固有のパラメータに依存しますが、プレイヤー数には依存しません。さらに、$n$人囚人のジレンマにおける利己主義水準はc/(b(n-1)-c)（bとcはそれぞれ協力の便益と費用）、n人公共財ゲームにおける利己主義水準は(1-c/n)/(c-1)（cは公共財乗数）、旅行者のジレンマゲームにおける利己主義水準は(b-1)/2（bはボーナス）であることを示す。最後に、クールノー競争（無限順序ポテンシャルゲームの一例）、コモンズの悲劇、そしてベルトラン競争における利己主義水準は無限です。

An Empirical Evaluation of Ranking Measures With Respect to Robustness to Noise

An Empirical Evaluation of Ranking Measures With Respect to Robustness to Noise / ノイズに対する頑健性に関するランキング尺度の実証的評価

Ranking measures play an important role in model evaluation and selection. Using both synthetic and real-world data sets, we investigate how different types and levels of noise affect the area under the ROC curve (AUC), the area under the ROC convex hull, the scored AUC, the Kolmogorov-Smirnov statistic, and the H-measure. In our experiments, the AUC was, overall, the most robust among these measures, thereby reinvigorating it as a reliable metric despite its well-known deficiencies. This paper also introduces a novel ranking measure, which is remarkably robust to noise yet conceptually simple.

ランキング尺度は、モデルの評価と選択において重要な役割を果たします。合成データセットと実世界データセットの両方を用いて、様々な種類とレベルのノイズがROC曲線下面積（AUC）、ROC凸包下面積、スコア付きAUC、コルモゴロフ・スミルノフ統計量、およびH尺度にどのように影響するかを調査します。実験では、AUCはこれらの尺度の中で全体的に最も堅牢であり、既知の欠陥にもかかわらず、信頼できる指標として再び注目を集めました。本論文では、ノイズに対して非常に堅牢でありながら概念的にシンプルな、新たなランキング尺度も紹介します。

Efficient HEX-Program Evaluation Based on Unfounded Sets

Efficient HEX-Program Evaluation Based on Unfounded Sets / 未根拠集合に基づく効率的なHEXプログラム評価

HEX-programs extend logic programs under the answer set semantics with external computations through external atoms. As reasoning from ground Horn programs with nonmonotonic external atoms of polynomial complexity is already on the second level of the polynomial hierarchy, minimality checking of answer set candidates needs special attention. To this end, we present an approach based on unfounded sets as a generalization of related techniques for ASP programs. The unfounded set detection is expressed as a propositional SAT problem, for which we provide two different encodings and optimizations to them. We then integrate our approach into a previously developed evaluation framework for HEX-programs, which is enriched by additional learning techniques that aim at avoiding the reconstruction of the same or related unfounded sets. Furthermore, we provide a syntactic criterion that allows one to skip the minimality check in many cases. An experimental evaluation shows that the new approach significantly decreases runtime.

HEXプログラムは、外部アトムを介した外部計算によって、解答集合セマンティクスに基づく論理プログラムを拡張します。多項式複雑度の非単調な外部アトムを持つグラウンドホーンプログラムからの推論は、既に多項式階層の第2レベルにあるため、解答集合候補の最小性チェックには特別な注意が必要です。この目的のために、ASPプログラムのための関連手法の一般化として、非根拠集合に基づくアプローチを提示します。根拠のない集合の検出は命題的SAT問題として表現され、我々はこれに対して2つの異なるエンコーディングと最適化を提供します。次に、我々のアプローチを、同一または関連する根拠のない集合の再構築を回避することを目的とした追加の学習手法によって強化された、以前に開発されたHEXプログラムの評価フレームワークに統合します。さらに、多くの場合に最小性チェックを省略できる構文基準を提供します。実験的評価では、この新しいアプローチによって実行時間が大幅に短縮されることが示されました。

Symmetric Subgame-Perfect Equilibria in Resource Allocation

Symmetric Subgame-Perfect Equilibria in Resource Allocation / 資源配分における対称的部分ゲーム完全均衡

We analyze symmetric protocols to rationally coordinate on an asymmetric, efficient allocation in an infinitely repeated N-agent, C-resource allocation problems, where the resources are all homogeneous. Bhaskar proposed one way to achieve this in 2-agent, 1-resource games: Agents start by symmetrically randomizing their actions, and as soon as they each choose different actions, they start to follow a potentially asymmetric “convention” that prescribes their actions from then on. We extend the concept of convention to the general case of infinitely repeated resource allocation games with N agents and C resources. We show that for any convention, there exists a symmetric subgame-perfect equilibrium which implements it. We present two conventions: bourgeois, where agents stick to the first allocation; and market, where agents pay for the use of resources, and observe a global coordination signal which allows them to alternate between different allocations. We define price of anonymity of a convention as a ratio between the maximum social payoff of any (asymmetric) strategy profile and the expected social payoff of the subgame-perfect equilibrium which implements the convention. We show that while the price of anonymity of the bourgeois convention is infinite, the market convention decreases this price by reducing the conflict between the agents.

我々は、Nエージェント、Cリソースの無限反復割り当て問題において、リソースがすべて均質である場合、非対称かつ効率的な割り当てについて合理的に調整するための対称プロトコルを解析します。Bhaskarは、2エージェント、1リソースのゲームでこれを実現する1つの方法を提案した。エージェントは、行動を対称的にランダム化することから始め、それぞれが異なる行動を選択するとすぐに、それ以降の行動を規定する潜在的に非対称な「規約」に従うようになります。我々は、慣習の概念を、N個のエージェントとC個のリソースによる無限に繰り返されるリソース割り当てゲームの一般的なケースに拡張します。任意の慣習に対して、それを実行する対称的なサブゲーム完全な均衡が存在することを示す。我々は2つの慣習を提示します。ブルジョア慣習では、エージェントは最初の割り当てに固執します。市場慣習では、エージェントはリソースの使用に対して支払いを行い、異なる割り当てを切り替えることを可能にするグローバル調整シグナルを観察します。我々は、慣習の匿名性の代償を、任意の(非対称)戦略プロファイルの最大社会的利得と、慣習を実行するサブゲーム完全な均衡の期待社会的利得との比率として定義します。ブルジョア慣習の匿名性の代償は無限であるが、市場慣習ではエージェント間の対立を減らすことによってこの代償が減少することを示す。

Multiagent Only Knowing in Dynamic Systems

Multiagent Only Knowing in Dynamic Systems / 動的システムにおけるマルチエージェントのOnly Knowing（唯一知識）

The idea of “only knowing” a collection of sentences, as proposed by Levesque, has been previously shown to be very useful in characterizing knowledge-based agents: in terms of a specification, a precise and perspicuous account of the beliefs and non-beliefs is obtained in a monotonic setting. Levesque’s logic is based on a first-order modal language with quantifying-in, thus allowing for de re versus de dicto distinctions, among other things. However, the logic and its recent dynamic extension only deal with the case of a single agent. In this work, we propose a first-order multiagent framework with knowledge, actions, sensing and only knowing, that is shown to inherit all the features of the single agent version. Most significantly, we prove reduction theorems by means of which reasoning about knowledge and actions in the framework simplifies to non-epistemic, non-dynamic reasoning about the initial situation.

Levesqueが提唱した「文の集合を「知ることのみ」」という概念は、知識ベースエージェントの特性評価において非常に有用であることが既に示されています。仕様記述の観点から、信念と非信念に関する正確かつ明瞭な記述が単調な設定で得られます。Levesqueの論理は、量化を伴う一階様相言語に基づいており、これによりde reとde dictoの区別などが可能になります。しかし、この論理とその最近の動的拡張は、単一エージェントの場合のみを扱う。本研究では、知識、行動、感知、そして知ることのみを備えた一階マルチエージェントフレームワークを提案します。このフレームワークは、単一エージェント版のすべての特徴を継承します。最も重要なのは、このフレームワークにおける知識と行動に関する推論が、初期状況に関する非認識論的かつ非動的推論へと単純化される縮約定理を証明することです。

Mechanisms for Fair Allocation Problems: No-Punishment Payment Rules in Verifiable Settings

Mechanisms for Fair Allocation Problems: No-Punishment Payment Rules in Verifiable Settings / 公平な割り当て問題のためのメカニズム：検証可能な設定における罰なし支払いルール

Mechanism design is considered in the context of fair allocations of indivisible goods with monetary compensation, by focusing on problems where agents’ declarations on allocated goods can be verified before payments are performed. A setting is considered where verification might be subject to errors, so that payments have to be awarded under the presumption of innocence, as incorrect declared values do not necessarily mean manipulation attempts by the agents. Within this setting, a mechanism is designed that is shown to be truthful, efficient, and budget-balanced. Moreover, agents’ utilities are fairly determined by the Shapley value of suitable coalitional games, and enjoy highly desirable properties such as equal treatment of equals, envy-freeness, and a stronger one called individual-optimality. In particular, the latter property guarantees that, for every agent, her/his utility is the maximum possible one over any alternative optimal allocation. The computational complexity of the proposed mechanism is also studied. It turns out that it is #P-complete so that, to deal with applications with many agents involved, two polynomial-time randomized variants are also proposed: one that is still truthful and efficient, and which is approximately budget-balanced with high probability, and another one that is truthful in expectation, while still budget-balanced and efficient.

金銭補償を伴う不可分財の公平な配分という文脈において、エージェントによる配分財に関する申告が支払い前に検証可能な問題に焦点を当て、メカニズム設計について考察します。検証に誤りが生じる可能性がある状況を想定し、申告値の誤りが必ずしもエージェントによる操作の試みを意味するとは限らないため、無罪推定に基づいて支払いが行われるようにします。この状況下で、誠実性、効率性、そして予算均衡が保たれることが示されるメカニズムが設計されます。さらに、エージェントの効用は適切な提携ゲームのシャプレー値によって公平に決定され、同等者への平等な扱い、非羨望性、そしてより強力な個人最適性といった非常に望ましい特性を備える。特に、後者の特性は、すべてのエージェントにとって、その効用がいかなる代替的な最適配分よりも最大となることを保証します。提案されたメカニズムの計算量についても考察します。これは#P完全であることが判明しているため、多くのエージェントが関与するアプリケーションに対応するために、2つの多項式時間ランダム化変種も提案されています。1つは真実性と効率性を維持し、高い確率で近似的に予算均衡がとれており、もう1つは期待値において真実性を維持しながら予算均衡がとれており効率的です。

Text-Based Twitter User Geolocation Prediction

Text-Based Twitter User Geolocation Prediction / テキストベースのTwitterユーザーの位置情報予測

Geographical location is vital to geospatial applications like local search and event detection. In this paper, we investigate and improve on the task of text-based geolocation prediction of Twitter users. Previous studies on this topic have typically assumed that geographical references (e.g., gazetteer terms, dialectal words) in a text are indicative of its authors location. However, these references are often buried in informal, ungrammatical, and multilingual data, and are therefore non-trivial to identify and exploit. We present an integrated geolocation prediction framework and investigate what factors impact on prediction accuracy. First, we evaluate a range of feature selection methods to obtain location indicative words. We then evaluate the impact of non-geotagged tweets, language, and user-declared metadata on geolocation prediction. In addition, we evaluate the impact of temporal variance on model generalisation, and discuss how users differ in terms of their geolocatability.We achieve state-of-the-art results for the text-based Twitter user geolocation task, and also provide the most extensive exploration of the task to date. Our findings provide valuable insights into the design of robust, practical text-based geolocation prediction systems.

地理的位置は、ローカル検索やイベント検出などの地理空間アプリケーションにとって不可欠です。本稿では、Twitterユーザーのテキストベースの地理位置予測タスクを調査し、改善します。このトピックに関するこれまでの研究では、テキスト内の地理的参照(地名辞典用語、方言語など)が著者の所在地を示すと想定されることが一般的でした。しかし、これらの参照は非公式で文法的に正しくない多言語データに埋もれていることが多く、識別して活用するのは容易ではありません。本稿では、統合された地理位置予測フレームワークを提示し、予測精度に影響を与える要因を調査します。まず、所在地を示す単語を取得するためのさまざまな特徴選択手法を評価します。次に、ジオタグなしツイート、言語、およびユーザーが宣言したメタデータが地理位置情報予測に与える影響を評価します。さらに、時間的変動がモデルの一般化に与える影響を評価し、ユーザーの地理位置情報の取得可能性がどのように異なるかについて議論します。テキストベースのTwitterユーザー地理位置情報タスクにおいて最先端の結果を達成し、これまでで最も広範なタスクの調査結果も提供します。私たちの研究結果は、堅牢で実用的なテキストベースの地理位置情報予測システムの設計に貴重な洞察を提供します。

Robustness and Stability in Constraint Programming under Dynamism and Uncertainty

Robustness and Stability in Constraint Programming under Dynamism and Uncertainty / ダイナミズムと不確実性下における制約計画法の堅牢性と安定性

Many real life problems that can be solved by constraint programming, come from uncertain and dynamic environments. Because of the dynamism, the original problem may change over time, and thus the solution found for the original problem may become invalid. For this reason, dealing with such problems has become an important issue in the fields of constraint programming. In some cases, there exist extant knowledge about the uncertain and dynamic environment. In other cases, this information is fragmentary or unknown. In this paper, we extend the concept of robustness and stability for Constraint Satisfaction Problems (CSPs) with ordered domains, where only limited assumptions need to be made as to possible changes. We present a search algorithm that searches for both robust and stable solutions for CSPs of this nature. It is well-known that meeting both criteria simultaneously is a desirable objective for constraint solving in uncertain and dynamic environments. We also present compelling evidence that our search algorithm outperforms other general-purpose algorithms for dynamic CSPs using random instances and benchmarks derived from real life problems.

制約プログラミングによって解決できる多くの現実の問題は、不確実で動的な環境から生じます。そのダイナミズムのため、元の問題は時間の経過とともに変化する可能性があり、その結果、元の問題に対して見つかった解決策が無効になる可能性があります。このため、このような問題への対処は、制約プログラミングの分野において重要な課題となっています。場合によっては、不確実で動的な環境に関する既存の知識が存在することがあります。他の場合には、この情報は断片的であったり、未知であったりします。本稿では、順序付きドメインを持つ制約充足問題（CSP）における堅牢性と安定性の概念を拡張します。CSPでは、起こり得る変化について限定的な仮定のみで済みます。このような性質を持つCSPに対して、堅牢かつ安定した解を探索する探索アルゴリズムを提示します。不確実で動的な環境における制約解決において、両方の基準を同時に満たすことが望ましい目標であることはよく知られています。また、ランダムインスタンスと実問題から得られたベンチマークを用いて、本探索アルゴリズムが動的CSPに対する他の汎用アルゴリズムよりも優れていることを示す説得力のある証拠を示します。

Information-Theoretic Multi-view Domain Adaptation: A Theoretical and Empirical Study

Information-Theoretic Multi-view Domain Adaptation: A Theoretical and Empirical Study / 情報理論的マルチビュー領域適応：理論的および実証的研究

Multi-view learning aims to improve classification performance by leveraging the consistency among different views of data. The incorporation of multiple views was paid little attention in the studies of domain adaptation, where the view consistency based on source data is largely violated in the target domain due to the distribution gap between different domain data. In this paper, we leverage multiple views for cross-domain document classification. The central idea is to strengthen the views’ consistency on target data by identifying the associations of domain-specific features from different domains. We present an Information-theoretic Multi-view Adaptation Model (IMAM) using a multi-way clustering scheme, where word and link clusters can draw together seemingly unrelated features across domains, which boosts the consistency between document clusterings that are based on the respective word and link views. Moreover, we demonstrate that IMAM can always find the document clustering with the minimal disagreement rate to the overlap of view-based clusterings. We provide both theoretical and empirical justifications of the proposed method. Our experiments show that IMAM significantly outperforms traditional multi-view algorithm co-training, the co-training-based adaptation algorithm CODA, the single-view transfer model CoCC and the large-margin-based multi-view transfer model MVTL-LM.

マルチビュー学習は、異なるデータビュー間の一貫性を活用することで分類性能を向上させることを目指しています。ドメイン適応の研究では、複数のビューの組み込みはあまり考慮されていませんでした。ドメイン適応では、異なるドメインデータ間の分布のギャップにより、ソースデータに基づくビューの一貫性がターゲットドメインでは大きく破れてしまうからです。本稿では、クロスドメイン文書分類に複数のビューを活用します。中心的な考え方は、異なるドメインのドメイン固有の特徴の関連性を特定することで、ターゲットデータにおけるビューの一貫性を強化することです。本稿では、多元クラスタリング手法を用いた情報理論的マルチビュー適応モデル（IMAM）を提示します。この手法では、単語とリンクのクラスタリングによって、ドメイン間で一見無関係に見える特徴をまとめることができ、それぞれの単語とリンクのビューに基づく文書クラスタリング間の一貫性が向上します。さらに、IMAMは、ビューベースのクラスタリングの重複に対する不一致率が常に最小となる文書クラスタリングを見つけられることを示します。提案手法の理論的および実証的正当性を示します。実験では、IMAMが従来のマルチビューアルゴリズムのコトレーニング、コトレーニングベースの適応アルゴリズムCODA、シングルビュー転送モデルCoCC、およびラージマージンベースのマルチビュー転送モデルMVTL-LMを大幅に上回る性能を示すことが示されています。

Large-Scale Optimization for Evaluation Functions with Minimax Search

Large-Scale Optimization for Evaluation Functions with Minimax Search / ミニマックス探索を用いた評価関数の大規模最適化

This paper presents a new method, Minimax Tree Optimization (MMTO), to learn a heuristic evaluation function of a practical alpha-beta search program. The evaluation function may be a linear or non-linear combination of weighted features, and the weights are the parameters to be optimized. To control the search results so that the move decisions agree with the game records of human experts, a well-modeled objective function to be minimized is designed. Moreover, a numerical iterative method is used to nd local minima of the objective function, and more than forty million parameters are adjusted by using a small number of hyper parameters. This method was applied to shogi, a major variant of chess in which the evaluation function must handle a larger state space than in chess. Experimental results show that the large-scale optimization of the evaluation function improves the playing strength of shogi programs, and the new method performs signicantly better than other methods. Implementation of the new method in our shogi program Bonanza made substantial contributions to the program’s rst-place nish in the 2013 World Computer Shogi Championship. Additionally, we present preliminary evidence of broader applicability of our method to other two-player games such as chess.

本論文では、実用的なアルファベータ探索プログラムのヒューリスティック評価関数を学習するための新しい手法、ミニマックスツリー最適化（MMTO）を紹介します。評価関数は、重み付き特徴の線形または非線形の組み合わせであり、重みは最適化されるパラメータです。探索結果を制御して、着手決定が人間の専門家のゲーム記録と一致するように、最小化されるべき適切にモデル化された目的関数が設計されます。さらに、数値反復法を使用して目的関数の局所最小値を見つけ、少数のハイパーパラメータを使用して4000万を超えるパラメータを調整します。この手法は、評価関数がチェスよりも大きな状態空間を処理する必要があるチェスの主要な変種である将棋に適用されました。実験結果から、評価関数の大規模な最適化により将棋プログラムの棋力が向上し、新手法が他の手法よりも大幅に優れた性能を発揮することが示されました。この新手法を将棋プログラムBonanzaに実装したことで、2013年世界コンピュータ将棋選手権におけるプログラムの優勝に大きく貢献しました。さらに、チェスなどの他の2人用ゲームへの本手法の幅広い適用可能性を示す予備的な証拠も示します。

Inapproximability of Treewidth and Related Problems

Inapproximability of Treewidth and Related Problems / 木幅の近似不可能性と関連問題

Graphical models, such as Bayesian Networks and Markov networks play an important role in artificial intelligence and machine learning. Inference is a central problem to be solved on these networks. This, and other problems on these graph models are often known to be hard to solve in general, but tractable on graphs with bounded Treewidth. Therefore, finding or approximating the Treewidth of a graph is a fundamental problem related to inference in graphical models. In this paper, we study the approximability of a number of graph problems: Treewidth and Pathwidth of graphs, Minimum Fill-In, One-Shot Black (and Black-White) pebbling costs of directed acyclic graphs, and a variety of different graph layout problems such as Minimum Cut Linear Arrangement and Interval Graph Completion. We show that, assuming the recently introduced Small Set Expansion Conjecture, all of these problems are NP-hard to approximate to within any constant factor in polynomial time.

ベイジアンネットワークやマルコフネットワークなどのグラフィカルモデルは、人工知能や機械学習において重要な役割を果たしています。推論はこれらのネットワーク上で解決すべき中心的な問題です。この問題、そしてこれらのグラフモデルに関するその他の問題は、一般には解決が難しいことが知られていますが、ツリー幅が制限されたグラフでは扱いやすいことが知られています。したがって、グラフのツリー幅を見つけたり近似したりすることは、グラフィカルモデルにおける推論に関連する基本的な問題です。本論文では、グラフの木幅とパス幅、最小フィルイン問題、有向非巡回グラフのワンショット黒（および黒白）ペブリングコスト、そして最小カット線形配置や区間グラフ補完といった様々なグラフレイアウト問題など、様々なグラフ問題の近似可能性について考察します。最近発表された小集合拡張予想を仮定すると、これらの問題はすべて、任意の定数倍以内で多項式時間で近似することがNP困難であることを示す。

Algorithms and Applications for the Same-Decision Probability

Algorithms and Applications for the Same-Decision Probability / 同一決定確率のアルゴリズムと応用

When making decisions under uncertainty, the optimal choices are often difficult to discern, especially if not enough information has been gathered. Two key questions in this regard relate to whether one should stop the information gathering process and commit to a decision (stopping criterion), and if not, what information to gather next (selection criterion). In this paper, we show that the recently introduced notion, Same-Decision Probability (SDP), can be useful as both a stopping and a selection criterion, as it can provide additional insight and allow for robust decision making in a variety of scenarios. This query has been shown to be highly intractable, being PP^PP-complete, and is exemplary of a class of queries which correspond to the computation of certain expectations. We propose the first exact algorithm for computing the SDP, and demonstrate its effectiveness on several real and synthetic networks. Finally, we present new complexity results, such as the complexity of computing the SDP on models with a Naive Bayes structure. Additionally, we prove that computing the non-myopic value of information is complete for the same complexity class as computing the SDP.

不確実性下で意思決定を行う際、特に十分な情報が収集されていない場合、最適な選択肢を見極めることがしばしば困難です。この点に関して、2つの重要な疑問が存在します。1つは情報収集プロセスを中止し、決定を確定すべきかどうか（中止基準）、もう1つはそうでない場合、次にどのような情報を収集すべきか（選択基準）です。本稿では、最近導入された概念である同一決定確率（SDP）が、様々なシナリオにおいて更なる洞察を提供し、堅牢な意思決定を可能にするため、中止基準としても選択基準としても有用であることを示します。このクエリはPP^PP完全であり、非常に扱いにくいことが示されており、特定の期待値の計算に対応するクエリのクラスの典型例です。本稿では、SDPを計算するための最初の正確なアルゴリズムを提案し、いくつかの実ネットワークおよび合成ネットワークにおいてその有効性を実証します。最後に、ナイーブベイズ構造を持つモデルにおけるSDP計算の複雑性など、新たな計算量に関する結果を示します。さらに、情報の非近視的価値の計算は、SDPの計算と同じ計算量クラスで完全であることを証明します。

Algorithms for Argumentation Semantics: Labeling Attacks as a Generalization of Labeling Arguments

Algorithms for Argumentation Semantics: Labeling Attacks as a Generalization of Labeling Arguments / 議論意味論のためのアルゴリズム：ラベリング議論の一般化としてのラベリング攻撃

A Dung argumentation framework (AF) is a pair (A,R): A is a set of abstract arguments and R ⊆ A×A is a binary relation, so-called the attack relation, for capturing the conflicting arguments. Labeling based algorithms for enumerating extensions (i.e. sets of acceptable arguments) have been set out such that arguments (i.e. elements of A) are the only subject for labeling. In this paper we present implemented algorithms for listing extensions by labeling attacks (i.e. elements of R) along with arguments. Specifically, these algorithms are concerned with enumerating all extensions of an AF under a number of argumentation semantics: preferred, stable, complete, semi stable, stage, ideal and grounded. Our algorithms have impact, in particular, on enumerating extensions of AF-extended models that allow attacks on attacks. To demonstrate this impact, we instantiate our algorithms for an example of such models: namely argumentation frameworks with recursive attacks (AFRA), thereby we end up with unified algorithms that enumerate extensions of any AF/AFRA.

Dungの議論フレームワーク（AF）は、ペア（A,R）です。Aは抽象的な議論の集合であり、R⊆A×Aは、矛盾する議論を捕捉するための、いわゆる攻撃関係と呼ばれる二項関係です。拡張（つまり、許容可能な議論の集合）を列挙するためのラベル付けベースのアルゴリズムは、議論（つまり、Aの要素）のみがラベル付けの対象となるように設定されています。本稿では、攻撃（つまり、Rの要素）と議論をラベル付けすることにより、拡張をリストするための実装済みアルゴリズムを紹介します。具体的には、これらのアルゴリズムは、優先、安定、完全、半安定、段階、理想的、および根拠のある議論セマンティクスの下で、AFのすべての拡張を列挙することに関係しています。私たちのアルゴリズムは、特に、攻撃に対する攻撃を可能にするAF拡張モデルの拡張を列挙することに影響を与えます。この影響を実証するために、我々はそのようなモデルの例として、再帰攻撃を伴う議論フレームワーク（AFRA）に対して我々のアルゴリズムを具体化し、あらゆるAF/AFRAの拡張を列挙する統一アルゴリズムを得る。

Improved Separations of Regular Resolution from Clause Learning Proof Systems

Improved Separations of Regular Resolution from Clause Learning Proof Systems / 正規導出と節学習証明システムの改良された分離

This paper studies the relationship between resolution and conflict driven clause learning (CDCL) without restarts, and refutes some conjectured possible separations. We prove that the guarded, xor-ified pebbling tautology clauses, which Urquhart proved are hard for regular resolution, as well as the guarded graph tautology clauses of Alekhnovich, Johannsen, Pitassi, and Urquhart have polynomial size pool resolution refutations that use only input lemmas as learned clauses. For the latter set of clauses, we extend this to prove that a CDCL search without restarts can refute these clauses in polynomial time, provided it makes the right choices for decision literals and clause learning. This holds even if the CDCL search is required to greedily process conflicts arising from unit propagation. This refutes the conjecture that the guarded graph tautology clauses or the guarded xor-ified pebbling tautology clauses can be used to separate CDCL without restarts from general resolution. Together with subsequent results by Buss and Kolodziejczyk, this means we lack any good conjectures about how to establish the exact logical strength of conflict-driven clause learning without restarts.

本論文では、解決と再開なしの衝突駆動節学習（CDCL）の関係を研究し、いくつかの想定される分離を反駁します。Urquhartが証明したガード付きXOR化ペブリング・トートロジー節、およびAlekhnovich、Johannsen、Pitassi、Urquhartのガード付きグラフ・トートロジー節は、学習済み節として入力補題のみを使用する多項式サイズ・プール解決反駁を持つことを証明します。後者の節セットについては、これを拡張し、決定リテラルと節学習に適切な選択を行うことを条件として、再開なしのCDCL探索はこれらの節を多項式時間で反駁できることを証明します。これは、CDCL探索がユニット伝播から生じる衝突を貪欲に処理する必要がある場合でも成立します。これは、ガード付きグラフ・トートロジー節またはガード付きXOR化ペブリング・トートロジー節を用いて、リスタートなしのCDCLを一般解から分離できるという仮説を反証します。BussとKolodziejczykによるその後の結果と合わせると、リスタートなしの衝突駆動型節学習の正確な論理的強度を確立する方法について、良い仮説が存在しないことを意味します。

Convergence of a Q-learning Variant for Continuous States and Actions

Convergence of a Q-learning Variant for Continuous States and Actions / 連続状態とアクションのためのQ学習バリアントの収束

This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision Processes under the expected total discounted reward criterion when both the state and action spaces are continuous. This algorithm is based on Watkins’ Q-learning, but uses Nadaraya-Watson kernel smoothing to generalize knowledge to unvisited states. As expected, continuity conditions must be imposed on the mean rewards and transition probabilities. Using results from kernel regression theory, this algorithm is proven capable of producing a Q-value function estimate that is uniformly within an arbitrary tolerance of the true Q-value function with probability one. The algorithm is then applied to an example problem to empirically show convergence as well.

本論文では、状態空間と行動空間の両方が連続である場合に、期待総割引報酬基準の下で無限時間マルコフ決定過程を解くための強化学習アルゴリズムを提示します。このアルゴリズムはワトキンスのQ学習に基づいているが、ナダラヤ・ワトソンカーネル平滑化を用いて知識を未訪問状態に一般化します。予想通り、平均報酬と遷移確率には連続条件を課す必要があります。カーネル回帰理論の結果を用いて、このアルゴリズムは真のQ値関数の任意の許容範囲内で確率1で均一なQ値関数推定値を生成できることが証明されています。次に、このアルゴリズムを例題に適用し、収束も実証的に示す。

Comparative Evaluation of Link-Based Approaches for Candidate Ranking in Link-to-Wikipedia Systems

Comparative Evaluation of Link-Based Approaches for Candidate Ranking in Link-to-Wikipedia Systems / Wikipediaへのリンクシステムにおける候補ランキングのためのリンクベースアプローチの比較評価

In recent years, the task of automatically linking pieces of text (anchors) mentioned in a document to Wikipedia articles that represent the meaning of these anchors has received extensive research attention. Typically, link-to-Wikipedia systems try to find a set of Wikipedia articles that are candidates to represent the meaning of the anchor and, later, rank these candidates to select the most appropriate one. In this ranking process the systems rely on context information obtained from the document where the anchor is mentioned and/or from Wikipedia. In this paper we center our attention in the use of Wikipedia links as context information. In particular, we offer a review of several candidate ranking approaches in the state-of-the-art that rely on Wikipedia link information. In addition, we provide a comparative empirical evaluation of the different approaches on five different corpora: the TAC 2010 corpus and four corpora built from actual Wikipedia articles and news items.

近年、文書内で言及されているテキスト片（アンカー）を、そのアンカーの意味を表すWikipediaの記事に自動的にリンクするタスクは、広く研究の注目を集めています。通常、Wikipediaへのリンクシステムは、アンカーの意味を表す候補となるWikipedia記事のセットを探し、その後、これらの候補をランク付けして最も適切なものを選択します。このランク付けプロセスにおいて、システムはアンカーが言及されている文書やWikipediaから取得したコンテキスト情報に依存します。本稿では、コンテキスト情報としてのWikipediaリンクの使用に焦点を当てます。特に、Wikipediaリンク情報に依存する最先端の候補ランク付け手法をいくつかレビューします。さらに、5つの異なるコーパス(TAC 2010コーパスと実際のWikipediaの記事およびニュース項目から構築された4つのコーパス)に対して、さまざまな手法の比較実証評価を提供します。

Closure and Consistency In Logic-Associated Argumentation

Closure and Consistency In Logic-Associated Argumentation / 閉包性と一貫性論理関連議論

Properties like logical closure and consistency are important properties in any logical reasoning system. Caminada and Amgoud showed that not every logic-based argument system satisfies these relevant properties. But under conditions like closure under contraposition or transposition of the monotonic part of the underlying logic, ASPIC-like systems satisfy these properties. In contrast, the logical closure and consistency properties are not well-understood for other well-known and widely applied systems like logic programming or assumption based argumentation. Though conditions like closure under contraposition or transposition seem intuitive in ASPIC-like systems, they rule out many sensible ASPIC-like systems that satisfy both properties of closure and consistency. We present a new condition referred to as the self-contradiction axiom that guarantees the consistency property in both ASPIC-like and assumption-based systems and is implied by both properties of closure under contraposition or transposition. We develop a logic-associated abstract argumentation framework, by associating abstract argumentation with abstract logics to represent the conclusions of arguments. We show that logic-associated abstract argumentation frameworks capture ASPIC-like systems (without preferences) and assumption-based argumentation. We present two simple and natural properties of compactness and cohesion in logic-associated abstract argumentation frameworks and show that they capture the logical closure and consistency properties. We demonstrate that in both assumption-based argumentation and ASPIC-like systems, cohesion follows naturally from the self-contradiction axiom. We further give a translation from ASPIC-like systems (without preferences) into equivalent assumption-based systems that keeps the self-contradiction axiom invariant.

論理的閉包や一貫性などの特性は、あらゆる論理推論システムにおいて重要な特性です。CaminadaとAmgoudは、すべての論理ベースの議論システムがこれらの関連特性を満たすわけではないことを示しました。しかし、基礎となるロジックの単調部分の対置または転置の下での閉包などの条件下では、ASPICのようなシステムはこれらの特性を満たします。対照的に、論理プログラミングや仮定に基づく議論のような、よく知られ広く応用されている他のシステムでは、論理的閉包性と一貫性の特性は十分に理解されていません。対置または転置の下での閉包性などの条件はASPICのようなシステムでは直感的に理解できるように見えますが、これらの条件は、閉包性と一貫性の両方の特性を満たす多くの合理的なASPICのようなシステムを除外します。本稿では、ASPICのようなシステムと仮定に基づくシステムの両方で一貫性特性を保証し、対置または転置の下での閉包性の両方の特性から導かれる、自己矛盾公理と呼ばれる新しい条件を提示します。議論の結論を表現するために抽象的議論を抽象論理に関連付けることにより、論理に関連付けられた抽象的議論フレームワークを開発します。論理に関連付けられた抽象的議論フレームワークが、選好のないASPICのようなシステムと仮定に基づく議論を捉えていることを示します。論理に関連付けられた抽象的議論フレームワークにおけるコンパクト性と凝集性の2つの単純で自然な特性を提示し、それらが論理的閉包性と一貫性の特性を捉えていることを示します。仮定に基づく議論とASPICのようなシステムの両方において、自己矛盾公理から凝集性が自然に導かれることを示す。さらに、選好を持たないASPICのようなシステムから、自己矛盾公理を不変に保つ同等の仮定に基づくシステムへの変換を示す。

参考文献

Journal of Artificial Intelligence Resarch Vol. 49 (2014)

論文