Journal of Artificial Intelligence Resarch: Volume 23の論文一覧

Journal of Artificial Intelligence Resarch Vol. 23 (2005)に記載されている内容を一覧にまとめ、機械翻訳を交えて日本語化し掲載します。
Reinforcement Learning for Agents with Many Sensors and Actuators Acting in Categorizable Environments
分類可能な環境で動作する多数のセンサーとアクチュエータを持つエージェントのための強化学習

In this paper, we confront the problem of applying reinforcement learning to agents that perceive the environment through many sensors and that can perform parallel actions using many actuators as is the case in complex autonomous robots. We argue that reinforcement learning can only be successfully applied to this case if strong assumptions are made on the characteristics of the environment in which the learning is performed, so that the relevant sensor readings and motor commands can be readily identified. The introduction of such assumptions leads to strongly-biased learning systems that can eventually lose the generality of traditional reinforcement-learning algorithms. In this line, we observe that, in realistic situations, the reward received by the robot depends only on a reduced subset of all the executed actions and that only a reduced subset of the sensor inputs (possibly different in each situation and for each action) are relevant to predict the reward. We formalize this property in the so called ‘categorizability assumption’ and we present an algorithm that takes advantage of the categorizability of the environment, allowing a decrease in the learning time with respect to existing reinforcement-learning algorithms. Results of the application of the algorithm to a couple of simulated realistic-robotic problems (landmark-based navigation and the six-legged robot gait generation) are reported to validate our approach and to compare it to existing flat and generalization-based reinforcement-learning approaches.

本稿では、複雑な自律ロボットのように、多数のセンサーを通して環境を認識し、多数のアクチュエータを用いて並列動作を実行できるエージェントに強化学習を適用するという問題に取り組む。強化学習をこのケースにうまく適用するには、学習が行われる環境の特性について強い仮定を立て、関連するセンサーの読み取り値とモータコマンドを容易に識別できるようにする必要があることを我々は主張します。このような仮定の導入は、強い偏りのある学習システムにつながり、最終的には従来の強化学習アルゴリズムの一般性を失う可能性があります。この流れで、現実的な状況では、ロボットが受け取る報酬は実行されたすべてのアクションの縮小されたサブセットのみに依存し、報酬の予測に関連するのはセンサー入力の縮小されたサブセットのみ（状況やアクションごとに異なる可能性があります）であることを観察します。この特性をいわゆる「分類可能性仮定」で形式化し、環境の分類可能性を活用し、既存の強化学習アルゴリズムに比べて学習時間を短縮できるアルゴリズムを提示します。このアルゴリズムを、シミュレートされたいくつかの現実的なロボットの問題（ランドマークベースのナビゲーションと6足ロボットの歩行生成）に適用した結果が報告され、私たちのアプローチを検証し、既存のフラットおよび一般化ベースの強化学習アプローチと比較しています。

Keys, Nominals, and Concrete Domains
キー、名義語、そして具体的なドメイン

Many description logics (DLs) combine knowledge representation on an abstract, logical level with an interface to ‘concrete’ domains like numbers and strings with built-in predicates such as >, +, and prefix-of. These hybrid DLs have turned out to be useful in several application areas, such as reasoning about conceptual database models. We propose to further extend such DLs with key constraints that allow the expression of statements like ‘US citizens are uniquely identified by their social security number’. Based on this idea, we introduce a number of natural description logics and perform a detailed analysis of their decidability and computational complexity. It turns out that naive extensions with key constraints easily lead to undecidability, whereas more careful extensions yield NExpTime-complete DLs for a variety of useful concrete domains.

多くの記述論理(DL)は、抽象的かつ論理的なレベルでの知識表現と、>、+、prefix-ofなどの述語が組み込まれた数値や文字列などの「具体的な」ドメインへのインターフェイスを組み合わせています。これらのハイブリッドDLは、概念データベースモデルに関する推論など、いくつかのアプリケーション領域で有用であることが判明しています。我々は、このような記述論理（DL）をキー制約を用いてさらに拡張し、「米国市民は社会保障番号によって一意に識別される」といった記述を表現できるようにすることを提案します。このアイデアに基づき、いくつかの自然な記述論理を導入し、それらの決定可能性と計算複雑性を詳細に分析します。その結果、キー制約を伴う単純な拡張は容易に決定不能性につながるが、より慎重な拡張は、様々な有用な具体的領域に対してNExpTime完全な記述論理（DL）をもたらすことがわかった。

An Expressive Language and Efficient Execution System for Software Agents
ソフトウェアエージェントのための表現力豊かな言語と効率的な実行システム

Software agents can be used to automate many of the tedious, time-consuming information processing tasks that humans currently have to complete manually. However, to do so, agent plans must be capable of representing the myriad of actions and control flows required to perform those tasks. In addition, since these tasks can require integrating multiple sources of remote information ? typically, a slow, I/O-bound process ? it is desirable to make execution as efficient as possible. To address both of these needs, we present a flexible software agent plan language and a highly parallel execution system that enable the efficient execution of expressive agent plans. The plan language allows complex tasks to be more easily expressed by providing a variety of operators for flexibly processing the data as well as supporting subplans (for modularity) and recursion (for indeterminate looping). The executor is based on a streaming dataflow model of execution to maximize the amount of operator and data parallelism possible at runtime. We have implemented both the language and executor in a system called THESEUS. Our results from testing THESEUS show that streaming dataflow execution can yield significant speedups over both traditional serial (von Neumann) as well as non-streaming dataflow-style execution that existing software and robot agent execution systems currently support. In addition, we show how plans written in the language we present can represent certain types of subtasks that cannot be accomplished using the languages supported by network query engines. Finally, we demonstrate that the increased expressivity of our plan language does not hamper performance; specifically, we show how data can be integrated from multiple remote sources just as efficiently using our architecture as is possible with a state-of-the-art streaming-dataflow network query engine.

ソフトウェアエージェントは、現在人間が手作業で行わなければならない、退屈で時間のかかる情報処理タスクの多くを自動化するために使用できます。しかし、そのためには、エージェントプランが、それらのタスクを実行するために必要な無数のアクションと制御フローを表現できなければなりません。さらに、これらのタスクは複数のリモート情報源（典型的には低速でI/O依存のプロセス）の統合を必要とする場合があるため、実行を可能な限り効率的にすることが望ましいです。これらの両方のニーズに対応するため、我々は柔軟なソフトウェアエージェントプラン言語と、表現力豊かなエージェントプランの効率的な実行を可能にする高度な並列実行システムを提示します。プラン言語は、データを柔軟に処理するための多様な演算子を提供するだけでなく、サブプラン（モジュール性のため）と再帰（不確定ループのため）をサポートすることで、複雑なタスクをより容易に表現できます。エグゼキュータは、ストリーミングデータフロー実行モデルに基づいており、実行時に可能な演算子とデータの並列性を最大化します。我々は、この言語とエグゼキュータの両方をTHESEUSと呼ばれるシステムに実装しました。THESEUSのテスト結果から、ストリーミングデータフロー実行は、従来のシリアル(フォンノイマン)実行だけでなく、既存のソフトウェアおよびロボットエージェント実行システムが現在サポートしている非ストリーミングデータフロースタイルの実行よりも大幅に高速化できることが示されています。さらに、ここで紹介する言語で記述されたプランが、ネットワーククエリエンジンでサポートされている言語では実現できない特定の種類のサブタスクを表現できることも示しています。最後に、プラン言語の表現力の向上がパフォーマンスを低下させないことを示します。具体的には、最先端のストリーミングデータフローネットワーククエリエンジンと同様に、このアーキテクチャを使用して複数のリモートソースからデータを統合できることを示します。

An Improved Search Algorithm for Optimal Multiple-Sequence Alignment
最適な複数配列アライメントのための改良探索アルゴリズム

Multiple sequence alignment (MSA) is a ubiquitous problem in computational biology. Although it is NP-hard to find an optimal solution for an arbitrary number of sequences, due to the importance of this problem researchers are trying to push the limits of exact algorithms further. Since MSA can be cast as a classical path finding problem, it is attracting a growing number of AI researchers interested in heuristic search algorithms as a challenge with actual practical relevance. In this paper, we first review two previous, complementary lines of research. Based on Hirschberg’s algorithm, Dynamic Programming needs O(kN^(k-1)) space to store both the search frontier and the nodes needed to reconstruct the solution path, for k sequences of length N. Best first search, on the other hand, has the advantage of bounding the search space that has to be explored using a heuristic. However, it is necessary to maintain all explored nodes up to the final solution in order to prevent the search from re-expanding them at higher cost. Earlier approaches to reduce the Closed list are either incompatible with pruning methods for the Open list, or must retain at least the boundary of the Closed list. In this article, we present an algorithm that attempts at combining the respective advantages; like A* it uses a heuristic for pruning the search space, but reduces both the maximum Open and Closed size to O(kN^(k-1)), as in Dynamic Programming. The underlying idea is to conduct a series of searches with successively increasing upper bounds, but using the DP ordering as the key for the Open priority queue. With a suitable choice of thresholds, in practice, a running time below four times that of A* can be expected. In our experiments we show that our algorithm outperforms one of the currently most successful algorithms for optimal multiple sequence alignments, Partial Expansion A*, both in time and memory. Moreover, we apply a refined heuristic based on optimal alignments not only of pairs of sequences, but of larger subsets. This idea is not new; however, to make it practically relevant we show that it is equally important to bound the heuristic computation appropriately, or the overhead can obliterate any possible gain. Furthermore, we discuss a number of improvements in time and space efficiency with regard to practical implementations. Our algorithm, used in conjunction with higher-dimensional heuristics, is able to calculate for the first time the optimal alignment for almost all of the problems in Reference 1 of the benchmark database BAliBASE.

多重配列アライメント（MSA）は、計算生物学において普遍的な問題です。任意の数の配列に対して最適解を見つけることはNP困難ですが、この問題の重要性から、研究者は正確なアルゴリズムの限界をさらに押し広げようとしています。MSAは古典的な経路検索問題として捉えることができるため、実用的意義のある課題としてヒューリスティックな探索アルゴリズムに関心を持つAI研究者が増えています。本稿では、まずこれまでの相補的な2つの研究分野について考察します。ヒルシュバーグのアルゴリズムに基づく動的計画法では、長さNのk個のシーケンスに対して、探索フロンティアと解決経路の再構築に必要なノードの両方を格納するためにO(kN^(k-1))の空間が必要です。一方、最良優先探索には、ヒューリスティックを使用して探索する必要がある探索空間を制限できるという利点があります。ただし、探索によってノードが再度拡張されてコストが高くなるのを防ぐために、最終的な解に至るまで探索されたノードをすべて維持する必要があります。クローズドリストを削減する従来のアプローチは、オープンリストの刈り込み方法と互換性がないか、少なくともクローズドリストの境界を保持する必要があります。本稿では、それぞれの利点を組み合わせるアルゴリズムを紹介します。A*と同様に、探索空間の枝刈りにヒューリスティックを用いますが、動的計画法と同様に、最大OpenサイズとClosedサイズの両方をO(kN^(k-1))に削減します。基本的な考え方は、DP順序をOpen優先キューのキーとして用い、上限を段階的に増加させながら一連の探索を実行することです。適切な閾値を選択すれば、実際にはA*の4倍未満の実行時間が期待できます。実験では、このアルゴリズムが、現在最も成功している多重配列アライメント最適化アルゴリズムの一つであるPartial Expansion A*を、時間とメモリの両面で凌駕することを示しています。さらに、配列ペアだけでなく、より大きなサブセットの最適アライメントに基づく改良されたヒューリスティックを適用します。この考え方は新しいものではありませんが、実用化のためには、ヒューリスティック計算を適切に制限することが同様に重要であり、そうでないとオーバーヘッドによって得られるメリットが帳消しになってしまうことを示します。さらに、実用的な実装における時間と空間の効率性を向上させるためのいくつかの改善点についても考察します。高次元ヒューリスティックと組み合わせて使用される当社のアルゴリズムは、ベンチマークデータベースBAliBASEの参照1にあるほぼすべての問題に対する最適なアライメントを初めて計算できます。

Using Memory to Transform Search on the Planning Graph
メモリを用いたプランニンググラフ上の探索の変換

The Graphplan algorithm for generating optimal make-span plans containing parallel sets of actions remains one of the most effective ways to generate such plans. However, despite enhancements on a range of fronts, the approach is currently dominated in terms of speed, by state space planners that employ distance-based heuristics to quickly generate serial plans. We report on a family of strategies that employ available memory to construct a search trace so as to learn from various aspects of Graphplan?s iterative search episodes in order to expedite search in subsequent episodes. The planning approaches can be partitioned into two classes according to the type and extent of search experience captured in the trace. The planners using the more aggressive tracing method are able to avoid much of Graphplan?s redundant search effort, while planners in the second class trade off this aspect in favor of a much higher degree of freedom than Graphplan in traversing the space of ‘states’ generated during regression search on the planning graph. The tactic favored by the second approach, exploiting the search trace to transform the depth-first, IDA* nature of Graphplan?s search into an iterative state space view, is shown to be the more powerful. We demonstrate that distance-based, state space heuristics can be adapted to informed traversal of the search trace used by the second class of planners and develop an augmentation targeted specifically at planning graph search. Guided by such a heuristic, the step-optimal version of the planner in this class clearly dominates even a highly enhanced version of Graphplan. By adopting beam search on the search trace we then show that virtually optimal parallel plans can be generated at speeds quite competitive with a modern heuristic state space planner.

並列アクションセットを含む最適なメイクスパンプランを生成するGraphplanアルゴリズムは、依然としてそのようなプランを生成する最も効果的な方法の一つです。しかしながら、様々な面で改良が加えられているにもかかわらず、このアプローチは速度の点では、距離ベースのヒューリスティックを用いて逐次プランを迅速に生成する状態空間プランナーに支配されています。本稿では、利用可能なメモリを用いて探索トレースを構築し、Graphplanの反復探索エピソードの様々な側面から学習することで、後続のエピソードにおける探索を高速化する一連の戦略について報告します。これらのプランニング手法は、トレースに記録される探索経験の種類と範囲に応じて2つのクラスに分類できます。より積極的なトレース手法を用いるプランナーは、Graphplanの冗長な探索努力の多くを回避できます。一方、後者のクラスのプランナーは、この側面を犠牲にして、プランニンググラフ上の回帰探索中に生成される「状態」空間を探索する際に、Graphplanよりもはるかに高い自由度を得ています。2番目のアプローチで好まれる戦術は、Graphplanの検索の深さ優先、IDA*の性質を反復的な状態空間ビューに変換するために検索トレースを利用するもので、より強力であることが示されています。距離ベースの状態空間ヒューリスティックが、2番目のクラスのプランナーで使用される検索トレースの情報に基づいたトラバースに適応できることを実証し、特にグラフ検索の計画に特化した拡張を開発しました。このようなヒューリスティックに導かれる、このクラスのプランナーのステップ最適バージョンは、Graphplanの高度に拡張されたバージョンよりも明らかに優れています。次に、検索トレースにビーム検索を採用することで、最新のヒューリスティックな状態空間プランナーと十分に競合できる速度で、事実上最適な並列プランを生成できることを示します。

Generalizing Boolean Satisfiability III: Implementation
ブール充足可能性の一般化 III：実装

This is the third of three papers describing ZAP, a satisfiability engine that substantially generalizes existing tools while retaining the performance characteristics of modern high-performance solvers. The fundamental idea underlying ZAP is that many problems passed to such engines contain rich internal structure that is obscured by the Boolean representation used; our goal has been to define a representation in which this structure is apparent and can be exploited to improve computational performance. The first paper surveyed existing work that (knowingly or not) exploited problem structure to improve the performance of satisfiability engines, and the second paper showed that this structure could be understood in terms of groups of permutations acting on individual clauses in any particular Boolean theory. We conclude the series by discussing the techniques needed to implement our ideas, and by reporting on their performance on a variety of problem instances.

本論文は、ZAPを説明する3つの論文のうちの3番目の論文です。ZAPは、既存のツールを大幅に一般化しながらも、最新の高性能ソルバーの性能特性を維持しています。ZAPの根底にある基本的な考え方は、そのようなエンジンに渡される多くの問題が、使用されるブール表現では隠蔽される豊富な内部構造を持っているというものです。私たちの目標は、この構造が明確で、計算性能を向上させるために活用できる表現を定義することです。最初の論文では、問題構造を（意識的か否かに関わらず）充足可能性エンジンの性能向上に活用した既存の研究を調査し、2番目の論文では、この構造が、特定のブール理論における個々の節に作用する順列群の観点から理解できることを示しました。本シリーズの締めくくりとして、私たちのアイデアを実装するために必要な手法について議論し、様々な問題例におけるそれらのパフォーマンスを報告します。

On the Practical use of Variable Elimination in Constraint Optimization Problems: ‘Still-life’ as a Case Study
制約最適化問題における変数除去の実用的利用：『Still-life』を事例として

Variable elimination is a general technique for constraint processing. It is often discarded because of its high space complexity. However, it can be extremely useful when combined with other techniques. In this paper we study the applicability of variable elimination to the challenging problem of finding still-lifes. We illustrate several alternatives: variable elimination as a stand-alone algorithm, interleaved with search, and as a source of good quality lower bounds. We show that these techniques are the best known option both theoretically and empirically. In our experiments we have been able to solve the n=20 instance, which is far beyond reach with alternative approaches.

変数除去は制約処理における一般的な手法です。空間計算量が大きいため、しばしば無視されます。しかし、他の手法と組み合わせると非常に有用です。本論文では、静物画を見つけるという困難な問題への変数除去の適用可能性を検討します。いくつかの代替手法を示します。変数除去を単独のアルゴリズムとして使用する場合、探索とインターリーブする場合、そして高品質の下限値を得るための情報源として使用する場合です。これらの手法は、理論的にも経験的にも最もよく知られている選択肢であることを示します。実験では、n = 20のインスタンスを解くことができましたが、これは他のアプローチでは到底不可能です。

Extremal Behaviour in Multiagent Contract Negotiation
マルチエージェント契約交渉における極値行動

We examine properties of a model of resource allocation in which several agents exchange resources in order to optimise their individual holdings. The schemes discussed relate to well-known negotiation protocols proposed in earlier work and we consider a number of alternative notions of “rationality” covering both quantitative measures, e.g. cooperative and individual rationality and more qualitative forms, e.g. Pigou-Dalton transfers. While it is known that imposing particular rationality and structural restrictions may result in some reallocations of the resource set becoming unrealisable, in this paper we address the issue of the number of restricted rational deals that may be required to implement a particular reallocation when it is possible to do so. We construct examples showing that this number may be exponential (in the number of resources m), even when all of the agent utility functions are monotonic. We further show that k agents may achieve in a single deal a reallocation requiring exponentially many rational deals if at most k-1 agents can participate, this same reallocation being unrealisable by any sequences of rational deals in which at most k-2 agents are involved.

本稿では、複数のエージェントがそれぞれの保有資源を最適化するために資源を交換する資源配分モデルの特性を検証します。ここで論じる手法は、先行研究で提案されたよく知られた交渉プロトコルと関連しており、協調合理性や個人合理性といった定量的な尺度と、ピグー・ダルトン移転といったより定性的な形態の両方を含む「合理性」の代替概念をいくつか検討します。特定の合理性や構造的制約を課すと、資源セットの再配分が実現不可能になる場合があることは知られているが、本稿では、特定の再配分が可能な場合に、それを実現するために必要な制約付き合理的取引の数という問題を取り扱う。エージェントの効用関数がすべて単調である場合でも、この数は（資源数mに対して）指数関数的になる可能性があることを示す例を構築します。さらに、k個のエージェントが、最大でk-1個のエージェントが参加できる場合、指数関数的に多くの合理的取引を必要とする再配分を単一の取引で達成できる可能性があることを示します。この同じ再配分は、最大でk-2個のエージェントが関与する合理的取引のシーケンスでは実現できません。

Hybrid BDI-POMDP Framework for Multiagent Teaming
マルチエージェントチームのためのハイブリッドBDI-POMDPフレームワーク

Many current large-scale multiagent team implementations can be characterized as following the “belief-desire-intention” (BDI) paradigm, with explicit representation of team plans. Despite their promise, current BDI team approaches lack tools for quantitative performance analysis under uncertainty. Distributed partially observable Markov decision problems (POMDPs) are well suited for such analysis, but the complexity of finding optimal policies in such models is highly intractable. The key contribution of this article is a hybrid BDI-POMDP approach, where BDI team plans are exploited to improve POMDP tractability and POMDP analysis improves BDI team plan performance. Concretely, we focus on role allocation, a fundamental problem in BDI teams: which agents to allocate to the different roles in the team. The article provides three key contributions. First, we describe a role allocation technique that takes into account future uncertainties in the domain; prior work in multiagent role allocation has failed to address such uncertainties. To that end, we introduce RMTDP (Role-based Markov Team Decision Problem), a new distributed POMDP model for analysis of role allocations. Our technique gains in tractability by significantly curtailing RMTDP policy search; in particular, BDI team plans provide incomplete RMTDP policies, and the RMTDP policy search fills the gaps in such incomplete policies by searching for the best role allocation. Our second key contribution is a novel decomposition technique to further improve RMTDP policy search efficiency. Even though limited to searching role allocations, there are still combinatorially many role allocations, and evaluating each in RMTDP to identify the best is extremely difficult. Our decomposition technique exploits the structure in the BDI team plans to significantly prune the search space of role allocations. Our third key contribution is a significantly faster policy evaluation algorithm suited for our BDI-POMDP hybrid approach. Finally, we also present experimental results from two domains: mission rehearsal simulation and RoboCupRescue disaster rescue simulation.

現在多くの大規模マルチエージェントチーム実装は、「信念・欲求・意図」（BDI）パラダイムに従い、チームプランを明示的に表現していると言える。その有望性にもかかわらず、現在のBDIチームアプローチには、不確実性下における定量的なパフォーマンス分析のためのツールが不足しています。分散型部分観測マルコフ決定問題（POMDP）はこのような分析に適しているが、このようなモデルにおいて最適なポリシーを見つける複雑さは非常に扱いにくい。本論文の主要な貢献は、BDI-POMDPハイブリッドアプローチです。このアプローチでは、BDIチームプランを活用してPOMDPの扱いやすさを向上させ、POMDP分析によってBDIチームプランのパフォーマンスを向上させる。具体的には、BDIチームにおける基本的な問題である役割割り当て、すなわちチーム内の異なる役割にどのエージェントを割り当てるかに焦点を当てる。本論文では、3つの主要な貢献を提供します。まず、ドメインにおける将来の不確実性を考慮した役割割り当て手法について説明します。マルチエージェント役割割り当てに関するこれまでの研究では、このような不確実性に対処できていない。この目的のために、役割割り当てを分析するための新しい分散型POMDPモデルであるRMTDP（役割ベースマルコフチーム決定問題）を導入します。この手法は、RMTDPポリシー検索を大幅に削減することで扱いやすさを向上させる。特に、BDIチームプランは不完全なRMTDPポリシーを提供するが、RMTDPポリシー検索は最適な役割割り当てを探索することで、そのような不完全なポリシーのギャップを埋める。2つ目の重要な貢献は、RMTDPポリシー検索の効率をさらに向上させる新しい分解手法です。役割割り当ての探索に限定されているとはいえ、組み合わせ的には依然として多くの役割割り当てが存在し、RMTDPでそれぞれを評価して最適なものを特定することは非常に困難です。この分解手法は、BDIチームプランの構造を利用して、役割割り当ての探索空間を大幅に削減します。3つ目の重要な貢献は、BDI-POMDPハイブリッドアプローチに適した、大幅に高速化されたポリシー評価アルゴリズムです。最後に、ミッションリハーサルシミュレーションとRoboCupRescue災害救助シミュレーションの2つの領域からの実験結果も紹介します。

Learning From Labeled And Unlabeled Data: An Empirical Study Across Techniques And Domains
ラベル付きデータとラベルなしデータからの学習：複数の手法と領域を横断した実証研究

There has been increased interest in devising learning techniques that combine unlabeled data with labeled data ? i.e. semi-supervised learning. However, to the best of our knowledge, no study has been performed across various techniques and different types and amounts of labeled and unlabeled data. Moreover, most of the published work on semi-supervised learning techniques assumes that the labeled and unlabeled data come from the same distribution. It is possible for the labeling process to be associated with a selection bias such that the distributions of data points in the labeled and unlabeled sets are different. Not correcting for such bias can result in biased function approximation with potentially poor performance. In this paper, we present an empirical study of various semi-supervised learning techniques on a variety of datasets. We attempt to answer various questions such as the effect of independence or relevance amongst features, the effect of the size of the labeled and unlabeled sets and the effect of noise. We also investigate the impact of sample-selection bias on the semi-supervised learning techniques under study and implement a bivariate probit technique particularly designed to correct for such bias.

ラベルなしデータとラベル付きデータを組み合わせた学習手法、すなわち半教師あり学習の考案への関心が高まっています。しかし、私たちの知る限りでは、様々な手法や、ラベル付きデータとラベルなしデータのタイプや量が異なる場合の研究は行われていません。さらに、半教師あり学習手法に関する公開された研究のほとんどは、ラベル付きデータとラベルなしデータが同じ分布からのものであることを前提としています。ラベル付けプロセスは、ラベル付きセットとラベルなしセットのデータポイントの分布が異なるように選択バイアスと関連付けられる可能性があります。このようなバイアスを修正しないと、偏った関数近似が得られ、パフォーマンスが低下する可能性があります。本稿では、さまざまなデータセットに対するさまざまな半教師あり学習手法の実証的研究を紹介します。特徴間の独立性や関連性の影響、ラベル付きセットとラベルなしセットのサイズの影響、ノイズの影響など、さまざまな疑問に答えようとします。また、本研究で対象とする半教師あり学習手法におけるサンプル選択バイアスの影響を調査し、そのようなバイアスを補正するために特別に設計された二変量プロビット法を実装します。

Combining Knowledge- and Corpus-based Word-Sense-Disambiguation Methods
知識ベースとコーパスベースの語義曖昧性解消手法の融合

In this paper we concentrate on the resolution of the lexical ambiguity that arises when a given word has several different meanings. This specific task is commonly referred to as word sense disambiguation (WSD). The task of WSD consists of assigning the correct sense to words using an electronic dictionary as the source of word definitions. We present two WSD methods based on two main methodological approaches in this research area: a knowledge-based method and a corpus-based method. Our hypothesis is that word-sense disambiguation requires several knowledge sources in order to solve the semantic ambiguity of the words. These sources can be of different kinds— for example, syntagmatic, paradigmatic or statistical information. Our approach combines various sources of knowledge, through combinations of the two WSD methods mentioned above. Mainly, the paper concentrates on how to combine these methods and sources of information in order to achieve good results in the disambiguation. Finally, this paper presents a comprehensive study and experimental work on evaluation of the methods and their combinations.

本稿では、ある単語が複数の異なる意味を持つ場合に生じる語彙の曖昧性の解決に焦点を当てます。この特定のタスクは、一般的に語義曖昧性解消（WSD）と呼ばれます。WSDのタスクは、電子辞書を単語の定義情報源として用い、単語に正しい意味を割り当てることです。本稿では、この研究分野における2つの主要な方法論的アプローチ、すなわち知識ベース手法とコーパスベース手法に基づく2つのWSD手法を紹介します。私たちの仮説は、語義曖昧性解消には、単語の意味的曖昧性を解決するために複数の知識源が必要であるというものです。これらの知識源は、統語的情報、パラダイム的情報、統計的情報など、様々な種類があります。私たちのアプローチは、前述の2つのWSD手法を組み合わせることで、様々な知識源を統合します。本稿では主に、曖昧性解消において良好な結果を得るために、これらの手法と情報源をどのように組み合わせるかに焦点を当てます。最後に、本稿では、これらの手法とその組み合わせの評価に関する包括的な研究と実験を紹介します。

Graduality in Argumentation
議論における段階性

Argumentation is based on the exchange and valuation of interacting arguments, followed by the selection of the most acceptable of them (for example, in order to take a decision, to make a choice). Starting from the framework proposed by Dung in 1995, our purpose is to introduce ‘graduality’ in the selection of the best arguments, i.e., to be able to partition the set of the arguments in more than the two usual subsets of ‘selected’ and ‘non-selected’ arguments in order to represent different levels of selection. Our basic idea is that an argument is all the more acceptable if it can be preferred to its attackers. First, we discuss general principles underlying a ‘gradual’ valuation of arguments based on their interactions. Following these principles, we define several valuation models for an abstract argumentation system. Then, we introduce ‘graduality’ in the concept of acceptability of arguments. We propose new acceptability classes and a refinement of existing classes taking advantage of an available ‘gradual’ valuation.

議論は、相互作用する議論の交換と評価、そしてそれらの中で最も受け入れやすい議論の選択（例えば、意思決定や選択を行うため）に基づいています。1995年にDungが提案した枠組みを出発点として、本研究では、最良の議論の選択に「段階性」を導入することを目的とします。つまり、議論の集合を、通常の「選択された」議論と「選択されない」議論の2つのサブセット以上に分割し、異なる選択レベルを表現することです。私たちの基本的な考え方は、議論が攻撃者よりも好まれる場合、その議論はより受け入れやすいというものです。まず、議論の相互作用に基づく「段階的」評価の根底にある一般原則を考察します。これらの原則に従い、抽象的な議論システムのためのいくつかの評価モデルを定義します。次に、議論の受容可能性の概念に「段階性」を導入します。そして、既存の「段階的」評価を活用した、新しい受容可能性クラスと既存のクラスの改良を提案します。

Combining Spatial and Temporal Logics: Expressiveness vs. Complexity
空間論理と時間論理の融合：表現力 vs. 複雑性

In this paper, we construct and investigate a hierarchy of spatio-temporal formalisms that result from various combinations of propositional spatial and temporal logics such as the propositional temporal logic PTL, the spatial logics RCC-8, BRCC-8, S4u and their fragments. The obtained results give a clear picture of the trade-off between expressiveness and `computational realisability’ within the hierarchy. We demonstrate how different combining principles as well as spatial and temporal primitives can produce NP-, PSPACE-, EXPSPACE-, 2EXPSPACE-complete, and even undecidable spatio-temporal logics out of components that are at most NP- or PSPACE-complete.

本論文では、命題時相論理PTL、空間論理RCC-8、BRCC-8、S4u、およびそれらのフラグメントといった命題時相論理の様々な組み合わせから得られる時空間形式主義の階層を構築し、調査します。得られた結果は、この階層における表現力と「計算実現可能性」のトレードオフを明確に示す。様々な組み合わせ原理や空間・時間プリミティブを用いることで、NP完全またはPSPACE完全な構成要素から、NP完全、PSPACE完全、EXPSPACE完全、2EXPSPACE完全、さらには決定不能な時空間論理を生成できることを明らかにします。

Restricted Value Iteration: Theory and Algorithms
制約付き値反復：理論とアルゴリズム

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is inefficient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-optimal policies and we give a condition for determining whether a given belief subset would bring about savings in space and time. We also apply restricted value iteration to two interesting classes of POMDPs, namely informative POMDPs and near-discernible POMDPs.

値反復法は、POMDPの準最適ポリシーを見つけるための一般的なアルゴリズムです。しかし、信念空間全体を考慮する必要があり、多数の線形計画を解く必要があるため、非効率的です。本稿では、信念サブセットに制限された値反復法について検討します。適切に選択された信念サブセットと組み合わせることで、制限付き値反復法が準最適ポリシーを生成することを示し、特定の信念サブセットが空間と時間の節約をもたらすかどうかを判断するための条件を与える。さらに、情報提供型POMDPとほぼ識別可能なPOMDPという2つの興味深いPOMDPクラスに制限付き値反復法を適用します。

Finding Approximate POMDP solutions Through Belief Compression
信念圧縮によるPOMDP近似解の探索

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in real-world POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, low-dimensional subspace embedded in the high-dimensional belief space. Finding a good approximation to the optimal value function for only this subspace can be much easier than computing the full value function. We introduce a new method for solving large-scale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta & Schapire, 2002) to represent sparse, high-dimensional belief spaces using small sets of learned features of the belief state. We then plan only in terms of the low-dimensional belief features. By planning in this low-dimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks.

部分観測マルコフ決定過程（POMDP）に対する方策を求める標準的な価値関数アプローチは、大規模モデルでは一般的に扱いにくいと考えられています。これらのアルゴリズムの扱いにくさは、主に、信念空間全体にわたって正確で最適な方策を計算することに起因するものです。しかし、現実のPOMDP問題では、複雑な方策クラスを持つ問題であっても、良好な制御のために信念空間全体に対する最適な方策を計算する必要がないことがよくあります。制御器が経験する信念は、高次元の信念空間に埋め込まれた構造化された低次元部分空間の近傍に存在することがよくあります。この部分空間のみに対する最適価値関数の良好な近似値を見つける方が、価値関数全体を計算するよりもはるかに容易です。本稿では、信念空間の次元を削減することで大規模POMDPを解くための新しい手法を紹介します。指数族主成分分析（Collins, Dasgupta & Schapire, 2002）を用いて、スパースな高次元の信念空間を、信念状態の学習済み特徴の小さなセットを用いて表現します。次に、低次元の確信特徴のみを用いて計画します。この低次元空間で計画することで、従来の手法で扱えるモデルよりも桁違いに大規模なPOMDPモデルのポリシーを見つけることができます。本稿では、このアルゴリズムを合成問題と移動ロボットナビゲーションタスクに適用する例を示す。

関連記事