Journal of Artificial Intelligence Resarch: Volume 1の論文一覧

Journal of Artificial Intelligence Resarch Vol. 1 (1993)に記載されている内容を一覧にまとめ、機械翻訳を交えて日本語化し掲載します。
The Difficulties of Learning Logic Programs with Cut

カットを用いた論理プログラム学習の難しさ

As real logic programmers normally use cut (!), an effective learning procedure for logic programs should be able to deal with it. Because the cut predicate has only a procedural meaning, clauses containing cut cannot be learned using an extensional evaluation method, as is done in most learning systems. On the other hand, searching a space of possible programs (instead of a space of independent clauses) is unfeasible. An alternative solution is to generate first a candidate base program which covers the positive examples, and then make it consistent by inserting cut where appropriate. The problem of learning programs with cut has not been investigated before and this seems to be a natural and reasonable approach. We generalize this scheme and investigate the difficulties that arise. Some of the major shortcomings are actually caused, in general, by the need for intensional evaluation. As a conclusion, the analysis of this paper suggests, on precise and technical grounds, that learning cut is difficult, and current induction techniques should probably be restricted to purely declarative logic languages.

実際の論理プログラマは通常cut (!)を使用するため、論理プログラムの効果的な学習手順はcutを扱えるはずです。cut述語は手続き的な意味しか持たないため、cutを含む節は、多くの学習システムで行われているように、外延的評価法を用いて学習することはできません。一方、（独立節の空間ではなく）可能なプログラム空間を探索することは現実的ではありません。別の解決策としては、まず正例をカバーする候補となるベースプログラムを生成し、次に適切な場所にcutを挿入することで一貫性を持たせるというものがあります。cutを含むプログラムを学習する問題はこれまで研究されておらず、これは自然で合理的なアプローチであると思われます。私たちはこの手法を一般化し、発生する問題点を調査します。いくつかの大きな欠点は、実際には、一般的に、内包的評価の必要性によって引き起こされます。結論として、本論文の分析は、正確かつ技術的な根拠に基づき、カットの学習は困難であり、現在の帰納的手法は純粋に宣言的論理言語に限定されるべきであることを示唆しています。

Software Agents: Completing Patterns and Constructing User Interfaces

ソフトウェアエージェント：パターンの完成とユーザーインターフェースの構築

To support the goal of allowing users to record and retrieve information, this paper describes an interactive note-taking system for pen-based computers with two distinctive features. First, it actively predicts what the user is going to write. Second, it automatically constructs a custom, button-box user interface on request. The system is an example of a learning-apprentice software- agent. A machine learning component characterizes the syntax and semantics of the user’s information. A performance system uses this learned information to generate completion strings and construct a user interface. Description of Online Appendix: People like to record information. Doing this on paper is initially efficient, but lacks flexibility. Recording information on a computer is less efficient but more powerful. In our new note taking softwre, the user records information directly on a computer. Behind the interface, an agent acts for the user. To help, it provides defaults and constructs a custom user interface. The demonstration is a QuickTime movie of the note taking agent in action. The file is a binhexed self-extracting archive. Macintosh utilities for binhex are available from mac.archive.umich.edu. QuickTime is available from ftp.apple.com in the dts/mac/sys.soft/quicktime.

ユーザーが情報を記録および取得できるようにするという目標をサポートするため、本稿では、2つの際立った機能を備えたペンベースのコンピューター向けの対話型メモ作成システムについて説明します。第1に、ユーザーが書き込む内容を能動的に予測します。第2に、要求に応じてカスタムのボタンボックスユーザーインターフェイスを自動的に構築します。このシステムは、学習見習いソフトウェアエージェントの一例です。機械学習コンポーネントが、ユーザー情報の構文と意味を特徴付けます。パフォーマンスシステムは、この学習した情報を使用して補完文字列を生成し、ユーザーインターフェイスを構築します。オンライン付録の説明:人々は情報を記録するのが好きです。紙に記録するのは最初は効率的ですが、柔軟性に欠けます。コンピューターに情報を記録するのは効率が悪いですが、より強力です。私たちの新しいメモ作成ソフトウェアでは、ユーザーはコンピューターに直接情報を記録します。インターフェイスの背後では、エージェントがユーザーに代わって動作します。エージェントは、ユーザーを支援するために、デフォルトを提供し、カスタムユーザーインターフェイスを構築します。デモは、メモ作成エージェントが動作しているQuickTimeムービーです。ファイルは、binhex形式の自己解凍アーカイブです。binhex用のMacintoshユーティリティはmac.archive.umich.eduから入手できます。QuickTimeはftp.apple.comのdts/mac/sys.soft/quicktimeから入手できます。

An Empirical Analysis of Search in GSAT

GSATにおける探索の実証分析

We describe an extensive study of search in GSAT, an approximation procedure for propositional satisfiability. GSAT performs greedy hill-climbing on the number of satisfied clauses in a truth assignment. Our experiments provide a more complete picture of GSAT’s search than previous accounts. We describe in detail the two phases of search: rapid hill-climbing followed by a long plateau search. We demonstrate that when applied to randomly generated 3SAT problems, there is a very simple scaling with problem size for both the mean number of satisfied clauses and the mean branching rate. Our results allow us to make detailed numerical conjectures about the length of the hill-climbing phase, the average gradient of this phase, and to conjecture that both the average score and average branching rate decay exponentially during plateau search. We end by showing how these results can be used to direct future theoretical analysis. This work provides a case study of how computer experiments can be used to improve understanding of the theoretical properties of algorithms.

我々は、命題充足可能性の近似手順であるGSATにおける探索に関する広範な研究について述べる。GSATは、真理値割り当てにおける充足節の数に対して貪欲な山登り法を実行します。我々の実験は、これまでの報告よりもGSATの探索のより完全な描写を提供します。我々は、急速な山登り法とそれに続く長いプラトー探索という2つの探索段階について詳細に述べる。ランダムに生成された3SAT問題に適用した場合、充足節の平均数と平均分岐率の両方が問題サイズに対して非常に単純なスケーリングを持つことを示す。我々の結果により、山登り法段階の長さ、この段階の平均勾配について詳細な数値的推測を行うことができ、プラトー探索中に平均スコアと平均分岐率の両方が指数関数的に減少すると推測することができます。最後に、これらの結果が将来の理論分析の方向付けにどのように使用できるかを示す。本研究は、アルゴリズムの理論的特性に対する理解を深めるためにコンピュータ実験をどのように使用できるかについてのケーススタディを提供します。

Applying GSATto Non-Clausal Formulas

GSATを非節式に適用する

In this paper we describe how to modify GSAT so that it can be applied to non-clausal formulas. The idea is to use a particular “score” function which gives the number of clauses of the CNF conversion of a formula which are false under a given truth assignment. Its value is computed in linear time, without constructing the CNF conversion itself. The proposed methodology applies to most of the variants of GSAT proposed so far.

本論文では、GSATを非節式に適用できるように修正する方法について述べる。そのアイデアは、与えられた真理値割り当てにおいて偽となる式のCNF変換の節の数を与える特定の「スコア」関数を用いることです。その値は、CNF変換自体を構築することなく、線形時間で計算されます。提案された方法論は、これまで提案されたGSATのほとんどのバリエーションに適用されます。

A Semantics and Complete Algorithm for Subsumption in the CLASSIC Description Logic

CLASSIC記述論理における包摂の意味論と完全アルゴリズム

This paper analyzes the correctness of the subsumption algorithm used in CLASSIC, a description logic-based knowledge representation system that is being used in practical applications. In order to deal efficiently with individuals in CLASSIC descriptions, the developers have had to use an algorithm that is incomplete with respect to the standard, model-theoretic semantics for description logics. We provide a variant semantics for descriptions with respect to which the current implementation is complete, and which can be independently motivated. The soundness and completeness of the polynomial-time subsumption algorithm is established using description graphs, which are an abstracted version of the implementation structures used in CLASSIC, and are of independent interest.

本論文では、実用化されている記述論理に基づく知識表現システムであるCLASSICで用いられる包摂アルゴリズムの正しさを分析します。CLASSIC記述における個体を効率的に扱うために、開発者は記述論理の標準的なモデル理論的意味論に関して不完全なアルゴリズムを使用せざるを得なかった。本論文では、現在の実装が完全であり、独立して動機付け可能な、記述のための別の意味論を提供します。多項式時間包摂アルゴリズムの健全性と完全性は、CLASSICで用いられる実装構造の抽象化バージョンであり、独立した関心事である記述グラフを用いて確立されます。

Exploring the Decision Forest: An Empirical Investigation of Occam’s Razor in Decision Tree Induction

決定森の探究：決定木誘導におけるオッカムの剃刀の実証的研究

We report on a series of experiments in which all decision trees consistent with the training data are constructed. These experiments were run to gain an understanding of the properties of the set of consistent decision trees and the factors that affect the accuracy of individual trees. In particular, we investigated the relationship between the size of a decision tree consistent with some training data and the accuracy of the tree on test data. The experiments were performed on a massively parallel Maspar computer. The results of the experiments on several artificial and two real world problems indicate that, for many of the problems investigated, smaller consistent decision trees are on average less accurate than the average accuracy of slightly larger trees.

本稿では、学習データと整合するすべての決定木を構築する一連の実験について報告します。これらの実験は、整合した決定木集合の特性と、個々の決定木の精度に影響を与える要因を理解するために実施しました。特に、ある学習データと整合する決定木のサイズと、テストデータにおける決定木の精度との関係を調査しました。実験は、超並列Masparコンピュータ上で実施しました。いくつかの人工問題と2つの現実世界の問題を対象とした実験の結果は、調査した多くの問題において、小さな整合した決定木は、わずかに大きな決定木の平均精度よりも平均的に精度が低いことを示しています。

Dynamic Backtracking

動的バックトラッキング

Because of their occasional need to return to shallow points in a search tree, existing backtracking methods can sometimes erase meaningful progress toward solving a search problem. In this paper, we present a method by which backtrack points can be moved deeper in the search space, thereby avoiding this difficulty. The technique developed is a variant of dependency-directed backtracking that uses only polynomial space while still providing useful control information and retaining the completeness guarantees provided by earlier approaches.

既存のバックトラッキング手法は、時折探索木の浅い点に戻る必要があるため、探索問題解決に向けた有意義な進捗を帳消しにしてしまうことがあります。本稿では、バックトラック点を探索空間のより深い位置に移動させることで、この問題を回避する手法を提示します。この手法は、依存性指向型バックトラッキングの変種であり、多項式空間のみを使用しながら、有用な制御情報を提供し、従来の手法で提供されていた完全性保証を維持しています。

Substructure Discovery Using Minimum Description Length and Background Knowledge

最小記述長と背景知識を用いた部分構造発見

The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by SUBDUE to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate SUBDUE’s ability to find substructures capable of compressing the original data and to discover structural concepts important to the domain. Description of Online Appendix: This is a compressed tar file containing the SUBDUE discovery system, written in C. The program accepts as input databases represented in graph form, and will output discovered substructures with their corresponding value.

興味深く反復的な部分構造を識別する能力は、構造データにおける知識の発見に不可欠な要素です。最小記述長の原則に基づくSUBDUE部分構造発見システムの新しいバージョンについて説明します。SUBDUEシステムは、元のデータを圧縮し、データ内の構造概念を表す部分構造を発見します。データ内で以前に発見された部分構造を置き換えることにより、SUBDUEを複数回実行することで、データ内の構造的規則性の階層的な記述が生成されます。SUBDUEは、計算上の制約があるが同一ではない部分構造のインスタンスを識別し、計算上の制約がある場合に2つの部分構造の近さのおおよその尺度を見つける、計算上制限のある不正確なグラフマッチを使用します。最小記述長の原則に加えて、SUBDUEでは他の背景知識を使用して、より適切な部分構造への検索を導くことができます。さまざまなドメインでの実験により、SUBDUEには、元のデータを圧縮できる部分構造を見つける能力があり、ドメインにとって重要な構造概念を発見できることが実証されています。オンライン付録の説明:これは、Cで記述されたSUBDUE発見システムを含む圧縮されたtarファイルです。プログラムは、グラフ形式で表されたデータベースを入力として受け入れ、発見された部分構造とそれに対応する値を出力します。

Learning the Past Tense of English Verbs: The Symbolic Pattern Associator vs. Connectionist Models

英語動詞の過去形の学習：記号パターン連想器 vs. コネクショニストモデル

Learning the past tense of English verbs – a seemingly minor aspect of language acquisition – has generated heated debates since 1986, and has become a landmark task for testing the adequacy of cognitive modeling. Several artificial neural networks (ANNs) have been implemented, and a challenge for better symbolic models has been posed. In this paper, we present a general-purpose Symbolic Pattern Associator (SPA) based upon the decision-tree learning algorithm ID3. We conduct extensive head-to-head comparisons on the generalization ability between ANN models and the SPA under different representations. We conclude that the SPA generalizes the past tense of unseen verbs better than ANN models by a wide margin, and we offer insights as to why this should be the case. We also discuss a new default strategy for decision-tree learning algorithms.

英語動詞の過去形の学習は、言語習得における一見些細な側面であるように思われるが、1986年以来、激しい議論を巻き起こし、認知モデリングの妥当性を検証するための画期的な課題となっています。いくつかの人工ニューラルネットワーク(ANN)が実装されており、より優れた記号モデルを求める課題が提起されています。本稿では、決定木学習アルゴリズムID3に基づく汎用記号パターンアソシエータ(SPA)を紹介します。さまざまな表現におけるANNモデルとSPAの一般化能力について、広範な直接比較を行います。SPAは、ANNモデルよりも大幅に優れた未表示動詞の過去形の一般化を実現していると結論付け、その理由について考察します。また、決定木学習アルゴリズムの新しいデフォルト戦略についても説明します。

Bias-Driven Revision of Logical Domain Theories

バイアス駆動による論理領域理論の改訂

The theory revision problem is the problem of how best to go about revising a deficient domain theory using information contained in examples that expose inaccuracies. In this paper we present our approach to the theory revision problem for propositional domain theories. The approach described here, called PTR, uses probabilities associated with domain theory elements to numerically track the “flow” of proof through the theory. This allows us to measure the precise role of a clause or literal in allowing or preventing a (desired or undesired) derivation for a given example. This information is used to efficiently locate and repair flawed elements of the theory. PTR is proved to converge to a theory which correctly classifies all examples, and shown experimentally to be fast and accurate even for deep theories.

理論修正問題とは、不正確な点を明らかにする例に含まれる情報を用いて、欠陥のある領域理論をどのように修正するのが最善かという問題です。本稿では、命題領域理論における理論修正問題へのアプローチを提示します。ここで述べるPTRと呼ばれるアプローチは、領域理論の要素に関連付けられた確率を用いて、理論における証明の「流れ」を数値的に追跡します。これにより、与えられた例における（望ましい、あるいは望ましくない）導出を許容または阻止する節またはリテラルの役割を正確に測定することができます。この情報は、理論の欠陥要素を効率的に特定し、修正するために用いられます。PTRは、すべての例を正しく分類する理論に収束することが証明されており、実験的に、深い理論であっても高速かつ正確であることが示されています。

Teleo-Reactive Programs for Agent Control

エージェント制御のためのテレオ・リアクティブ・プログラム

A formalism is presented for computing and organizing actions for autonomous agents in dynamic environments. We introduce the notion of teleo-reactive (T-R) programs whose execution entails the construction of circuitry for the continuous computation of the parameters and conditions on which agent action is based. In addition to continuous feedback, T-R programs support parameter binding and recursion. A primary difference between T-R programs and many other circuit-based systems is that the circuitry of T-R programs is more compact; it is constructed at run time and thus does not have to anticipate all the contingencies that might arise over all possible runs. In addition, T-R programs are intuitive and easy to write and are written in a form that is compatible with automatic planning and learning methods. We briefly describe some experimental applications of T-R programs in the control of simulated and actual mobile robots.

動的環境における自律エージェントの行動を計算および組織化するための形式主義が提示されます。本稿では、テレオリアクティブ(T-R)プログラムの概念を導入します。このプログラムを実行すると、エージェントのアクションのベースとなるパラメータと条件を継続的に計算するための回路が構築されます。継続的なフィードバックに加えて、T-Rプログラムはパラメータバインディングと再帰をサポートします。T-Rプログラムと他の多くの回路ベースシステムとの主な違いは、T-Rプログラムの回路がよりコンパクトであることです。回路は実行時に構築されるため、すべての実行で発生する可能性のあるすべての偶発事象を予測する必要はありません。さらに、T-Rプログラムは直感的で記述しやすく、自動計画および学習手法と互換性のある形式で記述されます。本稿では、シミュレーションおよび実際の移動ロボットの制御におけるT-Rプログラムのいくつかの実験的応用について簡単に説明します。

Decidable Reasoning in Terminological Knowledge Representation Systems

用語知識表現システムにおける決定可能推論

Terminological knowledge representation systems (TKRSs) are tools for designing and using knowledge bases that make use of terminological languages (or concept languages). We analyze from a theoretical point of view a TKRS whose capabilities go beyond the ones of presently available TKRSs. The new features studied, often required in practical applications, can be summarized in three main points. First, we consider a highly expressive terminological language, called ALCNR, including general complements of concepts, number restrictions and role conjunction. Second, we allow to express inclusion statements between general concepts, and terminological cycles as a particular case. Third, we prove the decidability of a number of desirable TKRS-deduction services (like satisfiability, subsumption and instance checking) through a sound, complete and terminating calculus for reasoning in ALCNR-knowledge bases. Our calculus extends the general technique of constraint systems. As a byproduct of the proof, we get also the result that inclusion statements in ALCNR can be simulated by terminological cycles, if descriptive semantics is adopted.

用語的知識表現システム（TKRS）は、用語言語（または概念言語）を用いた知識ベースを設計および利用するためのツールです。本研究では、現在利用可能なTKRSの機能を超えるTKRSを理論的な観点から分析します。研究対象となった新機能は、実用化においてしばしば必要とされるものであり、3つの主要な点にまとめることができます。まず、概念の一般補完、数制約、役割結合を含む、表現力の高い用語言語ALCNRについて考察します。次に、一般概念間の包含文と、特殊なケースとしての用語循環の表現を可能にします。3つ目に、ALCNR知識ベースにおける推論のための健全で完全かつ停止性のある計算法を用いて、多くの望ましいTKRS演繹サービス（充足可能性、包摂、インスタンスチェックなど）の決定可能性を証明します。この計算法は、制約システムの一般的な手法を拡張します。証明の副産物として、記述的意味論を採用すれば、ALCNRの包含文を用語循環によってシミュレートできるという結果も得られます。

A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems

市場指向プログラミング環境と分散多品種フロー問題への応用

Market price systems constitute a well-understood class of mechanisms that under certain conditions provide effective decentralization of decision making with minimal communication overhead. In a market-oriented programming approach to distributed problem solving, we derive the activities and resource allocations for a set of computational agents by computing the competitive equilibrium of an artificial economy. WALRAS provides basic constructs for defining computational market structures, and protocols for deriving their corresponding price equilibria. In a particular realization of this approach for a form of multicommodity flow problem, we see that careful construction of the decision process according to economic principles can lead to efficient distributed resource allocation, and that the behavior of the system can be meaningfully analyzed in economic terms.

市場価格システムは、特定の条件下で最小限の通信オーバーヘッドで効果的な意思決定の分散化を実現する、広く理解されているメカニズムのクラスを構成します。分散問題解決に対する市場指向プログラミングアプローチでは、人工経済の競争均衡を計算することにより、計算エージェントの集合の活動と資源配分を導出します。WALRASは、計算市場構造を定義するための基本構造と、それに対応する価格均衡を導出するためのプロトコルを提供します。このアプローチを多品目フロー問題に適用した具体的な事例では、経済原則に従って意思決定プロセスを慎重に構築することで効率的な分散資源配分を実現できること、そしてシステムの挙動を経済的な観点から有意義に分析できることがわかります。

関連記事