Concept Learning
Concept Learning: Definition: The problem is to learn a function mapping examples into two classes: positive and negative. We are given a database of examples already classified as positive or negative. Concept learning: the process of inducing a function mapping input examples into a Boolean output.
Examples:
- Classifying objects in astronomical images as stars or galaxies
- Classifying animals as vertebrates or invertebrates
Example: Classifying Mushrooms
- Class of Tasks: Predicting poisonous mushrooms
- Performance: Accuracy of classification
- Experience: Database describing mushrooms with their class
- Knowledge to learn: Function mapping mushrooms to {0,1} where 0:not-poisonous and 1:poisonous
- Representation of target knowledge: conjunction of attribute values.
- Learning mechanism: candidate-elimination
Representation of instances:
Features:
- color {red, brown, gray}
- size {small, large}
- shape {round,elongated}
- land {humid,dry}
- air humidity {low,high}
- texture {smooth, rough}
Input and Output Spaces:
- X : The space of all possible examples (input space).
- Y: The space of classes (output space).
An example in X is a feature vector X.
- For instance: X = (red,small,elongated,humid,low,rough)
- X is the cross product of all feature values.
Only a small subset of instances is available in the database of examples.
Training Examples:
D : The set of training examples.
D is a set of pairs { (x,c(x)) }, where c is the target concept. c is a subset of the universe of discourse or the set of all possible instances.
Example of D:
((red,small,round,humid,low,smooth), poisonous)
((red,small,elongated,humid,low,smooth), poisonous)
((gray,large,elongated,humid,low,rough), not-poisonous)
((red,small,elongated,humid,high,rough), poisonous)
Hypothesis Representation
Any hypothesis h is a function from X to Y
h: X -> Y
We will explore the space of conjunctions.
Special symbols:
- ? Any value is acceptable
- 0 no value is acceptable
Hypotheses Space:
The space of all hypotheses is represented by H
Let h be a hypothesis in H.
Let X be an example of a mushroom.
if h(X) = 1 then X is poisonous, otherwise X is not-poisonous
Our goal is to find the hypothesis, h*, that is very “close” to target concept c.
A hypothesis is said to “cover” those examples it classifies as positive.
Assumptions:
- We will explore the space of all conjunctions.
- We assume the target concept falls within this space.
- A hypothesis close to target concept c obtained after seeing many training examples will result in high accuracy on the set of unobserved examples. (Inductive Learning Hypothesis)