Wednesday, June 28, 2006
AI Community Project -- Sort Of
I actually had a dream about this tonight. The idea is not very well thought through as there are more interesting things for a guy to dream about at night if you know what I mean.
The idea is to build a community project website where everyone could feed bits of knowledge to build a base for an Artificial Intelligence.
First Stage - Object Recognition
First thing AI needs to distinguish are objects. Community members -- or just casual passers by -- could enter names of various objects. AI would periodically run search for these words on and gather a an image base. Users could help identify which pictures are relative an which are not.
By applying well established shape recognition algorythms, AI could start gathering object database and, eventually, start recognizing objects from images on its own.
Second Stage - Composite Objects and Action Recognition
When there would be a critical mass of objects that AI recognizes, the project could be taken one level up - to recognize composite objects and relating them together resulting in action recognition.
A number of possible actions could be related to the particular object. Again images could be used to give AI a clue for a visual distinction of the action.
For example:
milk pours
man runs
building collapses
By relating various objects and action info in a composite image AI could actually start recognizing what's going on in the picture.
I.e.:
girl [object] + bottle [object] + white liquid (milk) [bottle property] + glass [object] + bottle at nonvertical angle (pours) [action] + bottle close to a glass [relation]
=
girl pours milk into glass from a bottle
Third Stage - Video Recognition
Introduction of video brings even more simplicity into action recognition. As AI has not one but numerous pictures to analyze (something like 30 pictures a second).
By relating analysis results of each consecutive image (frame), AI could even easier relate what is going on.
Example #1:
Frame #1: tower [object] + angle 90 [property]
Frame #2: tower [object] + angle 89 [property]
Frame #3: tower [object] + angle 88 [property]
Result: Tower is falling.
Example #2:
Frame #1: man [object] + size X [property]
Frame #2: man [object] + size X+1 [property]
Frame #3: man [object] + size X+2 [property]
Result: A man is going towards camera.
Forth Stage - Speech Recognition and Communications
This is the part where I woke up. But I promise to figure that out until someone implements the three previous stages ;)
Promotion
In order for such project to work it should have a critical mass of contributors. Therefore media attention is a must. It should be fairly easy to reach that since many popular sources (like for example hi-tech blog ) tent to feature such crazy projects.
Btw, gizmodo, it's time to feature this site, don't you think? C'mon, you're my third favorite daily read after and .
The idea is to build a community project website where everyone could feed bits of knowledge to build a base for an Artificial Intelligence.
First Stage - Object Recognition
First thing AI needs to distinguish are objects. Community members -- or just casual passers by -- could enter names of various objects. AI would periodically run search for these words on and gather a an image base. Users could help identify which pictures are relative an which are not.
By applying well established shape recognition algorythms, AI could start gathering object database and, eventually, start recognizing objects from images on its own.
Second Stage - Composite Objects and Action Recognition
When there would be a critical mass of objects that AI recognizes, the project could be taken one level up - to recognize composite objects and relating them together resulting in action recognition.
A number of possible actions could be related to the particular object. Again images could be used to give AI a clue for a visual distinction of the action.
For example:
milk pours
man runs
building collapses
By relating various objects and action info in a composite image AI could actually start recognizing what's going on in the picture.
I.e.:
girl [object] + bottle [object] + white liquid (milk) [bottle property] + glass [object] + bottle at nonvertical angle (pours) [action] + bottle close to a glass [relation]
=
girl pours milk into glass from a bottle
Third Stage - Video Recognition
Introduction of video brings even more simplicity into action recognition. As AI has not one but numerous pictures to analyze (something like 30 pictures a second).
By relating analysis results of each consecutive image (frame), AI could even easier relate what is going on.
Example #1:
Frame #1: tower [object] + angle 90 [property]
Frame #2: tower [object] + angle 89 [property]
Frame #3: tower [object] + angle 88 [property]
Result: Tower is falling.
Example #2:
Frame #1: man [object] + size X [property]
Frame #2: man [object] + size X+1 [property]
Frame #3: man [object] + size X+2 [property]
Result: A man is going towards camera.
Forth Stage - Speech Recognition and Communications
This is the part where I woke up. But I promise to figure that out until someone implements the three previous stages ;)
Promotion
In order for such project to work it should have a critical mass of contributors. Therefore media attention is a must. It should be fairly easy to reach that since many popular sources (like for example hi-tech blog ) tent to feature such crazy projects.
Btw, gizmodo, it's time to feature this site, don't you think? C'mon, you're my third favorite daily read after and .