Not logged in, Join Here! or Log In Below:  
 
News Articles Search    
 


Submitted by James Matthews, posted on November 02, 2000




Image Description, by James Matthews



This is a screenshot of my edge detection/prototyping program. Basically, on the left is the originally image and on the right is an image built from prototypes.

Prototypes are generated by the program looking at the general "features" of the image - for example the program saw the a solid black was one of the prominent features and assigned the colour red to those pixel groups. Look at the black group (second from right), now notice how the left-hand wing has a band of black down the side?

I just thought you guys might be interested in this. It is a little different to the normal IOTDs. It "may" relate to some graphics programming in that you can actually recreate the image (given the right number of prototypes and prototype resolution) quite impressively by using the prototypes. This technique can be utilized with sounds too.

I'm thinking more along the lines of making a program that looks at a lot of pictures of terrain and the builds prototypes from that data - then you classify some real-world data (let's say a terrain map of England) using the prototypes and get relatively realistic results but only using perhaps 20-30 prototypes. For more details see http://www.generation5.org/vision.shtml.

- James Matthews
Generation5: http://www.generation5.org/
"...At the forefront of Artificial Intelligence..."


[prev]
Image of the Day Gallery
www.flipcode.com

[next]

 
Message Center / Reader Comments: ( To Participate in the Discussion, Join the Community )
 
Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.
 
Cyble

November 02, 2000, 03:36 AM

Could it be used for voice/character recognition ? Maybe hooking
it up a webcam so it could recognize human faces ;) or use those
human-faces as a texture map.

Excellent idea of using such program for texture grabbing for landscapes and such !

 
remo

November 02, 2000, 04:15 AM

that's a very good idea actually - i imagine it would be something a bit like a neural network travelled backwards to create the original image...

i think the problem is that you'd get a lot of artifacts however, much like you get artifacts in human drawings.

 
Cybok^Guideline

November 02, 2000, 04:58 AM

using that technique for sound is what I'm currently researching, I guess I can find some usefull info at your site, gonna check it out right away. Oh if anyone has some real good links about reducing waveforms to equaliser-bands and ther amplitude((Fast)Fourier),....my mailbox is always open:)

James->ever thought of intergrating pattern recognison, where you search for patterns in pixels, like strokes, or blocks?

 
John Jensen

November 02, 2000, 05:09 AM

What techniques are you using? Wavelets?

The pictures look good by the way, whatever you're doing it looks quite accurate.

Cheers

John

 
James Matthews

November 02, 2000, 05:39 AM

Right, lemme see if I can answer the questions, comment etc one by one. I doubt the technique is advanced enough to really be used to recognize individual faces - to recognize the fact that the pixels make up "a face" perhaps. Actually, off-topic, something else I wanted to mess with that has an application to gaming - a guy in MIT took pictures of 63 people, male and female from around the MIT lab. He found that (after scaling them so their features generally lined up) he could create *any* one of the faces by creating composites of the others (without using the original). I see a great application to games there - take shots of the game programming team and you can quickly create a HUGE number of faces for your ambient characters within the game. I find that character uniformity is one of the few things stopping some games from being really realistic.

Yeah, the artifacts thing is a little problem, but with some more processing you could probably get rid of that. What you see is a one-pass thing (the only reason it is so damn slow is because of the Windows GDI - I am planning to use Microsofts Vision SDK to speed it up).

Cybok: in a way, I suppose I am searching for blocks and strokes. It is just that the computer decides what kind of strokes and blocks to look for! I keep meaning to download some interesting pictures to test, but never got around to it...I'll try later. I have a document somewhere on DSP, I'll send it to you when I find it.

What I forgot to mention, is that the program and source code is at:

http://www.generation5.org/edgedetect.shtml

Thanks,

James.

 
vodzurk

November 02, 2000, 08:02 AM

Apparently if you break the input image into smaller sections, and apply the gradient filter thingey (been a while since I studied this), it processes the histogram for that section better. Only problem is where the sections meet.

This probably doesn't make any sense, but what the hell, it's my first post.

 
Chaoswizard

November 02, 2000, 09:47 AM

How fast is your technique? I know stuff like this is not usually meant for real-time apps, but if someone can do it in real-time, I think some very interesting effects could be developed.

Chaoswizard

 
Don Neufeld

November 02, 2000, 10:16 AM

Some thoughts on the performance of your algorithm:

GDI is rarely slow, but it is often misused, as it is in your app. I don't know what you mean by microsoft's "Vision SDK", but the solution to your problem is simple: stop calling 9 GetPixels and a SetPixel in your inner loop! Just do a GetDIBits on the bitmap and work directly on the bytes of the bitmap, then use SetDIBits to get the data back into the bitmap.

Also, you will want to provide the definition of the function CED256View::UpdateTotals before the function CED256View::OnEditEdgeDetect, as it can be inlined and will net you a further small speed increase (5%).

 
Cybok^Guideline

November 02, 2000, 11:14 AM

chaoswizard->what you mention is an understatement. When coded correctly, and running realtime, it can produce some frightining results.
I have been thinking about several applications using a variation on this technique realtime, and it has patent&buisness-potential.
That's the reason I didn't mention it:)
No, but serious, you can build 'dope shit' with it.
Sorry I can't go to deep on the subject, cauz I'm developing these ideas(nothing much concrete yet) not alone.

 
malkia/eastern.analog

November 02, 2000, 11:47 AM

As far as I know Microsoft Vision SDK is based on
ImageMagics library - www.ImageMagics.com - and
there are a lot of filters and stuff like this.
Wish it was easier for installation and use
(i think after decompression it took 56MBs and
thousands of files).

Hope some day something will be integrated into
windows for directly doing such stuff....

 
James Matthews

November 02, 2000, 01:14 PM

Oooohh, Get/SetDIBits - never saw that before :) Hey, thanks. About the Vision SDK, here is the link:

http://research.microsoft.com/projects/VisSDK/

I'll try the inline thing first. I'm one for getting the darn thing to work before working on optimizations - thanks for the comments.

It can be very handy for realtime applications, since you can go from 256 levels of detail to 9 (or whatever you wish) relatively easy. It is the typical speed for resolution trade.

Regards,

James.

 
bit64

November 02, 2000, 05:04 PM

Hey nice image.

A couple of things, Ive been thinking about while reading these posts.
What about generating a vector graphic based on the tangents of some of your more prominent curves? This would allow you to store a relatively small amount of information (the vector graphic) in memory.
Why?
Well think about this. A machine with a camera connected "looks" at several hundred images a day, processes them with your technique there, and then stores a vector graphic based from tangents to the prominent curves.
The "prototype" image is stored on the harddrive for future reference. The, using a genetic algorithm with a fitness function based on the vector graphic alone, the computer can determine what an object is in realtime just by looking at it.
Of course it wouldnt know it was a "banana" per se but it would know that it has seen that object before and perhaps how to react to it.
I dunno, just something I thought of while reading those posts.

 
zed zeek

November 03, 2000, 10:34 AM

look at the plane a diff way the computer will never know its a plane ( at least not in this lifetime 2020)!
i ve gotta admit i luv this (ie computers have shown to have less brains than monkeys) they even struggle with chess.
how long for computers think, will they ever wipe me ass? ive gotta say storiebooks r a long way from the reality

 
goltrpoat

November 03, 2000, 10:35 PM

zed, you're not very familiar with the recent developments in the field of robotics, are you :)

 
fluffy

November 04, 2000, 02:08 AM

goltr: Wow, they have robots to wipe your ass now? :)

 
bit64

November 08, 2000, 09:49 AM

zed, Im not sure I understand your response. Are you saying that a computer could not analyze that picture and recognize it as a jet?
Because it could, very easily. In my computer vision class in college we wrote programs to do just that. Give the program a picture of a pile of nails, and it could count the nails and then tell you that they were nails.

That's a pretty simple algorithm too, given further research and development, with the power of computers today, it would be possible for the computer to tell you the manufacturer of the nails, their approximate weight and the material they are made of.

As far as wiping your ass, that would be fairly easy to do. I could take my lego mindstorms and device an ass-wiping machine in just a few minutes for you, if you would like. I would probably charge the cost of the mindstorms + 500.00 if youre intersted.;)

 
This thread contains 16 messages.
 
 
Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.