Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

For 1.) you might want to experiment with a Sobel filter, which gives you e.g. the image derivative in x direction. Then setting a threshold to get a binary image with pixels indicating strong changes in x direction. And finally counting pixels for every column to find out the positions of clear phoneme starts/ends.

For 2.) you could have something like a histogram of gradients for a given region.