The doctoral dissertation in the Computer Science will be examined at the Faculty of Science and Forestry.
What is the topic of your doctoral research, Nilima Shah?
Topic is novel image segmentation methods for machine vision applications. Machines are taking over human tasks, mimicking even our ability to see. A computer can see the outside world by means of a camera. Humans have the gift of vision and intelligence which makes it simple to see and identify things. But to make a computer do this has proven to be very difficult task. This is what machine-vision is about. Computers use digital image, which is a representation of a real image as a set of numbers. In order to convert the image into numbers, it is divided into small areas known as pixels. The imaging device captures the image and records numerical values for these pixels. In black and white images, this value is intensity of light. In color images it is a color value which is the combination of three primary colors: red, blue and green. Such images are called RGB images (Fig. 1). There are also non-optical images such as ultrasound or X-ray in which the intensity of sound or X-rays is recorded.
There are many advantages of digital images. We can transfer them electronically almost instantaneously, convert from one medium to another (computer to print or screen) and change them as required. A very promising use of digital images is automatic object recognition where a computer recognizes an object in a digital image. The process to recognize an object in an image at pixel level is very difficult for a computer. Instead, it should first extract larger objects. Objects in an image can be extracted by partitioning an image based on colour, shape or borders. This process is called image segmentation. These segments can be given some intermediate labels like human, sky, ground, or building and later do final object recognition. Image segmentation has an extensive use in robotics where robots equipped with camera do tasks like assembly of parts, drive vehicles, clean homes, detect and perform surgeries, detect human activity in security applications and many more.
What are the key findings or observations of your doctoral research?
We attempt to handle segmentation of generic images. The challenging part here is that the computer doesn’t know what it is looking for and has to partition the image. In other words, we segment an image without a priori knowledge. Among the many methods that exist for image segmentation, the Mumford-Shah model is extensively used. This model specifies two criteria which are used simultaneously to segment images. The first is the similarity criterion: find areas in the image which are similar. The second criterion is the boundary length: find the boundary length of these areas. We take an initial segmentation which has fragmented disjoint segments and crooked boundary. We have to obtain optimal (best) segmentation which has a smaller number of fragments and a minimum boundary. See Fig. 2. For example, a circle has minimum boundary when we consider closed loop figures. For obtaining the best result, we need to optimize the Mumford-Shah model using both the criteria i.e., iteratively try to balance the two criteria by making changes to the grouping. There has been a lot of research to optimize the Mumford-Shah model, but different approaches have one or the other problem. We have developed three novel methods to optimize the Mumford-Shah model.
How can the results of your doctoral research be utilised in practice?
Modern machine-vision methods have higher-level goals like extracting an anatomical object of interest for diagnosis of diseases, scene classification, detection of human activity in visual surveillance, assigning a name to a human face in an image, and classifying handwritten characters. A crucial step in the design of such machine-vision systems is the extraction of discriminant objects from the images. Image segmentation also has an extensive use in robotics where robots equipped with camera do tasks like assembly of parts, drive vehicles, clean homes, detect and perform surgeries, and many more. The approach we have proposed can be applied to obtain a low-level segmentation based on color features, which can be further fed to these machine-vision systems.
What are the key research methods and materials used in your doctoral research?
To optimize the Mumford-Shah model, we propose three different approaches using (1) Douglas-Rachford algorithm, (2) k-means clustering, (3) PNN clustering algorithm. To apply the similarity criterion of the Mumford-Shah model, we use methods which group the pixels into the same segment based on their RGB values. These methods are known as clustering methods. For the second criterion of the Mumford-Shah model, we develop a new and easy method for finding the boundary length of an individual pixel and in turn the boundary length of the segment. Our first method obtains initial segmentation using one of the clustering methods, k-means. The initial segmentation is then optimized using the Douglas-Rachford algorithm. This approach is somewhat complicated and slightly slow when compared to other methods. Our second method is a variant of the k-means clustering method. The k-means method obtains the grouping using only the pixel values. Here, while grouping we also consider whether including the pixel in a group increases or decreases the boundary length. Our third method is a variant of another clustering method, pairwise nearest neighbour. This method finds similar areas by merging a pair of segments, starting with a segment as small as a single pixel. The pair to be merged is the one which is the most similar in terms of pixel values among all the pairs. In our method, while finding the pair to be merged, we also consider the boundary length. The increase in boundary length of the merged pair should also be minimum. The methods were tested using a dataset containing 200 random images with labeled objects. The first two methods are dependent on initial grouping. Our third method is slower but achieves better segmentation. It does not depend on initial grouping and gives segmentation with different number of partitions.
Automated object segmentation is still an ongoing research problem as there are many difficulties when we are segmenting any real-world object. An object can be captured from many different angles and under different lighting conditions, and each such variation will produce an image that looks different to the computer. There can be also be shadows, adjacent or overlapping objects. While there is already success in specific applications, the problem of segmenting any image in general is still open.
The doctoral defence of Nilima Shah, MCA, entitled Optimizing Mumford-Shah Image Segmentation, will be examined at the Faculty of Science and Forestry on 13 December 2021 at noon. The Opponent will be Professor Tommi Kärkkäinen, University of Jyväskylä, and the Custos will be Professor Pasi Fränti, University of Eastern Finland. Language of the public defence is English. Public examination will be streamed live.
For further information, please contact:
Nilima Shah, nilima91@yahoo.com