Trending December 2023 # Become A Computer Vision Artist With Stanford’s Game Changing ‘Outpainting’ Algorithm (With Github Link) # Suggested January 2024 # Top 20 Popular

You are reading the article Become A Computer Vision Artist With Stanford’s Game Changing ‘Outpainting’ Algorithm (With Github Link) updated in December 2023 on the website Bellydancehcm.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Become A Computer Vision Artist With Stanford’s Game Changing ‘Outpainting’ Algorithm (With Github Link)

Overview

Stanford researchers have designed ‘Outpainting’ – an algorithm that extrapolates and extends existing images

At the core of the algorithm are GANs; these were used on a dataset of 36,500 images, with 100 held out in the validation set

You can try it out yourself using a Keras implementation that has been open sourced on GitHub

Introduction

If you are a keen follower of AVBytes, you must have read about a technique called “inpainting” (read up on that in case you haven’t yet, it’s really worth it). It is a popular computer vision technique that aims to restore missing parts in an image and has produced some exquisite results, as you will see in that article. Current state-of-the-art methods for inpainting involve GANs (Generative Adversarial Networks) and CNNs (Convolutional Neural Networks).

It was only a matter of time before someone from the ML community figured out a technique that goes beyond the scope of inpainting. This breakthrough has come from a couple of Stanford researchers, Mark Sabini and Gili Rusak, and the new technique is appropriately named “outpainting”.

This approach extends the use of GANs for inpainting to estimate and imagine what the existing image might look like beyond what can be seen. Then the algorithm expands the image and paints what it has estimated – and the results, as you can see in the image below, are truly astounding.

For the dataset, the researchers used 36,500 images of 256×256 size, which were downsampled to 128×128. 100 images were held out for the validation set.

Even the research paper for outpainting has been written in a user-friendly format. Instead of the usual page after page of theory, the paper is of just 2 pages – one which lists down how the technique was derived and how it works, and the second which contains a list of references. Check out the image of the first page below which lists down a step-by-step approach for designing and executing outpainting:

Wondering how to implement this on your own? Wonder no more – use this GitHub repository as your stepping stone. It is a Keras implementation of outpainting in Python. It gives you the option to either build your model from scratch or use the pertained model the creator has uploaded. Get started now!

Our take on this

What an awesome concept! If this doesn’t get your interest in computer vision going, I don’t know what will. Take this course to learn all about computer vision with deep learning, and get started on your path toward becoming a CV expert!

For the Keras model, there’s a caveat here – as you’ll read in this Reddit discussion thread, there’s a chance that the Keras model was overfitted. The model was trained on images that were present in the training set itself so it was able to convincingly extrapolate the generated image. The model still did fairly well when tested on unseen data, but not as well as first imagined. But don’t let that dissuade you! The Stanford technique is still solid, and there will be far more refined frameworks coming soon using outpainting. Hope to see one from our Analytics Vidhya community as well!

Subscribe to AVBytes here

Related

You're reading Become A Computer Vision Artist With Stanford’s Game Changing ‘Outpainting’ Algorithm (With Github Link)

Computer Vision Using Opencv – With Practical Examples

This article was published as a part of the Data Science Blogathon

Hello Readers!!

OpenCV is a very famous library for computer vision and image processing tasks. It one of the most used pythons open-source library for computer vision and image data.

It is used in various tasks such as image denoising, image thresholding, edge detection, corner detection, contours, image pyramids, image segmentation, face detection and many more. If you want to know more about OpenCV, check this link.

📌 If you want to know about Python Libraries For Image Processing, then check this Link.

📌If you want to learn Image processing using NumPy, check this link.

                                 Image Source

Table of Contents

IMPORT LIBRARIES

RGB IMAGE AND RESIZING

GRAYSCALE IMAGE

IMAGE DENOISING

IMAGE THRESHOLDING

IMAGE GRADIENTS

EDGE DETECTION FOURIER TRANSFORM ON IMAGE

LINE TRANSFORM

CORNER DETECTION

MORPHOLOGICAL TRANSFORMATION OF IMAGE

GEOMETRIC TRANSFORMATION OF IMAGE

CONTOURS

IMAGE PYRAMIDS

COLORSPACE CONVERSION AND OBJECT TRACKING

INTERACTIVE FOREGROUND EXTRACTION

IMAGE SEGMENTATION

IMAGE INPAINTING

TEMPLATE MATCHING

FACE AND EYE DETECTION

IMPORT LIBRARIES 

Import all the required libraries using the below commands:

import os import numpy as np import cv2 import matplotlib.pyplot as plt %matplotlib inline RGB IMAGE AND RESIZING 

An RGB image where RGB indicates Red, Green, and Blue respectively can be considered as three images stacked on top of each other. It also has a nickname called ‘True Color Image’ as it represents a real-life image as close as possible and is based on human perception of colours.

The RGB colour model is used to display images on cameras, televisions, and computers.

Resizing all images to a particular height and width will ensure uniformity and thus makes processing them easier since images are naturally available in different sizes.

If the size is reduced, though the processing is faster, data might be lost in the image. If the size is increased, the image may appear fuzzy or pixelated. Additional information is usually filled in using interpolation.

height = 224 width = 224 font_size = 20 plt.figure(figsize=(15, 8)) for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, cv2.IMREAD_COLOR) resized_img = cv2.resize(img, (height, width)) plt.subplot(1, 2, i+1).set_title(name[ : -4], fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB)) plt.show()

GRAYSCALE IMAGE 

Grayscale images are images that are shades of grey. It represents the degree of luminosity and carries the intensity information of pixels in the image. Black is the weakest intensity and white is the strongest intensity.

Grayscale images are efficient as they are simpler and faster than colour images during image processing.

plt.figure(figsize=(15, 8)) for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, 0) resized_img = cv2.resize(img, (height, width)) plt.subplot(1, 2, i + 1).set_title(f'Grayscale {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(resized_img, cmap='gray') plt.show()

IMAGE DENOISING for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, cv2.IMREAD_COLOR) resized_img = cv2.resize(img, (height, width)) denoised_img = cv2.medianBlur(resized_img, 5) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title(f'Original {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB)) plt.subplot(1, 2, 2).set_title(f'After Median Filtering of {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(denoised_img, cv2.COLOR_BGR2RGB)) plt.show() IMAGE THRESHOLDING

Image Thresholding is self-explanatory. If the pixel value in an image is above a certain threshold, a particular value is assigned and if it is below the threshold, another particular value is assigned.

Adaptive Thresholding does not have global threshold values. Instead, a threshold is set for a small region of the image. Hence, there are different thresholds for the entire image and they produce greater outcomes for dissimilar illumination. There are different Adaptive Thresholding methods

for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, 0) resized_img = cv2.resize(img, (height, width)) denoised_img = cv2.medianBlur(resized_img, 5) th = cv2.adaptiveThreshold(denoised_img, maxValue = 255, adaptiveMethod = cv2.ADAPTIVE_THRESH_GAUSSIAN_C, thresholdType = cv2.THRESH_BINARY, blockSize = 11, C = 2) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title(f'Grayscale {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(resized_img, cmap = 'gray') plt.subplot(1, 2, 2).set_title(f'After Adapative Thresholding of {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(th, cv2.COLOR_BGR2RGB)) plt.show() IMAGE GRADIENTS for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, 0) resized_img = cv2.resize(img, (height, width)) laplacian = cv2.Laplacian(resized_img, cv2.CV_64F) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title(f'Grayscale {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(resized_img, cmap = 'gray') plt.subplot(1, 2, 2).set_title(f'After finding Laplacian Derivatives of {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(laplacian.astype('float32'), cv2.COLOR_BGR2RGB)) plt.show() EDGE DETECTION

Edge Detection is performed using Canny Edge Detection which is a multi-stage algorithm. The stages to achieve edge detection are as follows. Noise Reduction – Smoothen image using Gaussian filter

Find Intensity Gradient – Using the Sobel kernel, find the first derivative in the horizontal (Gx) and vertical (Gy) directions.

for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, 0) resized_img = cv2.resize(img, (height, width)) edges = cv2.Canny(resized_img, threshold1 = 100, threshold2 = 200) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title(f'Grayscale {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(resized_img, cmap = 'gray') plt.subplot(1, 2, 2).set_title(f'After Canny Edge Detection of {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(edges, cv2.COLOR_BGR2RGB)) FOURIER TRANSFORM ON IMAGE 

Fourier Transform analyzes the frequency characteristics of an image. Discrete Fourier Transform is used to find the frequency domain.

Fast Fourier Transform (FFT) calculates the Discrete Fourier Transform. Frequency is higher usually at the edges or wherever noise is present. When FFT is applied to the image, the high frequency is mostly in the corners of the image. To bring that to the centre of the image, it is shifted by N/2 in both horizontal and vertical directions.

Finally, the magnitude spectrum of the outcome is achieved. Fourier Transform is helpful in object detection as each object has a distinct magnitude spectrum

for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, 0) resized_img = cv2.resize(img, (height, width)) freq = np.fft.fft2(resized_img) freq_shift = np.fft.fftshift(freq) magnitude_spectrum = 20 * np.log(np.abs(freq_shift)) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title(f'Grayscale {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB)) plt.subplot(1, 2, 2).set_title(f'Magnitude Spectrum of {name[ : -4]} Image', fontsize = font_size); plt.axis('off') plt.imshow(magnitude_spectrum, cmap = 'gray') plt.show() LINE TRANSFORM 

Hough Transform can detect any shape even if it is distorted when presented in mathematical form. A line in the cartesian coordinate system y = mx + c can be put in its polar coordinate system as rho = xcosθ + ysinθ. rho is the perpendicular distance from the origin to the line and θ is the angle formed by the horizontal axis and the perpendicular line in the clockwise direction.

So, the line is represented in these two terms (rho, θ). An array is created for these two terms where rho forms the rows and θ forms the columns. This is called the accumulator. rho is the distance resolution of the accumulator in pixels and θ is the angle resolution of the accumulator in radians.

For every line, its (x, y) values can be put into (rho, θ) values. For every (rho, θ) pair, the accumulator is incremented. This is repeated for every point on the line. A particular (rho, θ) cell is voted for the presence of a line.

This way the cell with the maximum votes implies a presence of a line at rho distance from the origin and at angle θ degrees.

min_line_length = 100 max_line_gap = 10 img = cv2.imread('../input/cv-images/hough-min.png') resized_img = cv2.resize(img, (height, width)) img_copy = resized_img.copy() edges = cv2.Canny(resized_img, threshold1 = 50, threshold2 = 150) lines = cv2.HoughLinesP(edges, rho = 1, theta = chúng tôi / 180, threshold = 100, minLineLength = min_line_length, maxLineGap = max_line_gap) for line in lines: for x1, y1, x2, y2 in line: hough_lines_img = cv2.line(resized_img ,(x1,y1),(x2,y2),color = (0,255,0), thickness = 2) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title('Original Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB)) plt.subplot(1, 2, 2).set_title('After Hough Line Transformation', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(hough_lines_img, cv2.COLOR_BGR2RGB)) plt.show()

CORNER DETECTION

Harris Corner finds the difference in intensity for a displacement in all directions to detect a corner.

img = cv2.imread('../input/cv-images/corners-min.jpg') resized_img = cv2.resize(img, (height, width)) img_copy = resized_img.copy() gray = cv2.cvtColor(resized_img,cv2.COLOR_BGR2GRAY) gray = np.float32(gray) corners = cv2.cornerHarris(gray, blockSize = 2, ksize = 3, k = 0.04) corners = cv2.dilate(corners, None) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title('Original Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB)) plt.subplot(1, 2, 2).set_title('After Harris Corner Detection', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB)) plt.show()

MORPHOLOGICAL TRANSFORMATION OF IMAGE 

Morphological Transformation is usually applied on binary images where it takes an image and a kernel which is a structuring element as inputs. Binary images may contain imperfections like texture and noise.

These transformations help in correcting these imperfections by accounting for the form of the image

kernel = np.ones((5,5), np.uint8) plt.figure(figsize=(15, 8)) img = cv2.imread('../input/cv-images/morph-min.jpg', cv2.IMREAD_COLOR) resized_img = cv2.resize(img, (height, width)) morph_open = cv2.morphologyEx(resized_img, cv2.MORPH_OPEN, kernel) morph_close = cv2.morphologyEx(morph_open, cv2.MORPH_CLOSE, kernel) plt.subplot(1,2,1).set_title('Original Digit - 7 Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB)) plt.subplot(1,2,2).set_title('After Morphological Opening and Closing of Digit - 7 Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(morph_close, cv2.COLOR_BGR2RGB)) plt.show()

GEOMETRIC TRANSFORMATION OF IMAGE 

Geometric Transformation of images is achieved by two transformation functions namely cv2.warpAffine and cv2.warpPerspective that receive a 2×3 and 3×3 transformation matrix respectively.

pts1 = np.float32([[1550, 1170],[2850, 1370],[50, 2600],[1850, 3450]]) pts2 = np.float32([[0,0],[4160,0],[0,3120],[4160,3120]]) img = cv2.imread('../input/cv-images/book-min.jpg', cv2.IMREAD_COLOR) transformation_matrix = cv2.getPerspectiveTransform(pts1, pts2) final_img = cv2.warpPerspective(img, M = transformation_matrix, dsize = (4160, 3120)) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) img = cv2.resize(img, (256, 256)) final_img = cv2.cvtColor(final_img, cv2.COLOR_BGR2RGB) final_img = cv2.resize(final_img, (256, 256)) plt.figure(figsize=(15, 8)) plt.subplot(1,2,1).set_title('Original Book Image', fontsize = font_size); plt.axis('off') plt.imshow(img) plt.subplot(1,2,2).set_title('After Perspective Transformation of Book Image', fontsize = font_size); plt.axis('off') plt.imshow(final_img) plt.show()

CONTOURS 

Contours are outlines representing the shape or form of objects in an image. They are useful in object detection and recognition. Binary images produce better contours. There are separate functions for finding and drawing contours.

plt.figure(figsize=(15, 8)) img = cv2.imread('contours-min.jpg', cv2.IMREAD_COLOR) resized_img = cv2.resize(img, (height, width)) contours_img = resized_img.copy() img_gray = cv2.cvtColor(resized_img,cv2.COLOR_BGR2GRAY) ret,thresh = cv2.threshold(img_gray, thresh = 127, maxval = 255, type = cv2.THRESH_BINARY) contours, hierarchy = cv2.findContours(thresh, mode = cv2.RETR_TREE, method = cv2.CHAIN_APPROX_NONE) cv2.drawContours(contours_img, contours, contourIdx = -1, color = (0, 255, 0), thickness = 2) plt.subplot(1,2,1).set_title('Original Image', fontsize = font_size); plt.axis('off') plt.imshow(resized_img) plt.subplot(1,2,2).set_title('After Finding Contours', fontsize = font_size); plt.axis('off') plt.imshow(contours_img) plt.show()

IMAGE PYRAMIDS 

Images have a resolution which is the measure of the information in the image. In certain scenarios of image processing like Image Blending, working with images of different resolutions is necessary to make the blend look more realistic.

In OpenCV, images of high resolution can be converted to low resolution and vice-versa. By converting a higher-level image to a lower-level image, the lower-level image becomes 1/4th the area of the higher-level image.

When this is done for a number of iterations and the resultant images are placed next to each other in order, it looks like it is forming a pyramid and hence its name ‘Image Pyramid’

R = cv2.imread('GR-min.jpg', cv2.IMREAD_COLOR) R = cv2.resize(R, (224, 224)) H = cv2.imread('../input/cv-images/H-min.jpg', cv2.IMREAD_COLOR) H = cv2.resize(H, (224, 224)) G = R.copy() guassian_pyramid_c = [G] for i in range(6): G = cv2.pyrDown(G) guassian_pyramid_c.append(G) G = H.copy() guassian_pyramid_d = [G] for i in range(6): G = cv2.pyrDown(G) guassian_pyramid_d.append(G) laplacian_pyramid_c = [guassian_pyramid_c[5]] for i in range(5, 0, -1): GE = cv2.pyrUp(guassian_pyramid_c[i]) L = cv2.subtract(guassian_pyramid_c[i-1], GE) laplacian_pyramid_c.append(L) laplacian_pyramid_d = [guassian_pyramid_d[5]] for i in range(5,0,-1): guassian_expanded = cv2.pyrUp(guassian_pyramid_d[i]) L = cv2.subtract(guassian_pyramid_d[i-1], guassian_expanded) laplacian_pyramid_d.append(L) laplacian_joined = [] for lc,ld in zip(laplacian_pyramid_c, laplacian_pyramid_d): r, c, d = lc.shape lj = np.hstack((lc[:, 0 : int(c / 2)], ld[:, int(c / 2) :])) laplacian_joined.append(lj) laplacian_reconstructed = laplacian_joined[0] for i in range(1,6): laplacian_reconstructed = cv2.pyrUp(laplacian_reconstructed) laplacian_reconstructed = cv2.add(laplacian_reconstructed, laplacian_joined[i]) direct = np.hstack((R[ : , : int(c / 2)], H[ : , int(c / 2) : ])) plt.figure(figsize=(30, 20)) plt.subplot(2,2,1).set_title('Golden Retriever', fontsize = 35); plt.axis('off') plt.imshow(cv2.cvtColor(R, cv2.COLOR_BGR2RGB)) plt.subplot(2,2,2).set_title('Husky', fontsize = 35); plt.axis('off') plt.imshow(cv2.cvtColor(H, cv2.COLOR_BGR2RGB)) plt.subplot(2,2,3).set_title('Direct Joining', fontsize = 35); plt.axis('off') plt.imshow(cv2.cvtColor(direct, cv2.COLOR_BGR2RGB)) plt.subplot(2,2,4).set_title('Pyramid Blending', fontsize = 35); plt.axis('off') plt.imshow(cv2.cvtColor(laplacian_reconstructed, cv2.COLOR_BGR2RGB)) plt.show()

COLORSPACE CONVERSION AND OBJECT TRACKING 

Colourspace Conversion, BGR↔Gray, BGR↔HSV conversions are possible. The BGR↔Gray conversion was previously seen. HSV stands for Hue, Saturation, and Value respectively.

Since HSV describes images in terms of their hue, saturation, and value instead of RGB where R, G, B are all co-related to colour luminance, object discrimination is much easier with HSV images than RGB images.

lower_white = np.array([0, 0, 150]) upper_white = np.array([255, 255, 255]) img = cv2.imread('../input/cv-images/color_space_cat.jpg', cv2.IMREAD_COLOR) img = cv2.resize(img, (height, width)) background = cv2.imread("../input/cv-images/galaxy.jpg", cv2.IMREAD_COLOR) background = cv2.resize(background, (height, width)) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) mask = cv2.inRange(hsv_img, lowerb = lower_white, upperb = upper_white) final_img = cv2.bitwise_and(img, img, mask = mask) final_img = np.where(final_img == 0, background, final_img) plt.figure(figsize=(15, 8)) plt.subplot(1,2,1).set_title('Original Cat Image', fontsize = font_size); plt.axis('off') plt.imshow(img) plt.subplot(1,2,2).set_title('After Object Tracking using Color-space Conversion of Cat Image', fontsize = font_size); plt.axis('off') plt.imshow(final_img) plt.show()

INTERACTIVE FOREGROUND EXTRACTION 

The foreground of the image is extracted using user input and the Gaussian Mixture Model (GMM).

img = cv2.imread('Cat.jpg', cv2.IMREAD_COLOR) img = cv2.resize(img, (height, width)) img_copy = img.copy() mask = np.zeros(img.shape[ : 2], np.uint8) background_model = np.zeros((1,65),np.float64) foreground_model = np.zeros((1,65),np.float64) rect = (10, 10, 224, 224) cv2.grabCut(img, mask = mask, rect = rect, bgdModel = background_model, fgdModel = foreground_model, iterCount = 5, mode = cv2.GC_INIT_WITH_RECT) img = img * new_mask[:, :, np.newaxis] plt.figure(figsize=(15, 8)) plt.subplot(1,2,1).set_title('Original Cat Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB)) plt.subplot(1,2,2).set_title('After Interactive Foreground Extraction of Cat Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) plt.show()

IMAGE SEGMENTATION

Image Segmentation is done using the Watershed Algorithm. This algorithm treats the grayscale image as hills and valleys representing high and low-intensity regions respectively. If these valleys are filled with coloured water and as the water rises, depending on the peaks, different valleys with different coloured water will start to merge.

To avoid this, barriers can be built which gives the segmentation result. This is the concept of the Watershed algorithm. This is an interactive algorithm as one can specify which pixels belong to an object or background. The pixels that one is unsure about can be marked as 0. Then the watershed algorithm is applied on this where it updates the labels given and all the boundaries are marked as -1

img = cv2.imread('lymphocytes-min.jpg', cv2.IMREAD_COLOR) resized_img = cv2.resize(img, (height, width)) img_copy = resized_img.copy() gray = cv2.cvtColor(resized_img, cv2.COLOR_BGR2GRAY) ret, thresh = cv2.threshold(gray, thresh = 0, maxval = 255, type = cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) opening = cv2.morphologyEx(thresh, op = cv2.MORPH_OPEN, kernel = kernel, iterations = 2) background = cv2.dilate(opening, kernel = kernel, iterations = 5) dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5) ret, foreground = cv2.threshold(dist_transform, thresh = 0.2 * dist_transform.max(), maxval = 255, type = cv2.THRESH_BINARY) foreground = np.uint8(foreground) unknown = cv2.subtract(background, foreground) ret, markers = cv2.connectedComponents(foreground) markers = markers + 1 markers[unknown == 255] = 0 markers = cv2.watershed(resized_img, markers) resized_img[markers == -1] = [0, 0, 255] plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title('Lymphocytes Image', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB)) plt.subplot(1, 2, 2).set_title('After Watershed Algorithm', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB)) plt.show()

IMAGE INPAINTING 

Images may be damaged and require fixing. For example, an image may have no pixel information in certain portions. Image Inpainting will fill all the missing information with the help of the surrounding pixels.

mask = cv2.imread('mask.png',0) mask = cv2.resize(mask, (height, width)) for i, path in enumerate(paths): name = os.path.split(path)[-1] img = cv2.imread(path, cv2.IMREAD_COLOR) resized_img = cv2.resize(img, (height, width)) ret, th = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY) inverted_mask = cv2.bitwise_not(th) damaged_img = cv2.bitwise_and(resized_img, resized_img, mask = inverted_mask) result = cv2.inpaint(resized_img, mask, inpaintRadius = 3, flags = cv2.INPAINT_TELEA) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title(f'Damaged Image of {name[ : -4]}', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(damaged_img, cv2.COLOR_BGR2RGB)) plt.subplot(1, 2, 2).set_title(f'After Image Inpainting of {name[ : -4]}', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB)) plt.show()

TEMPLATE MATCHING 

Template Matching matches the template provided to the image in which the template must be found. The template is compared to each patch of the input image. This is similar to a 2D convolution operation. It results in a grayscale image where each pixel denotes the similarity of the neighbourhood pixels to that of the template.

From this output, the maximum/minimum value is determined. This can be regarded as the top-left corner coordinates of the rectangle. By also considering the width and height of the template, the resultant rectangle is the region of the template in the image.

w, h, c = template.shape method = eval('cv2.TM_CCOEFF') result = cv2.matchTemplate(img, templ = template, method = method) min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result) top_left = max_loc bottom_right = (top_left[0] + w, top_left[1] + h) cv2.rectangle(img, top_left, bottom_right, color = (255, 0, 0), thickness = 3) plt.figure(figsize=(30, 20)) plt.subplot(2, 2, 1).set_title('Image of Selena Gomez and Taylor Swift', fontsize = 35); plt.axis('off') plt.imshow(cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB)) plt.subplot(2, 2, 2).set_title('Face Template of Selena Gomez', fontsize = 35); plt.axis('off') plt.imshow(cv2.cvtColor(template, cv2.COLOR_BGR2RGB)) plt.subplot(2, 2, 3).set_title('Matching Result', fontsize = 35); plt.axis('off') plt.imshow(result, cmap = 'gray') plt.subplot(2, 2, 4).set_title('Detected Face', fontsize = 35); plt.axis('off') plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) plt.show()

FACE AND EYE DETECTION

It is done by using Haar Cascades. Check the below code for face and eye detection:

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml') eye_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_eye.xml') img = cv2.imread('../input/cv-images/elon-min.jpg') img = cv2.resize(img, (height, width)) img_copy = img.copy() gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) faces = face_cascade.detectMultiScale(gray, scaleFactor = 1.3, minNeighbors = 5) for (fx, fy, fw, fh) in faces: img = cv2.rectangle(img, (fx, fy), (fx + fw, fy + fh), (255, 0, 0), 2) roi_gray = gray[fy:fy+fh, fx:fx+fw] roi_color = img[fy:fy+fh, fx:fx+fw] eyes = eye_cascade.detectMultiScale(roi_gray) for (ex, ey, ew, eh) in eyes: cv2.rectangle(roi_color, (ex, ey), (ex + ew, ey + eh), (0, 255, 0), 2) plt.figure(figsize=(15, 8)) plt.subplot(1, 2, 1).set_title('Elon Musk', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(img_copy, cv2.COLOR_BGR2RGB)) plt.subplot(1, 2, 2).set_title('Elon Musk - After Face and Eyes Detections', fontsize = font_size); plt.axis('off') plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) plt.show()

End Notes

So in this article, we had a detailed discussion on Computer Vision Using OpenCV. Hope you learn something from this blog and it will help you in the future. Thanks for reading and your patience. Good luck!

You can check my articles here: Articles

Email id: [email protected]

Connect with me on LinkedIn: LinkedIn

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion

Related

Performing Computer Vision Task With Opencv And Python

This article was published as a part of the Data Science Blogathon

Introduction

Data is often defined as raw facts. Information refers to any amounts or quantities of data that have undergone processing and holds more value than the raw facts themselves. The key difference is that decisions can be taken, and actions made, based upon the studying of Information- You should not base decisions and actions on data, Only information.

Source: StudiousGuy

Recap

Following our previous article on Computer Vision, we are now going to further explore the world of Computer Vision in the Python Programming Language, using the OpenCV Python package. This article will show us how to perform a few more of the many operations offered by OpenCV in the Python Programming Language. In the previous article, we examined the below block of code and will now look at a few more aspects to it, in this article.

Python Code:



Source: Medium

Understanding CV Basics

As you may have seen, in our previous article, we have read (loaded) the OpenCV Logo image into our system memory, using the OpenCV library’s built-in method, imread(). Upon utilizing this method, we passed in two arguments, namely, a filename and a flag. The filename specified the name and location of the file on your personal computer, while the flag could be seen as the color setting for the image. Upon recollection, we remember that we read the image into our memory in a GRAYSCALE color format.

Now, the most crucial aspect to understanding OpenCV:

Images are data. When you use the OpenCV imread() method, you are converting that raw image data into another datatype. The new datatype is one that all of us on Analytics Vidhya are very familiar with and that is A NumPy Array Comprising Integers. Each element in the array represents pixel color intensity and may have one or more elements within it. Since we have loaded our array in GRAYSCALE color format, one will find that each pixel is represented as a single value that may take on a value equal to, and ranging from, zero (0) to two-hundred and fifty-five (255). As one moves from 0 to higher values, the intensity of a specific pixel increases, hence making it more striking to the eye.

255 = The Color White.

Essentially what I am trying to convey is as below(I have omitted the source as it is the same as the previous article, and also to allow for the flow of information):

We started with the image below:

Next, we returned the image in GRAYSCALE and obtained the image in a new format as below:

Now, we shall print the contents of the variable which is storing our GRAYSCALE image:

# variable image stored our GRAYSCALE image print(image)

-we receive output to the above code as follows:

Now, we shall print the type of the variable image:

print(type(image))

The output will be seen as follows:

Fundamentally, OpenCV has transformed our image into a NumPy array, in which there are values from 0 to 255 representing pixel intensity, that correspond to the colors we see in the GRAYSCALE image. Remember GRAYSCALE images will always return an array in which each pixel has a single value that ranges from 0 (Black) to 255 (White).

Returning The Shape of The Array.

Let us print the shape of the NumPy array to the console.

print(image.shape)

The output will be seen as follows:

Our array has 600 rows and 487 columns. In image terminology, one would say that the image has the dimensions 600 pixels (height), by 487 pixels (width).

Printing The Image Using Pixels

Since OpenCV has transformed our image pixels into a NumPy array with integers, we may perform NumPy operations on the array containing image pixel values, and manipulate the array.

Our array is 2-Dimensional. This means it has rows and columns. Let us perform a bit of indexing and slicing on the array, and return the contents.

cv2.imshow("AV", image[0:100]) cv2.waitKey() cv2.destroyAllWindows()

If one is familiar with the technique of indexing and slicing, one will be able to see that we are attempting to slice a portion of our image (NumPy) array. Again, it is crucial to understand and be conscious of the fact that OpenCV Library in Python Programming Language represents its images and associated objects as NumPy nd-Arrays.

Source: Indian AI Production.

Code Explanation.

Line-by-line explanations for the above block of code are as follows:

cv2.imshow(“Analytics Vidhya Computer Vision “, image[0:100])

The imshow() method is used to display an image to the screen employing a GUI. However, in this particular instance we passed in a name for the GUI window, and only a portion of the pixel array, using slicing. Specifically, we wish to return the first 100 rows (height) from the image.

cv2.waitKey()

This will wait infinitely for the GUI window to be closed- i.e., user action/interaction will close this window. You may pass an integer value as an argument representing the duration (in milliseconds), the GUI window should wait before terminating automatically.

cv2.destroyAllWindows()

The above line of code will terminate all active/open OpenCV GUI windows. You may pass in the name of a specific GUI window to terminate as a string.

The output will be seen as follows:

Thus, we have successfully returned the first 100 rows of pixels from our image. Feel free to experiment with the image, and look at whether the pixel at a particular position, matches that of the color found on the image itself.

This concludes my article on Computer Vision With Python. I do hope that you enjoyed reading through this article, and have learned a new concept.

Please feel free to connect with me on LinkedIn.

Thank you for your time.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Related

Computer Vision Tutorial: A Step

19 minutes

⭐⭐⭐⭐⭐

Rating: 5 out of 5.

Introduction

What’s the first thing you do when you’re attempting to cross the road? We typically look left and right, take stock of the vehicles on the road, and make our decision. Our brain is able to analyze, in a matter of milliseconds, what kind of vehicle (car, bus, truck, auto, etc.) is coming towards us. Can machines do that?

Now, there are multiple ways of dealing with computer vision challenges. The most popular approach I have come across is based on identifying the objects present in an image, aka, object detection. But what if we want to dive deeper? What if just detecting objects isn’t enough – we want to analyze our image at a much more granular level?

As data scientists, we are always curious to dig deeper into the data. Asking questions like these is why I love working in this field!

In this article, I will introduce you to the concept of image segmentation. It is a powerful computer vision algorithm that builds upon the idea of object detection and takes us to a whole new level of working with image data. This technique opens up so many possibilities – it has blown my mind.

What Is Image Segmentation?

Let’s understand image segmentation using a simple example. Consider the below image:

There’s only one object here – a dog. We can build a straightforward cat-dog classifier model and predict that there’s a dog in the given image. But what if we have both a cat and a dog in a single image?

We can train a multi-label classifier, in that instance. Now, there’s another caveat – we won’t know the location of either animal/object in the image.

That’s where image localization comes into the picture (no pun intended!). It helps us to identify the location of a single object in the given image. In case we have multiple objects present, we then rely on the concept of object detection (OD). We can predict the location along with the class for each object using OD.

Before detecting the objects and even before classifying the image, we need to understand what the image consists of. Enter – Image Segmentation.

How Does Image Segmentation Work?

We can divide or partition the image into various parts called segments. It’s not a great idea to process the entire image at the same time as there will be regions in the image which do not contain any information. By dividing the image into segments, we can make use of the important segments for processing the image. That, in a nutshell, is how image segmentation works.

An image is a collection or set of different pixels. We group together the pixels that have similar attributes using image segmentation. Take a moment to go through the below visual (it’ll give you a practical idea of image segmentation):

Source : cs231n.stanford.edu

Object detection builds a bounding box corresponding to each class in the image. But it tells us nothing about the shape of the object. We only get the set of bounding box coordinates. We want to get more information – this is too vague for our purposes.

Image segmentation creates a pixel-wise mask for each object in the image. This technique gives us a far more granular understanding of the object(s) in the image.

Why do we need to go this deep? Can’t all image processing tasks be solved using simple bounding box coordinates? Let’s take a real-world example to answer this pertinent question.

What Is Image Segmentation Used For?

The shape of the cancerous cells plays a vital role in determining the severity of the cancer. You might have put the pieces together – object detection will not be very useful here. We will only generate bounding boxes which will not help us in identifying the shape of the cells.

Image Segmentation techniques make a MASSIVE impact here. They help us approach this problem in a more granular manner and get more meaningful results. A win-win for everyone in the healthcare industry.

Source: Wikipedia

Here, we can clearly see the shapes of all the cancerous cells. There are many other applications where Image segmentation is transforming industries:

Traffic Control Systems

Self Driving Cars

Locating objects in satellite images

Different Types of Image Segmentation

We can broadly divide image segmentation techniques into two types. Consider the below images:

Can you identify the difference between these two? Both the images are using image segmentation to identify and locate the people present.

In image 1, every pixel belongs to a particular class (either background or person). Also, all the pixels belonging to a particular class are represented by the same color (background as black and person as pink). This is an example of semantic segmentation

Image 2 has also assigned a particular class to each pixel of the image. However, different objects of the same class have different colors (Person 1 as red, Person 2 as green, background as black, etc.). This is an example of instance segmentation

Let me quickly summarize what we’ve learned. If there are 5 people in an image, semantic segmentation will focus on classifying all the people as a single instance. Instance segmentation, on the other hand. will identify each of these people individually.

So far, we have delved into the theoretical concepts of image processing and segmentation. Let’s mix things up a bit – we’ll combine learning concepts with implementing them in Python. I strongly believe that’s the best way to learn and remember any topic.

Region-based Segmentation

One simple way to segment different objects could be to use their pixel values. An important point to note – the pixel values will be different for the objects and the image’s background if there’s a sharp contrast between them.

In this case, we can set a threshold value. The pixel values falling below or above that threshold can be classified accordingly (as an object or the background). This technique is known as Threshold Segmentation.

If we want to divide the image into two regions (object and background), we define a single threshold value. This is known as the global threshold.

If we have multiple objects along with the background, we must define multiple thresholds. These thresholds are collectively known as the local threshold.

Let’s implement what we’ve learned in this section. Download this image and run the below code. It will give you a better understanding of how thresholding works (you can use any image of your choice if you feel like experimenting!).

First, we’ll import the required libraries.

View the code on Gist.

Let’s read the downloaded image and plot it:

View the code on Gist.

It is a three-channel image (RGB). We need to convert it into grayscale so that we only have a single channel. Doing this will also help us get a better understanding of how the algorithm works.

Python Code:



Now, we want to apply a certain threshold to this image. This threshold should separate the image into two parts – the foreground and the background. Before we do that, let’s quickly check the shape of this image:

gray.shape

(192, 263)

The height and width of the image is 192 and 263 respectively. We will take the mean of the pixel values and use that as a threshold. If the pixel value is more than our threshold, we can say that it belongs to an object. If the pixel value is less than the threshold, it will be treated as the background. Let’s code this:

View the code on Gist.

Nice! The darker region (black) represents the background and the brighter (white) region is the foreground. We can define multiple thresholds as well to detect multiple objects:

View the code on Gist.

Calculations are simpler

Fast operation speed

When the object and background have high contrast, this method performs really well

But there are some limitations to this approach. When we don’t have significant grayscale difference, or there is an overlap of the grayscale pixel values, it becomes very difficult to get accurate segments.

Edge Detection Segmentation

What divides two objects in an image? There is always an edge between two adjacent regions with different grayscale values (pixel values). The edges can be considered as the discontinuous local features of an image.

We can make use of this discontinuity to detect edges and hence define a boundary of the object. This helps us in detecting the shapes of multiple objects present in a given image. Now the question is how can we detect these edges? This is where we can make use of filters and convolutions. Refer to this article if you need to learn about these concepts.

The below visual will help you understand how a filter colvolves over an image :

Here’s the step-by-step process of how this works:

Take the weight matrix

Put it on top of the image

Perform element-wise multiplication and get the output

Move the weight matrix as per the stride chosen

Convolve until all the pixels of the input are used

One such weight matrix is the sobel operator. It is typically used to detect edges. The sobel operator has two weight matrices – one for detecting horizontal edges and the other for detecting vertical edges. Let me show how these operators look and we will then implement them in Python.

Sobel filter (horizontal) =

121000-1-2-1

Sobel filter (vertical) =

-101-202-101

Edge detection works by convolving these filters over the given image. Let’s visualize them on this article.

View the code on Gist.

It should be fairly simple for us to understand how the edges are detected in this image. Let’s convert it into grayscale and define the sobel filter (both horizontal and vertical) that will be convolved over this image:

View the code on Gist.

Now, convolve this filter over the image using the convolve function of the ndimage package from scipy.

View the code on Gist.

Let’s plot these results:

View the code on Gist. View the code on Gist.

Here, we are able to identify the horizontal as well as the vertical edges. There is one more type of filter that can detect both horizontal and vertical edges at the same time. This is called the laplace operator:

1111-81111

Let’s define this filter in Python and convolve it on the same image:

View the code on Gist.

Next, convolve the filter and print the output:

View the code on Gist.

Here, we can see that our method has detected both horizontal as well as vertical edges. I encourage you to try it on different images and share your results with me. Remember, the best way to learn is by practicing!

Clustering-based Image Segmentation

This idea might have come to you while reading about image segmentation. Can’t we use clustering techniques to divide images into segments? We certainly can!

In this section, we’ll get an an intuition of what clustering is (it’s always good to revise certain concepts!) and how we can use of it to segment images.

Clustering is the task of dividing the population (data points) into a number of groups, such that data points in the same groups are more similar to other data points in that same group than those in other groups. These groups are known as clusters.

K-means Clustering

One of the most commonly used clustering algorithms is k-means. Here, the k represents the number of clusters (not to be confused with k-nearest neighbor). Let’s understand how k-means works:

First, randomly select k initial clusters

Randomly assign each data point to any one of the k clusters

Calculate the centers of these clusters

Calculate the distance of all the points from the center of each cluster

Depending on this distance, the points are reassigned to the nearest cluster

Calculate the center of the newly formed clusters

Finally, repeat steps (4), (5) and (6) until either the center of the clusters does not change or we reach the set number of iterations

Let’s put our learning to the test and check how well k-means segments the objects in an image. We will be using this image, so download it, read it and and check its dimensions:

View the code on Gist.

It’s a 3-dimensional image of shape (192, 263, 3). For clustering the image using k-means, we first need to convert it into a 2-dimensional array whose shape will be (length*width, channels). In our example, this will be (192*263, 3).

View the code on Gist.

(50496, 3)

We can see that the image has been converted to a 2-dimensional array. Next, fit the k-means algorithm on this reshaped array and obtain the clusters. The cluster_centers_ function of k-means will return the cluster centers and labels_ function will give us the label for each pixel (it will tell us which pixel of the image belongs to which cluster).

View the code on Gist.

I have chosen 5 clusters for this article but you can play around with this number and check the results. Now, let’s bring back the clusters to their original shape, i.e. 3-dimensional image, and plot the results.

View the code on Gist.

Amazing, isn’t it? We are able to segment the image pretty well using just 5 clusters. I’m sure you’ll be able to improve the segmentation by increasing the number of clusters.

k-means works really well when we have a small dataset. It can segment the objects in the image and give impressive results. But the algorithm hits a roadblock when applied on a large dataset (more number of images).

It looks at all the samples at every iteration, so the time taken is too high. Hence, it’s also too expensive to implement. And since k-means is a distance-based algorithm, it only applies to convex datasets and is unsuitable for clustering non-convex clusters.

Finally, let’s look at a simple, flexible and general approach for image segmentation.

Mask R-CNN

Data scientists and researchers at Facebook AI Research (FAIR) pioneered a deep learning architecture, called Mask R-CNN, that can create a pixel-wise mask for each object in an image. This is a really cool concept so follow along closely!

Mask R-CNN is an extension of the popular Faster R-CNN object detection architecture. Mask R-CNN adds a branch to the already existing Faster R-CNN outputs. The Faster R-CNN method generates two things for each object in the image:

Its class

The bounding box coordinates

Mask R-CNN adds a third branch to this which outputs the object mask as well. Take a look at the below image to get an intuition of how Mask R-CNN works on the inside:

Source: arxiv.org

We take an image as input and pass it to the ConvNet, which returns the feature map for that image

Region proposal network (RPN) is applied on these feature maps. This returns the object proposals along with their objectness score

A RoI pooling layer is applied on these proposals to bring down all the proposals to the same size

Finally, the proposals are passed to a fully connected layer to classify and output the bounding boxes for objects. It also returns the mask for each proposal

Mask R-CNN is the current state-of-the-art for image segmentation and runs at 5 fps.

Summary of Image Segmentation Techniques

I have summarized the different image segmentation algorithms in the below table.. I suggest keeping this handy next time you’re working on an image segmentation challenge or problem!

AlgorithmDescriptionAdvantagesLimitationsRegion-Based SegmentationSeparates the objects into different regions based on some threshold value(s).a. Simple calculations

b. Fast operation speed

c. When the object and background have high contrast, this method performs really well

When there is no significant grayscale difference or an overlap of the grayscale pixel values, it becomes very difficult to get accurate chúng tôi Detection SegmentationMakes use of discontinuous local features of an image to detect edges and hence define a boundary of the chúng tôi is good for images having better contrast between chúng tôi suitable when there are too many edges in the image and if there is less contrast between objects.Segmentation based on ClusteringDivides the pixels of the image into homogeneous clusters.Works really well on small datasets and generates excellent clusters.a. Computation time is too large and expensive.

b. k-means is a distance-based algorithm. It is not suitable for clustering non-convex clusters.

Mask R-CNNGives three outputs for each object in the image: its class, bounding box coordinates, and object maska. Simple, flexible and general approach

b. It is also the current state-of-the-art for image segmentation

High training time

Conclusion

This article is just the beginning of our journey to learn all about image segmentation. In the next article of this series, we will deep dive into the implementation of Mask R-CNN. So stay tuned!

I have found image segmentation quite a useful function in my deep learning career. The level of granularity I get from these techniques is astounding. It always amazes me how much detail we are able to extract with a few lines of code. I’ve mentioned a couple of useful resources below to help you out in your computer vision journey:

Frequently Asked Questions

Q1. What are the different types of image segmentation?

A. There are mainly 4 types of image segmentation: region-based segmentation, edge detection segmentation, clustering-based segmentation, and mask R-CNN.

Q2. What is the best image segmentation method?

A. Clustering-based segmentation techniques such as k-means clustering are the most commonly used method for image segmentation.

Q3. What is image segmentation?

A. Image segmentation is the process of filtering or categorizing a database of images into classes, subsets, or regions based on certain specific features or characteristics.

Related

How To Become A Data Analyst With No Experience?

Introduction Is it Possible to Become a Data Analyst With No Experience?

Absolutely! You can pursue a data analyst role with no experience by obtaining the necessary qualifications. Several factors make the data job market accessible to beginners:

Lack of data expertise: The demand for data professionals surpasses the current supply, creating opportunities for newcomers to enter the field.

Emphasis on transferable skills: Data analytics values skills that can be applied from other domains, allowing individuals to leverage their existing abilities.

Rapid market growth: The data market has witnessed exponential growth, increasing the need for skilled professionals across industries.

Hiring data experts becomes a top priority as businesses rely on data-driven strategies. By investing effort, embracing growth, and accessing appropriate training resources, individuals can acquire the expertise needed to thrive in this dynamic field.

How to Become a Data Analyst With No Experience?

Here is your step-wise guide to get a data analyst job with no experience:

Gain Relevant Skills

Master Data Tools

Creating a Professional Portfolio

Networking and Seeking Internship Opportunities

Leveraging Online Learning Platforms

Joining Professional Data Analysis Communities

Showcasing Your Skills Through Personal Projects

Tailoring Your Resume and Cover Letter

Preparing for Data Analyst Interviews

1. Gain Relevant Skills 

It’s not necessarily essential to be a data analyst to have a degree in a relevant subject; however, having one in statistics, mathematics, or computer science might be helpful. You can sign up for in-person training sessions, watch video tutorials, or take online courses to increase your data expertise. Learn Python libraries like Matplotlib and Seaborn and data visualization applications like Tableau, Power BI, and others. Invest time in understanding language syntax, data types, and packages associated with programming languages.

Real-data projects can give you practical experience while instructing you on how to use data in practical settings. You may participate in existing projects or create your own by utilizing some of the freely available public data sets and building your project around those. Experiment with tools like Excel for data handling, SQL for database querying, and statistical software like SAS or SPSS.

Source: Monkeylearn

Helpful Resources – 3. Creating a Professional Portfolio

A portfolio of work can help you get a job because it builds authenticity and displays the projects you participated in in the past. Build a portfolio that showcases your skills by documenting your projects and their results. The portfolio you create might be simple, but it’s a good idea to incorporate a biography portion that highlights your fondness for data analytics and your education and experience. 

Source: BeamJobs

4. Networking and Seeking Internship Opportunities

Networking can result in insightful connections, mentorship possibilities, and even helpful internships. Explore reaching out to the connections you’ve developed from your projects, classes, and self-study. Recruitment websites are a wonderful resource for discovering prospective job opportunities. 

Source: Springboard

5. Leveraging Online Learning Platforms

The best way to upskill in data analytics field is through a professional online course. A professional certificate from an accredited institution might help you show businesses you have the education needed for the job. The most effective courses focus on projects and give students access to resources for career services. Our Blackbelt Plus Program offers a one-stop solution to all your needs. From guided projects to one-on-one sessions with expert mentors, you get all the resources to become a data analyst!

6. Joining Professional Data Analysis Communities

You may establish connections with business leaders, possible mentors, and recruiters by engaging in discussions, attending events, and interacting with other members. You can build a strong professional standing within the sector by actively participating in such networks. Additionally, it improves your exposure and increases the likelihood that companies would approach you directly about job openings.

Connect, Learn, Thrive: Join the Analytics Vidhya’s Data Community Today!

7. Showcasing Your Skills Through Personal Projects

Suppose you’re completely new to the field of data analytics. In that case, it’s essential to establish a correlation between the skill set of your previous projects and skills for your new job opportunities. Gather relevant data, assess it, and draw insightful inferences. Create a methodology report, then concisely convey what you have learned.

8. Tailoring Your Resume and Cover Letter

Having an all-around resume is a good idea if you wish to apply for multiple jobs. However, tailor your CV if you have a particular position in mind. Edit your resume to emphasize skills that are essential to the position’s requirements.  Wherever it is possible, highlight your technical skills and projects and provide numbers for your accomplishments. 

9. Preparing for Data Analyst Interviews

The interview gives the hiring managers a better understanding your skills and prospective match for the position. Prepare your primary concepts, algorithms, and statistical methods well. Also, be prepared to showcase your skills for problem-solving while working with real-world datasets. You may stand out in the interview by showcasing your understanding of the latest innovations and how they affect data analysis.

Entry Level Data Analyst Salary Entry-Level Data Analyst Jobs You Can Get Without Experience

Junior data analyst

Quality assurance analyst

Marketing and sales data analyst

Data associate

Research analyst

Data quality analyst

Conclusion

A data analyst employs data to address issues, find solutions, and suggest techniques for enhancement. Objectives of the job include looking at patterns and using present data to fuel projections and future approaches. Putting your foot in the door as a data analyst is only the beginning. Analytics vidhya offers various courses to help freshers build their skills.

FAQs

Q1. How do I start a data analyst for beginners?

A. Beginners can start as a data analyst by learning the fundamentals of data analysis. Sign up for online courses to learn the skill set for data analysis. 

Q2. What qualifications do I need to be a data analyst?

A. Some data analysts might have a graduate degree in subjects mathematics, statistics and computer science. It is not essential to have a degree in specific subjects; with the right skill set, any individual can be a data analyst.

Q3. Can I join a data analyst as a fresher?

A. Yes, one can be a data analyst as a fresher with the proper skill set and expertise.

Q4. How do I become a data analyst from zero?

A. Start with small steps, such as learning the fundamentals of data analysis, joining an online course to learn the tools used, making a portfolio, networking and applying for various job roles. You can consider signing up for our Blackbelt program to gain the relevant skills.

Related

5 Ways Ai Is Changing Game Development

Game studios use AI in multiple ways to enrich their releases, and its use is only set to grow in the future.

AI has played a major role in pushing games to where they are currently. Game studios use AI in multiple ways to enrich their releases, and its use is only set to grow in the future. Here are 5 ways AI is changing game development.

Pathfinding

Games these days have engaging storylines and sophisticated worlds as standard. Players expect highly textured environments that both entertain and challenge them. For instance, players routinely take in-game characters on long exploratory journeys, in a bid to explore the limits of the game’s universe.

Pathfinding, or

the act of coding an in-game character’s navigation

, is an important game development task. Given the open nature of the average game’s universe these days, developers must take several factors into account when plotting a character’s path.

For instance, if a user decides to take their character exploring in the middle of a main quest or task, how will the character interact with other characters nearby? How will they navigate their terrain, and how will the status of the primary quest affect the path they will take between in-game waypoints?

AI models these complex scenarios and is embedded into most gaming engines. This way, the game’s logic changes in real-time and can accommodate almost any decisions the user makes. The result is an engaging experience that feels almost like the real world.

Object Detection

While navigating an in-game universe, characters will

stumble upon in-game objects

. Users can detect objects pretty easily. For instance, a vehicle can be used to navigate from one point to another. However, the in-game character is just a piece of code and might struggle to identify every variation an object presents.

For instance, from a coding perspective, a fully functioning, pristine vehicle is different from a damaged one that can still travel short distances. A user might decide to “walk” to the next destination or use the damaged vehicle. To execute the latter task, the in-game character must identify the damaged vehicle as a candidate for a task and engage with it in an expected fashion.

If the character misidentifies the damaged vehicle as a tree and refuses to drive it, users will not hesitate to point out these flaws. AI is being used to create intelligent in-game characters that are more likely to correctly identify objects and their variations.

Character Design

For instance, a character that can convincingly express just one emotion isn’t going to play a convincing role in a storyline with emotional depth. AI, in the form of deep learning algorithms, can now process in-game mechanics and display appropriate emotions. These algorithms also inform character actions, voices, and dialogue. 

The result is

an immersive experience that gamers will never forget

.

Engineering Complex Game Scenarios

Games are becoming more open-ended, with in-game character choices driving the narrative. Coding these possibilities beforehand, while anticipating how one choice affects another, is a highly complicated task. In most cases, it’s impossible to accurately predict which way the story ought to head.

AI is coming to the rescue and is playing a part in creating Finite State Machine (FSM) models for game development. FSM models allow developers to code multiple scenarios into a single package and let the game engine compute and choose the ideal path to take. Thus, developers can give gamers almost infinite freedom and let AI do the heavy lifting when processing in-game logic.

Game Analytics

As game codebases grow more complex, reviewing code and fixing errors is a tough task. There are many nooks and crannies in games these days, and locating the source of an error is close to impossible, given the vast areas developers have to search.

AI is being used to conduct code tests quickly and isolate errors and potential breaks in code. These days, games are platform-agnostic. An error on desktop might not show up on mobile or vice-versa. Isolating platform-specific errors is a tough task, and AI is coming to the rescue.

Gaming analytics powered by AI isolate incidents and prevent faulty code releases. This gives developers timely alerts to act upon, and prevent major flaws in their releases.

Many Applications

Update the detailed information about Become A Computer Vision Artist With Stanford’s Game Changing ‘Outpainting’ Algorithm (With Github Link) on the Bellydancehcm.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!