opencv - freeCodeCamp.org

How to Use OpenCV and Python for Computer Vision and AI

Beau Carnes — Mon, 07 Jun 2021 20:54:41 +0000

OpenCV is a popular Python library for real-time computer vision.

We just released a new OpenCV course on the freeCodeCamp.org YouTube channel. This course comes directly from the creators of OpenCV and is the perfect course for beginners.

You will learn how to use OpenCV for Computer Vision and AI. You will learn and get exposed to a wide range of exciting topics like Image & Video Manipulation, Image Enhancement, Filtering, Edge Detection, Object Detection and Tracking, Face Detection and the OpenCV Deep Learning Module.

At the end of this course there is an interview with Dr. Satya Mallick, the CEO of OpenCV.org. Dr. Mallick shares his views on the many opportunities in the Computer Vision and AI job market. He gives advice on how to prepare and get hired for a job in AI.

Here are the sections in this course:

Module 1: Getting Started with Images
Module 2: Basic Image Manipulation
Module 3: Image Annotation
Module 4: Image Enhancement
Module 5: Accessing the Camera
Module 6: Read and Write Videos
Module 7: Image Filtering and Edge Detection
Module 8: Image Features and Image Alignment
Module 9: Image Stitching and Creating Panoramas
Module 10: High Dynamic Range Imaging (HDR)
Module 11: Object Tracking
Module 12: Face Detection
Module 13: Object Detection
Module 14: Pose Estimation using OpenPose
Interview with OpenCV CEO, Dr. Satya Mallick

Watch the course below or on the freeCodeCamp.org YouTube channel (3-hour watch).

Python and OpenCV Course –Create Computer Vision Apps in the Cloud

Beau Carnes — Wed, 05 May 2021 21:11:40 +0000

OpenCV is a library of programming functions mainly aimed at real-time computer vision.

We just published a course on the freeCodeCamp.org YouTube channel that will teach you how to use OpenCV in the cloud with Python.

Misbah Mohammed created this course. He has a lot of experience with machine learning and makes it simple to follow along if you already know some Python.

You will learn how to create computer vision applications in the cloud on Google Colab. You will use AI and machine learning.

Here are the sections in this video:

Lesson 1: Changing color profiles in an image
Image Properties
Lesson 2: Edge Detection
Erosion and Dilation
Lesson 3: Image Manipulation-Noise Removal
Lesson 4: Drawing Shapes and Writing Text on Images
Intermediate Exercise 1: Color Detection
Intermediate Exercise 2: Face Detection
Intermediate Exercise 3: Shape Detection
Project 1: Ball Tracking
Project 2: Face Recognition

Watch the full course below or on the freeCodeCamp.org YouTube channel (3 hour watch).

Learn How to Use the OpenCV Computer Vision Library

Beau Carnes — Wed, 04 Nov 2020 18:07:02 +0000

OpenCV is a cross-platform library that can be used to code real-time computer vision applications. It makes it easier to implement image processing, face detection, and object detection.

We've released a full course on the freeCodeCamp.org YouTube channel that will help you get started with OpenCV. You will learn how to use it with the Python programming language.

The course was created by Jason Dsouza. Jason has been teaching about deep learning and Python for many years.

The course starts with the basics such as reading images and video, image transformations, and drawing on images. Then it covers more advanced concepts such as color spaces, edge detection, and thresholding. Towards the end, you'll learn to build a Deep Computer Vision model to detect between the characters in "The Simpsons".

Here are the topics covered in this course:

Section #1 - Basics

Reading Images & Video
Resizing and Rescaling Frames
Drawing Shapes & Putting Text
Essential Functions in OpenCV
Image Transformations
Contour Detection

Section #2 - Advanced

Color Spaces
Color Channels
Blurring
BITWISE operations
Masking
Histogram Computation
Thresholding/Binarizing Images
Edge Detection

Section #3 - Faces:

Face Detection with Haar Cascades
Face Recognition with OpenCV's built-in recognizer

Section #4 - Capstone

Deep Computer Vision

Watch the full course on the freeCodeCamp.org YouTube channel (4-hour watch).

Facial recognition using OpenCV in Java

freeCodeCamp — Wed, 25 Jul 2018 15:29:37 +0000

By Manish Bansal

Ever since the Artificial Intelligence boom began — or the iPhone X advertisement featuring the face unlock feature hit TV screens — I’ve wanted to try this technology. However, once I started googling about it, I typically only found code examples in Python. And being a Java enthusiast for seven years, I got demotivated seeing that. Therefore, I finally decided to hunt for Java open source libraries for this.

Currently, there are various Java libraries out there. But the most popular one I found was OpenCV.

OpenCV is an open source computer vision library that has tons of modules like object detection, face recognition, and augmented reality. Although this library is written in C++, it also offers battle-tested Java bindings.

However, there is one issue. As part of its software release, it offers only a few modules (with Java bindings) out of the box — and facial recognition is not one of them. Therefore, to use it, you need to manually build it.

Wait! What? Why?

Yes — the reason cited by the OpenCV community is that the modules are not completely stable. Therefore, they are not bundled along with the standard release. Hence, they maintain them in a separate repository here.

If you have no or very little C++ experience (like me), you must have already started to feel dizzy about building a C++ library yourself. But don’t worry, I am here to hold your hand and walk you through this tedious process. So let’s begin, shall we?

Building OpenCV for Java from scratch

_“Birds flying around a half-built building and construction site” by [Unsplash](https://unsplash.com/@danist07?utm_source=medium&utm_medium=referral" rel="noopener" target="_blank" title="">贝莉儿 NG on this, this, and this. However, none of them worked perfectly for me, as one thing or another was missing. The closest I found, which helped me, is this one. However, you do not need to refer to it. You can follow below steps and you will be good.

First, you need to have the below software on your PC. Here, I am building a 64-bit version of the library as I own a 64 bit PC. But you can build it for 32-bit as well.

The required software is:

Cmake (I used 3.6.0 rc-4 version).
Ant (used internally for building JAR)
MinGW — W64 GCC-8.1.0
64 bit JDK 1.8

A word about MinGW: Here, to build this library, we need C++ compilers. You can use Visual Studio tools (VS), which is far better. However, I did not have the luxury to do that, as I built it on my office laptop and VS is licensed software unavailable to Java people here. Therefore, I had to use open source tools, and the best one is MinGW (Minimalist GNU for Windows).

Also, it is very important to use the correct version of MinGW. Download version x86_64-posix-seh, as there is thread support in this version. I have not tried all other versions. But version x86_64-win32-sjlj does not work at all.

To give some more perspective, the build is done by the utility called make which comes as part of MinGW (bin/mingw32-make.exe). make is a task runner for C++ like “Ant” is for Java. But C++ code and make scripts are very much platform-dependent. Hence, to make the distributables platform-independent, the utility CMake is used. CMake generates platform-dependent make scripts.

Generating build configurations using CMake

Step 1: Download the source code zip of both the opencv and opencv_contrib, and extract them into a directory. Further, create a folder called “build” in the same directory (I created “build_posix” as visible in the screenshots).

Step 2: Open CMake. Point “where is the source code” to the opencv extracted folder. Further, point “where to build the binaries” to the “build” folder you created.

Step 3: Add the 64 bit JDK 1.8 bin folder, the MinGW bin folder, and the Ant bin folder to the “PATH” environment variables. This is important, as CMake will look in the environment variables for configuration. If this is not done, then we will have to configure CMake manually in step 5.

In case you have multiple JDKs in your system and you already have some different JDK in “PATH” & you don’t want to add JDK 1.8 in “PATH”, you can skip this. But do configure it manually in step 5.

Step 4: Press the “Configure” button and select “ MinGw Makefiles” and “finish”. After this, CMake will start configuring your project. It will take a while and, after it finishes configuring, it will show the current available configurations.

In case you are wondering if the configurations generated for you are correct, you can refer to the logs which got generated for me here and compare.

Step 5: Now comes the most important part — changing the configurations. First, click the checkboxes “Grouped” and “Advanced” to organize the configurations.

Verify that ANT_EXECUTABLE (search “ANT_EXECUTABLE” in the search box) and all five “JAVA” configurations are pointing to the 64-bit JDK 1.8. If Step 3 was done properly, then this will be correct. Otherwise, correct them.

Un-check Python (search “Python”) related check boxes under “BUILD” and “INSTALL” groups as we don’t need Python builds.

Disable “WITH_MSMF” and “WITH_IPP & WITH_TBB”. These libs are only available for VS.
Edit “OPENCV_EXTRA_MODULES_PATH” under “OPENCV” group and set it to the “modules” folder under the “opencv_contrib” source folder you extracted earlier.

After this, press the “Configure” button again. This will do the final configurations. You can refer to the logs which got generated for me here.

Note: Make sure to compare your “Configure” logs generated with the one I shared in pastebin above. If you find some major difference, then first try correcting your configurations and press “Configure” again. Otherwise, there are chances that your build will fail and that it will be more difficult to debug.

Step 6: After this, press “Generate”. It will take few seconds and then close CMake.

Compiling OpenCV

Now, if all the configurations generated above are correct, this task will be a breeze (of 2–3 hours!). Just open the command prompt, go to the “build” folder, and execute the command below.

mingw32-make.exe  -j5 > buildLogs.txt

Here, -j5 is added, which instructs the make utility to run five jobs in parallel. This will make your build faster, at least theoretically.

Further, do not forget to push the logs to a text file. These might get too big, in which case your command prompt window might truncate it. You need them in case compilation fails. You can refer to the compilation logs generated in my case here.

Note: The order of log statements might not be the same for you, as the build is happening in five parallel threads.

Once the build is over, you can check the “bin” and “lib” folders inside your “build” directory. Inside “bin”, you will have all your opencv.exe’s and libopencv.dll’s and your compiled JAR. Further, “lib” will have your main dll (libopencv_javaxxx.dll) along with some more dependent files.

“bin” folder after successful compilation

“lib” folder after successful compilation

Hands on with OpenCV face recognition API

_Photo by [Unsplash](https://unsplash.com/@rawpixel?utm_source=medium&utm_medium=referral" rel="noopener" target="_blank" title="">rawpixel on here and import the project into your Eclipse workspace. Further, you will need to add JDK 1.8 as well as the opencv user library (just created above) to this project. Once you are done, you will be ready to test your newly built OpenCV library.

As of this writing, there are three programs in this project.

HelloWorld: you can run this to test if your OpenCV library setup is ok. If this does not work properly, you need to sort this out first. The only issues you will encounter at this point will be related to system environment variables or user library setup.
FaceDetection: you can use this to test the face detection module. It is a different module from face recognition. This is a module which gets shipped along with standard release of OpenCV. As of this writing, we can provide an image as an input to the program, and it will detect all the faces inside the image. The output image has green rectangles drawn on all the detected faces.

Input image for Face Detection program

Output image of face detection program

FaceRecognition: the OpenCV facerec module includes three algorithms:
Eigenfaces
Fisherfaces
Local Binary Patterns Histograms.

For technical details on all these algorithms, you can refer this official article. For demonstration purposes, I will show you how to use the Eigenfaces algorithm.

First, you need to download training data from the face database. This data contains ten different images for each of 40 distinct subjects (400 images). For some subjects, the images were taken at different times, varying the lighting, facial expressions (open / closed eyes, smiling / not smiling), and facial details (glasses / no glasses). After extracting them on your computer, you need to prepare a .csv file containing the path of each image, along with their corresponding label.

To make it easy, I have one TrainingData.txt in my Git repository. However, you need to edit the file and alter the paths of images as per your computer directory location.

Note: the downloaded face database contains images in .pgm format. This format is not supported by Windows. To actually convert them to .jpg, I added PGMToJPGConverter to my repository. You can use to this to convert the images and have an actual look at the training data.

After this, you can run the face recognition program. Below are the steps performed in the program:

OpenCV library is loaded as usual.
The .csv file is read, and two ArrayList(s) are created. One for the matrix of images and other for their corresponding labels.
Out of the 400 input images, the last entry in the list data structure is removed and saved for testing the trained model later.
After that, the remaining 399 images are used for training the Eigenfaces algorithm.
Once training is complete, the model is asked to predict the label of the image we removed in step 3.

Output of Face Recognition Program

Here, we can observe that the algorithm is able to predict the label of our test subject with a confidence value of 1807. The lower the value, the better the prediction. Similarly, you can perform this exercise with two other algorithms. The C++ code can be downloaded from here and here.

Update (27th Dec 2018): In case you find building the openCV java bindings painful, then i have a good news for you. Recently, I have found an easier way to get all the openCV dependencies for java. For complete details, please refer my another article.

Congratulations!! ? You made it to the end. And if you liked ?this article, hit that clap button below ?. It means a lot to me and it helps other people see the story.

How to use image preprocessing to improve the accuracy of Tesseract

freeCodeCamp — Wed, 06 Jun 2018 13:25:41 +0000

By Berk Kaan Kuguoglu

Previously, on How to get started with Tesseract, I gave you a practical quick-start tutorial on Tesseract using Python. It is a pretty simple overview, but it should help you get started with Tesseract and clear some hurdles that I faced when I was in your shoes. Now, I’m keen on showing you a few more tricks and stuff you can do with Tesseract and OpenCV to improve your overall accuracy.

Where did we leave off last time?

In the previous story, I didn’t bother going into details for the most part. But if you liked the first story, here comes the sequel! So where did we leave off?

Ah, we had a brief overview of rescaling, noise removal, and binarization. Now, it’s time to get down to details and show you a few settings you can play with.

Rescaling

The images that are rescaled are either shrunk or enlarged. If you’re interested in shrinking your image, INTER_AREA is the way to go for you. (Btw, the parameters fx and fy denote the scaling factor in the function below.)

img = cv2.resize(img, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_AREA)

On the other hand, as in most cases, you may need to scale your image to a larger size to recognize small characters. In this case, INTER_CUBIC generally performs better than other alternatives, though it’s also slower than others.

img = cv2.resize(img, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)

If you’d like to trade off some of your image quality for faster performance, you may want to try INTER_LINEAR for enlarging images.

img = cv2.resize(img, None, fx=2, fy=2, interpolation=cv2.INTER_LINEAR)

Blurring

It’s worth mentioning that there are a few blur filters available in the OpenCV library. Image blurring is usually achieved by convolving the image with a low-pass filter kernel. While filters are usually used to blur the image or to reduce noise, there are a few differences between them.

1. Averaging

After convolving an image with a normalized box filter, this simply takes the average of all the pixels under the kernel area and replaces the central element. It’s pretty self-explanatory, I guess.

img = cv.blur(img,(5,5))

2. Gaussian blurring

This works in a similar fashion to Averaging, but it uses Gaussian kernel, instead of a normalized box filter, for convolution. Here, the dimensions of the kernel and standard deviations in both directions can be determined independently. Gaussian blurring is very useful for removing — guess what? — gaussian noise from the image. On the contrary, gaussian blurring does not preserve the edges in the input.

img = cv2.GaussianBlur(img, (5, 5), 0)

3. Median blurring

The central element in the kernel area is replaced with the median of all the pixels under the kernel. Particularly, this outperforms other blurring methods in removing salt-and-pepper noise in the images.

Median blurring is a non-linear filter. Unlike linear filters, median blurring replaces the pixel values with the median value available in the neighborhood values. So, median blurring preserves edges as the median value must be the value of one of neighboring pixels.

img = cv2.medianBlur(img, 3)

4. Bilateral filtering

Speaking of keeping edges sharp, bilateral filtering is quite useful for removing the noise without smoothing the edges. Similar to gaussian blurring, bilateral filtering also uses a gaussian filter to find the gaussian weighted average in the neighborhood. However, it also takes pixel difference into account while blurring the nearby pixels.

Thus, it ensures only those pixels with similar intensity to the central pixel are blurred, whereas the pixels with distinct pixel values are not blurred. In doing so, the edges that have larger intensity variation, so-called edges, are preserved.

img = cv.bilateralFilter(img,9,75,75)

Overall, if you are interested in preserving the edges, go with median blurring or bilateral filtering. On the contrary, gaussian blurring is likely to be faster than median blurring. Due to its computational complexity, bilateral filtering is the slowest of all methods.

Again, you do you.

Image Thresholding

There’s not a single image thresholding method that fits all types of documents. In reality, all filters perform differently on varying images. For instance, while some filters successfully binarize some images, they may fail to binarize others. Likewise, some filters may work well with those images that other filters cannot binarize well.

I’ll try to cover the basics here, though I do recommend that you read the official documentation of OpenCV on Image Thresholding for more information and the theory behind it.

1. Simple Threshold

You might recall a friend of yours giving you some advice about your life by saying “things are not always black and white”. Well, for a simple threshold, things are pretty straight-forward.

cv.threshold(img,127,255,cv.THRESH_BINARY)

First, you pick a threshold value, say 127. If the pixel value is greater than the threshold, it becomes black. If less, it becomes white. OpenCV provides us with different types of thresholding methods that can be passed as the fourth parameter. I often use binary threshold for most tasks, but for other thresholding methods you may visit the official documentation.

2. Adaptive Threshold

Rather than setting a one global threshold value, we let the algorithm calculate the threshold for small regions of the image. Thus, we end up having various threshold values for different regions of the image, which is great!

cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)

There are two adaptive methods for calculating the threshold value. While Adaptive Thresh Mean returns the mean of the neighborhood area, Adaptive Gaussian Mean calculates the weighted sum of the neighborhood values.

We’ve got two more parameters that determine the size of the neighborhood area and the constant value that is subtracted from the result: the fifth and sixth parameters, respectively.

3. Otsu’s Threshold

This method particularly works well with bimodal images, which is an image whose histogram has two peaks. If this is the case, we might be keen on picking a threshold value between these peaks. This is what Otsu’s Binarization actually does, though.

cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

It’s pretty useful for some cases. But it may fail to binarize images that are not bimodal. So, please take this filter with a grain of salt.

Types of thresholding

You might have already noticed there is a parameter, or in some cases a combination of a few parameters, that are passed as arguments to determine the type of thresholding, such as THRESH_BINARY. I’m not going into the detail here now, as it is explained clearly in the official documentation.

What next?

So far, we’ve discussed some of the techniques of image pre-processing. You might wonder when exactly you’re going to get your hands dirty. Well, the time has come. Before you get back to your favorite Python IDE — mine is PyCharm, btw — I’m going to show you few lines of code that will save you some time while trying to find which combination of filters and image manipulations work well with your documents.

Let’s start by defining a switcher function that holds a few combinations of thresholding filters and blurring methods. Once you get the idea, you could also add more filters, incorporating other image pre-processing methods like rescaling into your filter set.

Here I’ve created 20 different combinations of image thresholding methods, blurring methods, and kernel sizes. The switcher function, _applythreshold, takes two arguments, namely OpenCV image and an integer that denotes the filter. Likewise, since this function returns the OpenCV image as a result, it could easily be integrated into our _getstring function from the previous post.

def apply_threshold(img, argument):    switcher = {        1: cv2.threshold(cv2.GaussianBlur(img, (9, 9), 0), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1],        2: cv2.threshold(cv2.GaussianBlur(img, (7, 7), 0), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1],        3: cv2.threshold(cv2.GaussianBlur(img, (5, 5), 0), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1],

...

        18: cv2.adaptiveThreshold(cv2.medianBlur(img, 7), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2),        19: cv2.adaptiveThreshold(cv2.medianBlur(img, 5), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2),        20: cv2.adaptiveThreshold(cv2.medianBlur(img, 3), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)    }    return switcher.get(argument, "Invalid method")

And, here it comes.

def get_string(img_path, method):    # Read image using opencv    img = cv2.imread(img_path)    # Extract the file name without the file extension    file_name = os.path.basename(img_path).split('.')[0]    file_name = file_name.split()[0]    # Create a directory for outputs    output_path = os.path.join(output_dir, file_name)    if not os.path.exists(output_path):        os.makedirs(output_path)

    # Rescale the image, if needed.    img = cv2.resize(img, None, fx=1.5, fy=1.5, interpolation=cv2.INTER_CUBIC)

    # Convert to gray    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)    # Apply dilation and erosion to remove some noise    kernel = np.ones((1, 1), np.uint8)    img = cv2.dilate(img, kernel, iterations=1)    img = cv2.erode(img, kernel, iterations=1)

    # Apply threshold to get image with only black and white    img = apply_threshold(img, method)

    # Save the filtered image in the output directory    save_path = os.path.join(output_path, file_name + "_filter_" + str(method) + ".jpg")    cv2.imwrite(save_path, img)    # Recognize text with tesseract for python    result = pytesseract.image_to_string(img, lang="eng")

    return result

Last words

Now, all we need to do is to write a simple for loop that iterates over the input directory to collect images and applies each filter on the images gathered. I prefer to use glob, or os, for collecting images from directories, and argparse for passing arguments via terminal, like any other sane person would do.

Here I’ve done pretty much the same thing as in my gist, if you’d like have a look at it. However, feel free to use the tools you feel comfortable with.

So far, I’ve tried to cover a few useful image pre-processing concepts and implementations, though it’s probably just the tip of the iceberg. I don’t know how much “leisure time” I’m going to have in the upcoming weeks, so, I can’t give you a specific time frame for publishing my next post. However, I’m considering adding at least one more part to this series that explains a few things I left out, such as rotation and de-skewing on images.

Until then, best bet is to just keep your wits about you and continue to look for signs.*

How you can get started with Tesseract

freeCodeCamp — Tue, 05 Jun 2018 18:42:00 +0000