通过实际示例学习计算机视觉和机器学习的基本技术

2025.06.09 磐创AI

OpenCV 是一个开源的计算机视觉库，广泛应用于计算机视觉和机器学习领域。它提供了广泛的图像和视频处理工具，包括特征检测、图像识别和对象跟踪。

在本文中，我们将了解如何使用 OpenCV 执行各种任务，重点是如何使用它来应用机器学习。

首先，让我们从安装开始，你需要在你的环境中安装 OpenCV 库，你可以通过运行以下命令来完成此操作：
pip install opencv-python

或者
conda install -c conda-forge opencv

一旦安装了 OpenCV，就可以开始在 Python 代码中使用它。以下是如何读取图像文件并显示它的示例：
    import cv2
    # read the image
    image = cv2.imread("image.jpg")
    # display the image
    cv2.imshow("Image", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

OpenCV 还提供了广泛的图像处理功能。以下是如何将图像转换为灰度并显示它的示例：
    import cv2
    # read the image
    image = cv2.imread("image.jpg")
    # convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    # display the image
    cv2.imshow("Grayscale Image", gray)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

OpenCV 的另一个重要特性是它能够检测图像中的特征。例如，你可以使用 OpenCV 的cv2.CascadeClassifier类来检测图像中的人脸：
    import cv2
    # read the image
    image = cv2.imread("image.jpg")
    # create the classifier
    classifier = cv2.CascadeClassifier("path_to_classifier_xml")
    # detect faces
    faces = classifier.detectMultiScale(image, scaleFactor=1.3, minNeighbors=5)
    # draw a rectangle around the faces
    for (x, y, w, h) in faces:
        cv2.rectangle(image, (x, y), (x + w, y + h), (255, 0, 0), 2)
    # display the image
    cv2.imshow("Faces", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

OpenCV 还提供了许多基于机器学习的功能，例如检测、识别和跟踪。例如，你可以使用cv2.ml模块来训练和使用机器学习模型。
    import cv2
    import numpy as np
    # create the feature and label vectors
    features = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
    labels = np.array([1, 2, 3, 4])
    # create the SVM model
    svm = cv2.ml.SVM_create()
    svm.setType(cv2.ml.SVM_C_SVC)
    svm.setKernel(cv2.ml.SVM_LINEAR)
    svm.setC(1.0)
    # train the model
    svm.train(features, cv2.ml.ROW_SAMPLE, labels)
    # test the model on new data
    new_data = np.array([[2, 3], [4, 5]]) result = svm.predict(new_data) print(result[1])

在上面的示例中，我们使用cv2.ml模块创建了一个 SVM 模型，设置了模型的参数，使用我们的特征和标签向量对其进行了训练，然后在新数据上对其进行了测试。

另一个例子是使用深度学习，你可以使用OpenCV的cv2.dnn模块来加载和使用预训练的深度学习模型cv2.dnn.readNetFromCaffe，这是一个基于Caffe的深度学习模型。
    import cv2
    # read the image
    image = cv2.imread("image.jpg")
    # load the deep learning model
    net = cv2.dnn.readNetFromCaffe("path_to_prototxt", "path_to_caffe_model")
    # set the input blob
    blob = cv2.dnn.blobFromImage(image, 1.0, (224, 224), (104, 117, 123))
    net.setInput(blob)
    # get the predictions
    predictions = net.forward()
    # display the predictions
    print(predictions)

在上面的示例中，我们使用cv2.dnn模块加载了一个深度学习模型，设置了输入 blob，然后使用该模型对我们的图像进行预测。

这些是你如何将 OpenCV 用于计算机视觉和机器学习任务的几个示例。OpenCV 拥有广泛的工具和功能，是一个强大的库，可供数据科学家用于满足他们的计算机视觉和机器学习需求。

OpenCV 强大的功能集使其成为图像和视频处理和分析的优秀库，机器学习的集成使其功能更加强大。
    更多高级示例对象跟踪：OpenCV 提供了广泛的对象跟踪算法，可用于跟踪视频流中的对象。例如，你可以使用该cv2.TrackerKCF_create()函数创建一个 KCF（Kernelized Correlation Filters）跟踪器，然后使用它来跟踪视频流中的对象。这是一个例子：import cv2
    # create the video capture object
    cap = cv2.VideoCapture("video.mp4")
    # get the first frame
    ret, frame = cap.read()
    # select the object to track
    bbox = cv2.selectROI(frame, False)
    # create the KCF tracker
    tracker = cv2.TrackerKCF_create()
    tracker.init(frame, bbox)
    # start the tracking loop
    while True:
        # get the next frame
        ret, frame = cap.read()
        # update the tracker
        success, bbox = tracker.update(frame)
        # check if the tracking failed
        if not success:
     break
        # draw the bounding box
        cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[0] + bbox[2]), int(bbox[1] + bbox[3])), (255, 0, 0), 2)
        # show the frame
        cv2.imshow("Tracking", frame)
        # exit if the user presses the 'q' key
        if cv2.waitKey(1) & 0xFF == ord("q"):
     break
    # release the video capture and close the window
    cap.release()
    cv2.destroyAllWindows()
    光流：OpenCV 提供了广泛的光流算法，可用于跟踪视频流中对象的运动。一种流行的算法是 Farneback 算法，可用于估计两帧之间的光流。以下是如何使用此算法可视化视频流中的光流的示例：import cv2
    # create the video capture object
    cap = cv2.VideoCapture("video.mp4")
    # get the first frame
    ret, frame1 = cap.read()
    gray1 = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
    # start the tracking loop
    while True:
        # get the next frame
        ret, frame2 = cap.read()
        gray2 = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)
        # calculate the optical flow
        flow = cv2.calcOpticalFlowFarneback(gray1, gray2, None, 0.5, 3, 15, 3, 5, 1.2, 0)
        # visualize the optical flow
        mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
        hsv = np.zeros((gray1.shape[0], gray1.shape[1], 3), dtype=np.float32)
        hsv[..., 0] = ang * 180 / np.pi / 2
        hsv[..., 1] = 255
        hsv[..., 2] = c
    使用 OpenCV 机器学习功能的另一个示例是使用预训练模型进行对象检测。一种流行的对象检测模型是 Single Shot MultiBox Detector (SSD)，它是一种基于深度学习的模型，可以检测图像中的多个对象。import cv2
    # read the image
    image = cv2.imread("image.jpg")
    # read the pre-trained model and config files
    net = cv2.dnn.readNetFromCaffe("ssd.prototxt", "ssd.caffemodel")
    # create a 4D blob from the image
    blob = cv2.dnn.blobFromImage(image, 1.0, (300, 300), (104.0, 177.0, 123.0))
    # set the blob as input to the model
    net.setInput(blob)
    # get the detections
    detections = net.forward()
    # loop over the detections
    for i in range(detections.shape[2]):
        # get the confidence of the detection
        confidence = detections[0, 0, i, 2]
        # filter out weak detections
        if confidence > 0.5:
     # get the coordinates of the detection
     box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
     (startX, startY, endX, endY) = box.astype("int")
     # draw the detection on the image
     cv2.rectangle(image, (startX, startY), (endX, endY), (0, 0, 255), 2)
    # display the image
    cv2.imshow("Objects", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

在上面的示例中，我们使用cv2.dnn.readNetFromCaffe加载 SSD 模型及其配置文件，从输入图像创建一个 blob，将 blob 设置为模型的输入，运行前向传播以获得检测，过滤掉弱检测，并绘制检测在图像上。
    另一个例子是使用 OpenCV 的cv2.Tracker类来跟踪视频中的对象。import cv2
    # Read video
    cap = cv2.VideoCapture("video.mp4")
    # Read the first frame
    ret, frame = cap.read()
    # Define the region of interest (RoI)
    roi = cv2.selectROI(frame)
    # Initialize the tracker
    tracker = cv2.TrackerKCF_create()
    tracker.init(frame, roi)
    # Loop over the frames
    while True:
        # Read the next frame
        ret, frame = cap.read()
        if not ret:
     break
        # Update the tracker
        success, roi = tracker.update(frame)
        # Draw the RoI
        if success:
     (x, y, w, h) = [int(v) for v in roi]
     cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
        # Show the frame
        cv2.imshow("Frame", frame)
        key = c
    使用 OpenCV 的另一个高级示例是使用图像抠图技术使图像中的对象消失。图像抠图是估计图像中每个像素的不透明度的过程，它允许你将前景对象与背景分开。

下面是如何使用 OpenCV 的cv2.createBackgroundSubtractorMOG2函数从图像中提取前景对象并使其消失的示例：
    import cv2
    # Read the image
    image = cv2.imread("image.jpg")
    # Create the background subtractor
    bgSubtractor = cv2.createBackgroundSubtractorMOG2()
    # Apply the background subtractor to the image
    fgMask = bgSubtractor.apply(image)
    # Use a morphological operator to remove noise
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
    fgMask = cv2.morphologyEx(fgMask, cv2.MORPH_CLOSE, kernel)
    # Invert the mask to get the background
    bgMask = cv2.bitwise_not(fgMask)
    # Use the mask to extract the background and the object
    bg = cv2.bitwise_and(image, image, mask=bgMask)
    fg = cv2.bitwise_and(image, image, mask=fgMask)
    # Set the object pixels to transparent
    fg[fg > 0] = (255, 255, 255, 0)
    # Combine the background and the transparent object
    result = cv2.addWeighted(bg, 1, fg, 1, 0)
    # Show the result
    cv2.imshow("Object Disappeared", result)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

在这个例子中，我们使用 OpenCV 的cv2.createBackgroundSubtractorMOG2函数创建了一个背景减法器，然后将其应用于图像以提取前景对象。

然后我们使用形态学运算符从掩模中去除噪声。之后，我们反转掩码以提取背景，并使用掩码提取背景和对象。

最后，我们将对象像素设置为透明，并将背景和透明对象组合在一起，以创建带有消失对象的最终结果。
总结

OpenCV 是用于计算机视觉和机器学习任务的强大且广泛使用的库。它提供了广泛的图像和视频处理工具，包括特征检测、图像识别、对象跟踪和机器学习。

本文中提供的示例演示了使用 OpenCV 读取和显示图像、将图像转换为灰度、检测图像中的特征以及对象检测和图像抠图等任务。

OpenCV 还提供了许多基于机器学习的功能，例如使用 cv2.ml 和 cv2.dnn 模块进行检测、识别和跟踪。借助 OpenCV，开发人员可以轻松地将计算机视觉和机器学习功能集成到他们的项目中，并为各个行业创造新的解决方案。