半身清晰人像检测

本文最后更新于 2024年8月22日晚上

Abstract

使用代码检测视频中是否存在类似这种清晰的半身(较明显)人像

Methods

graph LR;

v(Video) --cv2--> f(frame)
f --YOLOv8x-pose--> p(points)
p --> elp([left eye point])
p --> erp([right eye point])
elp --> ed([eye distance])
erp --> ed
ed -- ÷ frame height --> ef([ratio]) -- >0.12 --> FT((obvious))
ef -- <=0.12 --> FF((not ob))
p --> np([nose point])
elp --> bias([bias])
erp --> bias
np --> bias
bias --> re((reasonable))
bias --> nre((unreasonable))
f --Sobel算子--> s([scobel])
s --> t([tenengrad])
t -- >1000 --> TC((clear))
t -- <=1000 --> FC((blur))
FT --> matched
re --> matched
TC --> matched

正脸半身人像检测

使用yolov8x-pose检测帧中所有人物的关键点
1. 完整人体可以检测到17个关键点，其中0,1,2号分别是鼻子,左眼,右眼，5-10号为上半身，11-17号为下半身
2. 勾股定理计算眼距eye_distance,然后计算eye_distance / frame.shape[1]判断人像在画面中的占比，同时可以过滤掉转向的人脸
3. 通过鼻子位置和眼镜中点位置判断人脸姿势是否合理，可以排除躺着的、低头的

def is_facing(self, image, detection_results, idx=0):
  """
  使用检测结果判断是否为正脸人像
  
  :param image							cv2得到的帧数据
  :param detection_results	YOLOv8x-pose检测到的结果
  :param idx								检测结果索引
  :return	bool
  """
  # 裁剪人像区域
  x1, y1, x2, y2 = detection_results.boxes.xyxy[idx]
  image = image[int(y1):int(y2), int(x1):int(x2)]

  # 使用检测结果
  res = detection_results  # 直接使用传入的检测结果

  if not hasattr(res, 'keypoints') or not res.keypoints:
      return False
  points = res.keypoints.xy[idx]
  logging.info(f'[{self.video.stem}] Key Points: {points.tolist()}')
  if len(points) < 1:
      return False
  left_eye = points[1]
  right_eye = points[2]
  nose = points[0]
  self.result.update({
      'keypoints': points.tolist()
  })
  if any(sum(point) == 0 for point in [left_eye, right_eye, nose]):  # 人脸关键点存在缺失表示不是正脸
      return False
  # 计算眼睛之间的中点
  mid_point = ((left_eye[0] + right_eye[0]) / 2, (left_eye[1] + right_eye[1]) / 2)
  # 计算眼睛之间的水平距离
  eye_distance = math.sqrt((right_eye[0] - left_eye[0]) ** 2 + (right_eye[1] - left_eye[1]) ** 2)
  # 计算鼻子与眼睛中点之间的水平偏移
  horizontal_offset = float(abs(nose[0] - mid_point[0]))

  # 计算鼻子与眼睛中点之间的垂直距离
  vertical_distance = float(abs(nose[1] - mid_point[1]))

  # 判断鼻子是否在两个眼睛中点的水平偏移量内
  is_symmetrical = horizontal_offset < 0.5 * eye_distance
  self.result.update({
      'horizontal_offset': float(horizontal_offset),
      'eye_distance': float(eye_distance),
      'symmetrical': float(horizontal_offset / eye_distance),
      'is_symmetrical': bool(is_symmetrical)
  })
  if eye_distance / image.shape[1] < 0.12:
      return False
  if self.debug:
      self._save_debug_image(image, 'symmetrical',
                             f'{round(horizontal_offset / eye_distance, 2)}-{self.video.parent.name}_{self.video.stem}.png')
  if not is_symmetrical:
      logging.info(f'[{self.video.stem}] 鼻子不在中间范围.{horizontal_offset} < 0.2 * {eye_distance}')
      return False

  # 判断垂直距离是否在合理范围内（例如 1:1 到 1:1.5）
  reasonable_vertical_distance = 0.2 <= vertical_distance / eye_distance <= 1.5
  self.result.update({
      'vertical_distance': float(vertical_distance),
      'reasonable_vertical_distance': float(vertical_distance / eye_distance),
      'is_reasonable_vertical_distance': bool(reasonable_vertical_distance)
  })
  if self.debug:
      self._save_debug_image(image, 'reasonable',
                             f'{round(vertical_distance / eye_distance, 2)}-{self.video.parent.name}_{self.video.stem}.png')
  # if not reasonable_vertical_distance:
  #     logging.info(
  #         f'[{self.video.stem}] 垂直范围不合理. {self.vd_ratio[0]} <= {vertical_distance / eye_distance} <= {self.vd_ratio[1]}')
  # return False, None

  if np.sum(points[5:11].tolist()) == 0:  # 5-11个点为上半身，如果和为0表示可能为手机中的人像
      logging.info(f'[{self.video.stem}] 没有上半身')
      return False
  # if np.sum(points[11:].tolist()) > 0:  # 11-17个点为下半身，如果存在表示不是半身
  #     logging.info(f'[{self.video.stem}] 存在下半身')
  # return False, None
  return True

清晰度检测

使用Tenengrad算子计算

def tenengrad_clarity(self, image, boxes, idx=0):
  """使用Tenengrad算子作为图像清晰度的度量"""
  # 裁剪人像区域
  x1, y1, x2, y2 = boxes.xyxy[idx]
  image = image[int(y1):int(y2), int(x1):int(x2)]
  
  # 使用Sobel算子计算x和y方向的梯度
  sobelx = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=3)
  sobely = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=3)\
  # 计算梯度的幅度
  gradient_magnitude = np.sqrt(sobelx ** 2 + sobely ** 2)
  # 计算梯度幅度的平方和
  tenengrad = np.sum(gradient_magnitude ** 2)
  # 计算的结果太大所以除以1000000方便debug
  tenengrad /= 1000000
  # 为了记录一些中间值方便debug所以放到了字典里
  self.result.update({
      'clarity': float(tenengrad),
      'is_clear': bool(tenengrad >= 1000)
  })
  logging.info(f'[{self.video.stem}] 清晰度检测：{tenengrad}')
  if self.debug:
      self._save_debug_image(image, 'sharpness',
                             f'{round(float(tenengrad), 2)}-{self.video.parent.name}_{self.video.stem}.png')
  return self.result['is_clear']

算法Log

#人像 #detection #YOLO #清晰度检测

半身清晰人像检测

https://tippye.github.io/2024/08/22/半身清晰人像检测/

作者

Tippy

发布于

2024年8月22日

许可协议

One For All 上一篇

树莓派openwrt使用argonone外壳下一篇