FrankMocap is special in that it is dedicated for 3D motion tracking from 2D video. Where the lightweight Mediapipe body and hand tracking also provides some depth information, FrankMocap is specifically optimized for estimating as best as possible the 3D joint positions (and rotations) from 2D video for the hands and the body. See the main FrankMocap paper in the resources for performance indicators. Note that FrankMocap only allows for tracking one person, and it will actually give an error if there are more than 1 person in the frame.
The installation of frankmocap can be quite cumbersome, and the procedure depends on your operating system (for a description see the FrankMocap repository linked below). We have installed frankmocap on a windows machine following the exact instructions of this helpful tutorial https://www.youtube.com/watch?v=MYLMM7jOMS4. If your institute has such resources you can ask your IT department to assist in the installation.
But even when the installation is completed, the output you can then generate is not yet a time series and a tracking video that we can work with. FrankMocap generates only single video frames with the pose projected onto the frames, and per frame a .pkl file that contains tracking information. This python script therefore assists in this last step to transform the frankmocap output into workable time series data and create a video from the frames. We will extract the joint positions, but not that the script can be adapted if you want the 3D joint rotation angles instead.
"python -m demo.demo_frankmocap --input_path ./sampledata/TanDun_youtube_conductorexample.mp4 --out_dir ./mocap_output --save_pred_pkl"
location code: https://github.com/WimPouw/EnvisionBootcamp2021/tree/main/Python/FrankMocapBodyandHandTracking
citation: Pouw, W. & Trujillo, J.P.(2021-11-18). Processing output for 3D single person full-body tracking with FrankMocap [day you visited the site]. Retrieved from: https://wimpouw.github.io/EnvisionBootcamp2021/FrankMocapBodyandHandTracking.html
from IPython.display import HTML
HTML('<iframe width="935" height="584" src="https://www.youtube.com/embed/mw8RymohMp0?start=8668" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>')
import pickle #for opening pkl files
import cv2 #for video processing functions
import os
import numpy as np
import pandas as pd
import csv
#files that contains predicted points
framefol = "../frankmocap_output/mocap/"
framedata = os.listdir(framefol)
#files that contain the images
renderfol = "../frankmocap_output/rendered/"
renderdata = os.listdir(renderfol)
#folder for saving the rendered video to
videofold = '../MTVideos/'
#folder for saving the timeseries to
outtsfol = '../OutputTimeseries/'
Lets first render the frames that were produced by frankmocap into a motion tracking video and save it to the MTVideos folder.
#check the frame properties
images = renderdata
checkpixels = cv2.imread(renderfol + images[0])
height, width = checkpixels.shape[:2]
print(width, height)
# choose codec according to format needed
fourcc = cv2.VideoWriter_fourcc(*'MP4V')
video = cv2.VideoWriter(videofold +'MTvideo.mp4', fourcc, 29.97, (width, height))
#loop through the images save them to a video
for i in images:
img = cv2.imread(renderfol + i)
video.write(img)
#release
print("done")
cv2.destroyAllWindows()
video.release()
We will now extract from the pkl files all the tracked body points. We will only do this for the upper body, as the rest was not visible. You can change the script of course to extract all data.
#each left or right hand has
#pred_joints_smpl
#faces
#bbox_scale_ratio
#bboc_top_left
#pred_camera
#img_cropped
#pred_hand_pose -> this gives the joint angles
#pred_hand_betas
#pred_joints_img*** -> this gives the joint positions (rescaled for camera)
#****we will extract the rescaled joint positions
#https://github.com/facebookresearch/frankmocap/issues/31
#in body+hand mode the body points are the same as openpose
markerspositionhand = ['Wrist', 'Thumb_00', 'Thumb_01', 'Thumb_02', 'Thumb_03', 'Index_00', 'Index_01',
'Index_02', 'Index_03', 'Middle_00', 'Middle_01', 'Middle_02', 'Middle_03', 'Ring_00',
'Ring_01', 'Ring_02', 'Ring_03', 'Little_00', 'Little_01', 'Little_02', 'Little_03']
#lets make a list of the variables with x,y,z coordinates
markerhands = []
for i in ['L_', 'R_']:
for j in markerspositionhand:
for k in ['x_', 'y_', 'z_']:
markerhands.append(str.lower(k + i + j))
#for upper body we only want these points and they for position they follow openpose
markerspositionbody = ['Nose', 'Neck', 'Rshoulder', 'Relbow', 'RWrist', 'LShoulder',
'LElbow', 'LWrist', 'Midhip', 'RHip', 'RKnee', 'RAnkle', 'LHip']
#'LKnee', 'LAnkle', 'REye', 'LEye', 'REar', 'LEar', 'LBigToe', 'LSmallToe',
# 'LHeel', 'RBigToe', 'RSmallToe', 'RHeel', 'Background']
markerbody = []
for j in markerspositionbody:
for k in ['x_', 'y_', 'z_']:
markerbody.append(str.lower(k + j))
columns = markerhands+markerbody #these are your column names
#make a time series by concatenating row information
timeseries = []
timeseries.append(columns) #the first is a list with columns
for frame in framedata:
with open(framefol+frame, 'rb') as f:
data = pickle.load(f)
timsdat = data['pred_output_list']
timsdat = pd.DataFrame(timsdat)
#extract body joint positions
leftjoints = np.ndarray.tolist(timsdat['pred_lhand_joints_img'][0])
rightjoints = np.ndarray.tolist(timsdat['pred_rhand_joints_img'][0])
bodyjoints = np.ndarray.tolist(timsdat['pred_body_joints_img'][0])
#get everything in a flat list
lj = []
rj = []
bj = []
for i in leftjoints:
lj.extend(i)
for i in leftjoints:
rj.extend(i)
it = 0
for i in markerspositionbody:
bj.extend(bodyjoints[it])
it+=1
#append to the the list
timeseries.append(lj+rj+bj)
print('done! Now saving to file')
####################################################### data to be written row-wise in csv fil
data = timeseries
# opening the csv file in 'w+' mode
file = open(outtsfol + 'timeseries.csv', 'w+', newline ='')
#write it
with file:
write = csv.writer(file)
write.writerows(data)
print('done! Timeseries saved')