DECOLOR: Moving Object Detection by Detecting Contiguous Outliers
in the Low-Rank Representation
Xiaowei Zhou et al.
Abstract—Object detection is a fundamental step for automated video analysis in many vision applications. Object detection in a video
is usually performed by object detectors or background subtraction techniques. Often, an object detector requires manually labeled
examples to train a binary classifier, while background subtraction needs a training sequence that contains no objects to build a background model. To automate the analysis, object detection without a separate training phase becomes a critical task. People have tried to tackle this task by using motion information. But existing motion-based methods are usually limited when coping with complex scenarios such as nonrigid motion and dynamic background. In this paper, we show that the above challenges can be addressed in a unified framework named DEtecting Contiguous Outliers in the LOw-rank Representation (DECOLOR). This formulation integrates object detection and background learning into a single process of optimization, which can be solved by an alternating algorithm efficiently. We explain the relations between DECOLOR and other sparsity-based methods. Experiments on both simulated data and real sequences demonstrate that DECOLOR outperforms the state-of-the-art approaches and it can work effectively on a wide range of complex scenarios.
本文的三个贡献点,分别是:
1.提出了一个在低秩框架下的离群点检测方法,即:DECOLOR.
2.引入了连续先验,用MRF来对相邻像素点之间的联系进行建模.
3.结合了参数运动模型,来弥补相机的运动.使得DECOLOR在动态背景下,仍可以取得不错的效果.
3 C ONTIGUOUS O UTLIER D ETECTION IN THE L OW -R ANK R EPRESENTATION
3.2. Formulation
本文将输入的视频记为D,那么要做的工作就是将D分解为前景F和背景B,其模型的构建分别为:
背景模型:在不考虑动态背景的情况下,视频的背景在没有光照的变化和周期性的纹理的影响时,背景的强度应该是不变的.所以,背景图像可以看作是现行相关的,构成了一个低秩矩阵B,于是就有了:
rank(B) <= K;
前景模型:
所谓前景,也就是相对于背景而言运动的物体,也可以说是 any object that moves differently from the background. 前景目标的运动导致了强度的变化,这就使得低秩的背景无法模拟此类像素点.所以,在低秩表示的时候,这些像素点就成了离群点.另外,就是文章引入了一个先验,即:前景目标应该是连续的区域并且相对较小.可以用马尔科夫随机场来对前景像素点进行建模.马尔科夫随机场是一种图模型(Graph Model),谈到图模型,自然少不了边和点.这里,就是将单个像素点比作是图的一个顶点,而两个像素点之间的联系,就可看作是图的边.根据Ising model 可以得到如下的关于前景S的能量函数:
其中,Uij 代表前景支持Sij的一元势能, Lambda ij,kl 控制着像素点Sij 和Skl之间相关性的强度.一元势能Uij的定义为:
从这个定义看出,ij位置的像素点为前景时,其一元能量为 Lambda ij, 当其为背景时,则是0.当最小化公式(4)时,会使得 Uij 最小,也就使得前景像素点尽量稀疏了.( sparse foreground ). 从公式(4)的第二项可以看出,当ij 和kl位置处的像素点相同时,即 Sij = Skl=0或者=1时,第二项为0;否则就是1.这样最小化公式(4)也使得前景区域和背景区域尽可能的平滑,
信号模型:
信号模型描述了D的构成.在背景区域,Sij = 0, 我们认为 Dij = Bij + εij, 其中εij代表独立同分布的高斯噪声.在前景区域,Sij = 1,背景区域被前景区域所遮挡,所以,Dij等于前景的强度.
将上述三个模型统一起来,就是提出的如下model:
为了使得上述最小化能量可解,作了相关的松弛,即: 用核范数取代了 rank(B)<=K, 将公式(6)写成其对偶的形式,并且引入矩阵操作符,得到最终的能量函数:
其中,A 是图的邻结矩阵.
3.3. Algorithm
公式(7)的目标函数是非凸的,包括连续的和离散的变量.联合的优化是非常困难的,所以,在这里采用了迭代的方式,分别进行前景S和背景B的求解.B-step是一个凸优化问题,S-step是联合优化问题.
3.3.1 Estimation of the Low-Rank Matrix B
给定前景支持S的预测,那么最小化公式(7)就变成了如下的矩阵完全问题:
这是从部分观察来学习一个低秩的矩阵.该问题可以用 soft-impute 算法来求解,并且用到了如下的引理:
重写公式(8),我们有:
利用引理1,通过迭代的利用下列式子,可以得到公式(8)的最优解.
SOFT-IMPUTE 算法对应的code如下:
1 %% function for soft-impute 2 function [Z,Znorm,alpha] = softImpute(X,Z,Omega,alpha0,maxRank) 3 % 4 % This program implements the soft-impute algorithm followed by 5 % postprocessing in the Matrix completion paper Mazumder'10 IJML 6 % min || Z - X ||_Omega + \alpha || Z ||_Nulear 7 % \alpha is decrease from alpha0 to the minima value that makes rank(Z) <= maxRank 8 9 % X is the incomplete matrix 10 % maxRank is the desired rank in the constraint 11 % Omega is the mask with value 1 for data and 0 for missing part 12 if isempty(Z) 13 Z = X; 14 end 15 if isempty(Omega) 16 Omega = true(size(X)); 17 end 18 if isempty(alpha0) 19 [U,D] = svd(X,'econ'); 20 alpha0 = D(2,2); 21 end 22 if isempty(maxRank) 23 maxRank = -1; 24 end 25 % parameters 26 eta = 0.707; 27 epsilon = 1e-4; 28 maxInnerIts = 20; 29 %% trivial 30 % no rank constraint 31 if maxRank >= min(size(X)) 32 Z = X; 33 [U,D] = svd(Z,'econ'); 34 Znorm = sum(diag(D)); 35 alpha = 0; 36 return; 37 end 38 % no observation 39 if sum(Omega(:)) == 0 40 % no data 41 Z = zeros(size(X)); 42 Znorm = 0; 43 alpha = alpha0; 44 return; 45 end 46 %% soft-impute 47 % 1. initialize 48 outIts = 0; 49 alpha = alpha0; 50 % 2. Do for alpha = alpha0 > alpha_1 > alpha_2 > ... > alpha_maxRank 51 disp('begin soft-impute iterations'); 52 while 1 53 outIts = outIts + 1; 54 energy = inf; 55 for innerIts = 1:maxInnerIts 56 % (a)i 57 C = X.*Omega + Z.*(1-Omega); 58 [U,D,V] = svd(C,'econ'); 59 VT = V'; 60 % soft impute 61 d = diag(D); 62 idx = find(d > alpha); 63 Z = U(:,idx) * diag( d(idx) - alpha ) * VT(idx,:); 64 % (a)ii 65 Znorm = sum(d(idx)-alpha); 66 energy_old = energy; 67 energy = alpha*Znorm + norm(Z(Omega(:))-X(Omega(:)),'fro')/2; 68 if abs(energy - energy_old) / energy_old < epsilon 69 break 70 end 71 end 72 % check termination condition of alpha 73 k = length(idx); % rank of Z 74 disp(['alpha = ' num2str(alpha) '; rank = ' num2str(k) '; number of iteration: ' num2str(innerIts)]); 75 if k <= maxRank && alpha > 1e-3 76 alpha = alpha*eta; 77 else 78 break; 79 end 80 end 81 end
3.3.2 Estimation of the Outlier Support S
在假设给定低秩矩阵B的情况下,如何求前景支持S ? 我们可以得到如下的公式:
从文章中得知,当给定B的时候,C 是一个常数,那么可以忽略,Sij前面的那一大堆项,也是定量,于是就成了 Sij 和 ||A vec(S)||1两家独大,这两项即是马尔科夫随机场的两项.
于是,通过上述两种方式,就可以迭代的求解出前景S和背景B.
对应的code 为:
1 for i = 1:size(Dtau,2) 2 GCO_SetDataCost( hMRF, (amplify/gamma)*[ 0.5*(E(:,i)).^2, ~OmegaOut(:,i)*beta + OmegaOut(:,i)*0.5*max(E(:)).^2]' ); 3 GCO_Expansion(hMRF); 4 Omega(:,i) = ( GCO_GetLabeling(hMRF) == 1 )'; 5 energy_cut = energy_cut + double( GCO_ComputeEnergy(hMRF) ); 6 end
额,说累了,歇会...出去走走...
4 EXTENSION TO MOVING ACKGROUND
Here, we use the 2D parametric transforms [60] to model the translation, rotation, and planar deformation of the background.
Dj ○ τj 代表被向量 τj ∈IRp转换后的第j帧图像,其中p表示运动模型的参数的编号,如:p=6代表仿射运动,p=8代表投影运动.所以,本文所提出的分解变成了 D○τ = B+E+ε,其中, D○τ = [D1○τ1, ... , Dn○τn], τ 是一个向量代表所有 τj 的集合.
接下来,我们用D○τ代替公式(7)中的D,并且用B,S来预测 τ,迭代的最小化下列公式:
现在开始讨论如何通过 τ 来最小化公式(20):
此处,我们利用增量优化的方法来解决这个参数运动估计问题:在每一次迭代,我们以小的增幅 △τ 更新 τ, 将 D○τ线性化为D○τ + J△τ ,
其中,J为雅克比矩阵.所以,τ 可以用下列方式进行更新:
通过△τ 的最小化过程时一个最小二乘问题,有闭合解.
实际上,τ1,...,τn 的更新可以分别独立的进行,为了加速DECOLOR的收敛,我们初始化τ,粗略的将每一帧Dj和中间帧Dn/2进行对比.这个预对齐的过程是用robust multiresolution method来进行的.
总的算法,归结为:
-------------- 算法部分结束 ---------------
代码解读:
1.RUN_REAL_MOVING.m
1 clear all 2 close all 3 4 addpath('internal'); 5 addpath(genpath('gco-v3.0')); 6 7 %% data 8 % static background 9 dataList = {'people1','people2','cars6','cars7'}; 10 11 for dataID = 1:4 12 13 dataName = dataList{dataID}; 14 load(['data\' dataName],'ImData'); 15 16 %% run DECOLOR 17 opt.flagAlign = 1; 18 opt.tol = 1e-3; 19 [LowRank,Mask,tau,info] = ObjDetection_DECOLOR(ImData,opt); 20 21 % warp masks to match the original images 22 for i = 1:size(ImData,3) 23 % use [Iwarp,Omega] = warpImg(I,tau,mode,extrapval) 24 Mask(:,:,i) = warpImg(double(Mask(:,:,i)),tau(:,i),1,0)>0.5; 25 cropRatio = 0.01; 26 Mask([1:round(cropRatio*size(Mask,1)),round((1-cropRatio)*size(Mask,1)):end],:,i) = false; 27 Mask(:,[1:round(cropRatio*size(Mask,2)),round((1-cropRatio)*size(Mask,2)):end],i) = false; 28 end 29 save(['result\' dataName '_DECOLOR.mat'],'dataName','Mask','LowRank','tau','info'); 30 31 32 %% displaying 33 load(['result\' dataName '_DECOLOR.mat'],'dataName','Mask','LowRank','tau'); 34 moviename = ['result\' dataName,'_DECOLOR.avi']; fps = 12; 35 mov = avifile(moviename,'fps',fps,'compression','none'); 36 for i = 1:size(ImData,3) 37 figure(1); clf; 38 subplot(2,2,1); 39 imshow(ImData(:,:,i)), axis off, colormap gray; axis off; 40 title('Original image','fontsize',12); 41 subplot(2,2,2); 42 imshow(LowRank(:,:,i)), axis off,colormap gray; axis off; 43 title('Low Rank','fontsize',12); 44 subplot(2,2,3); 45 imshow(ImData(:,:,i)), axis off,colormap gray; axis off; 46 hold on; contour(Mask(:,:,i),[0 0],'y','linewidth',5); 47 title('Segmentation','fontsize',12); 48 subplot(2,2,4); 49 imshow(ImData(:,:,i).*uint8(Mask(:,:,i))), axis off, colormap gray; axis off; 50 title('Foreground','fontsize',12); 51 mov = addframe(mov,getframe(1)); 52 end 53 h = close(mov); 54 55 end
2. ObjDetection_DECOLOR.m
1 function [LowRank,Mask,tau,info] = ObjDetection_DECOLOR(ImData,opt) 2 % This function use DECOLOR to detect moving objects in sequence ImData 3 % http://arxiv.org/PS_cache/arxiv/pdf/1109/1109.0882v1.pdf 4 % eexwzhou@ust.hk 5 % Syntex: [LowRank,Mask,tau] = ObjDetection_DECOLOR(ImData). 6 % Input: 7 % ImData -- 3D array representing a image sequence. 8 % Each image is stored in ImData(:,:,i). 9 % opt -- options. Usually, default setting is good. No need to specify. 10 % opt.K: desired rank of the estimated low-rank component. 11 % Default: \sqrt(min(size(D))) is good generally. 12 % opt.lambda: a constant controls the strength of smoothness regularize 13 % lambda ~ [1 5] is recommended. Default: 5 14 % opt.sigma: STD of noise in the image. If not specified, computed online 15 % opt.flagAlign: whether alighment is needed or not. 16 % opt.tol: convergence precision. Default: 1e-4 17 % Output: 18 % LowRank -- Low-rank background component 19 % Mask -- Segmented object mask 20 % tau - transformation parameters to compensate for camera motion 21 % info -- other information 22 23 disp('^_^^_^^_^^_^^_^^_^ DECOLOR ^_^^_^^_^^_^^_^'); 24 tic; 25 26 %% default parameter setting 27 if ~exist('opt','var'); opt = []; end 28 if ~isfield(opt,'tol'); opt.tol = 1e-4; end 29 if ~isfield(opt,'K'); opt.K = floor(sqrt(size(ImData,3))); end 30 if ~isfield(opt,'lambda'); opt.lambda = 5; end % gamma = opt.lambda * beta; 31 if ~isfield(opt,'sigma'); opt.sigma = []; end % sigma can be estimated online 32 if ~isfield(opt,'flagAlign'); opt.flagAlign = false; end % alignment or not 33 34 %% variable initialize 35 ImData = mat2gray(ImData); % 0~1 36 ImMean = mean(ImData(:)); 37 ImData = ImData - ImMean; % subtract mean is recommended 38 numImg = size(ImData,3); 39 sizeImg = [size(ImData,1),size(ImData,2)]; 40 if opt.flagAlign == true && sizeImg(2) > 1 41 disp('----------- Pre-alignment ----------'); 42 [ImTrans,tau] = preAlign(ImData); 43 Dtau = reshape(ImTrans,prod(sizeImg),numImg); 44 else 45 tau = []; 46 Dtau = reshape(ImData,prod(sizeImg),numImg); 47 end 48 maxOuterIts = 20; 49 alpha = []; % Default setting by soft-impute 50 beta = 0.5*(std(Dtau(:,1)))^2; % Start from a big value 51 minbeta = 0.5*(3*std(Dtau(:,1))/20)^2; % lower bound: suppose SNR <= 20 52 sigma = opt.sigma; % if empty, will be estimated online 53 B = Dtau; % the low-rank matrix 54 Omega = true(size(Dtau)); % background support 55 OmegaOut = false(size(Dtau)); % OmegaOut is to record the extrapolated regions 56 ObjArea = sum(~Omega(:)); 57 minObjArea = numel(Dtau(:,1))/1e4; % minimum number of outliers 58 59 % graph cuts initialization 60 % GCO toolbox is called 61 if opt.lambda > 0 62 hMRF = GCO_Create(prod(sizeImg),2); 63 GCO_SetSmoothCost( hMRF, [0 1;1 0] ); 64 AdjMatrix = getAdj(sizeImg); 65 amplify = 10 * opt.lambda; 66 GCO_SetNeighbors( hMRF, amplify * AdjMatrix ); 67 end 68 69 %% outer loop 70 energy_old = inf; % total energy 71 for outerIts = 1:maxOuterIts 72 disp(['---------------- Outer Loop: ' num2str(outerIts) ' ----------------']); 73 74 %% update tau 75 if opt.flagAlign == true && sizeImg(2) > 1 76 disp('*** Estimate Transformation ***'); 77 for i = 1:numImg 78 % update once 79 [Iwarp,tau(:,i),dummy,Lout] = regImg(reshape(B(:,i),sizeImg),ImData(:,:,i),tau(:,i),double(reshape(Omega(:,i),sizeImg)),1); 80 Dtau(:,i) = reshape(Iwarp,prod(sizeImg),1); 81 OmegaOut(:,i) = reshape(Lout,prod(sizeImg),1); % extrapolated regions 82 end 83 end 84 85 %% update B 86 disp('*** Estimate Low-rank Matrix *** '); 87 [B,Bnorm,alpha] = softImpute(Dtau,B,~OmegaOut&Omega,alpha,opt.K); 88 E = Dtau - B; 89 90 %% estimate sigma 91 if isempty(opt.sigma) 92 sigma_old = sigma; 93 residue = sort(E(~OmegaOut(:)&Omega(:))); 94 truncate = 0.005; 95 idx1 = round(truncate*length(residue))+1; 96 idx2 = round((1-truncate)*length(residue)); 97 sigma = std(residue(idx1:idx2)); 98 if abs(sigma_old-sigma)/abs(sigma_old) < 0.01 99 sigma = sigma_old; 100 end 101 end 102 % update beta 103 if ObjArea < minObjArea 104 beta = beta/2; 105 else 106 beta = min(max([beta/2,0.5*(3*sigma)^2 minbeta]),beta); 107 end 108 gamma = opt.lambda * beta; 109 110 %% estimate S = ~Omega; 111 disp('*** Estimate Outlier Support *** '); 112 disp(['$$$ beta = ' num2str(beta) '; gamma = ' num2str(gamma) '; sigma = ' num2str(sigma)]); 113 if opt.lambda > 0 114 % call GCO to run graph cuts 115 energy_cut = 0; 116 for i = 1:size(Dtau,2) 117 GCO_SetDataCost( hMRF, (amplify/gamma)*[ 0.5*(E(:,i)).^2, ~OmegaOut(:,i)*beta + OmegaOut(:,i)*0.5*max(E(:)).^2]' ); 118 GCO_Expansion(hMRF); 119 Omega(:,i) = ( GCO_GetLabeling(hMRF) == 1 )'; 120 energy_cut = energy_cut + double( GCO_ComputeEnergy(hMRF) ); 121 end 122 ObjArea = sum(Omega(:)==0); 123 energy_cut = (gamma/amplify) * energy_cut; 124 else 125 % direct hard thresholding if no smoothness 126 Omega = 0.5*E.^2 < beta; 127 ObjArea = sum(Omega(:)==0); 128 energy_cut = 0.5*norm(Dtau-B-E,'fro')^2 + beta*ObjArea; 129 end 130 131 %% display energy 132 energy = energy_cut + alpha * Bnorm; 133 disp(['>>> the object area is ' num2str(ObjArea)]); 134 disp(['>>> the objectvive energy is ' num2str(energy)]); 135 136 %% check termination condition 137 if ObjArea > minObjArea && abs(energy_old-energy)/energy < opt.tol; break; end 138 energy_old = energy; 139 140 end 141 142 LowRank = uint8(mat2gray(reshape(B,size(ImData))+ImMean)*256); 143 Mask = reshape(~Omega,size(ImData)); 144 145 info.opt = opt; 146 info.time = toc; 147 info.outerIts = outerIts; 148 info.energy = energy; 149 info.rank = rank(B); 150 info.alpha = alpha; 151 info.beta = beta; 152 info.sigma = sigma; 153 154 if opt.lambda > 0 155 GCO_Delete(hMRF); 156 end 157 158 end 159 160 161 162 %% function to get the adjcent matirx of the graph 163 function W = getAdj(sizeData) 164 numSites = prod(sizeData); 165 id1 = [1:numSites, 1:numSites, 1:numSites]; 166 id2 = [ 1+1:numSites+1,... 167 1+sizeData(1):numSites+sizeData(1),... 168 1+sizeData(1)*sizeData(2):numSites+sizeData(1)*sizeData(2)]; 169 value = ones(1,3*numSites); 170 W = sparse(id1,id2,value); 171 W = W(1:numSites,1:numSites); 172 end 173 174 175 %% function for soft-impute 176 function [Z,Znorm,alpha] = softImpute(X,Z,Omega,alpha0,maxRank) 177 % 178 % This program implements the soft-impute algorithm followed by 179 % postprocessing in the Matrix completion paper Mazumder'10 IJML 180 % min || Z - X ||_Omega + \alpha || Z ||_Nulear 181 % \alpha is decrease from alpha0 to the minima value that makes rank(Z) <= maxRank 182 183 % X is the incomplete matrix 184 % maxRank is the desired rank in the constraint 185 % Omega is the mask with value 1 for data and 0 for missing part 186 if isempty(Z) 187 Z = X; 188 end 189 if isempty(Omega) 190 Omega = true(size(X)); 191 end 192 if isempty(alpha0) 193 [U,D] = svd(X,'econ'); 194 alpha0 = D(2,2); 195 end 196 if isempty(maxRank) 197 maxRank = -1; 198 end 199 % parameters 200 eta = 0.707; 201 epsilon = 1e-4; 202 maxInnerIts = 20; 203 %% trivial 204 % no rank constraint 205 if maxRank >= min(size(X)) 206 Z = X; 207 [U,D] = svd(Z,'econ'); 208 Znorm = sum(diag(D)); 209 alpha = 0; 210 return; 211 end 212 % no observation 213 if sum(Omega(:)) == 0 214 % no data 215 Z = zeros(size(X)); 216 Znorm = 0; 217 alpha = alpha0; 218 return; 219 end 220 %% soft-impute 221 % 1. initialize 222 outIts = 0; 223 alpha = alpha0; 224 % 2. Do for alpha = alpha0 > alpha_1 > alpha_2 > ... > alpha_maxRank 225 disp('begin soft-impute iterations'); 226 while 1 227 outIts = outIts + 1; 228 energy = inf; 229 for innerIts = 1:maxInnerIts 230 % (a)i 231 C = X.*Omega + Z.*(1-Omega); 232 [U,D,V] = svd(C,'econ'); 233 VT = V'; 234 % soft impute 235 d = diag(D); 236 idx = find(d > alpha); 237 Z = U(:,idx) * diag( d(idx) - alpha ) * VT(idx,:); 238 % (a)ii 239 Znorm = sum(d(idx)-alpha); 240 energy_old = energy; 241 energy = alpha*Znorm + norm(Z(Omega(:))-X(Omega(:)),'fro')/2; 242 if abs(energy - energy_old) / energy_old < epsilon 243 break 244 end 245 end 246 % check termination condition of alpha 247 k = length(idx); % rank of Z 248 disp(['alpha = ' num2str(alpha) '; rank = ' num2str(k) '; number of iteration: ' num2str(innerIts)]); 249 if k <= maxRank && alpha > 1e-3 250 alpha = alpha*eta; 251 else 252 break; 253 end 254 end 255 end