% Compute KL distace D(p||q) = sum_x(p(x)*log2(p(x)/q(x)) % Input: % ctable is a contingency table % a g % 500 4 6 % 501 5 4 % 502 4 1 with missing % % contingency table % a g % 500 4 6 10 % 501 5 4 9 % 502 4 1 5 % 13 11 24 % % P : % a g % 500 4/24 6/24 % 501 5/24 4/24 % 502 4/24 1/24 total of ctable = 24 % % Q: % a g % 500 13.10 11.10 % 501 13.9 11.9 % 502 13.5 11.5 then / by 24*24 % % Output: % Kullback-Leibler divergence % log base 2 used % % Date: 5/5/2006 % Compute KL distace % % KL(p,q) = Sum Pi log2(pi/qi) % Type 2: pi = xi/mean, qi = 1/N % Date: 5/5/2006 function kltable = my_ctablekl(ctables) sm = length(ctables); %number of snps in raw data kltable = zeros(sm, 1); for k=1:sm ctable = ctables{k}; [m n] = size(ctable); p = ctable; q = ctable; % normalize each cell by the total of ctable total = sum(sum(p)); for i=1:m for j=1:n p(i,j) = p(i,j) / total; end end rowsum = sum(q'); % length(upatn) colsum = sum(q); % 2 for i=1:m for j=1:n q(i,j) = (rowsum(i) * colsum(j)) / (total * total); end end for i=1:m for j=1:n if p(i,j) ~= 0 kltable(k) = kltable(k) + p(i,j) * log2(p(i,j) / q(i,j)); end end end end