fix fluence for mua -> 0 #164

epini · 2023-03-17T16:56:51Z

Dear Prof. Fang,

I was playing with MCXlab at low absorption values, but it seems that below a certain threshold the fluence fails numerically and becomes eventually zero. The threshold can be lowered by turning on the DUSE_DOUBLE and DUSE_MORE_DOUBLE flags, but in principle the correct limit for low absorption is simply obtained by accumulating the appropriate Taylor expansion for the exponential w0 * (1 - exp(-prop.mua * len)) / prop.mua.
To take advantage of this one can introduce an if condition depending on the prop.mua / prop.mus ratio instead of the currently implemented zero value for prop.mua == 0. See the picture below relative to a homogeneous medium.

The dashed line value is calculated as the average pathlength in the medium (which is given by line 1894 of mcx_core.cu). Strangely enough, a small difference is visible between when the if condition kicks in. A Kahan summation may be needed somewhere? Adding the same if condition and Taylor expansion at line 1894 returns instead a smooth convergence to the dashed value.

Perhaps this is not the ideal solution, but I think it could still be preferable compared to the drastic loss of precision at low mua and the incorrect fluence value that is currently set for null absorption.

fangq · 2023-03-17T20:01:02Z

thanks, I am supportive of this patch, but the implementation is not not exactly robust. see my comment to the code.

fangq · 2023-03-17T20:01:46Z

src/mcx_core.cu

                    weight = w0 - p.w;
                } else if (gcfg->outputtype == otFluence || gcfg->outputtype == otFlux) {
-                    weight = (prop.mua == 0.f) ? 0.f : ((w0 - p.w) / (prop.mua));
+                    weight = (prop.mua/prop.mus < 0.001) ? (w0 * len * (1 - prop.mua * len / 2)) : ((w0 - p.w) / (prop.mua));


currently, mcx simulation kernel could accept domains with either mua or mus to be exactly 0. with your new formula

weight = (prop.mua/prop.mus < 0.001) ? (w0 * len * (1 - prop.mua * len / 2)) : ((w0 - p.w) / (prop.mua));

I see it can raise an exception if mus is 0, or both mua/mus is exactly 0.

please consider a formula that can also accommodate the case when mus=0.

several minor comments:

for numerical constants, please add f at the end so that it won't be treated as "double constant" which will trigger double-precision hardware/computation, which is slow on consumer GPUs.

same for integer constants, convert those to 1.f or 2.f could avoid unnecessary type conversions

avoid using division "/" in a CUDA code. it is one of the most expensive operators in the GPU kernel, even more expensive than sin/cos. try to convert to multiplication whenever possible, for example, instead of /2 , change it to *0.5f

epini · 2023-03-22T14:31:49Z

Thank you for your quick and tutorial reply. I've just started learning to code and your suggestions were very useful.

I came up with a slightly different solution that hopefully solves some of the previous shortcomings. Rather using a condition based on prop.mus, we can probably compare prop.mua to len on a step-wise basis. This avoids potential divide-by-zero scenarios, it is compatible with null prop.mua and/or prop.mus, and smooths out the abrupt step at the previous activation threshold.
Below you can see the updated fluence values for a scattering and a clear medium case.

I still cannot explain the origin of this small residual gap for vanishing absorption (which however is only present in the scattering case). Do you have any suggestion about what could be causing it?

fangq · 2023-04-02T16:12:32Z

I still cannot explain the origin of this small residual gap for vanishing absorption (which however is only present in the scattering case). Do you have any suggestion about what could be causing it?

sorry for the delay. I am trying to figure out the math of your patch.

my understanding is that your are trying to calculate the limit lim_{mua->0} [(w-w0)/mua] to weight, but if I replace
w-w0 = exp(-mua*len) and take the limit, isn't the result simply len?

why you had w0 * len * (1 - prop.mua * len / 2)?

if you replace this term by len, will the small residual disappear?

epini · 2023-04-03T08:58:11Z

No, the small residual does not disappear when using either fewer or more terms in the expansion.
The limit for mua -> 0 is indeed len. In fact, the second term does eventually vanish - but the residual does not.
The code would probably work well even just using len as the limit. However, it is desirable that the if condition kicks in as early as prop.mua * len < 0.001f in order to avoid seeing the oscillations due to the finite precision.
At prop.mua * len < 0.001f, however, we may not be sufficiently close to mua -> 0 to get away with just the leading term in the expansion. This is why I also kept the second one.

As I mentioned when I first created the pull request, what is particularly weird to me is that if we use just len, then the result should be identical to the ppath accumulator at line 1894 of mcx_core.cu.
Instead, there's this small residual between the two (identical?) calculations. If, just for the sake of argument, one tries to add more expansion terms at line 1894 (i.e., misusing ppath to estimate the fluence), then in that case the limit does eventually converge to the expected value, without any residual. Yet, this does not happen when using the same code at line 1938.

fangq · 2023-04-04T17:11:33Z

@epini, would you be able to share your test script that I can reproduce the "small residual"? I want to take a deeper look.

epini · 2023-04-12T12:03:08Z

Sure, here is the code. Of course the results depend on which modification have been implemented before compiling the executable as previously specified.

%% Validation of new fluence rate method proposed by F. Martelli
% Biomedical Optics Express Vol. 14, Issue 1, pp. 148-162 (2023)
% https://doi.org/10.1364/BOE.477339

clear
close all

%% material input settings

N=100; % number of layers
N_mua=100; % number of mua values
data=zeros(2,N,N_mua); % initialize matrix for all simulations data

cfg.unitinmm=0.1; % cubic voxel size in mm
sf=cfg.unitinmm; % scaling factor
L=N*sf; % sample thickness in mm
sz=2*sf; % sample dimensions along xy plane
g=0;
mus=0.1; % mm^-1
n=1.4;
v=3e11/n; % speed of light inside the medium (mm/s)
vol_matrix=ones(uint8(sz/sf),uint8(sz/sf),N);

for j=1:N
    vol_matrix(:,:,j)=j; % define a medium per layer
end

cfg.vol=vol_matrix;

index = 1;
for mua=logspace(-15,6,N_mua) % set values for mua

    cfg.prop=[0 0 1 1;
        repmat([mua mus g n], [N,1])];
    cfg.bc='cc_cc_000001'; % cyclic boundary condition to mimic horizontally infinite slabs

    % source input settings
    cfg.nphoton=1e6;
    cfg.srcpos=[sz/(2*sf)+1 sz/(2*sf)+1 0];
    cfg.srcdir=[0 0 1];
    cfg.srctype='pencil';

    % time intervals steps
    cfg.tstart=0; % s
    cfg.tend=1e-7; % s
    cfg.tstep=cfg.tend; % s

    % other settings
    cfg.gpuid=1;
    cfg.autopilot=1;
    cfg.issaveexit=1;
    cfg.debuglevel='p';

    % output settings
    cfg.issaveref=0;
    cfg.issave2pt=1; % if = 0 disables volumetric data output to speed up simulation
    cfg.isreflect=1;
    cfg.outputtype='fluence';

    cfg.detpos=[sz/(2*sf)+0.5 sz/(2*sf)+0.5 N/2+0.5 N*2];
    cfg.maxdetphoton=cfg.nphoton; % over 1e7 runs out of memory on my laptop
    cfg.savedetflag='dpw'; % detected photons recorded parameters

    % detected photon analysis
    fprintf("Simulation started mua = %g mus = %g\n", mua, mus);
    [flux,detps]=mcxlab(cfg);

    % detector
    pos_x = sf*(detps.p(:,1) - sz/(2*sf)+0.5);
    pos_y = sf*(detps.p(:,2)- sz/(2*sf)+0.5);
    pos_z = (detps.p(:,3));

    PBM_fluence = detps.ppath(:,:); % use ppath to retrieve fluence for mua=0 (Martelli's method)
    path_times = sum(PBM_fluence, 2)./v;

    % PBM fluence evaluation
    layer_PBM = zeros(1,L/sf); % Martelli's method (Pathlength Based Method)
    layer_ABM = zeros(1,L/sf); % mcx native method (Absorption Based Method)

    for k=1:N
        layer_PBM(k) = mean(PBM_fluence(:,k));
        AMB_flux = flux.data(:,:,k,:)*(sf^2); % fluence per time bin in k_th layer
        layer_ABM(k) =  sum(AMB_flux, 'all'); % fluence integrated over all time bins in k_th layer
    end

    data(1,:,index)=layer_PBM;
    data(2,:,index)=layer_ABM;

    index = index + 1;
end

%% plot fluence in one layer

layer = 1;

figure(4), hold on
plot(logspace(-15,6,N_mua), squeeze(data(1,layer,:)), 'k--')
plot(logspace(-15,6,N_mua), squeeze(data(2,layer,:)), 'b-')

ylabel('$\displaystyle \langle\Phi\rangle$  [Wmm$^{-2}]$','interpreter','latex', 'Fontsize', 16)
xlabel('$\displaystyle \mu_{\textrm{a}}$ (mm$^{-1}$)','interpreter','latex', 'Fontsize', 16)
legend('zero absorption', 'MCX')
set(gca, 'xscale', 'log')
axis([1e-15 1e6 0 6])
title(strcat('Single precision with if condition, mus=', num2str(mus), ' mm^{-1}, n=1.4'))

fangq · 2023-07-22T00:09:18Z

@epini, sorry for the delay. I finally have a chance to run your test script, and now I see what you meant by a small dependency.

I merged your patch, and also simplified it further with only the 1st order Taylor expansion instead of the 2nd order because they show nearly no difference in my test.

Regarding the dependency, I saw you used the accumulation of the partial paths to obtain the reference solution. if you look at the 1st order Taylor expansion, I think they are essentially the same formula, so I don't see a reason the two ways should yield different result. It is likely due to a book keep issue in the code somewhere.

I used mcxpreview(cfg) after running cfg=rmfield(cfg,'detpos'); to check out your settings, and noticed that your source is located 1 mm below the bottom of the non-zero voxels. By default, mcx assumes the bottom-lower corner of the voxeled domain as [1,1,1]; you will have to set cfg.issrcfrom0=1 to shift it to [0,0,0]. Because of that air gap, I think a number of things could happen.

I've tested two possibilities 1) disabling specular reflection using cfg.isspecular=0, and 2) shifting the origin [0,0,0] to the bottom-lower corner of the voxel domain by setting cfg.issrcfrom0=1, I did not see either of the change could remove the small difference. But I am confident it must be an error somewhere else in the script. I spoke to Fabrizio about his path-based approach during this year's PhotonicsWest, and he agreed with me that mcx's fluence accommodation method is consistent with his approach.

anyhow, I appreciate your patch and excited that this simple patch allows mcx to get meaningful fluence in non-absorbing medium.

patch in #164 breaks https://github.com/fangq/mmc/blob/master/mmclab/example/demo_dualmesh_output.m

#235 was reported by Thomas Rialan in https://groups.google.com/g/mcx-users/c/3zvP5W4OGnM revert #164 to fix high error reported by Jessica Lund in https://groups.google.com/g/mcx-users/c/hr6lqpfTYbg

fangq · 2025-01-22T15:56:39Z

hi @epini, I found the reason why the fluence does not convert to the expected value when mua->0. It was due to an incorrect variable - len is the pathlength the photon has to move in the current step, however, mcx only saves weight when a photon moves out of a voxel to minimize global memory write, we should have used the accommulative pathlength between every saving, which is f.pathpen.

after replacing len by f.pathpen in the above commit, I am able to get exact match using your above test script

patch in #164 breaks https://github.com/fangq/mmc/blob/master/mmclab/example/demo_dualmesh_output.m

fix fluence for mua -> 0

3a4f7ed

fangq reviewed Mar 17, 2023

View reviewed changes

allow mus=0, avoid unnecessary casting of scalars to double

291adf5

fangq merged commit 481c4b5 into fangq:master Jul 21, 2023

fangq added a commit that referenced this pull request Jul 21, 2023

simplify mua->0 approximation, drop high order term, #164

87167cb

fangq added a commit that referenced this pull request Sep 4, 2023

fix regression caused by #164 for mus=0 region

0b98843

patch in #164 breaks https://github.com/fangq/mmc/blob/master/mmclab/example/demo_dualmesh_output.m

fangq added a commit that referenced this pull request Sep 5, 2023

update pmcx after fixing the regression due to #164

53e2681

fangq added a commit that referenced this pull request Jan 22, 2025

[bug] fix incorrect per-voxel pathlength when mua->0, #164

24b445e

fangq added a commit to fangq/mcxcl that referenced this pull request Jan 22, 2025

[bug] fix incorrect per-voxel pathlength when mua->0, fangq/mcx#164

febd932

fangq added a commit that referenced this pull request Feb 14, 2025

fix regression caused by #164 for mus=0 region

ff49281

patch in #164 breaks https://github.com/fangq/mmc/blob/master/mmclab/example/demo_dualmesh_output.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix fluence for mua -> 0 #164

fix fluence for mua -> 0 #164

Uh oh!

epini commented Mar 17, 2023

Uh oh!

fangq commented Mar 17, 2023

Uh oh!

fangq Mar 17, 2023

Uh oh!

fangq Mar 17, 2023

Uh oh!

epini commented Mar 22, 2023

Uh oh!

fangq commented Apr 2, 2023 •

edited

Loading

Uh oh!

epini commented Apr 3, 2023

Uh oh!

fangq commented Apr 4, 2023

Uh oh!

epini commented Apr 12, 2023

Uh oh!

fangq commented Jul 22, 2023

Uh oh!

fangq commented Jan 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix fluence for mua -> 0 #164

fix fluence for mua -> 0 #164

Uh oh!

Conversation

epini commented Mar 17, 2023

Uh oh!

fangq commented Mar 17, 2023

Uh oh!

fangq Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

fangq Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

epini commented Mar 22, 2023

Uh oh!

fangq commented Apr 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

epini commented Apr 3, 2023

Uh oh!

fangq commented Apr 4, 2023

Uh oh!

epini commented Apr 12, 2023

Uh oh!

fangq commented Jul 22, 2023

Uh oh!

fangq commented Jan 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fangq commented Apr 2, 2023 •

edited

Loading