Variations in regression "mean" output predictions with variations in test set

### Describe the bug

Using the "mean" output type on my local MacOS hardware I am seeing small deviations in predictions depending on whether the predictions are batched or not. Example script below: 


### Steps/Code to Reproduce

```
import numpy as np
import sklearn.datasets
from tabpfn import TabPFNRegressor

X, y, _ = sklearn.datasets.make_regression(
    n_samples=9, n_features=3, random_state=0, coef=True
)

model = TabPFNRegressor(n_estimators=2, random_state=42, device="cpu")
model.fit(X, y)

# ---- Unbatched prediction ----
pred_full = model.predict(X, output_type="mean")

# ---- Single-sample batched prediction ----
preds_one = np.concatenate(
    [model.predict(X[i : i + 1], output_type="mean") for i in range(len(X))]
)

print("Full:     ", pred_full)
print("Batched:  ", preds_one)
print("Max |Δ|:  ", np.max(np.abs(pred_full - preds_one)))
print("Per-sample Δ:", pred_full - preds_one)

np.testing.assert_allclose(pred_full, preds_one, atol=1e-5, rtol=1e-5)
```

### Expected Results

Identical predictions 

### Actual Results

```
Full:      [ 198.92455   100.43968    49.008053  172.73198    93.20077  -259.5672
   33.52475   140.33487    62.212387]
Batched:   [ 198.92467   100.43905    49.007904  172.73172    93.20089  -259.56866
   33.524574  140.33533    62.212276]
Max |Δ|:   0.0014648438
Per-sample Δ: [-0.00012207  0.00063324  0.00014877  0.0002594  -0.00012207  0.00146484
  0.00017548 -0.00045776  0.00011063]
```

### Versions

```shell
tabpfn:  6.4.1                                                                                                                                                                                                                                                                                                                                                                                                                        
torch:   2.10.0                                                                                                                                                                                                                                                                                                                                                                                                                       
numpy:   2.3.3                                                                                                                                                                                                                                                                                                                                                                                                                        
sklearn: 1.6.1                                                                                                                                                                                                                                                                                                                                                                                                                        
scipy:   1.16.2                                                                                                                                                                                                                                                                                                                                                                                                                       
Python:  3.13.3   


In my local experiments torch 2.8.0 is even worse in this regard.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variations in regression "mean" output predictions with variations in test set #800

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Variations in regression "mean" output predictions with variations in test set #800

Description

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions