Skip to content

Commit dc71fb6

Browse files
djbwsmb49
authored andcommitted
cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
BugLink: https://bugs.launchpad.net/bugs/2099996 commit 101c268 upstream. In support of investigating an initialization failure report [1], cxl_test was updated to register mock memory-devices after the mock root-port/bus device had been registered. That led to cxl_test crashing with a use-after-free bug with the following signature: cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1 cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1 cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0 1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1 [..] cxld_unregister: cxl decoder14.0: cxl_region_decode_reset: cxl_region region3: mock_decoder_reset: cxl_port port3: decoder3.0 reset 2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1 cxl_endpoint_decoder_release: cxl decoder14.0: [..] cxld_unregister: cxl decoder7.0: 3) cxl_region_decode_reset: cxl_region region3: Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> cxl_region_decode_reset+0x69/0x190 [cxl_core] cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x5d/0x60 [cxl_core] At 1) a region has been established with 2 endpoint decoders (7.0 and 14.0). Those endpoints share a common switch-decoder in the topology (3.0). At teardown, 2), decoder14.0 is the first to be removed and hits the "out of order reset case" in the switch decoder. The effect though is that region3 cleanup is aborted leaving it in-tact and referencing decoder14.0. At 3) the second attempt to teardown region3 trips over the stale decoder14.0 object which has long since been deleted. The fix here is to recognize that the CXL specification places no mandate on in-order shutdown of switch-decoders, the driver enforces in-order allocation, and hardware enforces in-order commit. So, rather than fail and leave objects dangling, always remove them. In support of making cxl_region_decode_reset() always succeed, cxl_region_invalidate_memregion() failures are turned into warnings. Crashing the kernel is ok there since system integrity is at risk if caches cannot be managed around physical address mutation events like CXL region destruction. A new device_for_each_child_reverse_from() is added to cleanup port->commit_end after all dependent decoders have been disabled. In other words if decoders are allocated 0->1->2 and disabled 1->2->0 then port->commit_end only decrements from 2 after 2 has been disabled, and it decrements all the way to zero since 1 was disabled previously. Link: http://lore.kernel.org/[email protected] [1] Cc: [email protected] Fixes: 176baef ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Alison Schofield <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Zijun Hu <[email protected]> Signed-off-by: Dan Williams <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> CVE-2024-50226 Signed-off-by: Koichiro Den <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
1 parent a76027c commit dc71fb6

File tree

6 files changed

+100
-53
lines changed

6 files changed

+100
-53
lines changed

drivers/base/core.c

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4019,6 +4019,41 @@ int device_for_each_child_reverse(struct device *parent, void *data,
40194019
}
40204020
EXPORT_SYMBOL_GPL(device_for_each_child_reverse);
40214021

4022+
/**
4023+
* device_for_each_child_reverse_from - device child iterator in reversed order.
4024+
* @parent: parent struct device.
4025+
* @from: optional starting point in child list
4026+
* @fn: function to be called for each device.
4027+
* @data: data for the callback.
4028+
*
4029+
* Iterate over @parent's child devices, starting at @from, and call @fn
4030+
* for each, passing it @data. This helper is identical to
4031+
* device_for_each_child_reverse() when @from is NULL.
4032+
*
4033+
* @fn is checked each iteration. If it returns anything other than 0,
4034+
* iteration stop and that value is returned to the caller of
4035+
* device_for_each_child_reverse_from();
4036+
*/
4037+
int device_for_each_child_reverse_from(struct device *parent,
4038+
struct device *from, const void *data,
4039+
int (*fn)(struct device *, const void *))
4040+
{
4041+
struct klist_iter i;
4042+
struct device *child;
4043+
int error = 0;
4044+
4045+
if (!parent->p)
4046+
return 0;
4047+
4048+
klist_iter_init_node(&parent->p->klist_children, &i,
4049+
(from ? &from->p->knode_parent : NULL));
4050+
while ((child = prev_device(&i)) && !error)
4051+
error = fn(child, data);
4052+
klist_iter_exit(&i);
4053+
return error;
4054+
}
4055+
EXPORT_SYMBOL_GPL(device_for_each_child_reverse_from);
4056+
40224057
/**
40234058
* device_find_child - device iterator for locating a particular device.
40244059
* @parent: parent struct device

drivers/cxl/core/hdm.c

Lines changed: 42 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -713,7 +713,44 @@ static int cxl_decoder_commit(struct cxl_decoder *cxld)
713713
return 0;
714714
}
715715

716-
static int cxl_decoder_reset(struct cxl_decoder *cxld)
716+
static int commit_reap(struct device *dev, const void *data)
717+
{
718+
struct cxl_port *port = to_cxl_port(dev->parent);
719+
struct cxl_decoder *cxld;
720+
721+
if (!is_switch_decoder(dev) && !is_endpoint_decoder(dev))
722+
return 0;
723+
724+
cxld = to_cxl_decoder(dev);
725+
if (port->commit_end == cxld->id &&
726+
((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
727+
port->commit_end--;
728+
dev_dbg(&port->dev, "reap: %s commit_end: %d\n",
729+
dev_name(&cxld->dev), port->commit_end);
730+
}
731+
732+
return 0;
733+
}
734+
735+
void cxl_port_commit_reap(struct cxl_decoder *cxld)
736+
{
737+
struct cxl_port *port = to_cxl_port(cxld->dev.parent);
738+
739+
lockdep_assert_held_write(&cxl_region_rwsem);
740+
741+
/*
742+
* Once the highest committed decoder is disabled, free any other
743+
* decoders that were pinned allocated by out-of-order release.
744+
*/
745+
port->commit_end--;
746+
dev_dbg(&port->dev, "reap: %s commit_end: %d\n", dev_name(&cxld->dev),
747+
port->commit_end);
748+
device_for_each_child_reverse_from(&port->dev, &cxld->dev, NULL,
749+
commit_reap);
750+
}
751+
EXPORT_SYMBOL_NS_GPL(cxl_port_commit_reap, CXL);
752+
753+
static void cxl_decoder_reset(struct cxl_decoder *cxld)
717754
{
718755
struct cxl_port *port = to_cxl_port(cxld->dev.parent);
719756
struct cxl_hdm *cxlhdm = dev_get_drvdata(&port->dev);
@@ -722,14 +759,14 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld)
722759
u32 ctrl;
723760

724761
if ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)
725-
return 0;
762+
return;
726763

727-
if (port->commit_end != id) {
764+
if (port->commit_end == id)
765+
cxl_port_commit_reap(cxld);
766+
else
728767
dev_dbg(&port->dev,
729768
"%s: out of order reset, expected decoder%d.%d\n",
730769
dev_name(&cxld->dev), port->id, port->commit_end);
731-
return -EBUSY;
732-
}
733770

734771
down_read(&cxl_dpa_rwsem);
735772
ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
@@ -742,7 +779,6 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld)
742779
writel(0, hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
743780
up_read(&cxl_dpa_rwsem);
744781

745-
port->commit_end--;
746782
cxld->flags &= ~CXL_DECODER_F_ENABLE;
747783

748784
/* Userspace is now responsible for reconfiguring this decoder */
@@ -752,8 +788,6 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld)
752788
cxled = to_cxl_endpoint_decoder(&cxld->dev);
753789
cxled->state = CXL_DECODER_STATE_MANUAL;
754790
}
755-
756-
return 0;
757791
}
758792

759793
static int cxl_setup_hdm_decoder_from_dvsec(

drivers/cxl/core/region.c

Lines changed: 13 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -128,8 +128,8 @@ static int cxl_region_invalidate_memregion(struct cxl_region *cxlr)
128128
"Bypassing cpu_cache_invalidate_memregion() for testing!\n");
129129
return 0;
130130
} else {
131-
dev_err(&cxlr->dev,
132-
"Failed to synchronize CPU cache state\n");
131+
dev_WARN(&cxlr->dev,
132+
"Failed to synchronize CPU cache state\n");
133133
return -ENXIO;
134134
}
135135
}
@@ -138,19 +138,17 @@ static int cxl_region_invalidate_memregion(struct cxl_region *cxlr)
138138
return 0;
139139
}
140140

141-
static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
141+
static void cxl_region_decode_reset(struct cxl_region *cxlr, int count)
142142
{
143143
struct cxl_region_params *p = &cxlr->params;
144-
int i, rc = 0;
144+
int i;
145145

146146
/*
147-
* Before region teardown attempt to flush, and if the flush
148-
* fails cancel the region teardown for data consistency
149-
* concerns
147+
* Before region teardown attempt to flush, evict any data cached for
148+
* this region, or scream loudly about missing arch / platform support
149+
* for CXL teardown.
150150
*/
151-
rc = cxl_region_invalidate_memregion(cxlr);
152-
if (rc)
153-
return rc;
151+
cxl_region_invalidate_memregion(cxlr);
154152

155153
for (i = count - 1; i >= 0; i--) {
156154
struct cxl_endpoint_decoder *cxled = p->targets[i];
@@ -173,23 +171,17 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
173171
cxl_rr = cxl_rr_load(iter, cxlr);
174172
cxld = cxl_rr->decoder;
175173
if (cxld->reset)
176-
rc = cxld->reset(cxld);
177-
if (rc)
178-
return rc;
174+
cxld->reset(cxld);
179175
set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
180176
}
181177

182178
endpoint_reset:
183-
rc = cxled->cxld.reset(&cxled->cxld);
184-
if (rc)
185-
return rc;
179+
cxled->cxld.reset(&cxled->cxld);
186180
set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
187181
}
188182

189183
/* all decoders associated with this region have been torn down */
190184
clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
191-
192-
return 0;
193185
}
194186

195187
static int commit_decoder(struct cxl_decoder *cxld)
@@ -305,16 +297,8 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
305297
* still pending.
306298
*/
307299
if (p->state == CXL_CONFIG_RESET_PENDING) {
308-
rc = cxl_region_decode_reset(cxlr, p->interleave_ways);
309-
/*
310-
* Revert to committed since there may still be active
311-
* decoders associated with this region, or move forward
312-
* to active to mark the reset successful
313-
*/
314-
if (rc)
315-
p->state = CXL_CONFIG_COMMIT;
316-
else
317-
p->state = CXL_CONFIG_ACTIVE;
300+
cxl_region_decode_reset(cxlr, p->interleave_ways);
301+
p->state = CXL_CONFIG_ACTIVE;
318302
}
319303
}
320304

@@ -1946,13 +1930,7 @@ static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
19461930
get_device(&cxlr->dev);
19471931

19481932
if (p->state > CXL_CONFIG_ACTIVE) {
1949-
/*
1950-
* TODO: tear down all impacted regions if a device is
1951-
* removed out of order
1952-
*/
1953-
rc = cxl_region_decode_reset(cxlr, p->interleave_ways);
1954-
if (rc)
1955-
goto out;
1933+
cxl_region_decode_reset(cxlr, p->interleave_ways);
19561934
p->state = CXL_CONFIG_ACTIVE;
19571935
}
19581936

drivers/cxl/cxl.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -356,7 +356,7 @@ struct cxl_decoder {
356356
struct cxl_region *region;
357357
unsigned long flags;
358358
int (*commit)(struct cxl_decoder *cxld);
359-
int (*reset)(struct cxl_decoder *cxld);
359+
void (*reset)(struct cxl_decoder *cxld);
360360
};
361361

362362
/*
@@ -727,6 +727,7 @@ static inline bool is_cxl_root(struct cxl_port *port)
727727
int cxl_num_decoders_committed(struct cxl_port *port);
728728
bool is_cxl_port(const struct device *dev);
729729
struct cxl_port *to_cxl_port(const struct device *dev);
730+
void cxl_port_commit_reap(struct cxl_decoder *cxld);
730731
struct pci_bus;
731732
int devm_cxl_register_pci_bus(struct device *host, struct device *uport_dev,
732733
struct pci_bus *bus);

include/linux/device.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1061,6 +1061,9 @@ int device_for_each_child(struct device *dev, void *data,
10611061
int (*fn)(struct device *dev, void *data));
10621062
int device_for_each_child_reverse(struct device *dev, void *data,
10631063
int (*fn)(struct device *dev, void *data));
1064+
int device_for_each_child_reverse_from(struct device *parent,
1065+
struct device *from, const void *data,
1066+
int (*fn)(struct device *, const void *));
10641067
struct device *device_find_child(struct device *dev, void *data,
10651068
int (*match)(struct device *dev, void *data));
10661069
struct device *device_find_child_by_name(struct device *parent,

tools/testing/cxl/test/cxl.c

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -693,26 +693,22 @@ static int mock_decoder_commit(struct cxl_decoder *cxld)
693693
return 0;
694694
}
695695

696-
static int mock_decoder_reset(struct cxl_decoder *cxld)
696+
static void mock_decoder_reset(struct cxl_decoder *cxld)
697697
{
698698
struct cxl_port *port = to_cxl_port(cxld->dev.parent);
699699
int id = cxld->id;
700700

701701
if ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)
702-
return 0;
702+
return;
703703

704704
dev_dbg(&port->dev, "%s reset\n", dev_name(&cxld->dev));
705-
if (port->commit_end != id) {
705+
if (port->commit_end == id)
706+
cxl_port_commit_reap(cxld);
707+
else
706708
dev_dbg(&port->dev,
707709
"%s: out of order reset, expected decoder%d.%d\n",
708710
dev_name(&cxld->dev), port->id, port->commit_end);
709-
return -EBUSY;
710-
}
711-
712-
port->commit_end--;
713711
cxld->flags &= ~CXL_DECODER_F_ENABLE;
714-
715-
return 0;
716712
}
717713

718714
static void default_mock_decoder(struct cxl_decoder *cxld)

0 commit comments

Comments
 (0)