Skip to content

Conversation

geerlingguy
Copy link
Owner

See ongoing discussion in our Intel Arc GPU thread: geerlingguy/raspberry-pi-pcie-devices#510 (comment)

To test this patch, see the first comment in the issue linked above.

P33M and others added 30 commits January 17, 2025 14:31
A user has reported that a card of this model from late 2021 doesn't
work, so extend the date range and make it match on all card sizes.

Signed-off-by: Jonathan Bell <[email protected]>
Add two quirk properties that control whether or not the controller
issues many more handshakes to FS/HS Async endpoints in a single
(micro)frame. Enabling these can significantly increase throughput for
endpoints that frequently respond with NAKs.

Signed-off-by: Jonathan Bell <[email protected]>
If a device frequently NAKs, it can exhaust the scheduled handshakes in
a frame. It will then not get polled by the controller until the next
frame interval. This is most noticeable on FS devices as the controller
schedules a small set of transactions only once per full-speed frame.

Setting the ENH_PER_NAK_FS/LS bits in the GUCTL1 register increases the
number of transactions that can be scheduled to Async (Control/Bulk)
endpoints in the respective frame time. In the FS case, this only
applies to FS devices directly connected to root ports.

Signed-off-by: Jonathan Bell <[email protected]>
For platforms that have xHCI controllers attached over PCIe, and
non-coherent routes to main memory, a theoretical race exists between
posting new TRBs to a ring, and writing to the doorbell register.

In a contended system, write traffic from the CPU may be stalled before
the memory controller, whereas the CPU to Endpoint route is separate
and not likely to be contended. Similarly, the DMA route from the
endpoint to main memory may be separate and uncontended.

Therefore the xHCI can receive a doorbell write and find a stale view
of a transfer ring. In cases where only a single TRB is ping-ponged at
a time, this can cause the endpoint to not get polled at all.

Adding a readl() before the write forces a round-trip transaction
across PCIe, definitively serialising the CPU along the PCI
producer-consumer ordering rules.

Signed-off-by: Jonathan Bell <[email protected]>
The DHT11 datasheet is pretty cryptic, but it does suggest that after
each integer value (humidity and temperature) there are "decimal"
values. Validate these as integers in the range 0-9 and treat them as
tenths of a unit.

Link: raspberrypi#6220

Signed-off-by: Phil Elwell <[email protected]>
…ctors

The non-desktop property "Indicates the output should be ignored for
purposes of displaying a standard desktop environment or console."

That sounds like it should be true for all writeback and virtual
connectors as you shouldn't render a desktop to them, so set it
by default.

Signed-off-by: Dave Stevenson <[email protected]>
The limit of 32 planes per DRM device is dictated by the use
of planes_mask returning a u32.

Change to a u64 such that 64 planes can be supported by a device.

Signed-off-by: Dave Stevenson <[email protected]>
Some hardware will implement transpose as a rotation operation,
which when combined with X and Y reflect can result in a rotation,
but is a discrete operation in its own right.

Add an option for transpose only.

Signed-off-by: Dave Stevenson <[email protected]>
Some connectors, particularly writeback, can implement flip
or transpose operations as writing back to memory.

Add a connector rotation property to control this.

Signed-off-by: Dave Stevenson <[email protected]>
For devices where transfer lengths are not known upfront, there is a
danger when the destination is wider than the source that partial words
can be lost at the end of a transfer. Ideally the controller would be
able to flush the residue, but it can't - it's not even possible to tell
that there is any.

Instead, allow the client driver to avoid the problem by setting a
smaller width.

Signed-off-by: Phil Elwell <[email protected]>
SPI transfers are of defined length, unlike some UART traffic, so it is
safe to let the DMA controller choose a suitable memory width.

Signed-off-by: Phil Elwell <[email protected]>
In order to avoid losing residue bytes when a receive is terminated
early, set the destination width to single bytes.

Link: raspberrypi#6365

Signed-off-by: Phil Elwell <[email protected]>
The xHC may commence Host Initiated Data Moves for streaming endpoints -
see USB3.2 spec s8.12.1.4.2.4. However, this behaviour is typically
counterproductive as the submission of UAS URBs in {Status, Data,
Command} order and 1 outstanding IO per stream ID means the device never
enters Move Data after a HIMD for Status or Data stages with the same
stream ID. For OUT transfers this is especially inefficient as the host
will start transmitting multiple bulk packets as a burst, all of which
get NAKed by the device - wasting bandwidth.

Also, some buggy UAS adapters don't properly handle the EP flow control
state this creates - e.g. RTL9210.

Set Host Initiated Data Move Disable to always defer stream selection to
the device. xHC implementations may treat this field as "don't care,
forced to 1" anyway - xHCI 1.2 s4.12.1.

Signed-off-by: Jonathan Bell <[email protected]>
The default link frequency of 450MHz has been noted to interfere
with GPS if they are in close proximty.
Add the option for 453 and 456MHz to move the signal slightly out
of the band. (447MHz can not be offered as corruption is then observed
on the 133x992 10bit mode).

Signed-off-by: Dave Stevenson <[email protected]>

fixup imx477 gps
Copy of the imx708 change.

Signed-off-by: Dave Stevenson <[email protected]>
Attempting to start a non-idle channel causes an error message to be
logged, and is inefficient. Test for emptiness of the desc_issued list
before doing so.

Signed-off-by: Phil Elwell <[email protected]>
The Raspberry Pi RP1 includes 2 M3 cores running firmware. This driver
adds a mailbox communication channel to them via a doorbell and some
shared memory.

Signed-off-by: Phil Elwell <[email protected]>
The RP1 firmware runs a simple communications channel over some shared
memory and a mailbox. This driver provides access to that channel.

Signed-off-by: Phil Elwell <[email protected]>
Declare the communications channel to RP1.

Signed-off-by: Phil Elwell <[email protected]>
Provide remote access to the PIO hardware in RP1. There is a single
instance, with 4 state machines.

Signed-off-by: Phil Elwell <[email protected]>
Declare the device that proxies RP1's PIO hardware.

Signed-off-by: Phil Elwell <[email protected]>
Use the PIO hardware on RP1 to implement a PWM interface.

Signed-off-by: Phil Elwell <[email protected]>
Enable building of the pwm-pio-rp1 driver, which is Pi 5-specific.

Signed-off-by: Phil Elwell <[email protected]>
Add an overlay to enable a single-channel PIO-assisted PWM interface on any
header pin.

Signed-off-by: Phil Elwell <[email protected]>
This is a SPI to powerline chipset with host-side Ethernet interface.
Is is usually used in e-mobility environments, e.g. on
Electrical Vehicle Supply Equipment (EVSE) side.

Signed-off-by: Michael Heimpold <[email protected]>
The documentation isn't very clear explaining how to enable SPI CS
active-high and it takes a long time to understand it. Adding a specific
overlay as a simple example on how to invert this signal can help
understand the solution.

Link: https://forums.raspberrypi.com/viewtopic.php?t=378222
Signed-off-by: Iker Pedrosa <[email protected]>
If autonomous speed negotiation is unreliable then brcm_pcie_set_gen()
can be used to override/limit this behaviour. However, setting the limit
after the linkup poll means the link can temporarily enter a bad speed
which may result in link failure. Move the speed setup to just prior to
releasing perst_n.

Fixes: 0693b42 ("PCI: brcmstb: Split post-link up initialization to brcm_pcie_start_link()")

Signed-off-by: Jonathan Bell <[email protected]>
pelwell and others added 24 commits January 17, 2025 14:32
Designing a PCM512X-based soundcard with no external SCLK is a valid
choice supported by the driver. Don't alarm users with messages that
say "No SCLK, using BCLK: -2" - reclassify them as debug information.

Signed-off-by: Phil Elwell <[email protected]>
Controls which only exist when snd_soc_register_card returns can't be
modified before then. Move the setting of volume limits to just before
the end of the probe function.

Link: raspberrypi#6527

Signed-off-by: Phil Elwell <[email protected]>
The codec control Digital Playback Volume is one of the controls deleted
by the allo-piano-dac-plus driver. It is effectively replaced by the
soundcard controls Master Playback Volume and Subwoofer Playback Volume.

Delete the code that sets the volume limit on those codec controls - the
limits on the soundcard volume controls are sufficient.

Signed-off-by: Phil Elwell <[email protected]>
Enable nhpoly1305_neon module to speed up Adiantum disk encryption.
Ensure that rp1_pio_open fails if the device failed to probe.

Link: raspberrypi#6593

Signed-off-by: Phil Elwell <[email protected]>
Simplify the implementation of rp1_firmware_get, requiring its clients
to have a valid 'firmware' property. Also make it return NULL on error.

Link: raspberrypi#6593

Signed-off-by: Phil Elwell <[email protected]>
We want to be able to control the interop surface exposed by Command
Queueing across bcm2712 products to a more restrictive default, with
selectable disable and permissive behaviour.

Changing the bool to a cell lets it relay a tristate value.

Also add the override parameter to CM5 as CM5-lite may interface with
arbitrary eMMC or SD cards.

(Reimplemented on 6.12 - bcm2712 dts now has a downstream/upstream split.)

Signed-off-by: Jonathan Bell <[email protected]>
We have found that many SD cards in the field, even of the same make and
model, have latent bugs in their CQ implementation. Some product lines
have fewer bugs with newer manufacture dates, but this is not a
guarantee that a particular card is at a particular firmware revision
level.

Many of these bugs lead to card hangs or data corruption. Add a quirk to
mark a card as having a tested, working CQ implementation and ignore the
capability if absent.

Signed-off-by: Jonathan Bell <[email protected]>
These cards have a known-good CQ implementation and are based on a
Longsys product. Add the MANFID for Longsys SD, and the particular CID
details for the Pi card.

Signed-off-by: Jonathan Bell <[email protected]>
Implement a tristate-style option for "supports-cqe". If the property is
absent or zero, disable CQ completely. For 1, enable CQ unconditionally
for eMMC cards, and known-good SD cards. For 2, enable for eMMC cards,
and all SD cards that are not known-bad.

The sdhci-brcmstb driver needs to know about the tristate as its probe
sequence would otherwise override a disable in mmc_of_parse().

Signed-off-by: Jonathan Bell <[email protected]>
commit c13094b upstream.

on 32-bit kernels, iomap_write_delalloc_scan() was inadvertently using a
32-bit position due to folio_next_index() returning an unsigned long.
This could lead to an infinite loop when writing to an xfs filesystem.

Signed-off-by: Marco Nelissen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
The DT property arm,cpu-registers-not-fw-configured tells the kernel
that the ARM architectural timer has not been configured by the
firmware. This prevents the use of a vDSO - a faster alternative to a
syscall for some common kernel operations.

However, on Pi 4 the firmware does configure the timer, so this property
is unnecessary. Delete it.

Signed-off-by: Phil Elwell <[email protected]>
Make sure the sdhost driver doesn't use requests bigger than SWIOTLB
can handle.

Copied from [1].

Link: raspberrypi#6589
Signed-off-by: Phil Elwell <[email protected]>
[1] d4dd9bc ("mmc: bcm2835: Take SWIOTLB memory size limitation
into account")
We can avoid calling the v3d_clock_up_put and v3d_clock_up_get
when a job is submitted to a CPU queue. We don't need to change
the V3D core frequency to run a CPU job as it is executed on
the CPU. This way we avoid delaying timestamps CPU jobs by 4.5ms
that is the time that it takes the firmware to increase the V3D
core frequency.

Fixes: fe6a858 ("drm/v3d: Correct clock settng calls to new APIs")
Signed-off-by: Jose Maria Casanova Crespo <[email protected]>
Reviewed-by: Maíra Canal <[email protected]>
Prior to [1], an fb_ops member of 0 was intepreted as a request for a
default value. This saves source code but requires special handling by
the framework, slowing down all accesses for no runtime benefit.

Use the new __FB_DEFAULT_ macros to explicitly select default handlers
in the bcm2708_fb driver. Also remove the pointless wrappers around
cfb_fillrect and cfb_imageblit - call them directly.

Link: https://forums.raspberrypi.com/viewtopic.php?p=2286016#p2286016
Signed-off-by: Phil Elwell <[email protected]>
[1] 8813e86 ("fbdev: Remove default file-I/O implementations")
Now that the upstream driver is overclockable, switch to using it in
preference of the downstream driver (which can be deleted by a followup
commit).

Signed-off-by: Phil Elwell <[email protected]>
The principal differences between the downstream SDHOST driver and the
version accepted upstream driver are that the upstream version loses the
overclock support and DMA configuration via DT, but gains some tidying
up (and maintenance by the upstream devs).

Add the missing features (with the exception of the low-overhead logging)
as a patch to the upstream driver.

Signed-off-by: Phil Elwell <[email protected]>
Commit ceddfd4 ("media: i2c: imx219: Support four-lane operation")
added support for device tree to allow configuration of the sensor to
use 4 lanes with a link frequency of 363MHz, and amended the advertised
pixel rate to 280.8MPix/s.

However it didn't change any of the PLL settings, so actually it would
have been running effectively overclocked in the MIPI block, and with
the frame rate and exposure calculations being wrong.

The pixel rate and link frequency advertised were taken from the "Clock
Setting Example" section of the datasheet. However those are based on an
external clock of 12MHz, and are unachievable with a clock of 24MHz (it
seems PREPLLCLK_VT_DIV and PREPLLCK_OP_DIV can ONLY be set via the
automatic configuration doumented in "9-1-2 EXCK_FREQ setting depend on
INCK frequency).

Dropping all support for the 363MHz link frequency would cause problems
for existing users, so allow it from device tree, but log a warning that
the requested value is not being truly applied.

Fixes: ceddfd4 ("media: i2c: imx219: Support four-lane operation")
Co-developed-by: Peyton Howe <[email protected]>
Signed-off-by: Peyton Howe <[email protected]>
Signed-off-by: Dave Stevenson <[email protected]>
List V4L2_PIX_FMT_YUV422P as supported by the PiSP backend hardware.

Signed-off-by: Naushir Patuck <[email protected]>
These fields should not be set by either the user or the kernel driver
so remove them. Replace them with padding bytes to maintain backward
compatibility with existing userland applications.

Signed-off-by: Naushir Patuck <[email protected]>
Commit e442e5c ("arch:arm:boot:dts:overlays: Added waveshare 13.3inch
panel support") added an extra touch controller for the new panels.
On systems with old panels, it ends up spamming the kernel log as that
touch controller isn't there to respond.

Fixes: e442e5c ("arch:arm:boot:dts:overlays: Added waveshare 13.3inch panel support")
Signed-off-by: Dave Stevenson <[email protected]>
@geerlingguy
Copy link
Owner Author

The sleeps may not be necessary with newer kernel versions (6.15.y) and the patch from yanghaku in place (see geerlingguy/raspberry-pi-pcie-devices#764). I also need to rebase on the rpi-6.15.y branch...

@geerlingguy
Copy link
Owner Author

Superceded by #10

@geerlingguy geerlingguy closed this Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.