Last week we surmised that the ESP32-S3’s new LCD peripheral might be used for general-purpose parallel output (not just LCDs) under DMA control…perhaps even for things like RGB LED matrices and concurrent NeoPixel strands. These have been done with prior ESP32 generations, but the new S3 peripherals work a bit differently and there’s a learning curve.
Debugging the puzzle of driving HUB75 RGB LED matrices was shown last time. That’s since been merged into the Adafruit_Protomatter Arduino library for anyone to use or just dissect the code for similar goals.
Concurrent NeoPixel strands have been made to work here in the secret Adafruit bunker, but not yet merged into Adafruit_NeoPXL8. It’s a tremendous amount of code, and for the sake of a “learnable” project, let’s work through a minimal example that simply cycles through eight LEDs. The same ideas can scale up to bigger things.
The anode leg of eight LEDs are connected to ESP32-S3 Feather pins SCL, 5, 6, 9, 10, 11, 12 and 13 (consecutive pins along one side of the Feather board). The cathode legs are connected to ground (shown here with a 100Ω resistor for each as it’s considered Proper Form, but I won’t tell anyone if you want to go commando for a quick test):
If you don’t have the parts for this, you can still see a bit of the action on the board’s built-in pin 13 LED, possibly making this the world’s most complex “blink” sketch. Alternately or in addition, a NeoPXL8 FeatherWing could be used (with slight changes to the pin numbers) for easy interfacing to a logic analyzer. That’s something pretty nifty with the ESP32 family vs. most other microcontrollers right now…pin multiplexing is super capable, we’re not tied to specific pins or even a specific order.
The code’s thoroughly commented, so I won’t break down every step here. Needless to say, the LCD peripheral is weird and fussy. Oftentimes the Espressif ESP32-S3 Technical Reference Manual will say one thing, but reality goes a slightly different way. To be fair, the documentation is still a preliminary work in progress…but perhaps more importantly, the LCD peripheral probably wasn’t intended to be used this way, it’s just a wacky hack.
/* Simple example of using the ESP32-S3's LCD peripheral for general-purpose (non-LCD) parallel data output with DMA. Connect 8 LEDs (or logic analyzer), cycles through a pattern among them at about 1 Hz. This code is ONLY for the ESP32-S3, NOT the S2, C3 or original ESP32. None of this is authoritative canon, just a lot of trial error w/datasheet and register poking. Probably more robust ways of doing this still TBD. */ #include <driver/periph_ctrl.h> #include <esp_private/gdma.h> #include #include <hal/dma_types.h> #include <hal/gpio_hal.h> #include <soc/lcd_cam_struct.h> gdma_channel_handle_t dma_chan; // DMA channel dma_descriptor_t desc; // DMA descriptor uint8_t data[8][312]; // Transmit buffer (2496 bytes total) // End-of-DMA-transfer callback static IRAM_ATTR bool dma_callback(gdma_channel_handle_t dma_chan, gdma_event_data_t *event_data, void *user_data) { // This DMA callback seems to trigger a moment before the last data has // issued (buffering between DMA & LCD peripheral?), so pause a moment // before stopping LCD data out. The ideal delay may depend on the LCD // clock rate...this one was determined empirically by monitoring on a // logic analyzer. YMMV. delayMicroseconds(30); // The LCD peripheral stops transmitting at the end of the DMA xfer, but // clear the lcd_start flag anyway -- we poll it in loop() to decide when // the transfer has finished, and the same flag is set later to trigger // the next transfer. LCD_CAM.lcd_user.lcd_start = 0; return true; } void setup() { // LCD_CAM peripheral isn't enabled by default -- MUST begin with this: periph_module_enable(PERIPH_LCD_CAM_MODULE); periph_module_reset(PERIPH_LCD_CAM_MODULE); // Reset LCD bus LCD_CAM.lcd_user.lcd_reset = 1; esp_rom_delay_us(100); // Configure LCD clock. Since this program generates human-perceptible // output and not data for LED matrices or NeoPixels, use almost the // slowest LCD clock rate possible. The S3-mini module used on Feather // ESP32-S3 has a 40 MHz crystal. A 2-stage clock division of 1:16000 // is applied (250*64), yielding 2,500 Hz. Still much too fast for // human eyes, so later we set up the data to repeat each output byte // many times over. LCD_CAM.lcd_clock.clk_en = 1; // Enable peripheral clock LCD_CAM.lcd_clock.lcd_clk_sel = 1; // XTAL_CLK source LCD_CAM.lcd_clock.lcd_ck_out_edge = 0; // PCLK low in 1st half cycle LCD_CAM.lcd_clock.lcd_ck_idle_edge = 0; // PCLK low idle LCD_CAM.lcd_clock.lcd_clk_equ_sysclk = 0; // PCLK = CLK / (CLKCNT_N+1) LCD_CAM.lcd_clock.lcd_clkm_div_num = 250; // 1st stage 1:250 divide LCD_CAM.lcd_clock.lcd_clkm_div_a = 0; // 0/1 fractional divide LCD_CAM.lcd_clock.lcd_clkm_div_b = 1; LCD_CAM.lcd_clock.lcd_clkcnt_n = 63; // 2nd stage 1:64 divide // See section 26.3.3.1 of the ESP32S3 Technical Reference Manual // for information on other clock sources and dividers. // Configure LCD frame format. This is where we fiddle the peripheral // to provide generic 8-bit output rather than actually driving an LCD. // There's also a 16-bit mode but that's not shown here. LCD_CAM.lcd_ctrl.lcd_rgb_mode_en = 0; // i8080 mode (not RGB) LCD_CAM.lcd_rgb_yuv.lcd_conv_bypass = 0; // Disable RGB/YUV converter LCD_CAM.lcd_misc.lcd_next_frame_en = 0; // Do NOT auto-frame LCD_CAM.lcd_data_dout_mode.val = 0; // No data delays LCD_CAM.lcd_user.lcd_always_out_en = 1; // Enable 'always out' mode LCD_CAM.lcd_user.lcd_8bits_order = 0; // Do not swap bytes LCD_CAM.lcd_user.lcd_bit_order = 0; // Do not reverse bit order LCD_CAM.lcd_user.lcd_2byte_en = 0; // 8-bit data mode LCD_CAM.lcd_user.lcd_dummy = 1; // Dummy phase(s) @ LCD start LCD_CAM.lcd_user.lcd_dummy_cyclelen = 0; // 1 dummy phase LCD_CAM.lcd_user.lcd_cmd = 0; // No command at LCD start // "Dummy phases" are initial LCD peripheral clock cycles before data // begins transmitting when requested. After much testing, determined // that at least one dummy phase MUST be enabled for DMA to trigger // reliably. A problem with dummy phase(s) is if we're also using the // LCD_PCLK_IDX signal (not used in this code, but Adafruit_Protomatter // does)...the clock signal will start a couple of pulses before data, // which may or may not be problematic in some situations. You can // disable the dummy phase but need to keep the LCD TX FIFO primed // in that case, which gets complex. // always_out_en is set above to allow aribtrary-length transfers, // else lcd_dout_cyclelen is used...but is limited to 8K. Long (>4K) // transfers need DMA linked lists, not used here but mentioned later. // Route 8 LCD data signals to GPIO pins const struct { int8_t pin; uint8_t signal; } mux[] = { { SCL, LCD_DATA_OUT0_IDX }, // These are 8 consecutive pins down one { 5, LCD_DATA_OUT1_IDX }, // side of the ESP32-S3 Feather. The ESP32 { 6, LCD_DATA_OUT2_IDX }, // has super flexible pin MUX capabilities, { 9, LCD_DATA_OUT3_IDX }, // so any signal can go to any pin! { 10, LCD_DATA_OUT4_IDX }, { 11, LCD_DATA_OUT5_IDX }, { 12, LCD_DATA_OUT6_IDX }, { 13, LCD_DATA_OUT7_IDX }, }; for (int i = 0; i < 8; i++) { esp_rom_gpio_connect_out_signal(mux[i].pin, mux[i].signal, false, false); gpio_hal_iomux_func_sel(GPIO_PIN_MUX_REG[mux[i].pin], PIN_FUNC_GPIO); gpio_set_drive_capability((gpio_num_t)mux[i].pin, (gpio_drive_cap_t)3); } // This program has a known fixed-size data buffer (2496 bytes) that fits // in a single DMA descriptor (max 4095 bytes). Large transfers would // require a linked list of descriptors, but here it's just one... desc.dw0.owner = DMA_DESCRIPTOR_BUFFER_OWNER_DMA; desc.dw0.suc_eof = 1; // Last descriptor desc.next = NULL; // No linked list // Remaining descriptor elements are initialized before each DMA transfer. // Allocate DMA channel and connect it to the LCD peripheral gdma_channel_alloc_config_t dma_chan_config = { .sibling_chan = NULL, .direction = GDMA_CHANNEL_DIRECTION_TX, .flags = { .reserve_sibling = 0 } }; gdma_new_channel(&dma_chan_config, &dma_chan); gdma_connect(dma_chan, GDMA_MAKE_TRIGGER(GDMA_TRIG_PERIPH_LCD, 0)); gdma_strategy_config_t strategy_config = { .owner_check = false, .auto_update_desc = false }; gdma_apply_strategy(dma_chan, &strategy_config); // Enable DMA transfer callback gdma_tx_event_callbacks_t tx_cbs = { .on_trans_eof = dma_callback }; gdma_register_tx_event_callbacks(dma_chan, &tx_cbs, NULL); // As mentioned earlier, the slowest clock we can get to the LCD // peripheral is 40 MHz / 250 / 64 = 2500 Hz. To make an even slower // bit pattern that's perceptible, we just repeat each value many // times over. The pattern here just counts through each of 8 bits // (each LED lights in sequence)...so to get this to repeat at about // 1 Hz, each LED is lit for 2500/8 or 312 cycles, hence the // data[8][312] declaration at the start of this code (it's not // precisely 1 Hz because reality is messy, but sufficient for demo). // In actual use, say controlling an LED matrix or NeoPixels, such // shenanigans aren't necessary, as these operate at multiple MHz // with much smaller clock dividers and can use 1 byte per datum. for (int i = 0; i < (sizeof(data) / sizeof(data[0])); i++) { // 0 to 7 for (int j = 0; j < sizeof(data[0]); j++) { // 0 to 311 data[i][j] = 1 << i; } } } void loop() { // This uses a busy loop to wait for each DMA transfer to complete... // but the whole point of DMA is that one's code can do other work in // the interim. The CPU is totally free while the transfer runs! while (LCD_CAM.lcd_user.lcd_start); // Wait for DMA completion callback // After much experimentation, each of these steps is required to get // a clean start on the next LCD transfer: gdma_reset(dma_chan); // Reset DMA to known state LCD_CAM.lcd_user.lcd_dout = 1; // Enable data out LCD_CAM.lcd_user.lcd_update = 1; // Update registers LCD_CAM.lcd_misc.lcd_afifo_reset = 1; // Reset LCD TX FIFO // This program happens to send the same data over and over...but, // if desired, one could fill the data buffer with a new bit pattern // here, or point to a completely different buffer each time through. // With two buffers, one can make best use of time by filling each // with new data before the busy loop above, alternating between them. // Reset elements of DMA descriptor. Just one in this code, long // transfers would loop through a linked list. desc.dw0.size = desc.dw0.length = sizeof(data); desc.buffer = data; gdma_start(dma_chan, (intptr_t)&desc); // Start DMA w/updated descriptor(s) delayMicroseconds(1); // Must 'bake' a moment before... LCD_CAM.lcd_user.lcd_start = 1; // Trigger LCD DMA transfer }