January 31, 2023

Kyu networking -- Rip off the NetBSD startup for ARM

Pablo Picasso once said “good artists borrow, great artists steal.”

And Steve Jobs quoted him, famously saying in 1996:

"Picasso had a saying -- 'good artists copy; great artists steal' -- and we have always been shameless about stealing great ideas."
However, stealing (or sharing, depending on your point of view) is what open source is all about. There are two important things when doing so. One is not to claim other peoples work as your own. Retain any authorship comments and just be honest for crying out loud. (Maybe be thankful too). The second is to retain any licensing and/or copyright headers. Then you are good to go.

What I want to rip off from NetBSD

To start with, just two files (and whatever essential header files they also pull in).

For starters, the entire file "arch/arm/arm/cpufunc_asm_armv7.S". This also pulls in five header files, which I also copy. I may be able to prune these down, but if I do that, I will do it after I get everything working. This is done and it now compiles as part of Kyu without error.

Next I am looking at "arch/arm/arm/armv6_start.S". It is quick work to get it using the include files that I sorted out for cpufunc above. I get a variety of unsatisfied references which bear close scrutiny:

 in function `generic_savevars':
undefined reference to `kern_vtopdiff'
undefined reference to `uboot_args'
 in function `generic_vstartv7':
undefined reference to `cpu_info_store'
undefined reference to `arm_cpu_topology_set'

undefined reference to `start'
 in function `generic_vprint':
 in function `generic_prints':
 in function `generic_printx':
undefined reference to `uartputc'
The reference to "start" is the easy one, this would be in locore.S on NetBSD and is the label branched to when armv6_start is all done. I can make this whatever I want.

All the references to uartputc are conditional on VERBOSE_INIT_ARM, which is not defined for my build (but easily could be).

kern_vtopdiff is a location that is set by this startup code. The startup code figures out the virtual to physical address offset and posts it here for the kernel to use later. uboot_args is a 4 element array somewhere that the startup code fills with information. cpu_info_store is an array of structures in arm/arm_machdep.c. The startup code is once again placing information into the first of these structures (for core 0) for NetBSD to pick up later.

arm_cpu_topology_set() is a function (the code calls it with a "bl" instruction). It is in arm_cpu_topology.c. The startup code reads the MPIDR register and passes it to this function. The function extracts values and then calls another routine -- cpu_topology_set() -- with the extracted values.

My thoughts about all of this is that all of these can be commented out or made to call stubs (although uartputc would be useful and easy enough to set up).

What is KASAN?

This is an interesting NetBSD facility called the "kernel address sanitizer". It is certainly worth looking at (they say it catches bugs at run time), but it isn't something I need to dig into right now.

What about locore.S

When the code in armv6_start.S finishes, it branches to "start" in locore.S Let's take a close look at that to see what we don't want to leave out.

I will put my comments up front and let the listing below end this page.
First of all note that "uartputc" is defined here and bounces the call to _platform_early_putchar, which is somewhat like what I would do.
Also note the block of data in .Lstart. They use a multiple register transfer to cleverly load this into several registers. The first two items are the start and end addresses for the clearing of BSS. The r8 register gets a pointer to "cpu_info_store" which we have seen before. The CI_ARM_CPUID field gets the value of some darn CPU config register.

MRC p15, 0, , c0, c0, 0; Read Main ID Register
MRC p15, 0, , c0, c0, 5; Read Multiprocessor Affinity Register
Here is what is in the MIDR (specific values for the Cortex-A7 MPCore) The macro _ARM_ARCH_DWORD_OK will be defined on the v7, this is handled in nb_cdefs.h by way of a bit of obtuse logic. So we can zero the BSS quickly using a pair of registers 8 bytes at a time.

I have no idea what the TPIDRPRW macros are about. A search on this reveals that this is an ARM register described as "PL1 only Thread ID Register". The description indicates that this is a scratch register visible only in PL1 that the OS can use to keep track of which thread is running.

MRC p15, 0, , c13, c0, 4    ; Read TPIDRPRW into Rt
MCR p15, 0, , c13, c0, 4    ; Write Rt to TPIDRPRW
Neither TPI macro is defined in my build and although these macros are also referenced in armv6_start.S, they are only referenced if MULTIPROCESSOR is defined (and it is not). All this is somewhat mysterious, but ultimately the information goes into cpu_info_store[] and is not something I am concerned with.

Notice how they clear fp so backtraces will terminate nicely.

Also notice the call to initarm(). This is a big deal, and there are a multitude of implementations for each board platform. My build uses the one in arch/evbarm/fdt/fdt_machdep.c -- if you want to avoid the project of learning about the fdt, then look at any of the other 30 or so for specific platforms. They do a lot of hardware initialization (which is of course board dependent). This routine returns the value to be used for the sp (but in r0 of course). This code copies that return value into the sp.

ASENTRY_NP(start)
        mrs     r1, cpsr                /* fetch CPSR value */
        msr     spsr_sx, r1             /* set SPSR[23:8] to known value */

        /*
         * Get bss bounds (r1, r2), curlwp or curcpu (r8), and set initial
         * stack.
         */
        adr     r1, .Lstart
        ldmia   r1, {r1, r2, r8, sp}

#if defined(TPIDRPRW_IS_CURCPU) || defined(TPIDRPRW_IS_CURLWP)
        mcr     p15, 0, r8, c13, c0, 4
#endif
#if defined(TPIDRPRW_IS_CURLWP)
        ldr     r8, [r8, #L_CPU]        /* r8 needs curcpu in it */
#endif

        mov     r4, #0
#ifdef _ARM_ARCH_DWORD_OK
        mov     r5, #0
#endif
.L1:
#ifdef _ARM_ARCH_DWORD_OK
        strd    r4, r5, [r1], #0x0008   /* Zero the bss */
#else
        str     r4, [r1], #0x0004       /* Zero the bss */
#endif
        cmp     r1, r2
        blt     .L1

        mrc     p15, 0, r3, c0, c0, 0   /* get our cpuid and save it early */
        str     r3, [r8, #CI_ARM_CPUID]

        mov     fp, #0x00000000         /* trace back starts here */
        bl      _C_LABEL(initarm)       /* Off we go */

        /* initarm will return the new stack pointer. */
        mov     sp, r0

        mov     fp, #0x00000000         /* trace back starts here */
        mov     ip, sp
        push    {fp, ip, lr, pc}
        sub     fp, ip, #4

        bl      _C_LABEL(main)          /* call main()! */

        adr     r0, .Lmainreturned
        b       _C_LABEL(panic)

ENTRY_NP(uartputc)
#ifdef EARLYCONS
        b       ___CONCAT(EARLYCONS, _platform_early_putchar)
#endif
        RET
ASEND(uartputc)

.Lstart:
        .word   _edata
        .word   _end
#if defined(TPIDRPRW_IS_CURLWP)
        .word   _C_LABEL(lwp0)
#else
        .word   _C_LABEL(cpu_info_store)
#endif
#if !defined(__HAVE_GENERIC_START)
        .word   svcstk_end
#else
        .word   start_stacks_top
#endif

.Lmainreturned:
        .asciz  "main() returned"
        .align  0
ASEND(start)


Have any comments? Questions? Drop me a line!

Kyu / [email protected]