Loading...
 

48-pixel Image Routine

Overview


This is a quick tutorial on how to draw a 48-pixel image on the screen, often used for a 6-digit score routine, to display a logo on the bottom of the screen, or for a title screen display. This tutorial assumes that you already know how to position and display player sprites on the screen.

How This Trick Works


We can generate a 48-pixel image by exploiting the fact that we can set the player graphics to display multiple copies. If we set each player to display 3 close copies, and position P1 8 pixels to the right of P0, then we have an unbroken 48-pixel image that alternates between the P0 and P1 graphics.

To turn this into a single 48-pixel image (or, e.g. 6 separate score digits), we swap out the values of the GRP0 and GRP1 registers as they are being drawn to change what is displayed for the copies of each player graphic. E.g. After the first copy of the P0 graphic is drawn, its value is updated with new graphics before the second copy is displayed, and so on.

The timing needs to be very precise for this trick to work correctly, and some tricks are used to be able to update the graphics quickly enough to display the whole image correctly.

Creating Your Own 48-pixel Image


To generate a 48-pixel image, first set the NUSIZ0 and NUSIZ1 registers to a value to display 3 close copies of the players (e.g. $03). Next, you will need to position the player graphics to the desired position on the screen, with P1 being set 8 pixels to the right of P0. For this example, we will center the image, positioning P0 at pixel 55 of the visible screen, and P1 at pixel 63.

Image Data


We will need the image data for the 48-pixel image. It will need to be in 8-pixel wide chunks, as the image will be built one graphics register at a time. I personally use the "PlayerPal" online Atari 2600 sprite editor, using 6 image frames for my image data, and have the tool generate data for me.

PlayerPal Editor

In this example, we will use graphics that display the text "HELLO!" in large letters on the screen. Below is the data in 8-pixel chunks, from left to right:

; "HELLO!" graphic data
    if >. != >[.+hello_length]
        align 256
    endif
hello0
    .byte %01000010
    .byte %11100111
    .byte %11100111
    .byte %11100111
    .byte %11100111
    .byte %11111111
    .byte %11111111
    .byte %11100111
    .byte %11100111
    .byte %11100111
    .byte %11100111
    .byte %01000010
hello_length = * - hello0

    if >. != >[.+hello_length]
        align 256
    endif
hello1
    .byte %00011111
    .byte %00111111
    .byte %00111111
    .byte %00110000
    .byte %00110000
    .byte %00111100
    .byte %00111100
    .byte %00110000
    .byte %00110000
    .byte %00111111
    .byte %00111111
    .byte %00011111

    if >. != >[.+hello_length]
        align 256
    endif
hello2
    .byte %00000000
    .byte %10001111
    .byte %00011111
    .byte %00011111
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %10011100
    .byte %00001000
    
    if >. != >[.+hello_length]
        align 256
    endif
hello3
    .byte %00000000
    .byte %00001111
    .byte %10011111
    .byte %00011111
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00011100
    .byte %00001000
    
    if >. != >[.+hello_length]
        align 256
    endif
hello4
    .byte %00001111
    .byte %00011111
    .byte %10011111
    .byte %00011101
    .byte %00011000
    .byte %00011000
    .byte %00011000
    .byte %00011000
    .byte %00011101
    .byte %00011111
    .byte %00011111
    .byte %00001111
    
    if >. != >[.+hello_length]
        align 256
    endif
hello5
    .byte %10000110
    .byte %11000110
    .byte %11000000
    .byte %11000110
    .byte %11001111
    .byte %11001111
    .byte %11001111
    .byte %11001111
    .byte %11001111
    .byte %11001111
    .byte %11001111
    .byte %10000110


Note that we generate a hello_length label based on the number of bytes in the first chunk to serve as a counter for our image routine, and for checking if we cross any page boundaries. We don't want any of these data tables to cross any page boundaries, since that will affect the timing of our data reads, and our image will likely display incorrectly. For that reason, each of the data tables has a DASM assembler directive similar to this one that forces the data to be aligned by inserting filler bytes if the table crosses a page boundary:

; Check for crossing page boundary
    if >. != >[.+hello_length]
        align 256
    endif


Calculating the Kernel Timing


In order to correctly produce our 48-pixel image, the writes to the graphics registers need to be written to in a narrow range of time based on the positioning of our image. The first two graphic registers can be filled at the beginning of the scanline before the image is drawn, but the other 4 need to be written to after the previous copy has displayed on the screen, and before the next copy begins to display. The diagram below shows the 6 image segments that make up the 48-pixel image (G0 - G6), and the player graphic that makes up each segment (P0 or P1).

Screen Shot 2021 02 24 At 8.37.38 AM

G0P0: May be written anytime before G0P0 begins to display on the screen.
G1P1: May be written anytime before G0P1 begins to display on the screen.
G2P0: May be written after G0P0 finishes displaying, but before G2P0 begins to display.
G3P1: May be written after G1P1 finishes displaying, but before G3P1 begins to display.
G4P0: May be written after G2P0 finishes displaying, but before G4P0 begins to display.
G5P1: May be written after G3P1 finishes displaying, but before G5P1 begins to display.

Since we have already decided on the positioning of our image above, we can calculate the exact cycle range for these writes. P0 is positioned at 55, and P1 is positioned at 63. Therefore:

G0P0: Begins to display at 55, ends displaying at 63
G1P1: Begins displaying at 63, ends displaying at 71
G2P0: Begins displaying at 71, ends displaying at 79
G3P1: Begins displaying at 79, ends displaying at 87
G4P0: Begins displaying at 87, ends displaying at 95
G5P1: Begins displaying at 95, ends displaying at 103

From this information, we can figure out the range in which we should do our graphics register writes. For the beginning of the range for our writes, the first two segments can be anytime before our image displays. For the other four, the beginning of the range starts right after the previous copy for that graphics register cas finished displaying. So, e.g. G2P0 may first be written to when G0P0 ends displaying at 63.

The end of the range for all of them is when that segment begins to display, so in our example, it is at 55 for G0P0, 63 for G1P1, etc, giving us:

G0P0: May be written anytime before 55
G1P1: May be written anytime before 63
G2P0: May be written in the range of 63-71
G3P1: May be written in the range of 71-79
G4P0: May be written in the range of 79-87
G5P1: May be written in the range of 87-95

We can convert these to cycles by adding 68 to the screen position (for horizontal blank), and dividing by 3. For the beginning of the range, we round up to the nearest cycle, and for the end of the range, we round down.

G0P0: Must be written to by cycle 41 ((55+68)/3)
G1P1: Must be written to by cycle 43 ((63+68/3 round down)
G2P0: Must be written in the range of 44-46 ((63+68/3 round up) and ((71+68)/3 round down)
G3P1: Must be written in the range of 47-49 ((71+68)/3 round up) and ((79+68)/3)
G4P0: Must be written in the range of 49-51 ((79+68/3) and ((87+68)/3 round down)
G5P1: Must be written in the range of 52-54 ((87+68/3 round up) and ((95+68)/3 round down)

Using VDEL to Make the Timing Work


Now that we know our cycle ranges, we can see the timing is extremely tight for our graphics register writes. After the first two registers, there is only a 10-cycle range between the first cycle out third segment may be written, and the last cycle our last segment may be written. Since a write to a graphics register takes 3 cycles, this does not leave any time to read new graphics data. E.g. if our write for G2P0 ends at cycle 44, then a write to G3P1 could end at cycle 47, cycle 50 for G4P0, and finally cycle 53 for G5P1. That's 4 writes in rapid succession with no reads from memory. The problem is, we only have 3 registers to load up beforehand:

; Not enough registers
    sta GRP0		;  3	(44)
    stx GRP1		;  3	(47)
    sty GRP0		;  3	(50)
    ??? GRP1		;  3	(53)


Where do we find another register to be able to make the timing here work? The answer is to make use of the VDELP0 and VDELP1 registers. If we write a #1 to these registers, it means that writes to the player graphics registers go to a buffer rather than directly to the register until the other graphics register is written to. So, if VDELP0 and VDELP0 are both enabled, then a write to GRP0 will go to a buffer, and anything in the buffer for GRP1 will then be displayed. Likewise, a write to GRP1 will go to a buffer, and the buffer for GRP0 will then be displayed.

In the example below, I use brackets to indicate writes to the graphics buffer, and no brackets to indicate graphics that are displayed:

; VDELP* in practice 
    lda hello0,y
    sta GRP0	; hello0->[GRP0] (hello0 is written to GRP0's buffer)
    lda hello1,y
    sta GRP1	 ; hello1->[GRP1] hello0->GRP0 (hello1 is written to GRP1's buffer; hello0 is displayed)
    lda hello2,y
    sta GRP0	; hello2->[GRP0] hello1->GRP1 (hello2 is written to GRP0's buffer; hello1 is displayed)


At this point, hello0 will be displayed in GRP0, and hello1 will be displayed in GRP1, and hello2 is waiting in GRP0's buffer. This effectively gives us an extra register to make our tight timing work.

Putting it All Together


Now that we know our timing and how to make it work, we can finally write a kernel to display our big image. Remember that since we are using VDEL, we need to consult the correct timing for the segment that is actually being displayed rather than what is being written to a buffer. Here's an example of a kernel that will work for our positioning and data:

; kernel example
    ldy #(hello_length-1)
    sty ImageHeight
BigGraphicLoop
    sta WSYNC           ; 3     (0)
    lda hello0,y        ; 4     (4)
    sta GRP0            ; 3     (7)     hello0->[GRP0]
    lda hello1,y        ; 4     (11)
    sta GRP1            ; 3     (14)    hello1->[GRP1], hello0->GRP0
    lda hello2,y        ; 4     (18)
    sta GRP0            ; 3     (21)    hello2->[GRP0], hello1->GRP1
    lda hello3,y        ; 4     (25*)   
    tax                 ; 2     (27)    hello3->X
    lda hello4,y        ; 4     (31)
    sta Temp            ; 3     (34)
    lda hello5,y        ; 4     (38)    hello5->A
    ldy Temp            ; 3     (41)

    stx GRP1            ; 3     (44)    hello3->[GRP1], hello2->GRP0
    sty GRP0            ; 3     (47)    hello4->[GRP0], hello3->GRP1
    sta GRP1            ; 3     (50)    hello5->[GRP1], hello4->GRP0
    sta GRP0            ; 3     (53)    hello5->GRP1    
    dec ImageHeight     ; 5     (58)
    ldy ImageHeight     ; 3     (61)
    bpl BigGraphicLoop  ; 2/3   (64)


Note that the last write to GRP0 is just to display what is in GRP1's buffer, and it does not matter what is actually written there.

Sample Source File



Here is a bare-bones source file the code from this tutorial in use:

Source File

[stella] The scores / 48-pixel highres routine explained!