Upper vs Lower Case - Assembler Related

jasonsbeer

New Member
Jan 31, 2020
4
0
1
I hope you guys don't mind me asking a question or two here related to assembler with the X16. I'm eager to teach myself, but I've run into something odd. At least it seems odd.

When I run the program below, I expect the output to be "Welcome to beautiful Forestville." What I get is "wELCOME TO BEAUTIFUL fORESTVILLE." I've attached a screen shot. Any thoughts?

Code:
screen_init:
    lda     #$93
    jsr    CHROUT        ; CLR/HOME
    lda     #$03
    jsr     screen_set_charset    ; Set lower/upper case mode

    ldx    #$00        ; X register used to index string
loop:    lda    msg_intro, x    ; Load character from BUFFER into A register
    cmp    $00        ; Is this character = $00
    beq    end        ; If the character is $00, jump to END:
    jsr    CHROUT        ; Call KERNAL API to print char in A register
    inx            ; Increment X register
    jmp    loop        ; Jump back to loop label to get next character
end:    rts            ; Return to caller

msg_intro: .ASCIIZ "Welcome to beautiful Forestville."
Untitled.png
 

BruceMcF

Active Member
May 19, 2019
208
63
28
That's PETSCII.

Way back when ASCII was upper case only, Commodore used the unallocated bit 6 set characters for graphics and then bit 7 for inverted bitmaps to have easy inverse video by toggling the high bit of the screencode character. Then when Upper/Lower case was implemented, since the routines used unshifted for the upper case ASCII characters and shifted for the graphics characters, Commodore cleverly just set up the upper/lower case character ROM putting the lower case cgaracter where the upper case had been and the upper case where the graphics had been.

So they didn't have to change the keyboard scan routine at all, just double the size of the character ROM and toggling the highest address bit of the ROM, they supported swapping between UpperCase/Graphics mode for games and Lower/Upper Case mode for productivity applications.

Of course, ASCII just put the lower case by setting bit 6 of the upper case, so that bit of Commodore cleverness implied that ASCII upper/lower case and PETSCII upper/lower case are flipped around.

What assembler are you using? Some of them have PETSCII modes built in. Also, I believe you can switch to an ISO character set if you wish, but don't recall offhand how that works.
 
Last edited:
  • Like
Reactions: jasonsbeer

Wertyloo

Member
Sep 15, 2019
76
11
8
woha!
"cmp $00" after "loop: lda msg_intro, x" is no-no, you want "cmp #0",witch is unneeded,coz' when you lda somethin,that automat.ly set Z if its zero,so:

loop: lda msg_intro, x
beq end
jsr CHROUT
 

jasonsbeer

New Member
Jan 31, 2020
4
0
1
That's PETSCII.

Way back when ASCII was upper case only, Commodore used the unallocated bit 6 set characters for graphics and then bit 7 for inverted bitmaps to have easy inverse video by toggling the high bit of the screencode character. Then when Upper/Lower case was implemented, since the routines used unshifted for the upper case ASCII characters and shifted for the graphics characters, Commodore cleverly just set up the upper/lower case character ROM putting the lower case cgaracter where the upper case had been and the upper case where the graphics had been.

So they didn't have to change the keyboard scan routine at all, just toggle just double the size of the character ROM and toggling the highest address bit of the ROM, they supported swapping between UpperCase/Graphics mode for games and Lower/Upper Case mode for productivity applications.

Of course, ASCII just put the lower case by setting bit 6 of the upper case, so that bit of Commodore cleverness implied that ASCII upper/lower case and PETSCII upper/lower case are flipped around.

What assembler are you using? Some of them have PETSCII modes built in. Also, I believe you can switch to an ISO character set if you wish, but don't recall offhand how that works.
I'm using CA65 on Ubuntu. Your comment lead me to find the discussion below. If you use the "-t c64" in the compile command, CC65 automatically takes care of PETSCII character mappings. When I did this, the text is now proper case and appears as expected. I'm not sure if there are other implications of using this switch on an X16 program. Looks like a quick and easy solution.

From the CC65 manual...
The -t switch is needed when translating the text.s file, so the text is converted from the input character-set (usually ISO-8859-1) into the target character-set (PETSCII, in this example) by the assembler.
https://www.lemon64.com/forum/viewtopic.php?t=72236

I'm going to study this some more. Thanks!
 
Last edited:

jasonsbeer

New Member
Jan 31, 2020
4
0
1
woha!
"cmp $00" after "loop: lda msg_intro, x" is no-no, you want "cmp #0",witch is unneeded,coz' when you lda somethin,that automat.ly set Z if its zero,so:

loop: lda msg_intro, x
beq end
jsr CHROUT
OK. That makes sense. The branch commands always reference the A register? I'm still scratching my head on these. All the examples I've found are very technical and hard for me to fully understand. And there's a bunch...BEQ,BNE,BCC,BMI,BCS,BPL. Maybe more? I will keep digging. Thanks!
 

Jestin

New Member
Jan 25, 2020
16
13
3
For anyone reading this and using the ACME assembler, you can define different string encodings with !pet, !scr, and !raw. I'm sure ca65 has its equivalents, but I'm unfamiliar.
 
  • Like
Reactions: BruceMcF

Jestin

New Member
Jan 25, 2020
16
13
3
The branch commands always reference the A register?
No, but loading the A register will set the flags that the branch operations do reference. For example, beq will branch if the Z flag is set, regardless of what is in A. However most (if not all) operations that put values in A will also set the Z flag if A is zero. So if you do a lda #0, you can check the docs on lda and see that it sets the N and Z flags, depending on what you load. Since you set A to zero, you can assume that Z has been set and beq will branch.

I've been referencing this page for my 6502 reference. It's extremely sparse documentation, but does get straight to the point about each op code. I consider its brevity a feature, not a bug. For example, here's the entire doc on lda:
1580597626970.png
and here's the entire doc on beq:
1580597695769.png

Although not very thorough, it tells you everything you need to know. lda sets the N and Z flags, and beq branches if Z is set. Simple.
 
  • Like
Reactions: jasonsbeer

BruceMcF

Active Member
May 19, 2019
208
63
28
For anyone reading this and using the ACME assembler, you can define different string encodings with !pet, !scr, and !raw.
Quite. Or you can stick to ASCII and all upper case, which will be all lower case in upper/lower mode, and all upper case in upper/graphics mode.

You can forget these things after not having use any Commodore 8bit systems for twenty five years ... I spent two days debugging CAM64TH without realizing that what I thought were garbage characters showing up on the screen were graphics characters corresponding to "ok" because the emulated C64 had not been set into upper/lower case mode.
 
  • Like
Reactions: rje

Wertyloo

Member
Sep 15, 2019
76
11
8
just a little footnote:
the X reg. 8bit by default, and your text-end is meant a $00 char
because you using "loop: lda msg_intro, x", know that if your text is LONGER than 254 char it WILL OVERLAP and begin with the 1st char again on the next iteration
so,something like this fix it:
lda #<msg_intro
sta loop+1
lda #>msg_intro
sta loop+2
ldx #0
loop lda msg_intro, x
beq end
jsr CHROUT
inx
bne loop
inc loop+2
jmp loop
end rts

now you can print out whole screen-long msgs :)
 

SlithyMatt

New Member
Sep 14, 2019
9
9
3
If you are using ca65 and don't want to use the default mappings, you can override them with .charmap directives. I preserve ASCII mappings for certain characters by adding lines like:
.charmap $41, $41 ; "A"
 
  • Like
Reactions: jasonsbeer