base64

Base64 Encode/Decode



\ BASE64 Encode/Decode
\ base64.f
\ version 1.0.0
\ Bob Dickow, March 29, 2018

BASE @

DECIMAL

\ Build a translation table data-smart word, defered so you can delegate 
\ your own words to map to and from streamed 6-bit tokens to printable 
\ characters:

DEFER >BASE64CHR  \ given 6-bit n index, returns base64 representation
DEFER BASE64CHR>  \ given chr , returns 6-bit representation

\ Compiling word to build standard base 64 6-bit chars lookup:

: BUILD-BASE64-ENCODING-LOOKUP: ( n-index <name> -- chr ) 
	CREATE
	'A' C, 'B' C, 'C' C, 'D' C, 'E' C, 'F' C, 
	'G' C, 'H' C, 'I' C, 'J' C, 'K' C, 'L' C, 
	'M' C, 'N' C, 'O' C, 'P' C, 'Q' C, 'R' C, 
	'S' C, 'T' C, 'U' C, 'V' C, 'W' C, 'X' C, 
	'Y' C, 'Z' C, 'a' C, 'b' C, 'c' C, 'd' C, 
	'e' C, 'f' C, 'g' C, 'h' C, 'i' C, 'j' C, 
	'k' C, 'l' C, 'm' C, 'n' C, 'o' C, 'p' C, 
	'q' C, 'r' C, 's' C, 't' C, 'u' C, 'v' C, 
	'w' C, 'x' C, 'y' C, 'z' C, '0' C, '1' C,
	'2' C, '3' C, '4' C, '5' C, '6' C, '7' C,
	'8' C, '9' C, '+' C, '/' C, DOES> + C@ ;

\ For decoding, caculate the output 6-bit value given an ASCII chr token
\ instead of reverse lookup table. This actually profiles as faster.

: STANDARD-BASE64CHR> ( base64chr -- n )
  DUP 96 > IF
    71 -
  ELSE
    DUP 64 > IF
      65 -
    ELSE
      DUP 47 > IF
        4 +
      ELSE \ handles the '+' or '/'
        DUP 47 = IF 
          DROP 63
        ELSE
          DROP 62
        THEN
      THEN
    THEN
  THEN  ;

\ Set the word to build lookup table above:

\ Compile the lookup word that, given a 6-bit indexer, 
\ returns a printable character for encoding:

BUILD-BASE64-ENCODING-LOOKUP: >BASE64CHR-LOOKUP \ the encoding lookup

' >BASE64CHR-LOOKUP IS >BASE64CHR \ set the vector for encoding
' STANDARD-BASE64CHR> IS BASE64CHR>  \ set the vector for decoding

\ padding characters (usually '=') are added to encoded strings
\ if they are not an even tetrad of 6-bit units

VARIABLE PADDING-CHR

'=' PADDING-CHR !  \ initialize default '=' padding at end of base64 string.

\ Use the PADDING? variable to set a global that will assume
\ padded ( if PADDING is TRUE) or non-padded (if PADDING? is FALSE) base64 result. 
\ Padding characters are the '=' or '==' string endings. Some environments assume
\ unpadded base64 strings, such as for e-mail HELO authentication dialogs etc.

VARIABLE PADDING? 
 
PADDING? ON \  initialize default to TRUE; use padded endings on encoded output strings.

\ BASE64-ENCODE-LEN -- Helper word. Use to return required buffer size for the encoded output given
\ length n of un-encoded input data in bytes.

: BASE64-ENCODE-LEN ( n -- n )
  1- 3 /MOD NIP 4 * 4 + 
  PADDING? @ FALSE = IF
    DUP
    3 /MOD DROP
    ?DUP IF
      3 SWAP - 1+ -
    THEN
  THEN ;

: BASE64-DECODE-LEN ( c-addr-encoded n -- n )
  2 0 DO
  	2DUP 1-  + C@ PADDING-CHR = 
  	+
  LOOP
  NIP
  6 * 8 / ;

\ Provided buffer and length, return adjusted length:

: TRIM-PADDING ( c-addr-encoded n -- c-addr-encoded n )
  1- 
  2 0 DO
  	2DUP  + C@ PADDING-CHR @ = 
  	+
  LOOP 1+ ;

\ For flexibility, the encoding and decoding words do not assume
\ space for counts or zero-byte delimiters.

\ >BASE64 (say "TO BASE64") is the high level encoding word.
\ Creates base64 encoded string and places in buffer (c-addr-output).
\ To use, set up a buffer of size 3:4 input:output 
\ length proportions. (Use BASE64-ENCODE-LEN ( source-len -- result-len )
\ helper word for convenience in calculating storage bytes.)
\ The input string or data is a byte-aligned buffer.
\ n-len is the length in bytes of the buffer.

: >BASE64 ( c-addr-output c-addr-source-data n-len -- c-addr-output n-len ) 
  {: output-buffer source-data len | accum chr-index out-index  :}
  0 TO accum  0 TO chr-index  0 TO out-index
  len 0= IF output-buffer 0 EXIT THEN
  BEGIN  \ looping through each input byte, bit-twiddling and assembling output data as base64 characters.
     source-data chr-index + C@ DUP
     1 +TO chr-index
     $FC AND  2 RSHIFT
     >BASE64CHR output-buffer out-index + C!
     1 +TO out-index
     3 AND  4 LSHIFT  TO accum
     chr-index len >=  IF   
       accum 
       >BASE64CHR output-buffer out-index + C!
       1 +TO out-index
     THEN
	   chr-index len < IF
	     source-data chr-index + C@ DUP
	     1 +TO chr-index
	     $F0 AND  4 RSHIFT
	     accum OR 
	     >BASE64CHR output-buffer out-index + C!
	     1 +TO out-index
	     $0F AND 2 LSHIFT
	     TO accum
	     chr-index len >= IF
         accum  
         >BASE64CHR output-buffer out-index + C!
         1 +TO out-index
       THEN
	     chr-index len < IF
		     source-data chr-index + C@ DUP
		     1 +TO chr-index
		     $C0 AND 6 RSHIFT accum OR 
		     >BASE64CHR output-buffer out-index + C!
		     1 +TO out-index
		     $3F AND
		     >BASE64CHR output-buffer out-index + C!
		     1 +TO out-index
	     THEN
    THEN
    chr-index len >=
	UNTIL
  PADDING? @ IF
	  chr-index 3 /MOD DROP
		?DUP IF
		   3 SWAP -  0 DO 
			 	   PADDING-CHR @ output-buffer out-index + C!
			 	   1 +TO out-index
		   LOOP
		THEN
	THEN
	output-buffer out-index ;

\ BASE64> (say "BASE64 FROM") is the high level decoding word.
\ To use, set up a buffer (c-addr-output) of size 4:3 input:output 
\ length proportions. (Use BASE64-DECODE-LEN ( source-len -- target-len )
\ word for convenience.)
\ The input string or data is a byte-aligned buffer buffer.
\ n-len is the length in bytes of the buffer.

: BASE64> ( c-addr-output c-addr-encoded n-len  -- )
  {: output-buffer source-data len | accum chr-index out-index exit-flg :}
  0 TO accum   0 TO chr-index   0 TO out-index
  source-data len TRIM-PADDING TO len DROP
  FALSE TO exit-flg
  BEGIN
    source-data chr-index + C@  
    1 +TO chr-index
    BASE64CHR>                  
    2 LSHIFT          
    source-data chr-index + C@  
    1 +TO chr-index
    BASE64CHR>  DUP             
    $30 AND 4 RSHIFT ROT OR     
    output-buffer out-index + C! 
    1 +TO out-index
    chr-index len < IF
      $0F AND 4 LSHIFT          
      source-data chr-index + C@  
      BASE64CHR>   
      1 +TO chr-index
      DUP $3C AND 2 RSHIFT  ROT OR 
      output-buffer out-index + C!
		  1 +TO out-index
      chr-index len < IF
	      $3 AND 6 LSHIFT 
	      source-data chr-index + C@
	      1 +TO chr-index
	      BASE64CHR> OR
	      output-buffer out-index + C!
	      1 +TO out-index
      ELSE
        DROP 
        TRUE TO exit-flg
      THEN
    ELSE
      DROP
      TRUE TO exit-flg
    THEN
    chr-index len >= exit-flg OR
  UNTIL
  output-buffer out-index ;

BASE !

by Budsy

avatar of Budsy

Versions

1.0.0

Download current as zip

Tags

gpl, base64, encoding, decoding, mime

Dependencies

None

Dependents

None