Tokenize – a cross platform BBC BASIC tokeniser

Announcement from Steve Fryatt, 21st May, 2014.

After several months of development and discussions in comp.sys.acorn.programmer, I’m pleased to formally announce the existence of Tokenize: a cross-platform tokeniser for BBC1 BASIC V and VI. Although still very much a work in progress, it is now usable enough that others might find it useful.

Tokenize will take ASCII text versions of BASIC programs (what RISC OS users might know as “BasText” format) using untokenised keywords, and turn them into “proper” BASIC files – ready to run on RISC OS machines.

The main purpose of Tokenize was to allow BASIC programs to be stored in revision control systems in plain text (to allow changes to be seen and manipulated), while still being able to be converted into ‘real’ BASIC files quickly and easily.

In addition, it has the following functionality:

  • Optionally apply various CRUNCH operations to code (including more flexible handling of opening REMs for preserving copyright info).
  • Convert Tab indentations into spaces.
  • Link multiple source files into a single BASIC program, and/or include files referenced by the LIBRARY command (removing the commands in the process).
  • Convert SWI names in SYS commands into numbers (using C headers such as swis.h on non-RISC OS platforms).
  • Replace variables with constant values from the command line, to allow the embedding of build dates, version info and so on into software at the time of tokenising.

Tokenize is written in C, and as such can be built for RISC OS or other platforms. A pre-built copy for RISC OS can be found on my website, and there’s also a pre-built version that works on recent copies of Ubuntu Linux. It can also be found in the GCCSDK Autobuilder as the “native-tokenize” package, allowing it to be built for use on any Linux system where the GCCSDK is installed.

Tokenize is open source, and licensed under the EUPL (which itself is “compatible” with the GPLv2). Source code is on my website, and is also available from’s SVN repositories.


  1. In his announcement, Steve used ‘ARM’ rather than ‘BBC’. This may have been to differentiate between the version of BBC BASIC that runs on RISC OS and the versions of BASIC (BBC and other) that run on Windows and other platforms. However, since RISCOSitory is focused on RISC OS, the distinction isn’t necessary here. Furthermore, thanks to platforms such as the Raspberry Pi, there are now other versions of BASIC that run on ARM – just not on RISC OS. Perhaps ‘RISC OS BASIC’ might have been a better term to describe the specific version of BASIC that is supplied as standard with the operating system.

Related posts