Fandom

Scratchpad

VVM Bytecode

215,884pages on
this wiki
Add New Page
Discuss this page0 Share

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.


Introduction

A bytecode file [extension .vvm or .vvm.gz] is a binary program or library module for the Vanilla Virtual Machine.

The Machine

The Vanilla Virtual Machine operates [conceptually] on a higher level than say the JVM. It deals directly with things like local variables, types and closures - rather than requiring direct manipulation of the stack by VVM bytecode. This has several advantages:

  • Code is denser because it is higher level
    • This means files are smaller and [raw bytecode] programs are smaller in memory
  • Compilers for arbitrary languages can determine module information from it's binary representation without having to parse it's source code.
  • It can - at the VM level - be optimised to a greater extent based on information available only at runtime
  • The specification is smaller, more understandable and easier to mantain
  • It is simpler to create a compiler targeting VVM code
  • It is simpler to create a new VVM

But poses several difficulties:

  • Bytecode is more directly related to the source code, which is often undesirable in closed source projects
    • Can easily be countered by automated code mangling - making the bytecode less understandable [harder to reverse-engineer] without effecting it's functionality or efficiency
  • It must still be able to represent code compiled from a huge group of different languages
    • This will simply add some complexity to the specification - it is not really easier or simpler to make such languages inter-operate on a lower level VM
  • It is a more complicated process to optimise inside the VM
    • This is more a switch of such complexity from compiler to VM than additional complexity

File Format

Each VVM bytecode file corresponds to exactly one source module/file. It is the equivalent of a .class file in Java. All binary values mentioned herein are little endian, unless otherwise specified, and any text is encoded as UTF-8.

Script Line

If the file starts with a '#', the first line [up to \n] of the file must be ignored.

Magic Word

"VVM1" should be the first 4 characters in the file [after the optional script line]

Structured Data

The rest of the file conforms to a structure - similar to an XML document, though simpler. LISP users will be familiar with it's textual representation:

(i like (poo in a (bucket)))

Brackets enclose a section, and the first member of each is the op-code or type of the section. For example, a function return operation with the argument 33 can be represented by (return (int32-constant 33)).

Binary Representation

This textual list-based approach is used as input to the assembler and output from the disassembler, however it is not dense enough for distribution, so actual bytecode files are encoded as follows:

[Single byte section type/opcode] [32-bit section size, if required by type] [the section's data or sub-sections]

Op-Codes and Sections table

Byte Value Size word? Name Description
0x00 Yes module Encapsulates the entire module
0x01 Yes type An abstract type definition

Also on Fandom

Random wikia