Fandom

Scratchpad

VVM Bytecode

219,552pages on
this wiki
Add New Page
Discuss this page0 Share


Introduction

A bytecode file [extension .vvm or .vvm.gz] is a binary program or library module for the Vanilla Virtual Machine.

The Machine

The Vanilla Virtual Machine operates [conceptually] on a higher level than say the JVM. It deals directly with things like local variables, types and closures - rather than requiring direct manipulation of the stack by VVM bytecode. This has several advantages:

  • Code is denser because it is higher level
    • This means files are smaller and [raw bytecode] programs are smaller in memory
  • Compilers for arbitrary languages can determine module information from it's binary representation without having to parse it's source code.
  • It can - at the VM level - be optimised to a greater extent based on information available only at runtime
  • The specification is smaller, more understandable and easier to mantain
  • It is simpler to create a compiler targeting VVM code
  • It is simpler to create a new VVM

But poses several difficulties:

  • Bytecode is more directly related to the source code, which is often undesirable in closed source projects
    • Can easily be countered by automated code mangling - making the bytecode less understandable [harder to reverse-engineer] without effecting it's functionality or efficiency
  • It must still be able to represent code compiled from a huge group of different languages
    • This will simply add some complexity to the specification - it is not really easier or simpler to make such languages inter-operate on a lower level VM
  • It is a more complicated process to optimise inside the VM
    • This is more a switch of such complexity from compiler to VM than additional complexity

File Format

Each VVM bytecode file corresponds to exactly one source module/file. It is the equivalent of a .class file in Java. All binary values mentioned herein are little endian, unless otherwise specified, and any text is encoded as UTF-8.

Script Line

If the file starts with a '#', the first line [up to \n] of the file must be ignored.

Magic Word

"VVM1" should be the first 4 characters in the file [after the optional script line]

Structured Data

The rest of the file conforms to a structure - similar to an XML document, though simpler. LISP users will be familiar with it's textual representation:

(i like (poo in a (bucket)))

Brackets enclose a section, and the first member of each is the op-code or type of the section. For example, a function return operation with the argument 33 can be represented by (return (int32-constant 33)).

Binary Representation

This textual list-based approach is used as input to the assembler and output from the disassembler, however it is not dense enough for distribution, so actual bytecode files are encoded as follows:

[Single byte section type/opcode] [32-bit section size, if required by type] [the section's data or sub-sections]

Op-Codes and Sections table

Byte Value Size word? Name Description
0x00 Yes module Encapsulates the entire module
0x01 Yes type An abstract type definition

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.

Also on Fandom

Random wikia