[RFC007] Bytecode interpreter #2045

yannham · 2024-09-17T15:06:36Z

Although the name is a bit pompous, the goal of this RFC is mostly to be a working document for designing a more compact and efficient run-time representation for Nickel expressions.

While this is something that won't be user-facing (at least in a direct way), and thus can be changed later without breaking backward-compatibility, I think the technical scope of this effort is such that I find it better to discuss it formally here before going for a first implementation.

github-actions · 2024-09-17T15:14:06Z

Bencher Report

Branch	2045/merge
Testbed	ubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!
Latency
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results

Benchmark	Latency	nanoseconds (ns)
fibonacci 10	📈 view plot ⚠️ NO THRESHOLD	488,970.00
foldl arrays 50	📈 view plot ⚠️ NO THRESHOLD	1,916,800.00
foldl arrays 500	📈 view plot ⚠️ NO THRESHOLD	25,113,000.00
foldr strings 50	📈 view plot ⚠️ NO THRESHOLD	7,363,100.00
foldr strings 500	📈 view plot ⚠️ NO THRESHOLD	65,138,000.00
generate normal 250	📈 view plot ⚠️ NO THRESHOLD	47,324,000.00
generate normal 50	📈 view plot ⚠️ NO THRESHOLD	2,059,700.00
generate normal unchecked 1000	📈 view plot ⚠️ NO THRESHOLD	59,119,000.00
generate normal unchecked 200	📈 view plot ⚠️ NO THRESHOLD	2,911,900.00
pidigits 100	📈 view plot ⚠️ NO THRESHOLD	3,252,400.00
pipe normal 20	📈 view plot ⚠️ NO THRESHOLD	1,499,800.00
pipe normal 200	📈 view plot ⚠️ NO THRESHOLD	12,879,000.00
product 30	📈 view plot ⚠️ NO THRESHOLD	843,320.00
scalar 10	📈 view plot ⚠️ NO THRESHOLD	1,529,500.00
sum 30	📈 view plot ⚠️ NO THRESHOLD	846,870.00

🐰 View full continuous benchmarking report in Bencher

yannham · 2024-10-15T13:07:52Z

Some parts might need refinement, but I think it's in a good shape for a first round of reviews.

jneem · 2024-10-16T15:14:28Z

rfcs/007-bytecode-interpreter.md

+#### AST
+
+The first one is an AST and would more or less correspond to the current unique
+representation, minus runtime-specific constructors. We could have gone closer


I think we can also get rid of some efficiency-oriented duplication in the current representation, like the distinction between LetPattern/Let and RecRecord/Record. Getting rid of these would be convenient for both the LSP and the typechecker, I think.

Agreed. In fact I started to draft the first representation and did get rid of the Let and Fun to keep only the pattern variants.

jneem · 2024-10-16T15:24:25Z

rfcs/007-bytecode-interpreter.md

+record and the empty array, for example with `enum Record { Empty,
+NonEmpty(RecordData) }`. This should use the same space as `RecordData` in Rust
+(if `RecordData` is a pointer, at least) and save an allocation for empty
+structures.


I'm a bit confused about what discriminants are present in the "top-level" representation (I'm not sure what's the right term, but I'm talking about the word-sized thing that packs in a pointer with some discriminant). Above, it sounded like we were only going to inline null and boolean in the discriminant; here, you're also proposing the put empty records and empty arrays? It seems like there isn't enough room in the pointer alignment for all of these.

By the way, x86_64, aarch64, and riscv-64 all max out at 48 bits of address space. So on these architectures we can pack lots more stuff at the most-significant end of the top-level representation.

By the way, x86_64, aarch64, and riscv-64 all max out at 48 bits of address space. So on these architectures we can pack lots more stuff at the most-significant end of the top-level representation.

Yes, but it's less portable. Although we don't need Nickel to run on embedded, I think we can do with one bit for now, and explore those other possibilities later.

I'm a bit confused about what discriminants are present in the "top-level" representation (I'm not sure what's the right term, but I'm talking about the word-sized thing that packs in a pointer with some discriminant). Above, it sounded like we were only going to inline null and boolean in the discriminant; here, you're also proposing the put empty records and empty arrays? It seems like there isn't enough room in the pointer alignment for all of these.

Your first understanding is right. At the top-level, there is only one discriminant for bool, null and pointer. If we follow the pointer, we find many representations: arrays, numbers, etc. Here I'm talking about the representation of the pointee, which can itself be a pointer to something else (typically I guess your immutable vec representation would be mostly a pointer to the root plus some parameters). Somehow the 1-word representation can hold any data, and I'm talking about this specific data when the pointee represents a record. Does that make sense?

Also, as we need a discriminant at the beginning of the pointee (is it an Array? a Record? etc.) that will need to be aligned (although it might be merged with some other metadata as well), we'll probably have some more space here, and can even special case EmptyArray and EmptyRecord as special discriminants, instead of bothering making Record actually an enum.

Even on 32-bit, we can have lots more values at the top-level, right? Assuming our pointers are all 4-byte aligned, anything ending in 01, 10, and 11 is not a pointer. But then for each of those non-pointer values we still have 30 bits left to store actual data. So couldn't we have variants for EmptyArray and EmptyRecord without even following that single top-level pointer? And we'd still have room for small integers...

Sure, we have a lot more room. It's just that you can't encode anything that needs at least a full machine word or more - typically OCaml needs to have 31-bits and 63-bits integers so that they can unbox them, which is the case of most other non trivial data structures. But indeed special values like empty stuff could in theory be also directly put that top-level.

Ok, makes sense. I was just confused about what discriminants were where. Our current Vector has a root: Option<Rc<Node>>, so it's already avoiding allocation for empty arrays.

First draft (incomplete) of RFC007

f038447

github-actions bot temporarily deployed to pull request September 17, 2024 15:08 Inactive

Start writing about OCaml abstract machine

ca3ea52

github-actions bot temporarily deployed to pull request September 19, 2024 16:26 Inactive

Add some criterion on the VM comparisons

fead82f

github-actions bot temporarily deployed to pull request September 20, 2024 14:58 Inactive

Complete the OCaml VM description

55c763f

github-actions bot temporarily deployed to pull request September 23, 2024 08:32 Inactive

First bit about the Lua VM

00355f1

github-actions bot temporarily deployed to pull request September 23, 2024 10:47 Inactive

More on Lua VM

bd12104

github-actions bot temporarily deployed to pull request September 23, 2024 16:39 Inactive

Improve previous text, start V8 section

16cfc00

github-actions bot temporarily deployed to pull request September 25, 2024 12:38 Inactive

More on V8; timid start of Haskell

48b804c

github-actions bot temporarily deployed to pull request September 25, 2024 16:07 Inactive

Small chunk on Haskell/STG

6cebc7e

github-actions bot temporarily deployed to pull request September 30, 2024 14:33 Inactive

yannham mentioned this pull request Oct 2, 2024

[Performance] Reduce size of term #2022

Closed

More content on Tvix; drafty draft of a proposal

b64c323

github-actions bot temporarily deployed to pull request October 3, 2024 17:27 Inactive

Pass on the whole document, more details on V8 closures

dd1d007

github-actions bot temporarily deployed to pull request October 6, 2024 16:52 Inactive

More on STG and Tvix, and a bit more raw notes on the proposal

3aaf162

github-actions bot temporarily deployed to pull request October 6, 2024 21:38 Inactive

Full pass on existing VMs, a few more raw notes on proposal

6a220c3

github-actions bot temporarily deployed to pull request October 7, 2024 15:22 Inactive

yannham added 2 commits October 9, 2024 18:20

More of the proposal

6dfb394

More proposal

58a9e91

yannham added 4 commits October 11, 2024 15:25

More proposal

5aea596

More proposal

6c76a9e

Pass on most of the proposal

08c213e

Small pass on the STG, remove useless and vague paragraph

c2073a3

yannham marked this pull request as ready for review October 15, 2024 13:06

yannham requested review from aspiwack and jneem October 15, 2024 13:07

jneem approved these changes Oct 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC007] Bytecode interpreter #2045

[RFC007] Bytecode interpreter #2045

yannham commented Sep 17, 2024

github-actions bot commented Sep 17, 2024 •

edited

Loading

yannham commented Oct 15, 2024 •

edited

Loading

jneem Oct 16, 2024

yannham Oct 16, 2024

jneem Oct 16, 2024

yannham Oct 16, 2024

yannham Oct 16, 2024

jneem Oct 17, 2024

yannham Oct 17, 2024

jneem Oct 17, 2024

[RFC007] Bytecode interpreter #2045

Are you sure you want to change the base?

[RFC007] Bytecode interpreter #2045

Conversation

yannham commented Sep 17, 2024

github-actions bot commented Sep 17, 2024 • edited Loading

yannham commented Oct 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 17, 2024 •

edited

Loading

yannham commented Oct 15, 2024 •

edited

Loading