Singing a new TOON, a more efficient data format for AI

The Token-Oriented Object Notation format is here. As much as JSON reduced the size of XML, TOON is doing the same to JSON. Why? Tokens cost money! Developed by Johann Schopplich, TOON reduces the size (30-50%), and therefore the number of tokens burned when ingesting data. From his github page:


Why TOON?

AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money – and standard JSON is verbose and token-expensive:

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

YAML conveys the same infromation with fewer tokens:

users:
  - id: 1
    name: Alice
    role: admin
  - id: 2
    name: Bob
    role: user

TOON conveys the same information with even fewer tokens:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

There are .NET libraries that convert C# objects to TOON. Maybe we’ll see them included for Epicor Functions someday. :person_shrugging:

8 Likes

CSV is making a comeback, baby!

6 Likes

TOON is if CSV and YAML had a baby!

5 Likes

I’ve done the same thing by converting the json to just object arrays, and including the schema.

A little extra coding work, but works fantastic.

2 Likes

looks like a few bleeding-edge vscode extensions are up.

1 Like