Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule proposal: prefer TypedArrays over node:buffer #1885

Open
3 tasks
jimmywarting opened this issue Jan 22, 2023 · 5 comments
Open
3 tasks

Rule proposal: prefer TypedArrays over node:buffer #1885

jimmywarting opened this issue Jan 22, 2023 · 5 comments

Comments

@jimmywarting
Copy link
Contributor

Description

This rule is more targeted for cross platform compatibility reasons. node:buffer is a node specific module and i mostly just see it as something that bloats browser bundles and runs slower (in none NodeJS contexts - or using npm:buffer instead of node:buffer).

  • TextEncoder / TextDecoder can be used instead to turn your buffer to / from string / Uint8Array and is also actually much faster then Buffer.from(str) (which originally use string_decoder) in the browser,
  • DataView can be used instead of reading / writing bytes
  • base64 & Hex string should be discouraged b/c it takes up more bandwidth and isn't well suited for json or large data, there are better ways to send things in binary format like with FormData or just uploading raw data, it also waste processing time to encode / decode data and strings in v8 strings are caped at 512 MiB now we have TypedArrays that are better
  • buffer.slice is also overriding uint8array.slice with subarray, which is also unexpected and now also deprecated
  • the buffer userland polyfill for the browser is also lagging behind
  • Buffer.isBuffer can be replaced with instaceof Uint8Array to be acceptable of both Uint8Array and node:buffer. something even better would be to accept any TypedArray using ArrayBuffer.isView
  • TextDecoder is also able to handle BOM - unlike buf.toString()
  • NodeJS internal modules have also already been more acceptable of accepting any TypedArray, where as it only supported Buffer initially and wouldn't accept a Uint8Array.

I bet if you read this issue then you will maybe also be convinced that node:buffer is just unnecessary.

Fail

Buffer.from('hello world')

Pass

new TextEncoder().encode('hello world')

Additional Info

Replacing node:buffer with Uint8Array can be a lot. and maybe not that easy to autofix
Maybe dividing it up to smaller task could be easier.

like

  • prefer TextDecoder over buf.toString() and node:string_encoder
  • prefer instanceof Uint8Array over Buffer.isBuffer or something like that...
  • prefer DataView over read/write methods
@voxpelli
Copy link
Member

Is there a rule for this in some plug-in?

@LinusU
Copy link
Member

LinusU commented Jan 25, 2023

Generally I'm positive towards using TypedArrays instead, that's how I work myself. It might be a bit too strict though, and I think that we should evaluate each case thoroughly to make sure that it's strictly better in all ways.

base64 & Hex string should be discouraged b/c it takes up more bandwidth and isn't well suited for json or large data

Hex is common to use for e.g. ids or to write out small numbers.

If I recall correctly, Base64 inside JSON that is then GZip-compressed is almost as small as the raw data. And it has the big upside of working in a lot of places that already support JSON (e.g. GraphQL comes to mind).

I don't think that Standard should disallow Hex or Base64 encodings...

prefer TextDecoder over buf.toString() and node:string_encoder

This sounds nice, but we should benchmark that there isn't a penalty to creating a new TextEncoder() for every encode call. I know that when we changed some code to use the Intl date formatter, created a new instance every time we formatted a date was a severe bottleneck.

@voxpelli
Copy link
Member

I would expect a rule like this to appear in eg https://github.com/sindresorhus/eslint-plugin-unicorn before we add it here

standard to me isn’t about pushing a change in the ecosystem but rather to help the ecosystem conform to the very standards it has already adopted and thus help it stay consistent with itself.

There should be a rule for this somewhere if the ecosystem has adopted this and then we can add it here and check for how much it breaks and when the breakage is small enough we can do PR:s to fix the last few places that breaks and then we can merge it here.

@jimmywarting
Copy link
Contributor Author

jimmywarting commented Jan 25, 2023

if you want to send both JSON and binary data then you could do something like

const files = [file1, file2, file3]
const fd = new FormData()

fd.set('json', JSON.stringify({
  lastModified: files.map(file => file.lastModified)
}))
file1.forEach(file => fd.append('fileUploads', file))
fetch('/upload', { body: fd, ... })

and then on the receiver (server) side, you could do:

function income(req, res) {
  const fd = await new Response(req, { headers: req.headers }).formData()
  const files = fd.getAll('fileUploads')
  const json = JSON.parse(fd.get('json'))
}

And there you have both a JSON + binary format supported transport

You could even do the receive part on the client side fetch('/api/xyz').then(res => res.formData()) where you encode a payload into a FormData on the server side.


Another thing i like to do is just to simply encode a custom central json directory blob that describe how large each file is and then simply just encode everything into a new Blob where i have appended the binary things at the end of json.
new Blob([json_size_in_binary, json, ...files]). Then I divide up the json from the binary part

I have also started to like and use cbor-x cuz it's better then any JSON format. (it recently got support for encoding blob's as well without having to read them until you actually need it) (just how the blob constructor only creates references point instead of copying the data

honestly wish that we got Response.prototype.cbor instead... browsers already use it a lot, eg for webauthn

@LinusU
Copy link
Member

LinusU commented Jan 26, 2023

if you want to send both JSON and binary data then you could do something like [...]

This doesn't help me if I want to add e.g. thumbnail field to something in my GraphQL server. That would then require changing out the transport layer and all clients would have to be updated.

If Base64 inside JSON that is then GZip-compressed is comparable in size, then what benefit do I get by doing all that work?


That said, CBOR seems cool and it would be interesting to use it instead of JSON as the transport layer in e.g. GraphQL. But I think that that change needs to come from other projects, instead of a linter pushing that.

standard to me isn’t about pushing a change in the ecosystem but rather to help the ecosystem conform to the very standards it has already adopted and thus help it stay consistent with itself.

I think that this was very well put 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

3 participants