Chat: sync token limit at model import time #3486

abeatrix · 2024-03-20T23:26:31Z

Building block for #3455

Previously, we had to check the authStatus and config every time we wanted to get the token limit for a model. This approach was inefficient as it required the caller to have access to authProvider and check the same config repeatedly.

With this change, we sync the token limit along with the model when we import models By including the contextWindow property in the model definitions, we can easily access the token limit for each model anywhere without the need for repeating the same checks we did when we import the model.

This approach eliminates the need for scattered token limit checks throughout the codebase, making the code more maintainable and accessible. It centralizes the token limit configuration within the model definitions, making it easier to understand and modify.

Test plan

Check token limit

dominiccooney

Thanks for working on this! Some suggestions for naming to help people catch mixing up byte/character and token counts.

...bindings/kotlin/lib/src/main/kotlin/com/sourcegraph/cody/protocol_generated/ModelProvider.kt

lib/shared/src/models/dotcom.ts

dominiccooney · 2024-03-21T04:51:11Z

lib/shared/src/models/index.ts

    ) {
        const splittedModel = model.split('/')
        this.provider = getProviderName(splittedModel[0])
        this.title = splittedModel[1]?.replaceAll('-', ' ')
-        this.default = isDefaultModel
+        this.default = true
+        this.contextWindow = tokenLimit ? tokenLimit * 4 : DEFAULT_FAST_MODEL_TOKEN_LIMIT


It would be great to put tokenLimit * 4 inside a function, say tokenCountToByteCount(tokenLimit), so that when we make this accurate we don't have to hunt the codebase for the digit "4" and can instead find uses of that function.

@dominiccooney Thank you for pointing this out! I have renamed them to use character limit (since we already have a constant called CHARS_PER_TOKEN and a function called tokensToChars) and token limit instead and updated the code based on your feedback!
Also added some unit tests to cover the token and char limits as well.

lib/shared/src/models/index.ts

…/protocol_generated/ModelProvider.kt Co-authored-by: Dominic Cooney <dominic.cooney@sourcegraph.com>

dominiccooney

Thanks for making it clear about tokens vs chars.

We can simplify this more. See how ModelProvider.get has a side effect, overwriting the primary models if the user is a dotcom user? It is surprising to have "get" change state. See how many auth status listeners push auth status to the model provider? That's too much. That can be simplified:

Make a class for holding all the models.
Move the statics from ModelProvider, so they become instance properties of the new class.
Create an instance of the class when the extension starts.
Subscribe this new instance to auth status changes.
Do the dotcom handling along with the rest of the auth status change handling.
You don't need to call setModelProvider from all the other places auth status changes.

Ideally, pass the instance of the new class to every place that needs models. But if that is too hard in one patch, you can make one static method to hang onto the single instance of it and we will clean it up later.

lib/shared/src/models/index.ts

vscode/src/chat/chat-view/SimpleChatPanelProvider.ts

dominiccooney

Very nice. Some more feedback inline I leave up to your judgement.

lib/shared/src/models/index.ts

dominiccooney · 2024-03-22T09:39:33Z

lib/shared/src/models/index.ts

+    if (process.env.CODY_SHIM_TESTING === 'true') {
+        return []
+    }
+    // TODO (bee) watch file change to determine if a new model is added


There's no file to watch?

I was thinking we can watch for changes in ~/. ollama/models/manifests/registry.ollama.ai/library where the ollama models are stored.

lib/shared/src/models/index.ts

dominiccooney · 2024-03-22T09:44:30Z

vscode/src/chat/chat-view/ChatManager.ts

-                ])
-            )
-        }
+        setModelProviders(authStatus)


I guess one place is better than N places... Because the models are static, I think we should wire up setting the model providers to auth status at the top level. For example, say we make ChatManager lazier but still query models for edits, then we have a bug.

Having less global state, and giving the model provider object responsibility for keeping itself up-to-date instead of free-riding on nearby objects, will make it easier to keep evolving this code.

abeatrix added 3 commits March 20, 2024 16:13

Chat: sync token limit at model import

a600191

clean up

ed4ec1e

Re-generate Kotlin bindings

23ee671

dominiccooney requested changes Mar 21, 2024

View reviewed changes

abeatrix and others added 6 commits March 21, 2024 07:34

Update agent/bindings/kotlin/lib/src/main/kotlin/com/sourcegraph/cody…

e34d780

…/protocol_generated/ModelProvider.kt Co-authored-by: Dominic Cooney <dominic.cooney@sourcegraph.com>

Merge branch 'main' into bee/sync-model-token

ec7b02f

clean up and add tests

ad53e98

update bindings

50d3630

rename function

48bad9a

clean up

424eac9

abeatrix requested review from dominiccooney and a team March 21, 2024 17:12

Merge branch 'main' into bee/sync-model-token

83d1083

dominiccooney requested changes Mar 22, 2024

View reviewed changes

lib/shared/src/models/index.ts Outdated Show resolved Hide resolved

vscode/src/chat/chat-view/SimpleChatPanelProvider.ts Outdated Show resolved Hide resolved

abeatrix added 4 commits March 21, 2024 23:28

simplify sync logic

fc5c9ef

early return

967dcdd

clean up

f8cb7cf

update bindings

8c3e685

dominiccooney approved these changes Mar 22, 2024

View reviewed changes

abeatrix added 3 commits March 22, 2024 10:09

move sync to top level

9da8bfe

Update agent binding

93386b4

Merge branch 'main' into bee/sync-model-token

a67a2fe

abeatrix merged commit da1fe66 into main Mar 22, 2024
20 checks passed

abeatrix deleted the bee/sync-model-token branch March 22, 2024 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat: sync token limit at model import time #3486

Chat: sync token limit at model import time #3486

abeatrix commented Mar 20, 2024

dominiccooney left a comment

dominiccooney Mar 21, 2024

abeatrix Mar 21, 2024

dominiccooney left a comment

dominiccooney left a comment

dominiccooney Mar 22, 2024

abeatrix Mar 22, 2024

dominiccooney Mar 22, 2024

Chat: sync token limit at model import time #3486

Chat: sync token limit at model import time #3486

Conversation

abeatrix commented Mar 20, 2024

Test plan

dominiccooney left a comment

Choose a reason for hiding this comment

dominiccooney Mar 21, 2024

Choose a reason for hiding this comment

abeatrix Mar 21, 2024

Choose a reason for hiding this comment

dominiccooney left a comment

Choose a reason for hiding this comment

dominiccooney left a comment

Choose a reason for hiding this comment

dominiccooney Mar 22, 2024

Choose a reason for hiding this comment

abeatrix Mar 22, 2024

Choose a reason for hiding this comment

dominiccooney Mar 22, 2024

Choose a reason for hiding this comment