New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chat: sync token limit at model import time #3486
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! Some suggestions for naming to help people catch mixing up byte/character and token counts.
...bindings/kotlin/lib/src/main/kotlin/com/sourcegraph/cody/protocol_generated/ModelProvider.kt
Outdated
Show resolved
Hide resolved
lib/shared/src/models/index.ts
Outdated
) { | ||
const splittedModel = model.split('/') | ||
this.provider = getProviderName(splittedModel[0]) | ||
this.title = splittedModel[1]?.replaceAll('-', ' ') | ||
this.default = isDefaultModel | ||
this.default = true | ||
this.contextWindow = tokenLimit ? tokenLimit * 4 : DEFAULT_FAST_MODEL_TOKEN_LIMIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to put tokenLimit * 4
inside a function, say tokenCountToByteCount(tokenLimit)
, so that when we make this accurate we don't have to hunt the codebase for the digit "4" and can instead find uses of that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dominiccooney Thank you for pointing this out! I have renamed them to use character limit (since we already have a constant called CHARS_PER_TOKEN and a function called tokensToChars) and token limit instead and updated the code based on your feedback!
Also added some unit tests to cover the token and char limits as well.
…/protocol_generated/ModelProvider.kt Co-authored-by: Dominic Cooney <dominic.cooney@sourcegraph.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making it clear about tokens vs chars.
We can simplify this more. See how ModelProvider.get
has a side effect, overwriting the primary models if the user is a dotcom user? It is surprising to have "get" change state. See how many auth status listeners push auth status to the model provider? That's too much. That can be simplified:
- Make a class for holding all the models.
- Move the statics from ModelProvider, so they become instance properties of the new class.
- Create an instance of the class when the extension starts.
- Subscribe this new instance to auth status changes.
- Do the dotcom handling along with the rest of the auth status change handling.
- You don't need to call setModelProvider from all the other places auth status changes.
Ideally, pass the instance of the new class to every place that needs models. But if that is too hard in one patch, you can make one static method to hang onto the single instance of it and we will clean it up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Some more feedback inline I leave up to your judgement.
lib/shared/src/models/index.ts
Outdated
if (process.env.CODY_SHIM_TESTING === 'true') { | ||
return [] | ||
} | ||
// TODO (bee) watch file change to determine if a new model is added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no file to watch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we can watch for changes in ~/. ollama/models/manifests/registry.ollama.ai/library
where the ollama models are stored.
]) | ||
) | ||
} | ||
setModelProviders(authStatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one place is better than N places... Because the models are static, I think we should wire up setting the model providers to auth status at the top level. For example, say we make ChatManager lazier but still query models for edits, then we have a bug.
Having less global state, and giving the model provider object responsibility for keeping itself up-to-date instead of free-riding on nearby objects, will make it easier to keep evolving this code.
Building block for #3455
Previously, we had to check the
authStatus
andconfig
every time we wanted to get the token limit for a model. This approach was inefficient as it required the caller to have access to authProvider and check the same config repeatedly.With this change, we sync the token limit along with the model when we import models By including the
contextWindow
property in the model definitions, we can easily access the token limit for each model anywhere without the need for repeating the same checks we did when we import the model.This approach eliminates the need for scattered token limit checks throughout the codebase, making the code more maintainable and accessible. It centralizes the token limit configuration within the model definitions, making it easier to understand and modify.
Test plan
Check token limit