You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/code-search/code-navigation/precise_code_navigation.mdx
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ Precise Code Navigation requires language-specific indexes to be generated and u
48
48
| Rust |[rust-analyzer](https://sourcegraph.com/github.com/rust-lang/rust-analyzer)| 🟢 Generally available |
49
49
| Python |[scip-python](https://sourcegraph.com/github.com/sourcegraph/scip-python)| 🟢 Generally available |
50
50
| Ruby |[scip-ruby](https://sourcegraph.com/github.com/sourcegraph/scip-ruby)| 🟢 Generally available |
51
-
| C#, Visual Basic |[scip-dotnet](https://github.com/sourcegraph/scip-dotnet)|🟡 Partially available |
51
+
| C#, Visual Basic |[scip-dotnet](https://github.com/sourcegraph/scip-dotnet)|🟢 Generally available |
52
52
53
53
The easiest way to configure precise code navigation is with [auto-indexing](/code-search/code-navigation/auto_indexing). This feature uses [Sourcegraph executors](/admin/executors/) to automatically create indexes for the code, keeping precise code navigation available and up-to-date.
is not currently recommended (there is a known performance bug with this
100
-
method which will prevent autocomplete from working correctly. (internal
101
-
issue: PRIME-662)
102
-
</Callout>
94
+
#### AWS Bedrock: Latency optimization
95
+
96
+
<Callouttype="note">Optimization for latency with AWS Bedrock is available in Sourcegraph v6.5 and more.</Callout>
97
+
98
+
AWS Bedrock supports [Latency Optimized Inference](https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html) which can reduce autocomplete latency with models like Claude 3.5 Haiku by up to ~40%.
99
+
100
+
To use Bedrock's latency optimized inference feature for a specific model with Cody, configure the `"latencyOptimization": "optimized"` setting under the `serverSideConfig` of any model in `modelOverrides`. For example:
See also [Debugging: running a latency test](#debugging-running-a-latency-test).
103
146
104
147
### Example: Using GCP Vertex AI
105
148
@@ -194,3 +237,37 @@ To enable StarCoder, go to **Site admin > Site configuration** (`/site-admin/con
194
237
```
195
238
196
239
Users of the Cody extensions will automatically pick up this change when connected to your Enterprise instance.
240
+
241
+
## Debugging: Running a latency test
242
+
243
+
<Callouttype="note">Debugging latency optimizated inference is supported in Sourcegraph v6.5 and more.</Callout>
244
+
245
+
Site administrators can test completions latency by sending a special debug command in any Cody chat window (in the web, in the editor, etc.):
246
+
247
+
```shell
248
+
cody_debug:::{"latencytest": 100}
249
+
```
250
+
251
+
Cody will then perform `100` quick `Hello, please respond with a short message.` requests to the LLM model selected in the dropdown, and measure the time taken to get the first streaming event back (for example first token from the model.) It records all of these requests timing information, and then responds with a report indicating the latency between the Sourcegraph `frontend` container and the LLM API:
252
+
253
+
```shell
254
+
Starting latency test with 10 requests...
255
+
256
+
Individual timings:
257
+
258
+
[... how long each request took ...]
259
+
260
+
Summary:
261
+
262
+
* Requests: 10/10 successful
263
+
* Average: 882ms
264
+
* Minimum: 435ms
265
+
* Maximum: 1.3s
266
+
```
267
+
268
+
This can be helpful to get a feel for the latency of particular models, or models with different configurations - such as when using the AWS Bedrock Latency Optimized Inference feature.
269
+
270
+
Few important considerations:
271
+
272
+
- Debug commands are only available to site administrators and have no effect when used by regular users.
273
+
- Sourcegraph's built-in Grafana monitoring also has a full `Completions` dashboard for monitoring LLM requests, performance, etc.
Copy file name to clipboardExpand all lines: docs/cody/enterprise/model-config-examples.mdx
-10Lines changed: 0 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -792,14 +792,4 @@ Provisioned throughput for Amazon Bedrock models can be configured using the `"a
792
792
](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_InstanceMetadataOptionsRequest.html#:~:text=HttpPutResponseHopLimit) instance metadata option to a higher value (e.g., 2) to ensure that the metadata service can be accessed from the frontend container running in the EC2 instance. See [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-IMDS-existing-instances.html) for instructions.
793
793
</Callout>
794
794
795
-
<Callout type="warning">
796
-
We only recommend configuring AWS Bedrock to use an accessToken for
797
-
authentication. Specifying no accessToken (e.g. to use [IAM roles for EC2 /
Copy file name to clipboardExpand all lines: public/llms.txt
-20Lines changed: 0 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -15668,16 +15668,6 @@ Provisioned throughput for Amazon Bedrock models can be configured using the `"a
15668
15668
](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_InstanceMetadataOptionsRequest.html#:~:text=HttpPutResponseHopLimit) instance metadata option to a higher value (e.g., 2) to ensure that the metadata service can be accessed from the frontend container running in the EC2 instance. See [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-IMDS-existing-instances.html) for instructions.
15669
15669
</Callout>
15670
15670
15671
-
<Callout type="warning">
15672
-
We only recommend configuring AWS Bedrock to use an accessToken for
15673
-
authentication. Specifying no accessToken (e.g. to use [IAM roles for EC2 /
0 commit comments