FEATURE: Cohere Command R support (#558)

SamSaffron · web-flow · commit 7f16d3ad4370 · 2024-04-11T07:24:17.000+10:00
- Added Cohere Command models (Command, Command Light, Command R, Command R Plus) to the available model list
- Added a new site setting `ai_cohere_api_key` for configuring the Cohere API key
- Implemented a new `DiscourseAi::Completions::Endpoints::Cohere` class to handle interactions with the Cohere API, including:
   - Translating request parameters to the Cohere API format
   - Parsing Cohere API responses 
   - Supporting streaming and non-streaming completions
   - Supporting "tools" which allow the model to call back to discourse to lookup additional information
- Implemented a new `DiscourseAi::Completions::Dialects::Command` class to translate between the generic Discourse AI prompt format and the Cohere Command format
- Added specs covering the new Cohere endpoint and dialect classes
- Updated `DiscourseAi::AiBot::Bot.guess_model` to map the new Cohere model to the appropriate bot user

In summary, this PR adds support for using the Cohere Command family of models with the Discourse AI plugin. It handles configuring API keys, making requests to the Cohere API, and translating between Discourse's generic prompt format and Cohere's specific format. Thorough test coverage was added for the new functionality.
diff --git a/app/models/ai_api_audit_log.rb b/app/models/ai_api_audit_log.rb
@@ -7,6 +7,7 @@ module Provider
     HuggingFaceTextGeneration = 3
     Gemini = 4
     Vllm = 5
+    Cohere = 6
   end
 end
 
diff --git a/config/locales/client.en.yml b/config/locales/client.en.yml
@@ -268,6 +268,7 @@ en:
           claude-3-opus: "Claude 3 Opus"
           claude-3-sonnet: "Claude 3 Sonnet"
           claude-3-haiku: "Claude 3 Haiku"
+          cohere-command-r-plus: "Cohere Command R Plus"
           gpt-4: "GPT-4"
           gpt-4-turbo: "GPT-4 Turbo"
           gpt-3:
diff --git a/config/locales/server.en.yml b/config/locales/server.en.yml
@@ -50,6 +50,7 @@ en:
     ai_openai_embeddings_url: "Custom URL used for the OpenAI embeddings API. (in the case of Azure it can be: https://COMPANY.openai.azure.com/openai/deployments/DEPLOYMENT/embeddings?api-version=2023-05-15)"
     ai_openai_api_key: "API key for OpenAI API"
     ai_anthropic_api_key: "API key for Anthropic API"
+    ai_cohere_api_key: "API key for Cohere API"
     ai_hugging_face_api_url: "Custom URL used for OpenSource LLM inference. Compatible with https://github.com/huggingface/text-generation-inference"
     ai_hugging_face_api_key: API key for Hugging Face API
     ai_hugging_face_token_limit: Max tokens Hugging Face API can use per request
diff --git a/config/settings.yml b/config/settings.yml
@@ -110,6 +110,9 @@ discourse_ai:
   ai_anthropic_api_key:
     default: ""
     secret: true
+  ai_cohere_api_key:
+    default: ""
+    secret: true
   ai_stability_api_key:
     default: ""
     secret: true
@@ -336,6 +339,7 @@ discourse_ai:
       - claude-3-opus
       - claude-3-sonnet
       - claude-3-haiku
+      - cohere-command-r-plus
   ai_bot_add_to_header:
     default: true
     client: true
diff --git a/lib/ai_bot/bot.rb b/lib/ai_bot/bot.rb
@@ -180,6 +180,8 @@ def self.guess_model(bot_user)
         when DiscourseAi::AiBot::EntryPoint::CLAUDE_3_OPUS_ID
           # no bedrock support yet 18-03
           "anthropic:claude-3-opus"
+        when DiscourseAi::AiBot::EntryPoint::COHERE_COMMAND_R_PLUS
+          "cohere:command-r-plus"
         when DiscourseAi::AiBot::EntryPoint::CLAUDE_3_SONNET_ID
           if DiscourseAi::Completions::Endpoints::AwsBedrock.correctly_configured?(
                "claude-3-sonnet",
diff --git a/lib/ai_bot/entry_point.rb b/lib/ai_bot/entry_point.rb
@@ -17,6 +17,7 @@ class EntryPoint
       CLAUDE_3_OPUS_ID = -117
       CLAUDE_3_SONNET_ID = -118
       CLAUDE_3_HAIKU_ID = -119
+      COHERE_COMMAND_R_PLUS = -120
 
       BOTS = [
         [GPT4_ID, "gpt4_bot", "gpt-4"],
@@ -29,6 +30,7 @@ class EntryPoint
         [CLAUDE_3_OPUS_ID, "claude_3_opus_bot", "claude-3-opus"],
         [CLAUDE_3_SONNET_ID, "claude_3_sonnet_bot", "claude-3-sonnet"],
         [CLAUDE_3_HAIKU_ID, "claude_3_haiku_bot", "claude-3-haiku"],
+        [COHERE_COMMAND_R_PLUS, "cohere_command_bot", "cohere-command-r-plus"],
       ]
 
       BOT_USER_IDS = BOTS.map(&:first)
@@ -67,6 +69,8 @@ def self.map_bot_model_to_user_id(model_name)
           CLAUDE_3_SONNET_ID
         in "claude-3-haiku"
           CLAUDE_3_HAIKU_ID
+        in "cohere-command-r-plus"
+          COHERE_COMMAND_R_PLUS
         else
           nil
         end
diff --git a/lib/completions/dialects/command.rb b/lib/completions/dialects/command.rb
@@ -0,0 +1,107 @@
+# frozen_string_literal: true
+
+# see: https://docs.cohere.com/reference/chat
+#
+module DiscourseAi
+  module Completions
+    module Dialects
+      class Command < Dialect
+        class << self
+          def can_translate?(model_name)
+            %w[command-light command command-r command-r-plus].include?(model_name)
+          end
+
+          def tokenizer
+            DiscourseAi::Tokenizer::OpenAiTokenizer
+          end
+        end
+
+        VALID_ID_REGEX = /\A[a-zA-Z0-9_]+\z/
+
+        def translate
+          messages = prompt.messages
+
+          # ChatGPT doesn't use an assistant msg to improve long-context responses.
+          if messages.last[:type] == :model
+            messages = messages.dup
+            messages.pop
+          end
+
+          trimmed_messages = trim_messages(messages)
+
+          chat_history = []
+          system_message = nil
+
+          prompt = {}
+
+          trimmed_messages.each do |msg|
+            case msg[:type]
+            when :system
+              if system_message
+                chat_history << { role: "SYSTEM", message: msg[:content] }
+              else
+                system_message = msg[:content]
+              end
+            when :model
+              chat_history << { role: "CHATBOT", message: msg[:content] }
+            when :tool_call
+              chat_history << { role: "CHATBOT", message: tool_call_to_xml(msg) }
+            when :tool
+              chat_history << { role: "USER", message: tool_result_to_xml(msg) }
+            when :user
+              user_message = { role: "USER", message: msg[:content] }
+              user_message[:message] = "#{msg[:id]}: #{msg[:content]}" if msg[:id]
+              chat_history << user_message
+            end
+          end
+
+          tools_prompt = build_tools_prompt
+          prompt[:preamble] = +"#{system_message}"
+          if tools_prompt.present?
+            prompt[:preamble] << "\n#{tools_prompt}"
+            prompt[
+              :preamble
+            ] << "\nNEVER attempt to run tools using JSON, always use XML. Lives depend on it."
+          end
+
+          prompt[:chat_history] = chat_history if chat_history.present?
+
+          chat_history.reverse_each do |msg|
+            if msg[:role] == "USER"
+              prompt[:message] = msg[:message]
+              chat_history.delete(msg)
+              break
+            end
+          end
+
+          prompt
+        end
+
+        def max_prompt_tokens
+          case model_name
+          when "command-light"
+            4096
+          when "command"
+            8192
+          when "command-r"
+            131_072
+          when "command-r-plus"
+            131_072
+          else
+            8192
+          end
+        end
+
+        private
+
+        def per_message_overhead
+          0
+        end
+
+        def calculate_message_token(context)
+          self.class.tokenizer.size(context[:content].to_s + context[:name].to_s)
+        end
+      end
+    end
+  end
+end
diff --git a/lib/completions/dialects/dialect.rb b/lib/completions/dialects/dialect.rb
@@ -17,6 +17,7 @@ def dialect_for(model_name)
               DiscourseAi::Completions::Dialects::Gemini,
               DiscourseAi::Completions::Dialects::Mixtral,
               DiscourseAi::Completions::Dialects::Claude,
+              DiscourseAi::Completions::Dialects::Command,
             ]
 
             if Rails.env.test? || Rails.env.development?
diff --git a/lib/completions/endpoints/base.rb b/lib/completions/endpoints/base.rb
@@ -16,6 +16,7 @@ def endpoint_for(provider_name, model_name)
               DiscourseAi::Completions::Endpoints::Gemini,
               DiscourseAi::Completions::Endpoints::Vllm,
               DiscourseAi::Completions::Endpoints::Anthropic,
+              DiscourseAi::Completions::Endpoints::Cohere,
             ]
 
             if Rails.env.test? || Rails.env.development?
diff --git a/lib/completions/endpoints/cohere.rb b/lib/completions/endpoints/cohere.rb
@@ -0,0 +1,114 @@
+# frozen_string_literal: true
+
+module DiscourseAi
+  module Completions
+    module Endpoints
+      class Cohere < Base
+        class << self
+          def can_contact?(endpoint_name, model_name)
+            return false unless endpoint_name == "cohere"
+
+            %w[command-light command command-r command-r-plus].include?(model_name)
+          end
+
+          def dependant_setting_names
+            %w[ai_cohere_api_key]
+          end
+
+          def correctly_configured?(model_name)
+            SiteSetting.ai_cohere_api_key.present?
+          end
+
+          def endpoint_name(model_name)
+            "Cohere - #{model_name}"
+          end
+        end
+
+        def normalize_model_params(model_params)
+          model_params = model_params.dup
+
+          model_params[:p] = model_params.delete(:top_p) if model_params[:top_p]
+
+          model_params
+        end
+
+        def default_options(dialect)
+          options = { model: "command-r-plus" }
+
+          options[:stop_sequences] = ["</function_calls>"] if dialect.prompt.has_tools?
+          options
+        end
+
+        def provider_id
+          AiApiAuditLog::Provider::Cohere
+        end
+
+        private
+
+        def model_uri
+          URI("https://api.cohere.ai/v1/chat")
+        end
+
+        def prepare_payload(prompt, model_params, dialect)
+          payload = default_options(dialect).merge(model_params).merge(prompt)
+
+          payload[:stream] = true if @streaming_mode
+
+          payload
+        end
+
+        def prepare_request(payload)
+          headers = {
+            "Content-Type" => "application/json",
+            "Authorization" => "Bearer #{SiteSetting.ai_cohere_api_key}",
+          }
+
+          Net::HTTP::Post.new(model_uri, headers).tap { |r| r.body = payload }
+        end
+
+        def extract_completion_from(response_raw)
+          parsed = JSON.parse(response_raw, symbolize_names: true)
+
+          if @streaming_mode
+            if parsed[:event_type] == "text-generation"
+              parsed[:text]
+            else
+              if parsed[:event_type] == "stream-end"
+                @input_tokens = parsed.dig(:response, :meta, :billed_units, :input_tokens)
+                @output_tokens = parsed.dig(:response, :meta, :billed_units, :output_tokens)
+              end
+              nil
+            end
+          else
+            @input_tokens = parsed.dig(:meta, :billed_units, :input_tokens)
+            @output_tokens = parsed.dig(:meta, :billed_units, :output_tokens)
+            parsed[:text].to_s
+          end
+        end
+
+        def final_log_update(log)
+          log.request_tokens = @input_tokens if @input_tokens
+          log.response_tokens = @output_tokens if @output_tokens
+        end
+
+        def partials_from(decoded_chunk)
+          decoded_chunk.split("\n").compact
+        end
+
+        def extract_prompt_for_tokenizer(prompt)
+          text = +""
+          if prompt[:chat_history]
+            text << prompt[:chat_history]
+              .map { |message| message[:content] || message["content"] || "" }
+              .join("\n")
+          end
+
+          text << prompt[:message] if prompt[:message]
+          text << prompt[:preamble] if prompt[:preamble]
+
+          text
+        end
+      end
+    end
+  end
+end
diff --git a/spec/lib/completions/endpoints/cohere_spec.rb b/spec/lib/completions/endpoints/cohere_spec.rb

Original file line number	Diff line number	Diff line change
`@@ -17,6 +17,7 @@ def dialect_for(model_name)`
`17`	`17`	`DiscourseAi::Completions::Dialects::Gemini,`
`18`	`18`	`DiscourseAi::Completions::Dialects::Mixtral,`
`19`	`19`	`DiscourseAi::Completions::Dialects::Claude,`
	`20`	`+ DiscourseAi::Completions::Dialects::Command,`
`20`	`21`	`]`
`21`	`22`
`22`	`23`	`if Rails.env.test? \|\| Rails.env.development?`
Original file line number	Diff line number	Diff line change
`@@ -16,6 +16,7 @@ def endpoint_for(provider_name, model_name)`
`16`	`16`	`DiscourseAi::Completions::Endpoints::Gemini,`
`17`	`17`	`DiscourseAi::Completions::Endpoints::Vllm,`
`18`	`18`	`DiscourseAi::Completions::Endpoints::Anthropic,`
	`19`	`+ DiscourseAi::Completions::Endpoints::Cohere,`
`19`	`20`	`]`
`20`	`21`
`21`	`22`	`if Rails.env.test? \|\| Rails.env.development?`