Cognition Appliances

high performance AI Inference and Agent appliances

What do you do?

Develop high performance Cognition Appliances software (AI inference and agent software stack) and hardware for Enterprises to run AI services fast and easy.

For example, our Channeled Attention algorithm shows significantly performance improvement in various LLM inference models. On top of LLM inference models, our multi-agents framework allows enterprises to build accurate and deterministic agents.

With high speed Ethernet, a cluster of Cognition Appliances can deliver extreme high throughput and low latency. The entry level Cognition Appliances support commodity GPU devices to save cost and high end systems use high end GPU devices for ultra low latency.

For more information, info@cognitionappliances.com.

Our team have rich experience and deep knowledge in

  • Transformer based LLMs training and inference algorithms

  • Reinforcement Learning training and inference algorithms

  • GPU programming and embedded software (firmware)

  • High performance cluster development with commodity hardware

AI Inference Software Architecture

Cognition Appliances software utilizes layered architecture to address various challenges for high performance and high quality AI Services. For example, 1. the GPU Device Kernels primarily addresses the device utilization efficiency, 2. AI Model Computing layer keeps the devices are busy and 3. AI Service Management uses AI to manage all AI services over the AI clusters so that the systems have maximized output with minimized response time.

Multi-Agents Architecture

Service Agents:

service agents provide APIs to all clients. Each service agent can connect one or more route agents based on the performance and redundancy requirements.

Router Agents:

Router agents route requests from service agents to proper domain expert agents. The router agents are AI based and have self-learning capabilities.

Domain Expert Agents:

Domain expert agents have capabilities to perform domain specific tasks.