Sagence AI Emerges from Stealth Tackling Economic Viability of Inference Hardware for Generative AI

Sagence AI today emerged from stealth unveiling a groundbreaking advanced analog in-memory compute architecture that directly addresses the untenable power/performance/price and environmental sustainability conundrum facing AI inferencing. Driven by its industry-first architectural innovations using analog technology, Sagence AI makes possible multiple orders of magnitude improvement in energy efficiency and cost reductions, while sustaining performance equivalent to high performance GPU/CPU based systems.

Compared to industry’s leading volume GPU processing the Llama2-70B large language model with performance normalized to 666K tokens/sec, Sagence technology performs with 10X lower power, 20X lower price, and 20X smaller rack space. Using a modular chiplet architecture for maximum integration, Sagence technology makes possible a highly efficient inference machine that scales from data center generative AI to edge computer visions applications across multiple industries. This previously unimaginable balance of high performance and low power at affordable cost addresses the growing ROI problem for generative AI applications at scale, as AI compute in the data center shifts from training models to deployment of models to inference tasks.

“A fundamental advancement in AI inference hardware is vital to the future of AI. Use of large language models (LLMs) and Generative AI drives demand for rapid and massive change at the nucleus of computing, requiring an unprecedented combination of highest performance at lowest power and economics that match costs to the value created,” said Vishal Sarin, CEO & Founder, Sagence AI. “The legacy computing devices today that are capable of extreme high performance AI inference cost too much to be economically viable and consume too much energy to be environmentally sustainable. Our mission is to break those performance and economic limitations in an environmentally responsible way.”

“The demands of the new generation of AI models have resulted in accelerators with massive on-package memory and consequently extremely high-power consumption. Between 2018 and today, the most powerful GPUs have gone from 300W to 1200W, while top-tier server CPUs have caught up to the power consumption levels of NVIDIA’s A100 GPU from 2020,” said Alexander Harrowell, Principal Analyst, Advanced Computing, Omdia. “This has knock-on effects for data center cooling, electrical distribution, AI applications’ unit economics, and much else. One way out of the bind is to rediscover analog computing, which offers much lower power consumption, very low latency, and permits working with mature process nodes.”

On the Frontier of Analog In-memory Compute

Sagence AI leads the industry on the frontier of in-memory compute innovation. Sagence technology is the first to do deep subthreshold compute inside multi-level memory cells, an unprecedented combination that opens doors to the orders of magnitude improvements necessary to deliver inference at scale. As digital technology reaches limits in ability to scale power and cost, Sagence innovated a new analog path forward leveraging the inherent benefits of analog in energy efficiency and costs to make possible mass adoption of AI that is both economically viable and environmentally sustainable.

In-memory Computing Aligned to AI Inference

In-memory computing aligns closely with the essential elements of efficiency in AI inference applications. Merging storage and compute inside memory cells eliminates single-purpose memory storage and complex scheduled multiply-accumulate circuits that run the vector-matrix multiplication integral to AI computing. The resulting chips and systems are much simpler, lower cost, lower power and with vastly more compute capability.

Sagence views the AI inference challenge not as a general-purpose computing problem, but a mathematically intensive data processing problem. Managing the massive amount of arithmetic processing needed to “run” a neural network on CPU/GPU digital machines requires extremely complicated hardware reuse and hardware scheduling. The natural hardware solution is not a general-purpose computing machine, rather an architecture that more closely mirrors how biological neural networks operate.

Shannon Davis

Shannon, writes, edits and produces Semiconductor Digest’s news articles, email newsletters, blogs, webcasts, and social media posts. She holds a bachelor’s degree in journalism from Huntington University in Huntington, IN. In addition to her years of freelance business reporting, Shannon has also worked in marketing and public relations in the renewable energy and healthcare industries.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_142332005_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
_pk_id.56353.85f6	1 year 27 days	This cookie is set by Google Analytics and is used to store a unique user ID for statistical purposes.
_pk_ses.56353.85f6	30 minutes	This cookie is created by Piwik PRO to store a unique session ID.
CONSENT	16 years 5 months 18 days 3 hours	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.

Cookie	Duration	Description
IDE	1 year 24 days	Used by Google DoubleClick (which we don't use) and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
optin	1 hour	This cookie tracks users who take an affirmative action, such as checking a tick-box or another similar action. The consent is used for a variety of purposes, such as agreeing to terms and conditions, signing up for online content like newsletters and resources, consenting to the use of cookies, and more.
yt-remote-connected-devices	never	Stores the user's video player preferences using embedded YouTube video.
yt-remote-device-id	never	Stores the user's video player preferences using embedded YouTube video.

Sagence AI Emerges from Stealth Tackling Economic Viability of Inference Hardware for Generative AI

Shannon Davis

Featured Products

Applied Materials Announces New Collaboration Model for Advanced Packaging at Summit on Energy-Efficient Computing

C-Hawk Technology Pioneers Robotic Plastic Welding with New Roberto Platform for Semiconductor Equipment Manufacturing