16, 8, and 4-bit Floating Point Formats — How Does it Work? | by Dmitrii Eliuseev | Sep, 2023

September 29, 2023
by Dmitrii Eliuseev
AI, Syndicated
195 Views

Let’s go into bits and bytes

For 50 years, from the time of Kernighan, Ritchie, and their 1st edition of the C Language book, it was known that a single-precision “float” type has a 32-bit size and a double-precision type has 64 bits. There was also an 80-bit “long double” type with extended precision, and all these types covered almost all the needs for floating-point data processing. However, during the last few years, the advent of large neural network models required developers to move into another part of the spectrum and to shrink floating point types as much as possible.

Honestly, I was surprised when I discovered that the 4-bit floating-point format exists. How on Earth can it be possible? The best way to know is to test it on our own. In this article, we will discover the most popular floating point formats, make a simple neural network, and see how it works.

Let’s get started.

A “Standard” 32-bit Floating point

Before going into “extreme” formats, let’s recall a standard one. An IEEE 754 standard for floating-point arithmetic was established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). A typical number in a 32-float type looks like this:

Here, the first bit is a sign, the next 8 bits represent an exponent, and the last bits represent the mantissa. The final value is calculated using the formula:

This simple helper function allows us to print a floating point value in binary form:

import structdef print_float32(val: float):
""" Print Float32 in a binary form """
m = struct.unpack('I', struct.pack('f', val))[0]
return format(m, 'b').zfill(32)
print_float32(0.15625)
# > 00111110001000000000000000000000

Let’s also make another helper for backward conversion, which will be useful later:

def ieee_754_conversion(sign, exponent_raw, mantissa, exp_len=8, mant_len=23):
""" Convert binary data into the floating point value """
sign_mult = -1 if sign == 1 else 1
exponent = exponent_raw - (2 ** (exp_len - 1) - 1)
mant_mult = 1
for b in range(mant_len - 1, -1, -1):
if mantissa & (2 **…

Source link

Cybersecurity Threat Advisory: Craft CMS exploited

Threat actors have been actively exploiting two Craft CMS vulnerabilities, CVE-2025-32432 and CVE-2024-58136, to breach web servers and gain unauthorized

Craft CMS, cybersecurity, Cybersecurity Threat Advisory, day attacks, Featured, RCE, Security, Syndicated, threat actors, XDR, Yii PHP, Zero, zero-day

How agentic AI is driving AI-first business transformation

The role of agentic AI has grown rapidly over the past several months as organizational leaders seek ways to accelerate

agentic, agents, AI, Azure AI Agent Service, Azure AI Foundry, Azure AI Search, Business, business transformation, Copilot, Copilot Studio, Deals, Featured, Finance, Microsoft 365 Copilot Chat, Microsoft Azure OpenAI Service, Microsoft Teams, Power Platform, Researcher and Analyst, Sales Agent, Syndicated, Technology, The Official Microsoft Blog

Tech Time Warp: Beware of the nefarious floppy

Modern computer users are becoming increasingly aware of the potential cybersecurity risks associated with USB drives. (Whatever you do, don’t get curious

cybersecurity, drives, Featured, floppy disk, MS-DOS, Ping, Ping-Pong virus, Syndicated, Tech Insight, Tech Time Warp, USB drives

The future of MSP marketing: How AI enhances,

Artificial intelligence (AI) is reshaping industries, and managed service provider (MSP) marketing is no exception. As AI-driven tools become increasingly

AI, AI tools, Artificial Intelligence, Automation, Business, Creative, Featured, Ideas, managed service provider, marketing, MSP Marketing, MSPs, Sales & Marketing, Syndicated, Technology

16, 8, and 4-bit Floating Point Formats — How Does it Work? | by Dmitrii Eliuseev | Sep, 2023

Let’s go into bits and bytes

A “Standard” 32-bit Floating point

About Us

Our Services

Latest QSOL IT News

16, 8, and 4-bit Floating Point Formats — How Does it Work? | by Dmitrii Eliuseev | Sep, 2023

Let’s go into bits and bytes

A “Standard” 32-bit Floating point

Related Post

Cybersecurity Threat Advisory: Craft CMS exploited

How agentic AI is driving AI-first business transformation

Tech Time Warp: Beware of the nefarious floppy

The future of MSP marketing: How AI enhances,