本文从数学的角度来介绍无符号数和有符号数的定义, 关系及转换.
1. 无符号数和有符号数的定义
设 $n$ 维位向量: $\vec{x} = [x_0,x_1,\cdots, x_{n-1}] \in \{0, 1 \}^n$.
无符号数的定义, $B2U(\vec{x}) := \sum_{i = 0}^{n - 1} {{x_i}{2^i}}$.
有符号数的定义, $B2T(\vec{x}) := -x_{n-1}2^{n-1} + \sum_{i=0}^{n-2}{{x_i}{2^i}}$.
注意: 这里的有符号数采用的是 two's complement 编码方式, 这是绝大多数计算机系统采用的方式.
2. 无符号数和有符号数的取值范围
$B2U_{\min}(\vec{x}) = 0$, 这时 $\vec{x} = [0, 0, \cdots, 0]$
$B2U_{\max}(\vec{x}) = \sum_{i = 0}^{n - 1} {{2^i}}$, 这时 $\vec{x} = [1, 1, \cdots, 1]$
$B2T_{\min}(\vec{x}) = -2^{n-1}$, 这时 $\vec{x} = [1, 0, \cdots, 0]$
$B2T_{\max}(\vec{x}) = \sum_{i=0}^{n-2}{{2^i}} = 2^{n-1} - 1$, 这时 $\vec{x} = [0, 1,\cdots, 1]$
所以:
$$\begin{cases}
B2U(\vec{x}) \in\{0, \cdots, 2^{n}-1\} \\
B2T(\vec{x}) \in\{-2^{n-1}, \cdots, 2^{n-1} - 1\}
\end{cases}$$
3. 无符号数和有符号数转换函数的逆
可以证明 $B2U$ 和 $B2T$ 是双射 (bijection) 的, 所以可以定义它们的逆:
$$\begin{cases} U2B = B2U^{-1}: \{0, \cdots, 2^{n}-1\} \rightarrow \{0, 1 \}^n \\ T2B = B2T^{-1}: \{-2^{n-1}, \cdots, 2^{n-1} - 1\} \rightarrow \{0, 1 \}^n \end{cases}$$
4. 无符号数和有符号数之间的关系
根据定义, 可得
$$B2U(\vec{x}) - B2T(\vec{x}) = x_{n-1}{2^{n-1}} + x_{n-1}2^{n-1} = x_{n-1}{2^{n}}$$
具体地,
$$\begin{aligned}
B2U(\vec{x}) &= B2T(\vec{x}) + x_{n-1}{2^{n}} \\
&= \begin{cases}
B2T(\vec{x}) & x_{n-1} = 0 \ \textit{i.e.,}\ B2T(\vec{x}) \geq 0 \\
B2T(\vec{x}) + 2^n & x_{n-1} = 1 \ \textit{i.e.,}\ B2T(\vec{x}) < 0
\end{cases} \\
&= B2T(\vec{x}) \bmod 2^{n} \\
\end{aligned}$$
$$\begin{aligned} B2T(\vec{x}) &= B2U(\vec{x}) - x_{n-1}{2^{n}} \\ &= \begin{cases} B2U(\vec{x}) & x_{n-1} = 0 \ \textit{i.e.,}\ B2U(\vec{x}) < 2^{n-1}\\ B2U(\vec{x}) - 2^n & x_{n-1} = 1 \ \textit{i.e.,}\ B2U(\vec{x}) \geq 2^{n-1} \end{cases}\\ &= (B2U(\vec{x}) + 2^{n-1}) \bmod 2^{n} - 2^{n-1} \end{aligned}$$
5. 无符号数和有符号数之间的转换
定义无符号数到有符号数的转换函数: $U2T(u) := B2T(U2B(u))$, 其中 $u \in \{0, \cdots, 2^{n}-1\}$
定义有符号数到无符号数的转换函数: $T2U(t) := B2U(T2B(t))$, 其中 $t \in \{-2^{n-1}, \cdots, 2^{n-1} - 1\}$
根据定义及上一小节的结论, 可以得到:
$$\begin{cases}
\begin{aligned}
T2U(t) &:= B2U(T2B(t)) \\
&= B2T(T2B(t)) \bmod 2^{n} \\
&= t \bmod 2^{n}
\end{aligned}\\
\begin{aligned}
U2T(u) &:= B2T(U2B(u)) \\
&= (B2U(U2B(u)) + 2^{n-1}) \bmod 2^{n} - 2^{n-1} \\
&= (u + 2^{n-1}) \bmod 2^{n} - 2^{n-1} \\
\end{aligned}\\
\end{cases}$$
测试代码 (可以打开该链接运行该段代码: http://coliru.stacked-crooked.com/a/5a896386dbc1f143) 如下:
#include <iostream>
int main()
{
int n = 8;
for (int u = 0; u <= 255; ++u)
{
int u2t_v1 = (signed char)u;
// U2T(u) = (u + 2^{n-1}) \bmod 2^{n} - 2^{n-1}
int u2t_v2 = ((u + (1 << (n - 1))) % (1 << n)) - (1 << (n -1));
std::cout << u << "\t" << u2t_v1 << "\t" << u2t_v2 << std::endl;
}
std::cout << "==========================" <<std::endl;
for (int t = -128; t <= 127; ++t)
{
int t2u_v1 = (unsigned char)t;
// T2U(t) = t \bmod 2^{n}
// mod op in math is different from % op in C++ since C++11
// the sign of the result of mod in math is the same as the one of divisor
// the sign of the result of % since C++11 is the same as the one of dividend
// so dividend add (1 << n) to make it positive
int t2u_v2 = (t + (1 << n)) % (1 << n);
std::cout << t << "\t" << t2u_v1 << "\t" << t2u_v2 << std::endl;
}
std::getchar();
return 0;
}
参考
- CSAPP
更新记录
- 20211226, 发布