Lesson 21 安全(数据安全与协议)
在之前的课程中,我们讨论了“可靠性”。可靠性处理的是概率性的故障(如硬盘坏道、宇宙射线翻转内存),其对手是“自然规律”。
而“安全”处理的是恶意的攻击。攻击者是智能的,他们会主动寻找系统的弱点,制造特定的输入来破坏系统。因此,安全工程不仅要像可靠性工程那样处理概率问题,更要进行博弈。
1. 安全防御的基石:隔离
在谈论复杂的加密之前,最朴素也是最有效的安全手段是隔离。
- 物理隔离:系统内的数据是可信的,系统外的数据(开放环境)是危险的。
- 身份隔离:
- 远程攻击 $\to$ 依靠用户认证 (Authentication) 防御。
- 本地攻击 $\to$ 依靠访问控制 (Access Control/ACL) 防御。
- 权限隔离:即使是内部人员,也应遵循“最小权限原则”,限制 CRUD 操作。
2. 密码学工具箱
当数据必须离开安全的隔离区,暴露在公网时,我们需要数学工具:密码学。
2.1 对称加密
- 原理:加密和解密使用同一个密钥 $K$。
- 古典流派:
- 凯撒密码:简单的替换。
- Vernam Cipher (一次性密码本):$C = P \oplus K$。这是理论上完美保密的,前提是密钥 $K$ 必须随机、与明文等长且只用一次。但工程上无法实现(密钥分发太难)。
- 现代流派:
- 流加密:模仿一次性密码本,用种子生成伪随机密钥流,逐位异或。
- 分组加密:如 AES。将数据分成固定块(如 128 bit),经过多轮复杂的数学变换。
- 工作模式:单纯分组加密(ECB)是不安全的,因为相同的明文块会生成相同的密文块(暴露图案)。我们需要引入随机性:
- CBC (分组链接模式):$C_i = E_k(P_i \oplus C_{i-1})$。当前块加密前先和前一块的密文异或,打破了模式。
- CTR (计数器模式):将分组密码转化为流密码使用。
2.2 非对称加密
- 痛点:对称加密很快,但密钥分发麻烦。你怎么安全地把密钥告诉对方?
- 原理:密钥分为公钥 ($K_{pub}$) 和私钥 ($K_{priv}$)。
- RSA:基于大整数分解的困难性。
- Diffie-Hellman (DH):基于离散对数问题,专门用于密钥交换。
- 应用:
- 加密:公钥加密 $\to$ 私钥解密(用于保护数据机密性)。
- 签名:私钥加密 $\to$ 公钥解密(用于身份认证和防篡改)。
2.3 完整性与认证
- Hash (摘要):数据的指纹。哪怕改一个 bit,Hash 值都会剧变。用于防篡改。
- MAC (消息认证码):带密钥的 Hash。只有拥有密钥的人才能生成正确的 MAC。用于对称鉴别。
- 数字签名 (Digital Signature):Hash + 私钥加密。只有拥有私钥的人才能签,所有人都能用公钥验。用于非对称鉴别和不可抵赖性。
现代密码学黄金组合:
由于非对称加密太慢(比对称慢几个数量级),我们通常采用混合加密:
- 用非对称加密(如 RSA)交换一个短的对称密钥。
- 用对称加密(如 AES)传输真正的大流量数据。
3. 安全协议:TLS/SSL
有了上述工具,我们如何设计一个安全的通信协议(如 HTTPS)?我们需要解决五大挑战:
- 协商:双方用什么算法?
- 密钥:如何安全生成和分发密钥?
- 认证:我怎么知道你是真的银行网站?
- 机密性:数据不被窃听。
- 完整性:数据不被篡改。
3.1 信任链:CA 证书体系
如果攻击者在中间拦截并替换了公钥(中间人攻击),非对称加密也无效。因此需要 CA (证书授权中心)。
- 证书 = 服务器公钥 + 服务器身份信息 + CA 的数字签名。
- 验证:浏览器内置了可信 Root CA 的公钥,用来验证证书的签名是否合法。
3.2 TLS 握手流程
- ClientHello & ServerHello:
- 双方交换随机数 (
Client_Random,Server_Random)。 - 协商加密套件(Cipher Suite)。
- 双方交换随机数 (
- 身份认证与密钥交换:
- Server 发送证书(包含公钥 $K_{pub}$)。
- Client 验证证书。
- Client 生成一个
Pre-Master Secret,用服务器公钥 $K_{pub}$ 加密后发给 Server。(如果是 RSA 交换) - 注:如果是 DH 交换,则双方各自算出 Pre-Master Secret。
- 生成主密钥:
- 双方利用手中的三个参数:
Client_Random+Server_Random+Pre-Master Secret,通过算法生成最终的主密钥 (Master Secret),并进一步衍生出用于读写的对称密钥对和 MAC 密钥对。
- 双方利用手中的三个参数:
- ChangeCipherSpec & Finished:
- 双方发送
ChangeCipherSpec,表示“后面的消息我要开始加密了”。 - 发送
Finished消息(包含之前所有握手消息的 Hash 和 MAC),验证握手过程没被篡改。
- 双方发送
3.3 记录协议
握手完成后,建立 TCP 连接,传输真正的应用数据:
- 分段 & 压缩。
- 加 MAC:保证完整性。
- 加密:使用协商好的对称密钥加密。
- 加头:附加 SSL Header。
4. 对照
Lesson 21 Computer System Security: Data Security and Protocols
In previous lessons, we discussed “Reliability”. Reliability deals with probabilistic faults (like hard drive bad sectors, cosmic rays flipping memory), and its opponent is “laws of nature”.
However, “Security” deals with malicious attacks. Attackers are intelligent; they actively look for system weaknesses and craft specific inputs to destroy the system. Therefore, security engineering must not only handle probabilistic issues like reliability engineering but also engage in game theory.
1. The Cornerstone of Security Defense: Isolation
Before discussing complex encryption, the simplest and most effective security measure is Isolation.
- Physical Isolation: Data inside the system is trusted; data outside (open environment) is dangerous.
- Identity Isolation:
- Remote Attacks $\to$ Defended by Authentication.
- Local Attacks $\to$ Defended by Access Control/ACL.
- Privilege Isolation: Even for internal personnel, the “Principle of Least Privilege” should be followed to limit CRUD operations.
2. The Cryptography Toolbox
When data must leave the secure isolation zone and be exposed to the public internet, we need mathematical tools: Cryptography.
2.1 Symmetric Encryption
- Principle: Encryption and decryption use the same key $K$.
- Classical Schools:
- Caesar Cipher: Simple substitution.
- Vernam Cipher (One-Time Pad): $C = P \oplus K$. This is theoretically perfectly secure, provided the key $K$ is random, as long as the plaintext, and used only once. However, it is impossible to implement in engineering (key distribution is too hard).
- Modern Schools:
- Stream Cipher: Mimics the One-Time Pad, using a seed to generate a pseudo-random key stream for bitwise XOR.
- Block Cipher: Such as AES. Divides data into fixed blocks (e.g., 128 bit) and passes them through multiple rounds of complex mathematical transformations.
- Modes: Simple block encryption (ECB) is unsafe because identical plaintext blocks generate identical ciphertext blocks (exposing patterns). We need to introduce randomness:
- CBC (Cipher Block Chaining): $C_i = E_k(P_i \oplus C_{i-1})$. The current block is XORed with the previous ciphertext block before encryption, breaking patterns.
- CTR (Counter Mode): Converts block ciphers into stream ciphers.
2.2 Asymmetric Encryption
- Pain Point: Symmetric encryption is fast, but Key Distribution is troublesome. How do you safely tell the other party the key?
- Principle: Keys are divided into Public Key ($K_{pub}$) and Private Key ($K_{priv}$).
- RSA: Based on the difficulty of factoring large integers.
- Diffie-Hellman (DH): Based on the discrete logarithm problem, specialized for key exchange.
- Applications:
- Encryption: Encrypt with Public Key $\to$ Decrypt with Private Key (used to protect data confidentiality).
- Signature: Encrypt with Private Key $\to$ Decrypt with Public Key (used for identity authentication and tamper-proofing).
2.3 Integrity and Authentication
- Hash (Digest): The fingerprint of data. Changing even one bit causes drastic changes in the Hash value. Used for tamper-proofing.
- MAC (Message Authentication Code): Hash with a key. Only those with the key can generate the correct MAC. Used for Symmetric Authentication.
- Digital Signature: Hash + Private Key Encryption. Only the owner of the private key can sign, and everyone can verify with the public key. Used for Asymmetric Authentication and Non-repudiation.
Modern Cryptography Golden Combo:
Since asymmetric encryption is too slow (orders of magnitude slower than symmetric), we usually use Hybrid Encryption:
- Use Asymmetric Encryption (like RSA) to exchange a short Symmetric Key.
- Use Symmetric Encryption (like AES) to transmit the actual high-volume data.
3. Security Protocols: TLS/SSL
With the tools above, how do we design a secure communication protocol (like HTTPS)? We need to solve five major challenges:
- Negotiation: What algorithms to use?
- Key: How to safely generate and distribute keys?
- Authentication: How do I know you are the real bank website?
- Confidentiality: Data is not eavesdropped.
- Integrity: Data is not tampered with.
3.1 Trust Chain: CA Certificate System
If an attacker intercepts and replaces the public key in the middle (Man-in-the-Middle Attack), asymmetric encryption is useless. Therefore, a CA (Certificate Authority) is needed.
- Certificate = Server Public Key + Server Identity Info + CA’s Digital Signature.
- Verification: Browsers have built-in public keys of trusted Root CAs to verify if the certificate’s signature is legitimate.
3.2 TLS Handshake Process
- ClientHello & ServerHello:
- Both parties exchange random numbers (
Client_Random,Server_Random). - Negotiate Cipher Suite.
- Both parties exchange random numbers (
- Authentication & Key Exchange:
- Server sends Certificate (containing Public Key $K_{pub}$).
- Client verifies the Certificate.
- Client generates a
Pre-Master Secret, encrypts it with Server’s Public Key $K_{pub}$, and sends it to the Server. (If RSA exchange). - Note: If DH exchange, both parties calculate the Pre-Master Secret independently.
- Generate Master Secret:
- Both parties use the three parameters in hand:
Client_Random+Server_Random+Pre-Master Secretto generate the final Master Secret via an algorithm, and further derive symmetric key pairs and MAC key pairs for reading/writing.
- Both parties use the three parameters in hand:
- ChangeCipherSpec & Finished:
- Both parties send
ChangeCipherSpec, indicating “I will start encrypting subsequent messages”. - Send
Finishedmessage (containing Hash and MAC of all previous handshake messages) to verify the handshake process was not tampered with.
- Both parties send
3.3 Record Protocol
After the handshake is complete, a TCP connection is established to transmit real application data:
- Segmentation & Compression.
- Add MAC: Guarantees integrity.
- Encrypt: Encrypt using the negotiated symmetric key.
- Add Header: Attach SSL Header.