Lesson 22 安全(隔离、身份与访问控制)
在上一节中,我们利用 TLS 协议构建了安全的通信管道,解决了数据在外部传输时的安全问题。现在,假设数据已经安全到达了服务器内部。
系统安全的核心矛盾在于:为了安全,我们需要隔离;为了有用,我们需要共享。 本章将探讨如何在不可信的环境中构建可信的系统,以及如何控制主体(Subject)对客体(Object)的访问。
1. 隔离:一切安全的前提
如果系统没有边界,就无从谈起保护。隔离的本质是将数据关在笼子里,但笼子的形态在不断演进:
- 物理隔离:最原始但也最安全。两台不联网的物理机完全隔绝。
- 问题:通信困难,资源利用率低。
- 虚拟机隔离:利用 Hypervisor 和硬件虚拟化技术,在同一硬件上运行多个 OS。
- 机制:特权级(Ring 0 vs Ring 3)+ 虚拟内存。
- 进程隔离:现代操作系统的基础。
- 机制:虚拟地址空间 + IPC(进程间通信)。
核心挑战:隔离做得越好,共享就越困难。系统设计的艺术在于如何打开一个受控的缺口。
2. 身份认证 (Authentication)
2.1 认证的三要素
- Know:你知道什么(密码、密保问题)。
- Have:你拥有什么(门禁卡、U 盾、手机验证码)。
- Be:你是什么(指纹、人脸)。
2.2 挑战:重放攻击 (Replay Attack)
数字世界与物理世界最大的不同在于:数据是可以完美复制的。
攻击者不需要破解你的密码,只需截获你加密后的登录包,然后重新发送一遍,服务器就会以为你又登录了。
- 对策:引入新鲜性。
- Nonce:一次性随机数。
- Timestamp:时间戳(依赖时钟同步)。
- Challenge-Response (挑战-应答):服务器发一个随机挑战(Challenge),用户用私钥签名(Response),服务器验证。每次题目不同,答案也不同,无法重放。
2.3 认证体系的演进
- 口令 (Password):
- 防御:加盐 (Salt) 哈希,防止彩虹表攻击。
- 局限:容易被撞库,容易被钓鱼。
- 双因子 (2FA/U2F):
- 引入硬件设备。浏览器将 TLS 会话 ID (Channel ID) 绑定到认证请求中,防止中间人拦截中继(就是中间有个骗子两头假冒and中转信息)。
- 公钥基础设施:
- 解决“我是谁”的终极方案是用公钥作为身份 ID。
- 通过 CA (证书机构) 构建信任链,将公钥与人类可读的名称绑定。
3. 访问控制 (Authorization)
确认了“你是谁”之后,下一步是决定“你能做什么”。
3.1 访问控制矩阵
这是一个概念模型。行是主体 (Subject: 用户/进程),列是客体 (Object: 文件/资源),格子里是权限 (R/W/X)。
由于矩阵太稀疏,工程上通常有两种实现方式:
- ACL (访问控制列表):按列存储。
- 视角:文件 A 说“张三可以读,李四可以写”。
- 优点:容易管理资源,撤销权限方便(直接改文件属性)。
- 缺点:很难查询“张三到底拥有哪些文件的权限”。
- Capabilities (能力/票据):按行存储。
- 视角:张三口袋里揣着“读文件 A”和“写文件 B”的票据。
- 优点:权限检查极快,且易于权限传递(把票给别人)。
- 缺点:撤销权限极难(你不知道票据复印了多少份发给了谁)。
3.2 策略模型
- DAC (自主访问控制):Linux 文件系统。资源拥有者(Owner)决定谁能访问。
- MAC (强制访问控制):军队/SELinux。系统管理员强制规定安全策略,用户不能自行修改。
- RBAC (基于角色的访问控制):企业系统。用户 $\to$ 角色 $\to$ 权限。
4. 信息流控制 (Information Flow Control)
ACL 只能控制“谁能访问数据”,但控制不了“数据访问后流向了哪里”。例:你有权读取机密文件,但你读取后把它通过网络发给了竞争对手。ACL 对此无能为力。
我们需要信息流控制,动态追踪数据的流向。这通常基于安全标签,如“绝密”、“机密”、“公开”。
4.1 Bell-LaPadula 模型 (机密性)
旨在防止数据泄密。
No Read Up (NRU):不能读比自己级别高的信息。
No Write Down (NWD):不能写比自己级别低的信息。
如果你读了“绝密”信息,然后写到了“公开”文件中,秘密就泄露了。这就是特洛伊木马窃取数据的原理。
4.2 Biba 模型 (完整性)
旨在防止数据被污染。
- No Read Down (NRD):高等级主体不能读低等级数据(怕被垃圾信息误导)。
- No Write Up (NWU):低等级主体不能修改高等级数据(怕破坏系统配置)。
总结:BLP 模型是为了把秘密“关”在上面,Biba 模型是为了把垃圾“挡”在下面。
5. 对照
Lesson 22 Security (Isolation, Identity, and Access Control)
In the previous lesson, we used the TLS protocol to build a secure communication pipeline, solving the problem of data security during external transmission. Now, assume the data has safely arrived inside the server.
The core contradiction of system security is: To be secure, we need isolation; to be useful, we need sharing. This chapter explores how to build trusted systems in untrusted environments and how to control a Subject’s access to an Object.
1. Isolation: The Prerequisite for All Security
If a system has no boundaries, there is no protection to speak of. The essence of isolation is keeping data in a “cage,” but the form of this cage is constantly evolving:
- Physical Isolation: The most primitive but also the safest. Two physical machines that are not networked are completely isolated (Air-gapped).
- Problem: Difficult communication, low resource utilization.
- Virtual Machine Isolation: Uses a Hypervisor and hardware virtualization technology to run multiple OSs on the same hardware.
- Mechanism: Privilege levels (Ring 0 vs Ring 3) + Virtual Memory.
- Process Isolation: The basis of modern operating systems.
- Mechanism: Virtual Address Space + IPC (Inter-Process Communication).
Core Challenge: The better the isolation, the harder the sharing. The art of system design lies in opening a controlled breach.
2. Identity Authentication
2.1 The Three Factors of Authentication
- Know: What you know (Password, security questions).
- Have: What you have (Keycard, USB token, SMS code).
- Be: What you are (Fingerprint, Face ID).
2.2 Challenge: Replay Attack
The biggest difference between the digital world and the physical world is: Data can be perfectly copied.
An attacker doesn’t need to crack your password; they just need to intercept your encrypted login packet and send it again, and the server will think you are logging in again.
- Countermeasure: Introduce Freshness.
- Nonce: Number used once.
- Timestamp: Requires clock synchronization.
- Challenge-Response: The server sends a random challenge, the user signs it with a private key (Response), and the server verifies it. Since the challenge is different every time, the answer is different, making replay impossible.
2.3 Evolution of Authentication Systems
- Password:
- Defense: Salted Hash to prevent Rainbow Table attacks.
- Limitation: Prone to credential stuffing and phishing.
- Two-Factor (2FA/U2F):
- Introduces hardware devices. The browser binds the TLS Session ID (Channel ID) to the authentication request to prevent Man-in-the-Middle (a liar in the middle relaying information) attacks.
- Public Key Infrastructure (PKI):
- The ultimate solution to “Who am I” is using a Public Key as an Identity ID.
- Builds a trust chain through CA (Certificate Authority) to bind public keys to human-readable names.
3. Authorization (Access Control)
After confirming “Who you are”, the next step is deciding “What you can do”.
3.1 Access Control Matrix
This is a conceptual model. Rows are Subjects (Users/Processes), columns are Objects (Files/Resources), and cells contain Permissions (R/W/X).
Since the matrix is too sparse, there are usually two implementation methods in engineering:
- ACL (Access Control List): Column-based storage.
- Perspective: File A says, “Zhang San can read, Li Si can write.”
- Pros: Easy resource management, easy permission revocation (just change file attributes).
- Cons: Hard to query “Exactly which files does Zhang San have permissions for?”
- Capabilities (Tickets): Row-based storage.
- Perspective: Zhang San carries tickets in his pocket for “Read File A” and “Write File B”.
- Pros: Extremely fast permission checks, easy permission delegation (give the ticket to someone else).
- Cons: Extremely hard revocation (you don’t know how many copies of the ticket were made and who has them).
3.2 Policy Models
- DAC (Discretionary Access Control): Linux file system. The resource Owner decides who can access it.
- MAC (Mandatory Access Control): Military/SELinux. System administrators mandate security policies; users cannot modify them.
- RBAC (Role-Based Access Control): Enterprise systems. User $\to$ Role $\to$ Permission.
4. Information Flow Control (IFC)
ACLs only control “who can access data,” but cannot control “where the data flows after access.” Example: You have the right to read a confidential file, but after reading it, you send it to a competitor over the network. ACLs are powerless against this.
We need Information Flow Control to dynamically track the flow of data. This is usually based on Security Labels, such as “Top Secret,” “Confidential,” and “Public.”
4.1 Bell-LaPadula Model (Confidentiality)
Aims to prevent data leaks.
No Read Up (NRU): Cannot read information higher than your level.
No Write Down (NWD): Cannot write to a level lower than your own.
If you read “Top Secret” information and then write it into a “Public” file, the secret is leaked. This is the principle behind Trojan Horse data theft.
4.2 Biba Model (Integrity)
Aims to prevent data pollution.
- No Read Down (NRD): High-level subjects cannot read low-level data (fear of being misled by garbage information).
- No Write Up (NWU): Low-level subjects cannot modify high-level data (fear of corrupting system configurations).
Summary: The BLP model locks secrets “up above”, while the Biba model blocks garbage “down below”.