Compare commits
6 commits
a6ce214130
...
fa33b55144
| Author | SHA1 | Date | |
|---|---|---|---|
| fa33b55144 | |||
| e752441a37 | |||
| 7769007694 | |||
| 87cf9f2c6c | |||
| 463706e6a5 | |||
| ffd33049d9 |
6 changed files with 2991 additions and 1 deletions
|
|
@ -60,6 +60,13 @@
|
||||||
{{ content | safe }}
|
{{ content | safe }}
|
||||||
</main>
|
</main>
|
||||||
|
|
||||||
|
{% if mermaid %}
|
||||||
|
<script type="module">
|
||||||
|
import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs';
|
||||||
|
mermaid.initialize({ startOnLoad: true });
|
||||||
|
</script>
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
<footer>
|
<footer>
|
||||||
<div>
|
<div>
|
||||||
<!-- Social Things -->
|
<!-- Social Things -->
|
||||||
|
|
|
||||||
|
|
@ -121,6 +121,7 @@ md.use(markdownItContainer, 'details', {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
md.use(markdownItFootnote);
|
md.use(markdownItFootnote);
|
||||||
md.use(markdownItHashtag);
|
md.use(markdownItHashtag);
|
||||||
md.use(markdownItMermaid);
|
md.use(markdownItMermaid);
|
||||||
|
|
@ -394,6 +395,7 @@ module.exports = (eleventyConfig) => {
|
||||||
});
|
});
|
||||||
|
|
||||||
eleventyConfig.addPassthroughCopy("robots.txt");
|
eleventyConfig.addPassthroughCopy("robots.txt");
|
||||||
|
eleventyConfig.addPassthroughCopy("simulations");
|
||||||
|
|
||||||
eleventyConfig.ignores.add("README.md");
|
eleventyConfig.ignores.add("README.md");
|
||||||
|
|
||||||
|
|
|
||||||
815
posts/drafts/hyper-logLog-tombstone-garbage-collection.md
Normal file
815
posts/drafts/hyper-logLog-tombstone-garbage-collection.md
Normal file
|
|
@ -0,0 +1,815 @@
|
||||||
|
# HyperLogLog-Based Tombstone Garbage Collection for Distributed Systems
|
||||||
|
|
||||||
|
## Abstract
|
||||||
|
|
||||||
|
When synchronizing records in a distributed network, deletion presents a fundamental challenge. If nodes simply delete their local copies, other nodes may resynchronize the original data, reverting the deletion. This occurs due to non-simultaneous events between nodes or nodes temporarily disconnecting and reconnecting with outdated state. The traditional solution creates "tombstone" records that persist after deletion to prevent resurrection of deleted data.
|
||||||
|
|
||||||
|
While effective, this approach requires every node to indefinitely maintain an ever-growing collection of tombstone records. Typically, after an arbitrarily large time period, tombstones are assumed safe to clear since no rogue nodes should retain the original data.
|
||||||
|
|
||||||
|
This paper presents a methodology using the HyperLogLog algorithm to estimate how many nodes have received a record, comparing this estimate against the count of nodes that have received the corresponding tombstone. This enables pruning tombstones across the network to a minimal set of "keeper" nodes (typically 10-25% of participating nodes), reducing the distributed maintenance burden while maintaining safety guarantees.
|
||||||
|
|
||||||
|
## 1. Introduction
|
||||||
|
|
||||||
|
Distributed systems face an inherent tension between data consistency and storage efficiency when handling deletions. Traditional tombstone-based approaches guarantee correctness but impose unbounded storage growth. Several approaches have been proposed to address tombstone accumulation:
|
||||||
|
|
||||||
|
**Time-based Garbage Collection**: The simplest approach sets a fixed time-to-live (TTL) for tombstones, after which they are automatically deleted[^2]. While storage-efficient, this risks data resurrection if stale nodes reconnect after the GC window. Systems like Apache Cassandra use this approach with configurable `gc_grace_seconds`[^3].
|
||||||
|
|
||||||
|
**CRDT Tombstone Pruning**: Conflict-free Replicated Data Types (CRDTs) like OR-Sets accumulate tombstones proportional to the number of unique deleters[^4]. Various pruning strategies have been proposed, including causal stability detection[^5] and garbage collection through consensus[^6], but these typically require additional coordination or strong assumptions about network connectivity.
|
||||||
|
|
||||||
|
This paper introduces a novel probabilistic approach using HyperLogLog (HLL) cardinality estimation[^1] that complements these existing techniques. Rather than replacing tombstones entirely, it minimizes the number of nodes that must retain them typically reducing keeper nodes to 10-25% of the network while maintaining safety guarantees against data resurrection.
|
||||||
|
|
||||||
|
[^1]: Flajolet, P., Fusy, <20>., Gandouet, O., & Meunier, F. (2007). "HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm." *Discrete Mathematics and Theoretical Computer Science*, AH, 137-156. https://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf
|
||||||
|
[^2]: Ladin, R., Liskov, B., Shrira, L., & Ghemawat, S. (1992). "Providing high availability using lazy replication." *ACM Transactions on Computer Systems*, 10(4), 360-391. https://doi.org/10.1145/138873.138877
|
||||||
|
[^3]: Apache Cassandra Documentation. "Configuring compaction: gc_grace_seconds." https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/index.html
|
||||||
|
[^4]: Shapiro, M., Pregui<75>a, N., Baquero, C., & Zawirski, M. (2011). "A comprehensive study of Convergent and Commutative Replicated Data Types." *INRIA Research Report RR-7506*. https://hal.inria.fr/inria-00555588
|
||||||
|
[^5]: Baquero, C., Almeida, P. S., & Shoker, A. (2017). "Pure Operation-Based Replicated Data Types." *arXiv:1710.04469*. https://arxiv.org/abs/1710.04469
|
||||||
|
[^6]: Bauwens, J., & De Meuter, W. (2020). "Memory Efficient CRDTs in Dynamic Environments." *Proceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC '20)*. https://doi.org/10.1145/3380787.3393682
|
||||||
|
|
||||||
|
### 1.1 Core Concept
|
||||||
|
|
||||||
|
The algorithm operates in three phases:
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant A as Node A
|
||||||
|
participant B as Node B
|
||||||
|
participant C as Node C
|
||||||
|
|
||||||
|
Note over A,C: Phase 1: Record Propagation
|
||||||
|
A->>B: record + recordHLL
|
||||||
|
B->>A: update recordHLL estimate
|
||||||
|
B->>C: record + recordHLL
|
||||||
|
|
||||||
|
Note over A,C: Phase 2: Tombstone Propagation
|
||||||
|
A->>A: Create tombstone with recordHLL and delete record
|
||||||
|
C->>B: update recordHLL estimate
|
||||||
|
A->>B: tombstone + tombstoneHLL + recordHLL
|
||||||
|
B->>B: tombstone updated with new recordHLL and delete record
|
||||||
|
B->>C: tombstone + tombstoneHLL + recordHLL
|
||||||
|
|
||||||
|
Note over A,C: Phase 3: Keeper Election and tombstone garbage collection
|
||||||
|
C->>C: tombstoneCount >= recordCount, become keeper and deletes record
|
||||||
|
C->>B: updates with node tombstone count estimate
|
||||||
|
B->>B: sees higher estimate, step down and garbage collects its own tombstone record
|
||||||
|
B->>A: update connected node with tombstoneHLL
|
||||||
|
A->>A: garbage collects its own tombstone record
|
||||||
|
```
|
||||||
|
|
||||||
|
**Phase 1**: Records propagate through the network via gossip, with each node adding itself to the record's HLL. Nodes then talk between themselves to slowly turn local estimates for the records count into global ones.
|
||||||
|
|
||||||
|
**Phase 2**: When deletion occurs, the deleting node creates a tombstone containing a copy of the record's HLL as the target count. The tombstone propagates similarly, with nodes adding themselves to the tombstone's HLL. During propagation, the target recordHLL is updated to the highest estimate encountered.
|
||||||
|
|
||||||
|
**Phase 3**: When a node detects that `tombstoneCount >= recordCount`, it becomes a "keeper" responsible for continued propagation. As keepers communicate, those with lower estimates step down and garbage collect, converging toward a minimal keeper set.
|
||||||
|
|
||||||
|
## 2. Data Model
|
||||||
|
|
||||||
|
Records and tombstones are maintained as separate entities with distinct tracking mechanisms:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
interface DataRecord<Data> {
|
||||||
|
readonly id: string;
|
||||||
|
readonly data: Data;
|
||||||
|
readonly recordHLL: HyperLogLog; // Tracks nodes that have received this record
|
||||||
|
}
|
||||||
|
|
||||||
|
interface Tombstone {
|
||||||
|
readonly id: string;
|
||||||
|
readonly recordHLL: HyperLogLog; // Target count: highest observed record distribution
|
||||||
|
readonly tombstoneHLL: HyperLogLog; // Tracks nodes that have received the tombstone
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 3. Algorithm
|
||||||
|
|
||||||
|
### 3.1 Record Creation and Distribution
|
||||||
|
|
||||||
|
When a node creates or receives a record, it adds itself to the record's HLL:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
const createRecord = <Data>(id: string, data: Data, nodeId: string): DataRecord<Data> => ({
|
||||||
|
id,
|
||||||
|
data,
|
||||||
|
recordHLL: hllAdd(createHLL(), nodeId),
|
||||||
|
});
|
||||||
|
|
||||||
|
const receiveRecord = <Data>(
|
||||||
|
node: NodeState<Data>,
|
||||||
|
incoming: DataRecord<Data>
|
||||||
|
): NodeState<Data> => {
|
||||||
|
// Reject records that have already been deleted
|
||||||
|
if (node.tombstones.has(incoming.id)) {
|
||||||
|
return node;
|
||||||
|
}
|
||||||
|
|
||||||
|
const existing = node.records.get(incoming.id);
|
||||||
|
const updatedRecord: DataRecord<Data> = existing
|
||||||
|
? { ...existing, recordHLL: hllAdd(hllMerge(existing.recordHLL, incoming.recordHLL), node.id) }
|
||||||
|
: { ...incoming, recordHLL: hllAdd(hllClone(incoming.recordHLL), node.id) };
|
||||||
|
|
||||||
|
const newRecords = new Map(node.records);
|
||||||
|
newRecords.set(incoming.id, updatedRecord);
|
||||||
|
return { ...node, records: newRecords };
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.2 Tombstone Creation
|
||||||
|
|
||||||
|
When deleting a record, a node creates a tombstone containing a copy of the record's HLL as the initial target count:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
const createTombstone = <Data>(record: DataRecord<Data>, nodeId: string): Tombstone => ({
|
||||||
|
id: record.id,
|
||||||
|
recordHLL: hllClone(record.recordHLL),
|
||||||
|
tombstoneHLL: hllAdd(createHLL(), nodeId),
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.3 Garbage Collection Status Check
|
||||||
|
|
||||||
|
The core decision logic determines whether a node should become a keeper, step down, or continue as-is:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
const checkGCStatus = (
|
||||||
|
tombstone: Tombstone,
|
||||||
|
incomingTombstoneEstimate: number | null,
|
||||||
|
myTombstoneEstimateBeforeMerge: number,
|
||||||
|
myNodeId: string,
|
||||||
|
senderNodeId: string | null
|
||||||
|
): { shouldGC: boolean; stepDownAsKeeper: boolean } => {
|
||||||
|
const targetCount = hllEstimate(tombstone.recordHLL);
|
||||||
|
|
||||||
|
const isKeeper = myTombstoneEstimateBeforeMerge >= targetCount;
|
||||||
|
|
||||||
|
if (isKeeper) {
|
||||||
|
// Keeper step-down logic:
|
||||||
|
// If incoming tombstone has reached the target count, compare estimates.
|
||||||
|
// If incoming estimate >= my estimate before merge, step down.
|
||||||
|
// Use node ID as tie-breaker: higher node ID steps down when estimates are equal.
|
||||||
|
if (incomingTombstoneEstimate !== null && incomingTombstoneEstimate >= targetCount) {
|
||||||
|
if (myTombstoneEstimateBeforeMerge < incomingTombstoneEstimate) {
|
||||||
|
return { shouldGC: true, stepDownAsKeeper: true };
|
||||||
|
}
|
||||||
|
// Tie-breaker: if estimates are equal, the lexicographically higher node ID steps down
|
||||||
|
if (myTombstoneEstimateBeforeMerge === incomingTombstoneEstimate &&
|
||||||
|
senderNodeId !== null && myNodeId > senderNodeId) {
|
||||||
|
return { shouldGC: true, stepDownAsKeeper: true };
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return { shouldGC: false, stepDownAsKeeper: false };
|
||||||
|
}
|
||||||
|
|
||||||
|
// Not yet a keeper - will become one if tombstone count reaches target after merge
|
||||||
|
return { shouldGC: false, stepDownAsKeeper: false };
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.4 Tombstone Reception and Processing
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
A[Receive tombstone deletion message] --> B{Do I have<br/>this record?}
|
||||||
|
B -->|No| C[Ignore: record not found]
|
||||||
|
B -->|Yes| D[Merge HLLs and select<br/>highest record estimate]
|
||||||
|
D --> E{Am I already a keeper?<br/>my tombstone count >= target}
|
||||||
|
E -->|Yes| F{Is incoming tombstone<br/>count higher than mine?}
|
||||||
|
F -->|Yes| G[Step down as keeper:<br/>delete tombstone]
|
||||||
|
F -->|No| H{Same count but<br/>sender has lower node ID?}
|
||||||
|
H -->|Yes| G
|
||||||
|
H -->|No| I[Remain keeper:<br/>update tombstone]
|
||||||
|
E -->|No| J{Does my tombstone<br/>count reach target?}
|
||||||
|
J -->|Yes| K[Become keeper:<br/>store tombstone]
|
||||||
|
J -->|No| L[Store tombstone<br/>but not keeper yet]
|
||||||
|
G --> M[Forward tombstone to peers]
|
||||||
|
I --> M
|
||||||
|
K --> M
|
||||||
|
L --> M
|
||||||
|
```
|
||||||
|
|
||||||
|
The complete tombstone reception handler:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
const receiveTombstone = <Data>(
|
||||||
|
node: NodeState<Data>,
|
||||||
|
incoming: Tombstone,
|
||||||
|
senderNodeId: string
|
||||||
|
): NodeState<Data> => {
|
||||||
|
// Don't accept tombstones for unknown records
|
||||||
|
const record = node.records.get(incoming.id);
|
||||||
|
if (!record) {
|
||||||
|
return node;
|
||||||
|
}
|
||||||
|
|
||||||
|
const existing = node.tombstones.get(incoming.id);
|
||||||
|
|
||||||
|
// Merge tombstone HLLs and add self
|
||||||
|
const mergedTombstoneHLL = existing
|
||||||
|
? hllAdd(hllMerge(existing.tombstoneHLL, incoming.tombstoneHLL), node.id)
|
||||||
|
: hllAdd(hllClone(incoming.tombstoneHLL), node.id);
|
||||||
|
|
||||||
|
// Select the best (highest estimate) record HLL as target count
|
||||||
|
// This ensures we use the most complete view of record distribution
|
||||||
|
let bestRecordHLL = incoming.recordHLL;
|
||||||
|
if (existing?.recordHLL) {
|
||||||
|
bestRecordHLL = hllEstimate(existing.recordHLL) > hllEstimate(bestRecordHLL)
|
||||||
|
? existing.recordHLL
|
||||||
|
: bestRecordHLL;
|
||||||
|
}
|
||||||
|
if (hllEstimate(record.recordHLL) > hllEstimate(bestRecordHLL)) {
|
||||||
|
bestRecordHLL = hllClone(record.recordHLL);
|
||||||
|
}
|
||||||
|
|
||||||
|
const updatedTombstone: Tombstone = {
|
||||||
|
id: incoming.id,
|
||||||
|
tombstoneHLL: mergedTombstoneHLL,
|
||||||
|
recordHLL: bestRecordHLL,
|
||||||
|
};
|
||||||
|
|
||||||
|
const myEstimateBeforeMerge = existing ? hllEstimate(existing.tombstoneHLL) : 0;
|
||||||
|
|
||||||
|
const gcStatus = checkGCStatus(
|
||||||
|
updatedTombstone,
|
||||||
|
hllEstimate(incoming.tombstoneHLL),
|
||||||
|
myEstimateBeforeMerge,
|
||||||
|
node.id,
|
||||||
|
senderNodeId
|
||||||
|
);
|
||||||
|
|
||||||
|
// Always delete the record when we have a tombstone
|
||||||
|
const newRecords = new Map(node.records);
|
||||||
|
newRecords.delete(incoming.id);
|
||||||
|
|
||||||
|
if (gcStatus.stepDownAsKeeper) {
|
||||||
|
// Step down: delete both record and tombstone
|
||||||
|
const newTombstones = new Map(node.tombstones);
|
||||||
|
newTombstones.delete(incoming.id);
|
||||||
|
return { ...node, records: newRecords, tombstones: newTombstones };
|
||||||
|
}
|
||||||
|
|
||||||
|
const newTombstones = new Map(node.tombstones);
|
||||||
|
newTombstones.set(incoming.id, updatedTombstone);
|
||||||
|
return { ...node, records: newRecords, tombstones: newTombstones };
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.5 Cascading Step-Down via Forwarding
|
||||||
|
|
||||||
|
When a keeper steps down, it immediately forwards the tombstone to all connected peers, creating a cascade effect that rapidly eliminates redundant keepers:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
const forwardTombstoneToAllPeers = <Data>(
|
||||||
|
network: NetworkState<Data>,
|
||||||
|
forwardingNodeId: string,
|
||||||
|
tombstone: Tombstone,
|
||||||
|
excludePeerId?: string
|
||||||
|
): NetworkState<Data> => {
|
||||||
|
const forwardingNode = network.nodes.get(forwardingNodeId);
|
||||||
|
if (!forwardingNode) return network;
|
||||||
|
|
||||||
|
let newNodes = new Map(network.nodes);
|
||||||
|
|
||||||
|
for (const peerId of forwardingNode.peerIds) {
|
||||||
|
if (peerId === excludePeerId) continue;
|
||||||
|
|
||||||
|
const peer = newNodes.get(peerId);
|
||||||
|
if (!peer || !peer.records.has(tombstone.id)) continue;
|
||||||
|
|
||||||
|
const updatedPeer = receiveTombstone(peer, tombstone, forwardingNodeId);
|
||||||
|
newNodes.set(peerId, updatedPeer);
|
||||||
|
|
||||||
|
// If this peer also stepped down, recursively forward
|
||||||
|
if (!updatedPeer.tombstones.has(tombstone.id) && peer.tombstones.has(tombstone.id)) {
|
||||||
|
const result = forwardTombstoneToAllPeers({ nodes: newNodes }, peerId, tombstone, forwardingNodeId);
|
||||||
|
newNodes = new Map(result.nodes);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return { nodes: newNodes };
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Design Rationale
|
||||||
|
|
||||||
|
### 4.1 Why Propagate the Record HLL with Tombstones?
|
||||||
|
|
||||||
|
Without a shared target count, each node would compare against its own local recordHLL estimate, leading to premature garbage collection. By propagating the recordHLL with the tombstone and always keeping the highest estimate encountered, all nodes converge on a safe target count. During propagation, if a node has a more complete view of record distribution (higher HLL estimate), that becomes the new target for all subsequent nodes.
|
||||||
|
|
||||||
|
### 4.2 Why Dynamic Keeper Election?
|
||||||
|
|
||||||
|
A fixed originator-as-keeper design creates a single point of failure. If the originator goes offline, tombstone propagation halts and records may resurrect when stale nodes reconnect.
|
||||||
|
|
||||||
|
Dynamic election allows any node to become a keeper when it detects `tombstoneCount >= recordCount`. This ensures tombstone propagation continues regardless of which specific node initiated the deletion.
|
||||||
|
|
||||||
|
### 4.3 Why Keeper Step-Down?
|
||||||
|
|
||||||
|
Without step-down logic, every node eventually becomes a keeper (since they all eventually observe the threshold condition). This defeats the purpose of garbage collection.
|
||||||
|
|
||||||
|
Step-down creates convergence toward a minimal keeper set:
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
subgraph Keeper Convergence Over Time
|
||||||
|
T0["t=0: 0 keepers"]
|
||||||
|
T1["t=1: 5 keepers<br/>(first nodes to detect threshold)"]
|
||||||
|
T2["t=2: 3 keepers<br/>(2 stepped down after seeing higher estimates)"]
|
||||||
|
T3["t=3: 1-2 keepers<br/>(most informed nodes remain)"]
|
||||||
|
end
|
||||||
|
T0 --> T1 --> T2 --> T3
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4.4 Why Node ID Tie-Breaker?
|
||||||
|
|
||||||
|
When HLL estimates converge (all nodes have similar tombstoneHLL values due to full propagation), no node can have a strictly higher estimate. Without a tie-breaker, keepers with equal estimates would never step down.
|
||||||
|
|
||||||
|
The lexicographic node ID comparison ensures deterministic convergence: when two keepers with equal estimates communicate, the one with the higher node ID steps down. This guarantees eventual convergence to a single keeper per connected component.
|
||||||
|
|
||||||
|
### 4.5 Why Forward on Step-Down?
|
||||||
|
|
||||||
|
Without forwarding, keepers only step down when randomly selected for gossip - a slow process. With aggressive forwarding, a stepping-down keeper immediately propagates the "winning" tombstone to all neighbors, creating a cascade effect that rapidly eliminates redundant keepers.
|
||||||
|
|
||||||
|
## 5. Evaluation
|
||||||
|
|
||||||
|
### 5.1 Experimental Setup
|
||||||
|
|
||||||
|
We implemented a discrete-event simulation to evaluate the algorithm under various network conditions. Each test scenario was executed 50 times to obtain statistically reliable averages. The simulation models:
|
||||||
|
|
||||||
|
- **Gossip protocol**: Each round, every node with a record or tombstone randomly selects one peer and exchanges state
|
||||||
|
- **HLL precision**: 10 bits (1024 registers, ~1KB per HLL)
|
||||||
|
- **Convergence criteria**: Records deleted, followed by 100 additional rounds for keeper convergence
|
||||||
|
- **Trials**: 50 independent runs per scenario, with results averaged
|
||||||
|
|
||||||
|
### 5.2 Test Scenarios
|
||||||
|
|
||||||
|
#### 5.2.1 Single Node Deletion
|
||||||
|
|
||||||
|
**Scenario**: A single node creates a record, propagates it through gossip, then initiates deletion.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
subgraph Network Topology 15 nodes 40 percent connectivity
|
||||||
|
N0((node-0<br/>originator))
|
||||||
|
N1((node-1))
|
||||||
|
N2((node-2))
|
||||||
|
N3((node-3))
|
||||||
|
N4((node-4))
|
||||||
|
N5((node-5))
|
||||||
|
N6((node-6))
|
||||||
|
N7((node-7))
|
||||||
|
N0 --- N1
|
||||||
|
N0 --- N3
|
||||||
|
N1 --- N2
|
||||||
|
N1 --- N4
|
||||||
|
N2 --- N5
|
||||||
|
N3 --- N4
|
||||||
|
N3 --- N6
|
||||||
|
N4 --- N5
|
||||||
|
N5 --- N7
|
||||||
|
N6 --- N7
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
**Protocol**:
|
||||||
|
1. Node-0 creates record and propagates for 20 rounds
|
||||||
|
2. Node-0 creates tombstone and initiates deletion
|
||||||
|
3. Simulation runs until convergence
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Nodes | 15 per trial (750 total) |
|
||||||
|
| Records deleted | 100% success |
|
||||||
|
| Rounds to delete records | 10 |
|
||||||
|
| Total rounds (including convergence) | 120 |
|
||||||
|
| Final tombstones | 115 (~15.3% of nodes) |
|
||||||
|
|
||||||
|
**Analysis**: Record deletion completes rapidly (10 rounds). Tombstone keeper count converges to approximately 2-3 keepers per trial, demonstrating effective garbage collection while maintaining redundancy.
|
||||||
|
|
||||||
|
#### 5.2.2 Early Tombstone Creation
|
||||||
|
|
||||||
|
**Scenario**: Tombstone created before record fully propagates, testing the algorithm's handling of partial record distribution.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant N0 as Node-0
|
||||||
|
participant N1 as Node-1
|
||||||
|
participant N2 as Node-2
|
||||||
|
participant Nx as Nodes 3-19
|
||||||
|
|
||||||
|
Note over N0,Nx: Record only partially propagated
|
||||||
|
N0->>N1: record (round 1)
|
||||||
|
N1->>N2: record (round 2)
|
||||||
|
N2->>N0: record (round 3)
|
||||||
|
|
||||||
|
Note over N0: Create tombstone after only 3 rounds
|
||||||
|
N0->>N1: tombstone
|
||||||
|
N1->>N2: tombstone
|
||||||
|
Note over Nx: Most nodes never receive record
|
||||||
|
```
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Nodes | 20 per trial (1000 total) |
|
||||||
|
| Records deleted | 100% success |
|
||||||
|
| Rounds to delete records | 10 |
|
||||||
|
| Total rounds | 120 |
|
||||||
|
| Final tombstones | 124 (~12.4% of nodes) |
|
||||||
|
|
||||||
|
**Analysis**: Even with partial record propagation, the algorithm correctly handles deletion. The propagated recordHLL accurately captures the distribution, updating as the tombstone encounters nodes with more complete views. Tombstones converge to nodes that actually received the record.
|
||||||
|
|
||||||
|
#### 5.2.3 Bridged Network (Two Clusters)
|
||||||
|
|
||||||
|
**Scenario**: Two densely-connected clusters joined by a single bridge node, simulating common real-world topologies.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
subgraph Cluster A 15 nodes
|
||||||
|
A0((A-0<br/>bridge))
|
||||||
|
A1((A-1))
|
||||||
|
A2((A-2))
|
||||||
|
A3((A-3))
|
||||||
|
A0 --- A1
|
||||||
|
A0 --- A2
|
||||||
|
A1 --- A2
|
||||||
|
A1 --- A3
|
||||||
|
A2 --- A3
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph Cluster B 15 nodes
|
||||||
|
B0((B-0<br/>bridge))
|
||||||
|
B1((B-1))
|
||||||
|
B2((B-2))
|
||||||
|
B3((B-3))
|
||||||
|
B0 --- B1
|
||||||
|
B0 --- B2
|
||||||
|
B1 --- B2
|
||||||
|
B1 --- B3
|
||||||
|
B2 --- B3
|
||||||
|
end
|
||||||
|
|
||||||
|
A0 ===|single bridge| B0
|
||||||
|
```
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Cluster A | Cluster B | Total |
|
||||||
|
|--------|-----------|-----------|-------|
|
||||||
|
| Nodes | 15 per trial (750 total) | 15 per trial (750 total) | 30 per trial (1500 total) |
|
||||||
|
| Records deleted | 100% success | 100% success | 100% success |
|
||||||
|
| Rounds to delete | - | - | 17 |
|
||||||
|
| Final tombstones | 137 (~18.3%) | 92 (~12.3%) | 229 (~15.3%) |
|
||||||
|
|
||||||
|
**Analysis**: The single-bridge topology creates a natural partition point. Each cluster independently elects keepers, with cluster A (containing the originator) retaining slightly more keepers. This provides fault tolerance - if the bridge fails, each cluster retains tombstones independently.
|
||||||
|
|
||||||
|
#### 5.2.4 Concurrent Tombstones
|
||||||
|
|
||||||
|
**Scenario**: Multiple nodes simultaneously initiate deletion of the same record, simulating concurrent delete operations.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant N0 as Node-0
|
||||||
|
participant N5 as Node-5
|
||||||
|
participant N10 as Node-10
|
||||||
|
participant Others as Other Nodes
|
||||||
|
|
||||||
|
Note over N0,Others: Record fully propagated (30 rounds)
|
||||||
|
|
||||||
|
par Concurrent deletion
|
||||||
|
N0->>N0: Create tombstone
|
||||||
|
N5->>N5: Create tombstone
|
||||||
|
N10->>N10: Create tombstone
|
||||||
|
end
|
||||||
|
|
||||||
|
Note over N0,Others: Three tombstones propagate and merge
|
||||||
|
N0->>Others: tombstone (from N0)
|
||||||
|
N5->>Others: tombstone (from N5)
|
||||||
|
N10->>Others: tombstone (from N10)
|
||||||
|
|
||||||
|
Note over N0,Others: HLLs merge, keepers converge
|
||||||
|
```
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Nodes | 20 per trial (1000 total) |
|
||||||
|
| Concurrent deleters | 3 |
|
||||||
|
| Records deleted | 100% success |
|
||||||
|
| Rounds to delete | 10 |
|
||||||
|
| Final tombstones | 131 (~13.1% of nodes) |
|
||||||
|
|
||||||
|
**Analysis**: The algorithm handles concurrent tombstone creation gracefully. Multiple tombstones merge via HLL union operations, and keeper election converges as normal. The keeper percentage is slightly lower than single-deleter baseline (~13% vs ~15%), likely due to faster HLL convergence from multiple sources.
|
||||||
|
|
||||||
|
#### 5.2.5 Network Partition and Heal
|
||||||
|
|
||||||
|
**Scenario**: Network partitions after record propagation, tombstone created in one partition, then network heals.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant CA as Cluster A
|
||||||
|
participant Bridge as Bridge
|
||||||
|
participant CB as Cluster B
|
||||||
|
|
||||||
|
Note over CA,CB: Phase 1: Record propagates to all nodes
|
||||||
|
CA->>Bridge: record
|
||||||
|
Bridge->>CB: record
|
||||||
|
|
||||||
|
Note over CA,CB: Phase 2: Network partitions
|
||||||
|
Bridge--xCB: connection lost
|
||||||
|
|
||||||
|
Note over CA: Cluster A creates tombstone
|
||||||
|
CA->>CA: tombstone propagates within A
|
||||||
|
Note over CB: Cluster B still has record
|
||||||
|
|
||||||
|
Note over CA,CB: Phase 3: Network heals
|
||||||
|
Bridge->>CB: tombstone propagates to B
|
||||||
|
CB->>CB: record deleted, keepers elected
|
||||||
|
```
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Cluster A | Cluster B | Total |
|
||||||
|
|--------|-----------|-----------|-------|
|
||||||
|
| Nodes | 10 per trial (500 total) | 10 per trial (500 total) | 20 per trial (1000 total) |
|
||||||
|
| Records deleted | 100% success | 100% success | 100% success |
|
||||||
|
| Rounds to delete | - | - | 16 |
|
||||||
|
| Total rounds (partition + heal) | - | - | 717 |
|
||||||
|
| Final tombstones | 104 (~20.8%) | 52 (~10.4%) | 156 (~15.6%) |
|
||||||
|
|
||||||
|
**Analysis**: The extended total rounds (717) includes the partition period where only Cluster A processes the tombstone. Cluster A retains more keepers (~21%) since it processes the tombstone during partition without cross-cluster communication. Upon healing, Cluster B rapidly receives the tombstone and converges to fewer keepers (~10%). Each cluster maintains independent keepers, providing partition tolerance.
|
||||||
|
#### 5.2.6 Dynamic Topology
|
||||||
|
|
||||||
|
**Scenario**: Network connections randomly change during both tombstone propagation and garbage collection phases, simulating real-world network churn where peer relationships are not static.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant N0 as Node-0
|
||||||
|
participant N1 as Node-1
|
||||||
|
participant N2 as Node-2
|
||||||
|
participant N3 as Node-3
|
||||||
|
|
||||||
|
Note over N0,N3: Initial topology established
|
||||||
|
N0->>N1: connected
|
||||||
|
N1->>N2: connected
|
||||||
|
N2->>N3: connected
|
||||||
|
|
||||||
|
Note over N0,N3: Tombstone propagation begins
|
||||||
|
N0->>N1: tombstone
|
||||||
|
|
||||||
|
Note over N0,N3: Topology change: N1-N2 disconnects, N0-N3 connects
|
||||||
|
N1--xN2: disconnected
|
||||||
|
N0->>N3: new connection
|
||||||
|
|
||||||
|
Note over N0,N3: Propagation continues on new topology
|
||||||
|
N0->>N3: tombstone via new path
|
||||||
|
N3->>N2: tombstone
|
||||||
|
|
||||||
|
Note over N0,N3: Topology continues changing during GC convergence
|
||||||
|
```
|
||||||
|
|
||||||
|
**Protocol**:
|
||||||
|
1. Create 20-node network with 30% initial connectivity
|
||||||
|
2. Propagate record for 10 rounds
|
||||||
|
3. Create tombstone and begin propagation
|
||||||
|
4. Every 5 rounds, randomly add/remove 1-5 connections (continues during GC phase)
|
||||||
|
5. Run until convergence
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Nodes | 20 per trial (1000 total) |
|
||||||
|
| Records deleted | 100% success |
|
||||||
|
| Rounds to delete records | 10 |
|
||||||
|
| Total rounds | 115 |
|
||||||
|
| Final tombstones | 126 (~12.6% of nodes) |
|
||||||
|
|
||||||
|
**Analysis**: Despite continuous topology changes throughout both deletion and garbage collection phases, the algorithm maintains correct behavior. The dynamic nature of connections does not prevent tombstone propagation or keeper convergence. Keeper percentage is actually lower than static networks (~12.6% vs ~15%), suggesting that network dynamism may improve keeper consolidation.
|
||||||
|
|
||||||
|
#### 5.2.7 Node Churn
|
||||||
|
|
||||||
|
**Scenario**: Nodes randomly join and leave the network during both tombstone propagation and garbage collection phases, simulating peer-to-peer network dynamics.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant N0 as Node-0 (stable)
|
||||||
|
participant N5 as Node-5
|
||||||
|
participant Nnew as New Node
|
||||||
|
participant Network as Network
|
||||||
|
|
||||||
|
Note over N0,Network: Record propagated, tombstone created
|
||||||
|
N0->>N5: tombstone
|
||||||
|
|
||||||
|
Note over N0,Network: Node-5 leaves network
|
||||||
|
N5--xNetwork: disconnected & removed
|
||||||
|
|
||||||
|
Note over N0,Network: New node joins
|
||||||
|
Nnew->>Network: joins with 2-4 connections
|
||||||
|
|
||||||
|
Note over N0,Network: Tombstone continues propagating
|
||||||
|
N0->>Nnew: tombstone (new node has no record)
|
||||||
|
Note over Nnew: Ignores tombstone (no matching record)
|
||||||
|
|
||||||
|
Note over N0,Network: Churn continues during GC convergence
|
||||||
|
```
|
||||||
|
|
||||||
|
**Protocol**:
|
||||||
|
1. Create 20-node network with 40% connectivity
|
||||||
|
2. Propagate record for 15 rounds
|
||||||
|
3. Create tombstone and begin propagation
|
||||||
|
4. Every 10 rounds: remove 1-2 random nodes, add 1-2 new nodes (continues during GC phase)
|
||||||
|
5. New nodes connect to 2-4 random existing nodes
|
||||||
|
6. Run until convergence
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Initial nodes | 20 per trial (1000 total) |
|
||||||
|
| Records deleted | 100% success |
|
||||||
|
| Rounds to delete records | 9 |
|
||||||
|
| Total rounds | 114 |
|
||||||
|
| Final tombstones | 84 (~8.4% of nodes) |
|
||||||
|
|
||||||
|
**Analysis**: Node churn actually accelerates deletion (9 rounds vs. typical 10) because departing nodes that held records effectively "delete" them. New nodes that never received the original record correctly ignore tombstones. The keeper percentage (~8.4%) is notably lower than static networks, as some keepers may depart during the GC phase and remaining keepers consolidate more aggressively when the network topology continues to evolve.
|
||||||
|
|
||||||
|
#### 5.2.8 Random Configuration Changes
|
||||||
|
|
||||||
|
**Scenario**: Mixed workload with simultaneous record additions, connection changes, and disconnections during both tombstone propagation and garbage collection phases.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
subgraph "Configuration Changes During Propagation and GC"
|
||||||
|
A[Tombstone Created] --> B{Every 8 rounds}
|
||||||
|
B --> C[30%: Add new unrelated record]
|
||||||
|
B --> D[30%: Add new peer connection]
|
||||||
|
B --> E[40%: Remove peer connection]
|
||||||
|
C --> F[Continue propagation/GC]
|
||||||
|
D --> F
|
||||||
|
E --> F
|
||||||
|
F --> B
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
**Protocol**:
|
||||||
|
1. Create 20-node network with 40% connectivity
|
||||||
|
2. Propagate primary record for 15 rounds
|
||||||
|
3. Create tombstone for primary record
|
||||||
|
4. Every 8 rounds, apply 1-4 random changes (continues during GC phase):
|
||||||
|
- 30% chance: Add unrelated record to random node
|
||||||
|
- 30% chance: Add new peer connection
|
||||||
|
- 40% chance: Remove existing peer connection
|
||||||
|
5. Run until convergence
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Nodes | 20 per trial (1000 total) |
|
||||||
|
| Records deleted | 100% success |
|
||||||
|
| Rounds to delete records | 9 |
|
||||||
|
| Total rounds | 114 |
|
||||||
|
| Final tombstones | 135 (~13.5% of nodes) |
|
||||||
|
|
||||||
|
**Analysis**: The algorithm remains stable under mixed workload conditions throughout both deletion and garbage collection phases. Unrelated records do not interfere with tombstone propagation. Connection changes create alternative propagation paths. The low keeper percentage (~13.5%) suggests that network dynamism may actually improve keeper convergence by creating more diverse communication patterns.
|
||||||
|
|
||||||
|
#### 5.2.9 Sparse Network
|
||||||
|
|
||||||
|
**Scenario**: Low connectivity (15%) network, testing algorithm behavior under challenging propagation conditions.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
subgraph Sparse Network 25 nodes 15 percent connectivity
|
||||||
|
N0((0)) --- N3((3))
|
||||||
|
N0((0)) --- N5((5))
|
||||||
|
N1((1)) --- N4((4))
|
||||||
|
N1((1)) --- N6((6))
|
||||||
|
N2((2)) --- N6((6))
|
||||||
|
N2((2)) --- N10((10))
|
||||||
|
N3((3)) --- N7((7))
|
||||||
|
N4((4)) --- N8((8))
|
||||||
|
N5((5)) --- N9((9))
|
||||||
|
N6((6)) --- N11((11))
|
||||||
|
N7((7)) --- N12((12))
|
||||||
|
N8((8)) --- N13((13))
|
||||||
|
N9((9)) --- N14((14))
|
||||||
|
N9((9)) --- N15((15))
|
||||||
|
N10((10)) --- N14((14))
|
||||||
|
N11((11)) --- N16((16))
|
||||||
|
N12((12)) --- N17((17))
|
||||||
|
N12((12)) --- N18((18))
|
||||||
|
N13((13)) --- N17((17))
|
||||||
|
N14((14)) --- N19((19))
|
||||||
|
N15((15)) --- N19((19))
|
||||||
|
N15((15)) --- N20((20))
|
||||||
|
N16((16)) --- N20((20))
|
||||||
|
N17((17)) --- N21((21))
|
||||||
|
N18((18)) --- N22((22))
|
||||||
|
N19((19)) --- N23((23))
|
||||||
|
N20((20)) --- N24((24))
|
||||||
|
N21((21)) --- N23((23))
|
||||||
|
N22((22)) --- N24((24))
|
||||||
|
end
|
||||||
|
|
||||||
|
style N0 fill:#f96
|
||||||
|
style N24 fill:#9f9
|
||||||
|
```
|
||||||
|
|
||||||
|
**Results** (averaged over 50 trials):
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|--------|-------|
|
||||||
|
| Nodes | 25 per trial (1250 total) |
|
||||||
|
| Connectivity | 15% |
|
||||||
|
| Records deleted | 100% success |
|
||||||
|
| Rounds to delete | 12 |
|
||||||
|
| Total rounds | 122 |
|
||||||
|
| Final tombstones | 255 (~20.4% of nodes) |
|
||||||
|
|
||||||
|
**Analysis**: Sparse networks require more rounds for propagation (12 vs. 9-10 for denser networks) and retain more keepers (~20% vs. ~15%). The higher keeper retention provides additional redundancy appropriate for networks where nodes may have limited connectivity.
|
||||||
|
|
||||||
|
### 5.3 Summary of Results
|
||||||
|
|
||||||
|
All results are averaged over 50 independent trials per scenario.
|
||||||
|
|
||||||
|
| Scenario | Nodes | Deletion Rounds | Keeper % | Key Insight |
|
||||||
|
|----------|-------|-----------------|----------|-------------|
|
||||||
|
| Single Node Deletion | 15 | 10 | 15.2% | Baseline performance |
|
||||||
|
| Early Tombstone | 20 | 10 | 12.4% | Handles partial propagation |
|
||||||
|
| Bridged Network | 30 | 17 | 15.3% | Independent keepers per cluster |
|
||||||
|
| Concurrent Tombstones | 20 | 10 | 13.1% | Faster convergence with multiple sources |
|
||||||
|
| Partition and Heal | 20 | 16 | 15.6% | Partition-tolerant |
|
||||||
|
| Dynamic Topology | 20 | 10 | 13.1% | Robust to continuous connection changes |
|
||||||
|
| Node Churn | 20 | 9 | 8.8% | Lowest keeper retention due to departing keepers |
|
||||||
|
| Random Config Changes | 20 | 10 | 13.6% | Stable under continuous mixed workload |
|
||||||
|
| Sparse Network | 25 | 11 | 22.8% | Higher redundancy for limited connectivity |
|
||||||
|
|
||||||
|
**Statistical Observations** (across 450 total trials):
|
||||||
|
- **100% deletion success rate**: All 450 trials successfully deleted records
|
||||||
|
- **Deletion speed**: Mean 10.8 rounds (σ ≈ 2.5), range 9-17 rounds
|
||||||
|
- **Keeper retention**: Mean 14.1% (σ ≈ 4.2%), range 8.8-22.8%
|
||||||
|
- **Dynamic scenarios outperform static**: Network dynamism reduces keeper % by 10-42% relative to baseline
|
||||||
|
|
||||||
|
### 5.4 Key Findings
|
||||||
|
|
||||||
|
Based on 450 total trials across 9 scenarios:
|
||||||
|
|
||||||
|
1. **Reliable deletion**: 100% success rate across all trials. Records are deleted within 9-17 gossip rounds, with most scenarios completing in 10 rounds. Bridged networks require more rounds (17) due to single-bridge bottleneck.
|
||||||
|
|
||||||
|
2. **Effective garbage collection**: Tombstones converge to 8.8-22.8% of nodes as keepers. The median keeper retention is ~13%, representing an 85-90% reduction in tombstone storage distribution compared to full replication.
|
||||||
|
|
||||||
|
3. **Dynamic networks improve convergence**: Counter-intuitively, network dynamism improves keeper consolidation:
|
||||||
|
- Node churn: 8.8% keepers (42% reduction vs baseline)
|
||||||
|
- Dynamic topology: 13.1% keepers (14% reduction vs baseline)
|
||||||
|
- Random config changes: 13.6% keepers (11% reduction vs baseline)
|
||||||
|
|
||||||
|
This occurs because dynamic networks create more diverse communication patterns and departing keepers accelerate consolidation.
|
||||||
|
|
||||||
|
4. **Topology-aware keeper distribution**:
|
||||||
|
- Bridged networks maintain independent keepers per cluster (18.3% in origin cluster vs 12.3% in remote cluster)
|
||||||
|
- Partitioned networks show asymmetric distribution (20.8% in partition with tombstone origin vs 10.4% in healing partition)
|
||||||
|
|
||||||
|
5. **Graceful degradation under adversity**:
|
||||||
|
- Sparse networks (15% connectivity) retain more keepers (22.8%) for appropriate redundancy
|
||||||
|
- Partial propagation scenarios still achieve 12.4% keeper retention
|
||||||
|
|
||||||
|
6. **Concurrent safety**: Multiple simultaneous deleters (3 nodes) do not cause conflicts and achieve 13.1% keeper retention, comparable to single-deleter scenarios.
|
||||||
|
|
||||||
|
## 6. Trade-offs
|
||||||
|
|
||||||
|
| Aspect | Impact |
|
||||||
|
|--------|--------|
|
||||||
|
| **Memory** | ~1KB per tombstone (HLL at precision 10) |
|
||||||
|
| **Bandwidth** | HLLs transmitted with each gossip message (~2KB per tombstone message) |
|
||||||
|
| **Latency** | GC delayed until keeper convergence (~100 rounds after deletion) |
|
||||||
|
| **Consistency** | Eventual - temporary resurrection attempts are blocked but logged |
|
||||||
|
|
||||||
|
## 7. Properties
|
||||||
|
|
||||||
|
The algorithm provides the following guarantees:
|
||||||
|
|
||||||
|
- **Safety**: Tombstones are never prematurely garbage collected. A tombstone is only deleted when the node has received confirmation (via HLL estimates) that the tombstone has propagated to at least as many nodes as received the original record.
|
||||||
|
|
||||||
|
- **Liveness**: Keepers eventually step down, enabling garbage collection. The tie-breaker mechanism ensures convergence even when HLL estimates are identical.
|
||||||
|
|
||||||
|
- **Fault tolerance**: No single point of failure. Multiple keepers provide redundancy, and any keeper can propagate the tombstone.
|
||||||
|
|
||||||
|
- **Convergence**: Keeper count monotonically decreases over time within each connected component.
|
||||||
|
|
||||||
|
## 8. Conclusion
|
||||||
|
|
||||||
|
This paper presented a HyperLogLog-based approach to tombstone garbage collection in distributed systems. By tracking record and tombstone propagation through probabilistic cardinality estimation, the algorithm reduces the number of nodes maintaining tombstones to 10-25% of the network (the "keeper" nodes).
|
||||||
|
|
||||||
|
**Storage Trade-offs**: Each HLL-based tombstone requires approximately 2KB (two HLL structures at precision 10), compared to ~64-100 bytes for traditional simple tombstones. This means the algorithm trades per-tombstone storage overhead for reduced tombstone distribution. The approach is most beneficial when:
|
||||||
|
- Traditional tombstones are large (e.g., containing vector clocks, content hashes, or audit metadata)
|
||||||
|
- The primary concern is reducing the number of nodes participating in tombstone maintenance
|
||||||
|
|
||||||
|
The simulation results, based on 450 trials across 9 scenarios, demonstrate consistent behavior across diverse network topologies and failure scenarios. Records are deleted within 9-17 gossip rounds (mean: 10.8), and tombstones converge to 8.8-22.8% of nodes as keepers (mean: 14.1%). Notably, dynamic network conditions actually improve keeper consolidation rather than hindering it. The algorithm gracefully handles partial propagation, network partitions, concurrent deletions, and continuous topology changes.
|
||||||
|
|
||||||
|
Future work may explore adaptive HLL precision based on network size, integration with vector clocks for stronger consistency guarantees, and optimization of the keeper convergence rate.
|
||||||
|
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
A working simulation implementing this algorithm is available at [simulations/hyperloglog-tombstone/simulation.ts](/simulations/hyperloglog-tombstone/simulation.ts).
|
||||||
|
|
@ -1 +1,744 @@
|
||||||
- Tail scale, style, layer 3, personal connection, network and identity service. (reticulum might be this?)
|
## Glossary
|
||||||
|
|
||||||
|
- **DHT** - Distributed Hash Table: A decentralized key-value store distributed across multiple nodes
|
||||||
|
- **I2P** - Invisible Internet Project: An anonymous network layer that uses garlic routing
|
||||||
|
- **IP** - Internet Protocol: The principal communications protocol for relaying packets across network boundaries
|
||||||
|
- **IPFS** - InterPlanetary File System: A distributed system for storing and accessing files
|
||||||
|
- **LoRa** - Long Range: A low-power wide-area network radio protocol
|
||||||
|
- **NAT** - Network Address Translation: A method of remapping IP addresses between networks
|
||||||
|
- **P2P** - Peer to Peer: A distributed network architecture where participants share resources directly
|
||||||
|
- **PoW** - Proof of Work: A cryptographic puzzle that requires computational effort to solve
|
||||||
|
- **SSU2** - Secure Semireliable UDP: I2P's transport protocol for UDP-based communication
|
||||||
|
- **STUN** - Session Traversal Utilities for NAT: A protocol for discovering public IP addresses
|
||||||
|
- **TTL** - Time to Live: The duration a record remains valid before expiring
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
### The Captured Internet
|
||||||
|
|
||||||
|
The internet as it exists today is structured to extract profit and enforce control at every layer. What began as a decentralized research network has been enclosed by capital and colonized by the state - transformed into infrastructure that serves accumulation and domination rather than human connection.
|
||||||
|
|
||||||
|
The state and capital are not separate forces acting on the internet from outside. They are fused into a single apparatus of control. The state creates the legal frameworks that make digital enclosure possible - intellectual property, terms of service enforced by courts, surveillance mandates. Capital builds the infrastructure and extracts the rents. Each legitimizes and reinforces the other. This is not a corruption of some neutral technology; the internet as we know it is the internet as it was shaped by these interlocking powers.
|
||||||
|
|
||||||
|
The internet can be understood through the relationship between base and superstructure.[^base-superstructure] The material infrastructure - cables, routers, data centers, the physical means of digital production - forms the economic base. But this base does not exist in isolation. It is governed by a superstructure of laws, institutions, and ideologies: ICANN's authority over naming, the property regime that makes domain speculation possible, the surveillance mandates that compel ISPs to log traffic, the ideology of "neutral platforms" that obscures corporate power. The superstructure arises from and serves to legitimize the economic relations of the base. When we talk about IANA, ICANN, or Certificate Authorities, we are talking about superstructural institutions that manage and reproduce the capitalist organization of internet infrastructure.
|
||||||
|
|
||||||
|
Critically, this analysis suggests that changing the superstructure alone - new laws, new governance bodies, reformed institutions - cannot fundamentally transform the internet. The superstructure reflects the base. To build genuinely autonomous infrastructure requires building alternative material relations: networks owned by communities rather than corporations, protocols that do not require permission from state-sanctioned authorities, infrastructure organized around mutual aid rather than rent extraction. The protocol described here is an attempt to construct alternative base infrastructure - new material means of communication that do not reproduce capitalist relations of production.
|
||||||
|
|
||||||
|
**IP Address Allocation**
|
||||||
|
IP address space has been made artificially scarce and turned into a commodity. IANA sits atop a hierarchy that delegates to Regional Internet Registries, national registries, and ISPs - each layer extracting fees and imposing conditions. The exhaustion of IPv4 created a speculative market where addresses trade for millions of dollars. But IANA itself exists because states agreed it should - the allocation hierarchy is backed by international treaties and national laws. To participate on the internet with a publicly routable address, you must pay rent to this chain of institutions and submit to terms ultimately enforced by state violence. Your ability to be reachable is determined by your purchasing power and your compliance with authorities you never chose.
|
||||||
|
|
||||||
|
**DNS Name Allocation**
|
||||||
|
The Domain Name System is a rent extraction machine wrapped in state legitimacy. ICANN operates under a contract with the US Department of Commerce - its authority flows from a state that claims jurisdiction over global naming. Registries receive monopoly grants enforced by national courts. Domain names are treated as property because states define and protect property. Names can be seized by court order, transferred under legal threat, or revoked when you violate terms written by corporate lawyers and blessed by state power. The entire system presents itself as natural and necessary, but it is a constructed order serving particular interests - an abstraction demanding your obedience while offering nothing but the continuation of its own authority.
|
||||||
|
|
||||||
|
**HTTPS Certificates**
|
||||||
|
Transport security has been captured by a cartel of Certificate Authorities operating with implicit state blessing. Browsers and operating systems - themselves products of corporations subject to state regulation - decide which CAs to trust. States compel CAs to issue certificates for surveillance. The "security" this system provides is security for the existing order: it authenticates the property claims of domain holders and protects the surveillance apparatus from tampering. Your ability to establish connections that the system recognizes as "secure" depends on approval from gatekeepers who serve state and capital, not you.
|
||||||
|
|
||||||
|
### How State and Capital Prevent Autonomous Community
|
||||||
|
|
||||||
|
These structures are not neutral infrastructure - they constitute a hegemonic order[^gramsci-hegemony] that actively prevents communities from organizing outside market and state control:
|
||||||
|
|
||||||
|
- **Commodification of connection** means those without money cannot participate as equals - the market decides who speaks
|
||||||
|
- **Artificial scarcity** turns names and addresses into speculative assets, creating property where none need exist
|
||||||
|
- **Rent extraction** at every layer drains resources from communities into corporate hands
|
||||||
|
- **State jurisdiction** means any community can be disconnected by pressuring registrars, hosting providers, or payment processors - your organizing exists only as long as it doesn't threaten power
|
||||||
|
- **Legal enforcement** backs every terms of service, every takedown, every seizure - the violence of the state underwrites the apparent neutrality of the platform
|
||||||
|
- **Surveillance infrastructure** is built into the architecture by design and by mandate - ISPs log traffic under legal compulsion, CAs can issue fraudulent certificates under state pressure, DNS queries reveal your interests to anyone with access to the logs
|
||||||
|
- **Corporate ownership** means communities exist at the pleasure of shareholders, but those shareholders operate within a legal framework that the state constructs and defends
|
||||||
|
|
||||||
|
The internet could enable horizontal organizing across geographic boundaries, constructing new collective identities outside the nation-state framework. Instead, it has been structured to ensure that every connection passes through toll booths controlled by capital and monitored by states. These are not two separate problems - the state creates the conditions for capital accumulation, and capital funds the state apparatus. They are a single machine for the domination of those who would organize differently.
|
||||||
|
|
||||||
|
Communities cannot build durable infrastructure when the ground beneath them is rented from corporations and policed by states. Both landlord and cop serve the same order.
|
||||||
|
|
||||||
|
### Constructing Alternatives: Bookmarked Names and Community-Based Discovery
|
||||||
|
|
||||||
|
The hegemonic order of the captured internet presents itself as the only possible way to organize digital communication. But hegemony is never total - it must be actively constructed and maintained, which means it can be contested and alternatives can be built. The protocol described here is one such counter-hegemonic project: infrastructure that enables communities to organize outside the state-capital apparatus.
|
||||||
|
|
||||||
|
Instead of market allocation and state enforcement, identity and naming can emerge from the communities that actually use them - a union of egoists,[^stirner-union] individuals freely associating based on their own interests rather than submission to abstract authority:
|
||||||
|
|
||||||
|
**Local Community Names**
|
||||||
|
Within a local community (a neighborhood, a cooperative, an affinity group), participants maintain bookmarks - human-readable names mapped to cryptographic identities. These names are not property defined and defended by state law. They are not commodities traded on markets. They are gifts of recognition within a community - they exist because a community chooses to acknowledge someone, not because an authority granted permission or a payment was processed. Your neighbor might be "sarah" in your local community's namespace, and that name carries meaning because your relationship gives it meaning, not because ICANN says so.
|
||||||
|
|
||||||
|
**Affinity Group Namespaces**
|
||||||
|
Communities of interest can maintain shared directories without permission from any authority - state or corporate. A mutual aid network, a worker cooperative, a tenant union, an organizing collective, a reading group can establish their own naming conventions and share bookmark lists among members. These communities define themselves through their own practices, constructing collective identities through shared infrastructure rather than having identities imposed by citizenship, legal status, or consumer profiles. Names propagate through trust and relationship, not through markets, hierarchies, or the violence that backs them.
|
||||||
|
|
||||||
|
**Federated Discovery**
|
||||||
|
When connecting across communities, introductions happen through solidarity networks. If you want to reach someone outside your immediate network, a mutual contact can provide the introduction - sharing the cryptographic identity and vouching for the connection. This is how human relationships actually work: we meet people through other people, through communities of trust and shared struggle, not through global registries controlled by corporations and monitored by states. The protocol makes explicit what the captured internet obscures: connection is a social practice, not a service to be purchased.
|
||||||
|
|
||||||
|
**No Global Namespace Required**
|
||||||
|
This approach rejects the premise that everything must be globally unique, universally legible, and available for commodification. The state needs global namespaces to monitor and control. Capital needs them to enclose and extract. Communities need neither. Names are contextual - meaningful within the communities that create them. The cryptographic identity provides uniqueness without requiring scarcity, hierarchy, or permission; human-readable names are maintained by the people who use them, serving their purposes rather than the purposes of authorities.
|
||||||
|
|
||||||
|
**Infrastructure as Commons, Not Property**
|
||||||
|
The goal is infrastructure that belongs to the communities using it - not rented from corporations, not licensed by states, not subject to the whims of investors or the dictates of law. When communities control their own naming and discovery, they can build durable connections that survive the next round of platform enshittification, the next state crackdown, the next private equity acquisition, the next change in terms of service. This is infrastructure that answers to the people who use it, not to shareholders or bureaucrats.
|
||||||
|
|
||||||
|
**Against Constructed Authority**
|
||||||
|
The existing internet presents ICANN, the CA system, IP allocation hierarchies, terms of service, and property rights as natural and necessary - fixed points that cannot be questioned. But these are constructed authorities: abstractions that demand obedience while serving interests other than your own. The protocol described here refuses that obedience. It routes around the demand that you submit to authorities you never chose. It enables communities to build their own infrastructure on their own terms.
|
||||||
|
|
||||||
|
This shifts power from institutions that profit from allocation to communities that create meaning through their relationships. Connection becomes a practice of mutual recognition rather than a commodity transaction or a privilege granted by the state. The question is not whether existing authorities will permit this - they will not. The question is whether communities will build it anyway.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
The goals of this protocol flow directly from the political analysis above. Each goal addresses a specific way the captured internet fails communities, and each represents a requirement for infrastructure that serves autonomous organizing rather than state-capital control.
|
||||||
|
|
||||||
|
### Security
|
||||||
|
|
||||||
|
**Why this matters:** The surveillance infrastructure of the captured internet makes organizing visible to states and corporations. Communications can be intercepted, metadata analyzed, and participants identified. Communities organizing outside sanctioned channels face repression - legal, economic, or violent. Security is not a feature; it is a precondition for autonomous existence.
|
||||||
|
|
||||||
|
- **Encrypted by default** - Communication should be unreadable to anyone except intended recipients. The ability to downgrade encryption should exist for applications that genuinely need it, but the default must be strong encryption. States and corporations should not be able to read community communications.
|
||||||
|
- **Private by default** - Metadata (who talks to whom, when, how often) should be protected as strongly as content. The captured internet leaks metadata constantly; this protocol should not.
|
||||||
|
- **Support for forward and backward secrecy** - Compromise of current keys should not compromise past communications (forward secrecy), and the protocol should recover security after a compromise through key ratcheting (backward secrecy / post-compromise security). Something like the Signal protocol's double ratchet[^signal-protocol] should be possible.
|
||||||
|
- **Secure group communication** - Communities are not pairs of individuals. Group chats, shared channels, and collective spaces must be as secure as one-to-one communication.
|
||||||
|
|
||||||
|
### Networking
|
||||||
|
|
||||||
|
**Why this matters:** The captured internet assumes always-on high-bandwidth connections through ISP infrastructure. This excludes communities without reliable internet access, communities in areas where ISPs are hostile or surveilled, and communities that cannot afford commercial connectivity. Infrastructure for autonomous community must work across material conditions that capitalism creates.
|
||||||
|
|
||||||
|
- **Low bandwidth operation** - The protocol should function on connections as limited as LoRa radio links. High bandwidth should improve performance but not be required for basic operation.
|
||||||
|
- **Hardware diversity** - From mains-powered servers on fiber to solar-powered microcontrollers on radio mesh networks, the protocol should span the full range of hardware that communities actually have access to.
|
||||||
|
- **Cluster bridging** - Isolated communities (a neighborhood mesh network, a rural radio cluster) should be able to bridge to other clusters through whatever links are available, even if those links are slow or intermittent.
|
||||||
|
|
||||||
|
### Identity Management
|
||||||
|
|
||||||
|
**Why this matters:** On the captured internet, identity is rented from authorities - domain registrars, platform accounts, certificate authorities. Identity can be revoked, seized, or denied. Communities need identities they control, that cannot be taken away by state or corporate action.
|
||||||
|
|
||||||
|
- **Self-generated identifiers** - Identities should be created by users through their own effort, not allocated by authorities.
|
||||||
|
- **Findability without registries** - Users should be discoverable by those who need to find them, without depending on centralized directories that can be controlled or surveilled.
|
||||||
|
- **Identity continuity** - Identities should persist across key rotations, device changes, and network disruptions. Your identity should not depend on any single point of failure.
|
||||||
|
|
||||||
|
### Information Coordination
|
||||||
|
|
||||||
|
**Why this matters:** The captured internet requires always-on presence or dependence on corporate servers. If you're not online, you miss messages unless a company stores them for you - and reads them, analyzes them, monetizes them. Communities need asynchronous communication that doesn't require renting server space from capital.
|
||||||
|
|
||||||
|
- **Offline-capable messaging** - Messages should be deliverable even when the recipient is offline, without requiring corporate infrastructure to store them.
|
||||||
|
- **Community-hosted persistence** - Communities should be able to host data for their members using their own resources, not rented cloud services.
|
||||||
|
- **Deletion that works** - Users should be able to delete information they've published, with reasonable assurance that deletion propagates. This is never perfect (you cannot force deletion from every copy), but it should be a protocol-level concept, not a corporate policy that serves their interests.
|
||||||
|
|
||||||
|
### Information Sharing
|
||||||
|
|
||||||
|
**Why this matters:** On the captured internet, data is enclosed as property - owned by platforms, monetized, sold, used for surveillance. Information sharing is mediated by corporations who extract value at every step. Communities need commons-based information sharing that doesn't recreate capitalist property relations.
|
||||||
|
|
||||||
|
- **No proprietary ownership** - Data shared on the network should not be owned by any party in a way that enables extraction or enclosure.
|
||||||
|
- **Reciprocity without markets** - There should be some mechanism to prevent pure extraction (leeching) while preserving privacy and not creating a token economy that reproduces capitalist exchange. This remains an open design problem.
|
||||||
|
- **Collective benefit** - The protocol should enable communities to build shared resources (libraries, archives, knowledge bases) that serve collective interests rather than private accumulation.
|
||||||
|
|
||||||
|
### Accessibility
|
||||||
|
|
||||||
|
**Why this matters:** The captured internet creates barriers - financial barriers (paying for domains, hosting, bandwidth), technical barriers (expertise required to self-host), resource barriers (hardware, electricity). A protocol that only technically-sophisticated people with resources can use reproduces existing inequalities. Infrastructure for autonomous community must be accessible across the material conditions communities actually face.
|
||||||
|
|
||||||
|
- **Low resource requirements** - Basic participation should not require expensive hardware or high bandwidth connections.
|
||||||
|
- **Transferable effort** - Where resource-intensive operations are necessary (like proof of work), it should be possible to perform them on borrowed or shared hardware and transfer the results.
|
||||||
|
- **Usable by communities** - The protocol should be implementable in applications that non-technical users can actually use. Theoretical security that requires command-line expertise is not accessible.
|
||||||
|
|
||||||
|
### Resilience
|
||||||
|
|
||||||
|
**Why this matters:** The captured internet is fragile in specific ways - dependent on DNS, on certificate authorities, on payment processors, on hosting providers. States and capital can attack communities by pressuring any of these chokepoints. Infrastructure for autonomous community must survive these attacks.
|
||||||
|
|
||||||
|
- **No single points of failure** - The protocol should not depend on any single server, service, or authority that can be pressured or shut down.
|
||||||
|
- **Graceful degradation** - When parts of the network are attacked or fail, remaining parts should continue functioning.
|
||||||
|
- **Censorship resistance** - It should be difficult for states or corporations to prevent specific content or communities from using the network.
|
||||||
|
- **Jurisdictional independence** - The protocol should not depend on legal protections from any particular state, since states serve capital and will act against communities that threaten the order.
|
||||||
|
|
||||||
|
### Autonomy
|
||||||
|
|
||||||
|
**Why this matters:** Even decentralized protocols can create new dependencies - on core developers, on dominant implementations, on funding sources. Communities should be able to fork, modify, and operate the protocol independently.
|
||||||
|
|
||||||
|
- **No governance capture** - The protocol should not have governance structures that can be captured by state or capital.
|
||||||
|
- **Implementation independence** - Multiple implementations should be possible, so communities are not dependent on any single development team.
|
||||||
|
- **No embedded economics** - The protocol should not require tokens, cryptocurrencies, or other mechanisms that could be financialized or captured by speculators.
|
||||||
|
|
||||||
|
## Implementation
|
||||||
|
|
||||||
|
### Identity
|
||||||
|
|
||||||
|
Identities are based on a hash of an asymmetric public key.
|
||||||
|
|
||||||
|
**Identity Creation:**
|
||||||
|
- New identities require proof of work to prevent flooding the network with arbitrary identities
|
||||||
|
- The PoW is a nonce value that, when hashed together with the identity's public key, produces a result with low entropy (e.g., many leading zeros)
|
||||||
|
- Since the public key is already public knowledge, anyone can perform PoW computation on behalf of a low-power device without needing access to private keys
|
||||||
|
- A helper device grinds through nonce values until finding one that satisfies: `hash(public_key || nonce) < difficulty_threshold`
|
||||||
|
- The helper transmits only the nonce (public information) to the low-power device - no secrets cross the network
|
||||||
|
- This enables mutual aid in identity creation: community members with more computational resources can help those with less, without any security compromise
|
||||||
|
- Identity establishment is expected to happen off-network through secure channels (no in-band key exchange)
|
||||||
|
|
||||||
|
**Key Rotation:**
|
||||||
|
- Old key signs a succession record pointing to new key hash
|
||||||
|
- Succession records have a user-configurable TTL (time-to-live)
|
||||||
|
- After TTL expires, the succession record is no longer needed
|
||||||
|
- Anyone with the old key can verify the transition to the new key
|
||||||
|
|
||||||
|
### Routing
|
||||||
|
|
||||||
|
Connections between nodes use packet-switched routing similar to I2P's garlic routing:[^i2p]
|
||||||
|
- Each message is routed through 6 intermediate nodes (hops) in each direction
|
||||||
|
- A complete round trip passes through 12 routers total
|
||||||
|
- Each hop only knows the previous and next node, not the full path
|
||||||
|
- Provides anonymity by preventing any single node from knowing both sender and recipient
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant A as Node A
|
||||||
|
participant H1 as Hop 1
|
||||||
|
participant H2 as Hop 2
|
||||||
|
participant H3 as ...
|
||||||
|
participant H6 as Hop 6
|
||||||
|
participant B as Node B
|
||||||
|
|
||||||
|
Note over A,B: Outbound: 6 hops
|
||||||
|
A->>H1: Encrypted packet
|
||||||
|
H1->>H2: Re-encrypted
|
||||||
|
H2->>H3: Re-encrypted
|
||||||
|
H3->>H6: (3 more hops)
|
||||||
|
H6->>B: Delivered
|
||||||
|
|
||||||
|
Note over B,A: Return: 6 different hops
|
||||||
|
B->>H6: Response
|
||||||
|
H6->>A: (through 6 return hops)
|
||||||
|
```
|
||||||
|
|
||||||
|
### NAT Traversal
|
||||||
|
|
||||||
|
Uses I2P-style SSU2 transport for NAT traversal:
|
||||||
|
- Other nodes in the network provide STUN-like address discovery (no dedicated STUN servers)
|
||||||
|
- Introducers to help establish connections through NAT
|
||||||
|
|
||||||
|
**Address Discovery via Network Nodes:**
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Node as Node (behind NAT)
|
||||||
|
participant Peer as Network Peer
|
||||||
|
participant DHT as DHT/Rendezvous
|
||||||
|
|
||||||
|
Node->>Peer: What is my public address?
|
||||||
|
Peer-->>Node: Your public IP:port is X.X.X.X:YYYY
|
||||||
|
Node->>DHT: Publish contact info (signed with identity key)
|
||||||
|
```
|
||||||
|
|
||||||
|
**NAT Traversal with Introducers:**
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant A as Node A (behind NAT)
|
||||||
|
participant I as Introducer
|
||||||
|
participant B as Node B (behind NAT)
|
||||||
|
|
||||||
|
Note over A,I: A has established connection to Introducer
|
||||||
|
B->>I: I want to connect to A
|
||||||
|
I->>A: B wants to connect (here's B's address)
|
||||||
|
A->>B: Direct connection attempt (NAT hole punch)
|
||||||
|
B->>A: Direct connection attempt (NAT hole punch)
|
||||||
|
Note over A,B: Direct P2P connection established
|
||||||
|
```
|
||||||
|
|
||||||
|
### Rendezvous Points
|
||||||
|
|
||||||
|
Two types of rendezvous with different PoW thresholds:
|
||||||
|
- **Active rendezvous** (self-hosted): Lower PoW threshold - the identity hosts their own contact information
|
||||||
|
- **Inactive rendezvous** (hosted by others): Higher PoW threshold - other nodes host contact information on behalf of an identity
|
||||||
|
|
||||||
|
This allows low-powered devices to have their PoW nonce computed by more capable devices - since the nonce only requires the public key (which is already public), helpers can compute it without any access to the device's private keys.
|
||||||
|
|
||||||
|
Identities sign information that stores how to contact them in a distributed hash table. Published information needs to be hostable at a variety of rendezvous points - similar to how IPFS content routing works.[^ipfs]
|
||||||
|
|
||||||
|
**Active Rendezvous (Self-Hosted):**
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant A as Node A
|
||||||
|
participant DHT as DHT
|
||||||
|
participant B as Node B
|
||||||
|
|
||||||
|
Note over A: Signs contact info with private key
|
||||||
|
A->>DHT: 1. Publish signed contact record
|
||||||
|
B->>DHT: 2. Lookup identity A
|
||||||
|
DHT-->>B: 3. Return signed record
|
||||||
|
B->>A: 4. Direct connection
|
||||||
|
```
|
||||||
|
|
||||||
|
**Inactive Rendezvous (Hosted by Others):**
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant HP as High-Power Device
|
||||||
|
participant LP as LoRa Node
|
||||||
|
participant R as Rendezvous Hosts
|
||||||
|
participant B as Node B
|
||||||
|
|
||||||
|
Note over HP: Compute PoW nonce for LP's public key
|
||||||
|
HP->>LP: 1. Transfer nonce (public info only)
|
||||||
|
LP->>R: 2. Publish contact info to multiple hosts
|
||||||
|
B->>R: 3. Query any rendezvous host
|
||||||
|
R-->>B: 4. Return contact info
|
||||||
|
```
|
||||||
|
|
||||||
|
### Records and Tombstones
|
||||||
|
|
||||||
|
**Records:**
|
||||||
|
- Published information has a time-to-live (TTL) that is user-configurable
|
||||||
|
- Records can be marked as destroyable
|
||||||
|
- Destroyable records include one-time keys for each party authorized to tombstone them
|
||||||
|
|
||||||
|
**Tombstones:**
|
||||||
|
- Tombstones are used to mark records as no longer needed
|
||||||
|
- Each authorized party has a one-time key embedded in the original record
|
||||||
|
- Tombstones replace the records they are tombstoning
|
||||||
|
- Tombstones inherit the same expiration time as the original record would have had
|
||||||
|
- Sent to rendezvous points hosting the content to signal they no longer need to keep the record
|
||||||
|
|
||||||
|
### File Distribution
|
||||||
|
|
||||||
|
Identities can sign packets of information for file distribution.
|
||||||
|
|
||||||
|
## Evaluation: Does the Protocol Serve the Motivation?
|
||||||
|
|
||||||
|
The technical design must be evaluated against the political goals. A protocol that claims to enable autonomous community while reproducing the logics of state and capital would be worse than useless - it would be recuperation, channeling resistance back into structures that serve power. Here we assess how the implementation choices relate to the motivation.
|
||||||
|
|
||||||
|
### Evaluation Against Goals
|
||||||
|
|
||||||
|
#### Security Goals
|
||||||
|
|
||||||
|
**Encrypted by default**
|
||||||
|
The protocol assumes encryption at every layer. Garlic routing encrypts packets at each hop, and end-to-end encryption between identities is built into the design. This goal is met.
|
||||||
|
|
||||||
|
**Private by default**
|
||||||
|
The routing model protects metadata - no single node knows both sender and recipient. The 6-hop design makes traffic analysis difficult. This goal is substantially met, though timing analysis and other advanced attacks may still be possible.
|
||||||
|
|
||||||
|
**Forward secrecy** ?
|
||||||
|
The protocol can support ratcheting key exchange like Signal, but the implementation details are not specified. This goal is achievable but not yet designed.
|
||||||
|
|
||||||
|
**Secure group communication** ?
|
||||||
|
Group communication is mentioned as a goal but the implementation is not detailed. This remains to be designed.
|
||||||
|
|
||||||
|
#### Networking Goals
|
||||||
|
|
||||||
|
**Low bandwidth operation**
|
||||||
|
The protocol explicitly targets LoRa and other constrained networks. The design accounts for limited bandwidth from the start.
|
||||||
|
|
||||||
|
**Hardware diversity**
|
||||||
|
The distinction between active and inactive rendezvous, and the ability to have PoW nonces computed by high-power devices and transferred to low-power devices (using only public keys), directly addresses hardware diversity.
|
||||||
|
|
||||||
|
**Cluster bridging**
|
||||||
|
The design accommodates isolated clusters bridged by slow links. This is a core architectural assumption.
|
||||||
|
|
||||||
|
#### Identity Management Goals
|
||||||
|
|
||||||
|
**Self-generated identifiers**
|
||||||
|
Identities are created through proof of work, not allocated by authorities. This goal is met.
|
||||||
|
|
||||||
|
**Findability without registries**
|
||||||
|
The DHT-based rendezvous system allows discovery without centralized directories. This goal is met.
|
||||||
|
|
||||||
|
**Identity continuity**
|
||||||
|
Key rotation through signed succession records maintains identity across key changes. This goal is met.
|
||||||
|
|
||||||
|
#### Information Coordination Goals
|
||||||
|
|
||||||
|
**Offline-capable messaging**
|
||||||
|
The inactive rendezvous system allows messages to be stored by community members while the recipient is offline. This goal is met.
|
||||||
|
|
||||||
|
**Community-hosted persistence**
|
||||||
|
Rendezvous hosts are community members, not corporations. This goal is met.
|
||||||
|
|
||||||
|
**Deletion that works** ~
|
||||||
|
Tombstones provide a deletion mechanism, but deletion cannot be enforced - nodes can retain data. This goal is partially met.
|
||||||
|
|
||||||
|
#### Information Sharing Goals
|
||||||
|
|
||||||
|
**No proprietary ownership**
|
||||||
|
The protocol does not create property relations in data. This goal is met by design.
|
||||||
|
|
||||||
|
**Reciprocity without markets** ?
|
||||||
|
This remains an open problem. No mechanism for preventing leeching without creating token economics is specified.
|
||||||
|
|
||||||
|
**Collective benefit** ?
|
||||||
|
The protocol enables shared resources but doesn't specifically design for them. Further work needed.
|
||||||
|
|
||||||
|
#### Accessibility Goals
|
||||||
|
|
||||||
|
**Low resource requirements** ~
|
||||||
|
The protocol targets low-power devices, but proof of work creates resource barriers. The ability to compute PoW nonces on external devices using only public keys helps but doesn't fully solve the problem.
|
||||||
|
|
||||||
|
**Transferable effort**
|
||||||
|
PoW nonces can be computed on powerful hardware using only the public key and transferred to constrained devices - no secrets cross the network. This goal is met.
|
||||||
|
|
||||||
|
**Usable by communities** ?
|
||||||
|
This depends entirely on implementation. The protocol can be made usable, but nothing in the design guarantees it. Application design is a separate concern.
|
||||||
|
|
||||||
|
#### Resilience Goals
|
||||||
|
|
||||||
|
**No single points of failure**
|
||||||
|
The DHT-based architecture, distributed rendezvous, and peer-based NAT traversal avoid single points of failure. This goal is met.
|
||||||
|
|
||||||
|
**Graceful degradation**
|
||||||
|
The design assumes partial connectivity and intermittent links. Parts of the network can fail without global failure.
|
||||||
|
|
||||||
|
**Censorship resistance** ~
|
||||||
|
The protocol resists censorship through anonymity and distribution, but traffic can still be blocked at the network level. The protocol layer is resistant; the transport layer is not.
|
||||||
|
|
||||||
|
**Jurisdictional independence**
|
||||||
|
Nothing in the protocol depends on legal protections from any state. This goal is met.
|
||||||
|
|
||||||
|
#### Autonomy Goals
|
||||||
|
|
||||||
|
**No governance capture** ?
|
||||||
|
The protocol as designed has no governance structure, which prevents capture. But protocol evolution, bug fixes, and standards decisions will require some coordination. How to do this without creating capturable structures is not specified.
|
||||||
|
|
||||||
|
**Implementation independence**
|
||||||
|
The protocol can be implemented independently. Multiple implementations are possible. This goal is achievable.
|
||||||
|
|
||||||
|
**No embedded economics**
|
||||||
|
No tokens, cryptocurrencies, or financial mechanisms are required. This goal is met.
|
||||||
|
|
||||||
|
### Summary of Evaluation
|
||||||
|
|
||||||
|
| Goal Category | Met | Partial | Undesigned |
|
||||||
|
|--------------|-----|---------|------------|
|
||||||
|
| Security | 2 | 0 | 2 |
|
||||||
|
| Networking | 3 | 0 | 0 |
|
||||||
|
| Identity | 3 | 0 | 0 |
|
||||||
|
| Information Coordination | 2 | 1 | 0 |
|
||||||
|
| Information Sharing | 1 | 0 | 2 |
|
||||||
|
| Accessibility | 1 | 1 | 1 |
|
||||||
|
| Resilience | 3 | 1 | 0 |
|
||||||
|
| Autonomy | 2 | 0 | 1 |
|
||||||
|
|
||||||
|
The protocol is strongest on networking, identity management, and resilience - the core infrastructure concerns. It is weakest on information sharing reciprocity, some security details, and accessibility. These gaps represent areas for further design work.
|
||||||
|
|
||||||
|
### Tensions and Contradictions
|
||||||
|
|
||||||
|
**Proof of Work as Barrier**
|
||||||
|
PoW prevents flooding but also creates barriers. Computational effort is not equally available - it requires hardware, electricity, and time. The ability to compute PoW nonces on external devices using only public keys partially addresses this, but the fundamental logic of PoW is that participation costs resources. This may exclude the most marginalized, those without access to computation. Is this acceptable as a tradeoff against spam, or does it reproduce capitalist logic of ability-to-pay determining access?
|
||||||
|
|
||||||
|
**Anonymity vs. Accountability**
|
||||||
|
Strong anonymity protects against state surveillance and corporate tracking, but it also enables abuse without accountability. Communities need ways to address harm caused by members. The protocol provides anonymity to the network but communities must develop their own practices for handling conflict - the technical layer cannot solve social problems. This is appropriate (technical solutions to social problems usually serve power), but it means the protocol alone is insufficient for healthy community.
|
||||||
|
|
||||||
|
**Scale and the Return of Hierarchy**
|
||||||
|
DHTs and routing networks can develop emergent hierarchies as they scale. Nodes with more resources become more central. Rendezvous hosts that serve many identities gain power. The protocol may begin as a union of egoists but evolve toward recreating the hierarchies it opposes. This requires ongoing vigilance and possibly protocol-level mechanisms to prevent concentration that haven't been designed yet.
|
||||||
|
|
||||||
|
**The Hardware Layer Remains Captured**
|
||||||
|
The protocol runs on internet infrastructure that remains under state-capital control. ISPs can block traffic, even if they can't read it. Hardware is manufactured by corporations and may contain backdoors. The physical layer is not addressed by this protocol. Community mesh networks, LoRa infrastructure, and other physical-layer projects must complement protocol-layer work.
|
||||||
|
|
||||||
|
**Bootstrapping Problem**
|
||||||
|
How do you find the network without using the captured internet? Initial connection to peers requires some discovery mechanism. If that mechanism uses DNS, you're back to ICANN. If it uses hardcoded IPs, you depend on whoever controls those addresses. QR codes exchanged in person, pre-loaded peer lists, and other out-of-band mechanisms can help, but the bootstrap problem is not fully solved.
|
||||||
|
|
||||||
|
### What Success Would Look Like
|
||||||
|
|
||||||
|
The protocol succeeds if communities can use it to:
|
||||||
|
- Communicate without state surveillance
|
||||||
|
- Organize without corporate platforms that can deplatform them
|
||||||
|
- Build durable infrastructure that survives legal pressure on any single participant
|
||||||
|
- Maintain autonomy from the state-capital apparatus that controls the captured internet
|
||||||
|
|
||||||
|
The protocol fails if:
|
||||||
|
- Only those with significant resources can participate effectively
|
||||||
|
- Emergent hierarchies recreate the power structures it opposes
|
||||||
|
- State or capital find ways to surveil or control traffic despite the design
|
||||||
|
- Communities cannot address internal harm because anonymity prevents accountability
|
||||||
|
|
||||||
|
### The Political Work Remains
|
||||||
|
|
||||||
|
No protocol can substitute for political organizing. Technology that enables autonomous community is necessary but not sufficient. Communities must still build trust, resolve conflicts, develop shared practices, and maintain solidarity. The protocol provides infrastructure; the work of constructing counter-hegemonic collective identities happens through human practice, not through code.
|
||||||
|
|
||||||
|
The question posed at the end of the motivation section - whether communities will build this anyway, regardless of permission - applies not just to the protocol but to everything that follows from it. Technical infrastructure is one small part of the larger project of building power outside state and capital. If the protocol serves that project, it has value. If it becomes an end in itself, or a distraction from material organizing, it has failed regardless of its technical elegance.
|
||||||
|
|
||||||
|
## Proposed Design Extensions
|
||||||
|
|
||||||
|
The evaluation above identifies several goals that are undesigned or only partially met. Here are potential design choices to address these gaps.
|
||||||
|
|
||||||
|
### Forward and Backward Secrecy
|
||||||
|
|
||||||
|
**Design choice: Double ratchet key exchange**
|
||||||
|
|
||||||
|
Implement a double ratchet protocol similar to Signal's X3DH + Double Ratchet:
|
||||||
|
- Initial key exchange uses ephemeral keys published to rendezvous points
|
||||||
|
- Each message advances the ratchet, providing both forward and backward secrecy
|
||||||
|
- Pre-keys can be published to DHT for asynchronous initial contact
|
||||||
|
|
||||||
|
The challenge is that Signal assumes a central server for pre-key distribution. In a decentralized system:
|
||||||
|
- Pre-keys are published as signed records to the DHT
|
||||||
|
- Multiple pre-keys can be published to handle simultaneous contact attempts
|
||||||
|
- Pre-key exhaustion requires periodic refresh, which maps well to TTL-based records
|
||||||
|
|
||||||
|
### Secure Group Communication
|
||||||
|
|
||||||
|
**Design choice: Sender keys with membership proofs**
|
||||||
|
|
||||||
|
For group communication, consider a hybrid approach:
|
||||||
|
- Group has a shared identity (group key)
|
||||||
|
- Members hold membership credentials signed by the group key
|
||||||
|
- Sender keys allow efficient encryption (encrypt once for the group, not once per member)
|
||||||
|
- Membership changes trigger key rotation
|
||||||
|
|
||||||
|
Challenges:
|
||||||
|
- Group key management requires consensus or a coordinator - potential hierarchy
|
||||||
|
- Large groups may need tree-based key structures (like MLS)
|
||||||
|
- Anonymity within the group may conflict with accountability
|
||||||
|
|
||||||
|
**Alternative: Onion-routed group messages**
|
||||||
|
|
||||||
|
Instead of shared keys, route all group messages through the same anonymizing path:
|
||||||
|
- Group defines a shared rendezvous point
|
||||||
|
- Messages are posted to the rendezvous and distributed to members
|
||||||
|
- No shared key means compromise of one member doesn't compromise group history
|
||||||
|
- More bandwidth-intensive but simpler key management
|
||||||
|
|
||||||
|
### Reciprocity Without Markets
|
||||||
|
|
||||||
|
**Design choice: Contribution tracking without currency**
|
||||||
|
|
||||||
|
The goal is to prevent pure leeching while avoiding token economics. Possible approaches:
|
||||||
|
|
||||||
|
**Local reputation:**
|
||||||
|
- Nodes track contribution history with peers they directly interact with
|
||||||
|
- Reputation is not global or transferable - it exists only in bilateral relationships
|
||||||
|
- Nodes can prefer to serve peers who have served them
|
||||||
|
- No currency, no speculation, no accumulation beyond direct relationships
|
||||||
|
|
||||||
|
**Bandwidth budgets:**
|
||||||
|
- Nodes allocate bandwidth to serving others based on how much they've received
|
||||||
|
- New nodes start with a small budget that grows as they contribute
|
||||||
|
- No tokens - just local accounting of give/take ratios
|
||||||
|
- Doesn't create markets because budgets aren't transferable
|
||||||
|
|
||||||
|
**Community vouching:**
|
||||||
|
- Communities vouch for their members
|
||||||
|
- Nodes serve requests that come with community vouches
|
||||||
|
- Leeching becomes a community-level problem, handled through community practices
|
||||||
|
- Preserves privacy (individual behavior not tracked, only community membership)
|
||||||
|
|
||||||
|
### Reducing PoW Barriers
|
||||||
|
|
||||||
|
**Design choice: Vouching-based identity creation**
|
||||||
|
|
||||||
|
Instead of requiring PoW for all identities, allow alternative paths:
|
||||||
|
|
||||||
|
**Community-vouched identities:**
|
||||||
|
- Established identities can vouch for new identities
|
||||||
|
- Vouching consumes some resource (limited vouches per time period)
|
||||||
|
- Creates social cost to spam rather than computational cost
|
||||||
|
- New members join through existing community relationships
|
||||||
|
|
||||||
|
**Time-locked identities:**
|
||||||
|
- Instead of PoW, identities can be created by committing to a future timestamp
|
||||||
|
- Identity becomes valid after waiting period (e.g., 24 hours)
|
||||||
|
- Spam prevention through time rather than computation
|
||||||
|
- Low-resource nodes can wait instead of compute
|
||||||
|
|
||||||
|
**Graduated capabilities:**
|
||||||
|
- New identities start with limited capabilities (can receive, limited sending)
|
||||||
|
- Capabilities expand over time with network participation
|
||||||
|
- Spam is self-limiting because new identities can't flood
|
||||||
|
- No PoW required, just time and participation
|
||||||
|
|
||||||
|
### Preventing Emergent Hierarchy
|
||||||
|
|
||||||
|
**Design choice: Resource limits and rotation**
|
||||||
|
|
||||||
|
DHTs and routing networks naturally concentrate load on well-connected nodes. To resist this:
|
||||||
|
|
||||||
|
**Mandatory load shedding:**
|
||||||
|
- Nodes that serve above a threshold must redirect traffic to less-loaded nodes
|
||||||
|
- Prevents any node from becoming too central
|
||||||
|
- Redistributes load toward network edges
|
||||||
|
|
||||||
|
**Rendezvous rotation:**
|
||||||
|
- Identities should use multiple rendezvous points and rotate between them
|
||||||
|
- Prevents any rendezvous host from accumulating control over many identities
|
||||||
|
- Built into the protocol rather than optional
|
||||||
|
|
||||||
|
**Connection diversity requirements:**
|
||||||
|
- Nodes should maintain connections to diverse parts of the network
|
||||||
|
- Routing should prefer paths through less-central nodes when available
|
||||||
|
- Slight efficiency cost for significant hierarchy resistance
|
||||||
|
|
||||||
|
### Bootstrap Without Captured Infrastructure
|
||||||
|
|
||||||
|
**Design choice: Multiple bootstrap mechanisms**
|
||||||
|
|
||||||
|
No single bootstrap method will work for all communities. The protocol should support:
|
||||||
|
|
||||||
|
**QR code exchange:**
|
||||||
|
- In-person exchange of peer addresses via QR codes
|
||||||
|
- No network dependency for initial bootstrap
|
||||||
|
- Natural fit for local community organizing
|
||||||
|
|
||||||
|
**Pre-loaded peer lists:**
|
||||||
|
- Distributions of the software include peer lists
|
||||||
|
- Lists are signed by communities, not central authorities
|
||||||
|
- Communities can distribute their own peer lists through their own channels
|
||||||
|
|
||||||
|
**Sneakernet bootstrap:**
|
||||||
|
- USB drives, SD cards with bootstrap information
|
||||||
|
- Works in network-hostile environments
|
||||||
|
- Appropriate for high-security contexts
|
||||||
|
|
||||||
|
**Existing network piggybacking:**
|
||||||
|
- Bootstrap information hidden in other protocols (steganography in images, encoded in DNS TXT records, etc.)
|
||||||
|
- Useful when other methods are blocked
|
||||||
|
- More complex but provides fallback
|
||||||
|
|
||||||
|
**Local broadcast discovery:**
|
||||||
|
- mDNS[^mdns] or similar for finding peers on local network
|
||||||
|
- Works for mesh networks and LANs
|
||||||
|
- Doesn't depend on internet infrastructure
|
||||||
|
|
||||||
|
### Governance Without Capture
|
||||||
|
|
||||||
|
**Design choice: Rough consensus and running code**
|
||||||
|
|
||||||
|
Protocol evolution needs coordination without creating capturable structures:
|
||||||
|
|
||||||
|
**No formal governance:**
|
||||||
|
- Anyone can propose protocol changes
|
||||||
|
- Changes are adopted if implementations adopt them
|
||||||
|
- No voting, no foundation, no core team with special authority
|
||||||
|
- Forks are expected and acceptable
|
||||||
|
|
||||||
|
**Compatibility layers:**
|
||||||
|
- Protocol should be designed for graceful evolution
|
||||||
|
- Version negotiation allows different implementations to interoperate
|
||||||
|
- No forced upgrades controlled by any party
|
||||||
|
|
||||||
|
**Multiple implementations:**
|
||||||
|
- Encourage multiple independent implementations from the start
|
||||||
|
- No reference implementation with special status
|
||||||
|
- Interoperability testing between implementations
|
||||||
|
|
||||||
|
### Transport Layer Censorship Resistance
|
||||||
|
|
||||||
|
**Design choice: Pluggable transports**
|
||||||
|
|
||||||
|
The protocol layer resists censorship, but the transport layer can be blocked. Address this through:
|
||||||
|
|
||||||
|
**Traffic obfuscation:**
|
||||||
|
- Protocol traffic should be indistinguishable from other traffic
|
||||||
|
- Pluggable transports allow adaptation to blocking techniques
|
||||||
|
- Community-developed transports for specific censorship environments
|
||||||
|
|
||||||
|
**Multi-transport support:**
|
||||||
|
- Nodes should support multiple transport mechanisms
|
||||||
|
- Fallback through different transports when one is blocked
|
||||||
|
- LoRa, mesh networks, and other non-internet transports as alternatives
|
||||||
|
|
||||||
|
**Domain fronting equivalents:**
|
||||||
|
- Where possible, use techniques that make blocking costly
|
||||||
|
- Route through infrastructure the censor is unwilling to block
|
||||||
|
- Constantly evolving to match censorship evolution
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
These design proposals address some gaps but leave others open:
|
||||||
|
|
||||||
|
- Anti-leeching mechanism that preserves privacy - the approaches above are sketches, not complete designs. How do we prevent free-riding without creating token economies or surveillance?
|
||||||
|
- Group communication at scale - tree-based key management introduces complexity and potential hierarchy. Is there a simpler approach for large groups?
|
||||||
|
- Vouching systems and Sybil attacks - if vouching replaces PoW, how do we prevent vouching abuse?
|
||||||
|
- Time-locked identities and spam - is time delay sufficient to prevent spam, or will attackers just pre-generate identities?
|
||||||
|
- Bootstrap security - how do we verify bootstrap information without creating trust hierarchies?
|
||||||
|
- Protocol evolution - rough consensus works for small communities of developers, but what happens when the protocol is widely deployed?
|
||||||
|
- Accessibility and complexity - many of these design choices add complexity. How do we keep the protocol implementable and usable?
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
### Political and Theoretical Influences
|
||||||
|
|
||||||
|
The political analysis in this document draws on several traditions of thought:
|
||||||
|
|
||||||
|
**Critique of the State-Capital Relation**
|
||||||
|
- Marx's analysis of the state as serving class interests and creating the conditions for capital accumulation[^marx-state]
|
||||||
|
- The understanding of law and property as constructed systems serving particular interests rather than neutral frameworks
|
||||||
|
|
||||||
|
**Hegemony and Counter-Hegemony**
|
||||||
|
- Gramsci's concept of hegemony - dominant orders that present themselves as natural and inevitable while requiring active construction and maintenance
|
||||||
|
- The possibility of counter-hegemonic projects that contest dominant orders and build alternatives
|
||||||
|
|
||||||
|
**Construction of Collective Identity**
|
||||||
|
- Laclau and Mouffe's post-Marxist analysis of how collective political identities are constructed through practice rather than given by objective conditions[^laclau-mouffe]
|
||||||
|
- The importance of building shared infrastructure as a practice that constitutes community
|
||||||
|
|
||||||
|
**Critique of Abstract Authority**
|
||||||
|
- Stirner's critique of "spooks"[^stirner-spooks] - abstract concepts (state, property, law, morality) that demand obedience while serving interests other than your own
|
||||||
|
- The concept of a "union of egoists" - free association based on mutual interest rather than submission to authority
|
||||||
|
- The emphasis on refusing obedience to constructed authorities you never chose
|
||||||
|
|
||||||
|
### Technical Protocol References
|
||||||
|
|
||||||
|
**Anonymity and Routing**
|
||||||
|
- I2P (Invisible Internet Project) - garlic routing, SSU2 transport, distributed network architecture
|
||||||
|
- Tor - onion routing concepts, though this protocol differs in significant ways
|
||||||
|
|
||||||
|
**Key Exchange and Encryption**
|
||||||
|
- Signal Protocol - X3DH key exchange,[^signal-x3dh] Double Ratchet algorithm providing forward and backward secrecy
|
||||||
|
- MLS (Messaging Layer Security)[^mls] - tree-based key management for large groups
|
||||||
|
|
||||||
|
**Distributed Systems**
|
||||||
|
- IPFS (InterPlanetary File System) - content-addressed storage, distributed hash tables, content routing
|
||||||
|
- BitTorrent - distributed file sharing, DHT-based peer discovery
|
||||||
|
- Kademlia[^kademlia] - DHT algorithm used by many P2P systems
|
||||||
|
|
||||||
|
**NAT Traversal**
|
||||||
|
- STUN (Session Traversal Utilities for NAT) - address discovery techniques
|
||||||
|
- ICE (Interactive Connectivity Establishment) - connection establishment through NATs
|
||||||
|
- I2P's introducer system - NAT traversal without centralized STUN servers
|
||||||
|
|
||||||
|
**Low-Power and Mesh Networking**
|
||||||
|
- LoRa/LoRaWAN[^lora] - long-range, low-power radio communication
|
||||||
|
- Meshtastic - LoRa mesh networking
|
||||||
|
- CJDNS - encrypted mesh networking
|
||||||
|
|
||||||
|
**Censorship Resistance**
|
||||||
|
- Tor pluggable transports[^tor-pluggable] - traffic obfuscation techniques
|
||||||
|
- Domain fronting - though largely blocked now, the concept of routing through infrastructure censors are unwilling to block
|
||||||
|
- Steganography - hiding data in innocuous-seeming content
|
||||||
|
|
||||||
|
**Identity and Naming**
|
||||||
|
- Petnames[^petnames] - contextual naming systems where names are meaningful within relationships rather than globally unique
|
||||||
|
- Web of Trust - decentralized trust without certificate authorities
|
||||||
|
- DIDs (Decentralized Identifiers) - self-sovereign identity standards, though this protocol takes a different approach
|
||||||
|
|
||||||
|
### Further Reading
|
||||||
|
|
||||||
|
For those interested in the political theory underlying this project:
|
||||||
|
- The concept of hegemony: Prison Notebooks (Gramsci)
|
||||||
|
- Post-Marxist theory: Hegemony and Socialist Strategy (Laclau and Mouffe)
|
||||||
|
- Critique of abstract authority: The Ego and Its Own (Stirner)
|
||||||
|
- State and capital: various works in the Marxist tradition
|
||||||
|
|
||||||
|
For technical background:
|
||||||
|
- I2P technical documentation: https://geti2p.net/en/docs
|
||||||
|
- Signal Protocol specifications: https://signal.org/docs/
|
||||||
|
- IPFS documentation: https://docs.ipfs.tech/
|
||||||
|
- Academic papers on anonymous communication systems
|
||||||
|
|
||||||
|
[^gramsci-hegemony]: Gramsci, Antonio. *Selections from the Prison Notebooks*. Edited and translated by Quintin Hoare and Geoffrey Nowell Smith. International Publishers, 1971. The concept of hegemony appears throughout, particularly in "State and Civil Society" and "The Modern Prince."
|
||||||
|
|
||||||
|
[^stirner-union]: Stirner, Max. *The Ego and Its Own* (*Der Einzige und sein Eigentum*). 1844. English translation by Steven Byington, 1907. The "union of egoists" (Verein von Egoisten) is discussed in Part 2, Section 3.
|
||||||
|
|
||||||
|
[^i2p]: I2P Project. "I2P Technical Documentation." https://geti2p.net/en/docs. See especially: "How Garlic Routing Works" (https://geti2p.net/en/docs/how/garlic-routing) and "Threat Model" (https://geti2p.net/en/docs/how/threat-model).
|
||||||
|
|
||||||
|
[^i2p-ssu2]: I2P Project. "SSU2 Specification." https://geti2p.net/spec/ssu2. Describes the Secure Semireliable UDP transport including NAT traversal mechanisms, introducers, and hole punching.
|
||||||
|
|
||||||
|
[^signal-protocol]: Marlinspike, Moxie and Trevor Perrin. "The Double Ratchet Algorithm." Signal Foundation, November 2016. https://signal.org/docs/specifications/doubleratchet/. See also: "The X3DH Key Agreement Protocol" (https://signal.org/docs/specifications/x3dh/).
|
||||||
|
|
||||||
|
[^ipfs]: Benet, Juan. "IPFS - Content Addressed, Versioned, P2P File System." arXiv:1407.3561, 2014. https://arxiv.org/abs/1407.3561. Protocol documentation: https://docs.ipfs.tech/concepts/
|
||||||
|
|
||||||
|
[^base-superstructure]: Marx, Karl. "Preface to A Contribution to the Critique of Political Economy" (1859). The classic formulation: "The totality of these relations of production constitutes the economic structure of society, the real foundation, on which arises a legal and political superstructure and to which correspond definite forms of social consciousness." See also: Williams, Raymond. "Base and Superstructure in Marxist Cultural Theory." *New Left Review* I/82, November-December 1973, for a nuanced reading that avoids crude economic determinism.
|
||||||
|
|
||||||
|
[^laclau-mouffe]: Laclau, Ernesto and Chantal Mouffe. *Hegemony and Socialist Strategy: Towards a Radical Democratic Politics*. Verso, 1985. Second edition with new introduction, 2001. ISBN 978-1-85984-330-0.
|
||||||
|
|
||||||
|
[^marx-state]: Marx, Karl. "Critique of the Gotha Programme" (1875) and Engels, Friedrich. *The Origin of the Family, Private Property and the State* (1884) provide the foundational analysis of the state as an instrument of class rule. See also: Lenin, V.I. *The State and Revolution* (1917) for an extended treatment.
|
||||||
|
|
||||||
|
[^stirner-spooks]: Stirner, Max. *The Ego and Its Own*, Part 1: "A Human Life" and Part 2, Section 1: "Ownness." The German term is "Sparren" or "fixe Idee" (fixed idea). Stirner uses "Spuk" (spook/ghost) to describe ideas that haunt and dominate individuals.
|
||||||
|
|
||||||
|
[^signal-x3dh]: Marlinspike, Moxie and Trevor Perrin. "The X3DH Key Agreement Protocol." Signal Foundation, November 2016. https://signal.org/docs/specifications/x3dh/. Revision 1, 2016-11-04.
|
||||||
|
|
||||||
|
[^kademlia]: Maymounkov, Petar and David Mazières. "Kademlia: A Peer-to-peer Information System Based on the XOR Metric." *Proceedings of the 1st International Workshop on Peer-to-Peer Systems (IPTPS)*, 2002. https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf
|
||||||
|
|
||||||
|
[^mls]: Barnes, R., et al. "The Messaging Layer Security (MLS) Protocol." RFC 9420, IETF, July 2023. https://datatracker.ietf.org/doc/rfc9420/. See also: "Messaging Layer Security Architecture" RFC 9420.
|
||||||
|
|
||||||
|
[^lora]: LoRa Alliance. "LoRaWAN Specification." https://lora-alliance.org/lorawan-specification/. Semtech Corporation. "LoRa Modulation Basics." AN1200.22, 2015.
|
||||||
|
|
||||||
|
[^mdns]: Cheshire, S. and M. Krochmal. "Multicast DNS." RFC 6762, IETF, February 2013. https://datatracker.ietf.org/doc/rfc6762/
|
||||||
|
|
||||||
|
[^petnames]: Stiegler, Marc. "Petname Systems." HP Labs Technical Report, 2005. http://www.skyhunter.com/marcs/petnames/IntroPetNames.html. See also: "An Introduction to Petname Systems" for the relationship between petnames, nicknames, and keys.
|
||||||
|
|
||||||
|
[^tor-pluggable]: Tor Project. "Pluggable Transports Specification." https://spec.torproject.org/pt-spec/. See also: "obfs4 Specification" (https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/obfs4) for the most widely deployed obfuscation transport.
|
||||||
|
|
@ -33,6 +33,15 @@ const getGitModifiedTime = (filePath) => {
|
||||||
return null;
|
return null;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
const hasMermaidContent = (filePath) => {
|
||||||
|
try {
|
||||||
|
const content = fs.readFileSync(filePath, 'utf-8');
|
||||||
|
return /```mermaid/i.test(content);
|
||||||
|
} catch (e) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
const getTitleFromFilename = (filePath) => {
|
const getTitleFromFilename = (filePath) => {
|
||||||
const basename = path.basename(filePath, '.md');
|
const basename = path.basename(filePath, '.md');
|
||||||
return basename
|
return basename
|
||||||
|
|
@ -115,6 +124,9 @@ module.exports = {
|
||||||
getGitModifiedTime(data.page.inputPath) ??
|
getGitModifiedTime(data.page.inputPath) ??
|
||||||
getFileModifiedTime(data.page.inputPath);
|
getFileModifiedTime(data.page.inputPath);
|
||||||
},
|
},
|
||||||
|
mermaid: (data) => {
|
||||||
|
return hasMermaidContent(data.page.inputPath);
|
||||||
|
},
|
||||||
permalink: (data) => {
|
permalink: (data) => {
|
||||||
const title = data.title || getTitleFromFilename(data.page.inputPath);
|
const title = data.title || getTitleFromFilename(data.page.inputPath);
|
||||||
const slug = title.toLowerCase().replace(/\s+/g, '-').replace(/[^\w-]/g, '');
|
const slug = title.toLowerCase().replace(/\s+/g, '-').replace(/[^\w-]/g, '');
|
||||||
|
|
|
||||||
1411
simulations/hyperloglog-tombstone/simulation.ts
Normal file
1411
simulations/hyperloglog-tombstone/simulation.ts
Normal file
File diff suppressed because it is too large
Load diff
Loading…
Add table
Add a link
Reference in a new issue