Table of Contents
- Revision
- Introduction to Library Zero
- Requirements
- Desired User Experieince
- Threats
- Goals of Library Zero
- Software knowns
- Software unknowns
- Exploring possible components
- How does it all get started?
- Leaching
- Seeding
- UX
- Private trackers
- Public trackers
Revision
Orignally published 2023 February 26 Updated 2024 October 9
Introduction to Library Zero
This document is a work-in-progress design document for Library Zero, a human-rights focused Rust-based BitTorrent application designed to operate exclusively over Tor onion services. Library Zero will be developed as a stand-alone application that interacts only with itself, enhancing the Tor network while allowing anonymous access and contribution to a distributed library of data.
Unlike legacy BitTorrent applications that depend on centralized search and tracking functionality, Library Zero aims to base discovery on a Web of Trust model for peer connections while refactoring search and tracking into new distributed models. Every Library Zero node can become a Library, with its own Reference, and References are automatically shared based on trust. Searching for files is performed locally, because trusted Libraries auto share their complete References. Further, Library Zero libraries not only have copies of their own references, but they make copies of other Refernces of Libraries that they connect to. This is how Libraries can become known, trusted, and shared.
Library Zero is not intended to be used by anyone. Because of the requirement to run as a Tor middle relay, Library Zero users must understand and use technical concepts and requirements.
Requirements
Every Library Zero node must be:
a Tor middle relay that operates in parallel to BitTorrent operations.
a BitTorrrent client that accesses libraries via Tor onion services.
a BitTorrent server that hosts a library on Tor onion services.
Desired User Experieince
Leaching
A user wants to download a file from a Library.
The user downloads then installs Library Zero for BSD, Linux, macOS, or Windows.
The user sees a Get Started UI (tab 1) educating the user how to use Library Zero.
2a. The Get Started tab first informs the user that to be able to use Library Zero, they must first contribute to the Tor network as a Tor middle relay. The user is provided clear education about what this means, and then the user must accept, then configure, and then start a Tor middle relay. Once they click Accept, a new tab will appear, a Public Tor Relay tab (tab 2). Now on the Public Tor Relay tab, the user is further educated on how to setup their Tor middle relay. It asks for basic information and provides facts about what the user must do, like enabling port forwarding on their home firewall. Once started, and the UI shows it is successfully connected to the Tor network as a relay, the Tor Relay tab will show the status and basic metrics of their Tor middle relay. (In the background, a new Search tab (tab 3) and My Public Library tab (tab 4) are also made avaialble.) The user is then prompted to return to the Get Started tab.
2b. Library Zero then educates the user about searching other libraries and instructs the user wishing to download something to click on the Search tab (tab 3). In the Search tab, Library Zero asks the user to enter in one or more Tor onion addresses. The user, knowing a Tor onion address of their local library, pastes the onion address in and clicks Connect.
Once connected, the user is prompted to pin the Tor onion address as a trusted Library. Pinned, trusted libraries are kept in a sub-tab of the Search tab (tab 3). The user may pin the onion address so that when Library Zero is opened, the app will always connect to this Library. (In the background, metadata for the Library is downloaded, and if the Library has opted into naming itself, the name will automatically populate in Library Zero. Additionally, the entire Library Reference data is downloaded to the user's Library Zero application.) The user is shown a simple search field. The user can search for anything, and all metadata fields will be searched for in the Reference of the Library they are connected to, but the search being performed is local. (In the background, Zero Library finds all of the onion addreses for the Library they are connected to, and those other onion addresses can be viewed by the user in the advanced view of a pinned Library. By default, a connection with a Library uses 3 Tor onion addresses, but a Library my opt into having more onion addresses.)
The user finds one item of the two items that they are looking for in the connected Library's Reference. The user clicks download. The item is immediately queued for download in the Download tab. (In the background, Library Zero already has all of the data about a file, and how to begin downloading the data, beacuse of the auto-downloaded Refernce data. The Reference data does not contain a list of other seeders and leachers (other onion addresses). As soon as the user clicks Download, the list of other onion addresses (other leachers and seeders of the file) is provided by the connected Library. Zero Library multiplexes a download over the available onion addresses. If authentication is successful with other leachers and seeders, multiplexing with those Libraries also begins.) Once a download is 100% complete, a user may use their file.
Since the user was only able to find one of two items, the user is able to opt into searching the References of extended lirbaries.
5a. (Extended Trusted) Also on the Search tab (tab 3), the user is prompted to select how many degrees of separation they want to search when it comes to the trusted libraries of the library that they trust. Education is provided about what this means. (With one onion address beloning to one Library, a user is able to connect to and search the Reference of said Library. The Library being connected to, being an independent oeprator, opts into pinning their own trusted Libraries, which, to the user we're talking about, is two-degrees of separation from the user. If the user opts into searching two degree of separation, they then download the References of those trusted Libraries, and the user may opt into pinning those libraries as trusted Libraries.) The user selects two degrees of separation, then performs the same search but against five References, all local searches, because the Library they trust trusts four Libraries. The original Library that the user pinned as trusted trusts a different library that has the second file they are looking for. The user clicks Download.
5b. (Extended Untrusted) Also on the Search tab (tab 3), the user is prompted to select how many degrees of separation they want to search when it comes to untrusted Libraries. An untrusted library is a Library seen simply by being a peer, either seeding or leaching. Those peers, who are also Library Zero users, will automaitcally share their References with the user. The user is able to opt into searching (downloading the entire Reference first) References they come into contact with. If users decide that the untrusted Libraries contain high quality files, they can pin that library as trusted, like in step 3.
5c. Extended Trusted and Extended Untrusted libraries have simple check boxes in the Search Tab for enabling or disabling them when performing Reference searches.
Seeding
A user wants to share a file with the world.
See 1a.
The user sees a Get Started UI (the first tab) educating the user how to use Library Zero.
7a. See 2a.
7b. Library Zero instructs the user wishing to share something to click on the My Public Library tab (tab 4). In the My Public Library tab, the user is asked to select a file or folder to share. The user selects a folder of MP4 video files. Zero Library cleanly presents the user with the file and folder structure along with file and folder names. Below each file and folder name is its associated metadata. Library Zero offers a one-click button to anonymize the metadata of all files and folders selected to a standard configuration that all Library Zero application will do, which the user clicks. Optionally, Zero Library also provides one-click options to anonymize select metadata, categorically, like any "author" fields or "time and date" data. The file and folder names are left intact, and the user does not change them. The user clicks a Next button. Zero Libary prompts the user with one last screen informing them how to safeguard their anonymity, and to acknowledge that what they choose to share must be done carefully while aiming to not break any of their laws. The user clicks the a Share button. (In the background, Zero Library zips the folder into one file, generates a Reference dataset, and creates a set of Library onion addresses to share with anyone.)
The user is now able to share their onion address with anyone. Go to 2b.
Now that a user's Library Zero application has generated its own Reference, there is a new My Public Reference tab under the Share tab (tab 3). There a user can review the entire Refernce data that anyone with their onion address can see and download. Until a user shares something, there is no Reference file, so nothing is shared with other Library Zero users.
Threats
Governments who don't want their residents to read freely or become exposed to cultures other than their own.
Capitalists who want capitalism to be more important than human rights.
Politicians and Advertisers who want to control what users are exposed to and when.
Malicious users who want to trash the funtionality or user experieince of Library Zero.
Goals of Library Zero
Allow users to access and contribute to a distributed library of data, anonymously, over the internet. Tor onion services can meaningfully protect the physical location of all libraries, and from there, aspects of Tor and other software design choices will protect the identity of any library user.
The Universal Declration of Human Rights must dictate design and architecture choices for Library Zero, weighed against known threats. Distributed systems, layered cryptography, and layered networking empowers users to retain their human rights.
Users should be able to access and contribute to a global library from anywhere with minimal effort. Tor onion services excels at achieving this goal as a reverse proxy.
The protection of a user's identity is more important than performance. As such, a successful and complete download or upload in a privacy-preserving manner takes priority over speed.
Today, the Tor network is limited in its capacity so BitTorrent is discouraged. Library Zero must contribute back to the Tor network and Library Zero will do this by becoming a Tor middle relay. How much anyone is able to download with Library Zero will be limited by how much they give back to the Tor network. The more people that use Library Zero, the faster and more robust the Tor network will be. Secondarily, being that Library Zero is a Tor middle relay, users have plausible deniability when it comes to contributing to other's Libraries.
The Tor network is limited to TCP traffic. UDP traffic generated by some BitTorrent clients today may also undermine a user's expectations of privacy. Zero must utilize TCP protocols to be properly routable over Tor.
The Tor network, made up of random volunteers all around the world, run Tor relays that are limited in various network and compute capacities. Tor onion services are made up of a 6-hop circuit, of which the slowest of the 6 will limit the maxiumum throughput of a download or upload. Library Zero must multiplex onion services, which BitTorrent makes easy since all files being shared are broken down into small chucks, and it doesn't matter which chunk arrives first or last since a successful download will require 100% of all chunks.
Library Zero must be modular to support other transport types, such as mixnets, but initially will be designed to leverage well known anonymity protols and networks such as Tor. Specifically, Library Zero will exclusively use Tor onion services.
Library Zero must not to be interoperable with legacy BitTorrent over the clear-web. Those platforms, when used in the public domain, are negligent in protecting users and should be abandoned.
Software knowns
Arti is Tor Project's Tor daemon written in Rust. Arti will be used to:
- interface with the Tor network
- operate a Tor middle relay
- access remote onion services
- generate local onion services
Software unknowns
Since Tor provides automatic end-to-end encryption, does Library Zero need to be responsible for any additional transport cryptography? For example, Tor does not yet implement any quantum safe algorithms.
Tor has claimed within recent years that it is aiming to support UDP. When will that be?
Rust BitTorrent applications like rqbit exist, but how much can be forked, and how much will have to be rewritten?
How much of the DHT protocol can safely be ported? Or, because of its intended design, is it too anti-privacy, inefficient, or insecure to be used?
What should the TCP-based transport protocol be inside of the Tor onion circuits since Tor does not yet support UDP-based protocols? Hyper could be used for UDP to HTTP/2 conversion. Or, again, should a BitTorrent client be written from scratch to never use UDP in the first place?
aria2-onion-downloader exists for multiplexing onion/HTTP downloads, but how well is it designed?
"Serde is a framework for serializing and deserializing Rust data structures efficiently and generically."
Exploring possible components
I presume, to use Arti with Hyper and rqbit, a custom transport for Hyper that uses Arti to handle the underlying network connections would be needed. This might be done by implementing the hyper::client::conn trait for Arti, which may allow using Arti as the underlying network transport for Hyper's client. Serde could be responsible for parsing and serializing the data exchanged between rqbit and Hyper. It uses a defined data format (such as JSON or Binary) to ensure that data is structured correctly and can be understood by other nodes in the network.
In the context of using serde in the rqbit and Hyper libraries, i presume that data would be serialized and deserialized between the two libraries to facilitate communication between the different layers of the application stack. Specifically, rqbit deals with the Bittorrent protocol, which defines a set of messages that are sent between peers participating in a swarm. These messages contain information about which pieces of a shared file are available, which pieces are still needed, etc. When these messages are passed between rqbit and Hyper, they need to be serialized from rqbit's internal representation into a format that can be sent over the network, and then deserialized back into rqbit's internal representation on the receiving end. Serde provides a convenient and efficient way to do this serialization and deserialization, and thus acts as a bridge between rqbit and Hyper.
For rqbit, we would also need to modify the networking code to use Arti for making connections to other peers in the BitTorrent network. This could involve creating a custom networking layer that uses Arti to establish connections with other peers and handle incoming data. Overall, using Arti with Hyper and rqbit would require some significant modifications to both libraries, but it is certainly possible to make it work.
The peer selection algorithm in rqbit is responsible for selecting which peers to connect to based on a variety of factors, such as availability, download speed, and number of active connections. It is an important part of the overall performance and efficiency of the client, as the selection of good peers can significantly improve the speed and reliability of the downloads.
Multiplexing onions
Due to the network performance limitations of data passing through 6 different, globally distributed ISPs (a standard tor onion onion circuit), multiplexing download/upload streams is prudent. This has additional benefits of distributing data via incresingly greater data paths around the world, making it significantly harder to perform network analysis to deanonymize users. Onion services can be created dynamically and automatically depending on a number of factors, including network performance and file size. By default, the number of streams should be nine. To generate multiple Tor onion services and multiplex network streams, custom code would need to be created. This code would need to interface with several aspects of the application, including:
- Arti: The code would need to interface with Arti, the Rust implementation of Tor, to generate multiple Tor onion services (per file? per GB?). This would involve using the Arti API to create and manage Tor circuits and onion services.
- Hyper: The code would also need to interface with Hyper, the HTTP library for Rust, to multiplex network streams. This would involve using the Hyper API to manage HTTP/2 streams, which would allow multiple concurrent requests and responses to be sent over a single connection, but should be able to scale up across multiple onions.
- rqbit: The code would also need to interface with rqbit, a Rust implementation of the BitTorrent protocol. This would involve using the rqbit API to manage peer connections and piece selection, as well as to handle the actual data transfer over the network.
- Serde: Finally, the code would need to use Serde, the Rust library for serializing and deserializing data, to encode and decode messages between peers. This would be necessary to communicate information about available pieces and to negotiate the transfer of data between peers.
Overall, the custom code would need to coordinate these various components to ensure that multiple concurrent transfers were taking place over multiple Tor onion services and that data is being efficiently multiplexed to maximize download and upload performance.
Tracking
To make the tracker functionality of BitTorrent distributed (not centralized in any way), a new tracker application, written in rust, would need to be created to run as a distributed, federating system. In this model, the tracker must exist on all nodes in the network automatically. This approach has several advantages over traditional centralized tracker systems, including increased resilience, scalability, privacy, and plausible deniability. Making every node a tracker also makes it easy to self-host files without needing someone else's tracker. The reverse-proxy aspect of tor onion services makes this trivial from any network.
A modified version of the application Torrust Tracker could be used. To make is distributed, the following modifications might need to be made:
- Peer-to-peer communication: Torrust Tracker would need to be modified to support peer-to-peer communication between all nodes in the network. This would involve implementing a distributed messaging protocol that allows nodes to communicate with each other directly, without the need for a central server.
- Distributed data storage: The tracker would need to be modified to store its data in a distributed manner, such as in a distributed hash table (DHT). A DHT is a decentralized system for storing and retrieving key-value pairs that can be used to store information about the torrents being shared on the network.
- Load balancing: In a distributed system, load balancing becomes important to ensure that no single node becomes overloaded with requests. To achieve this, Torrust Tracker would need to be modified to distribute incoming requests across all the nodes in the network, using a load balancing algorithm.
- Fault tolerance: To ensure that the tracker remains available in the event of node failures, it would need to be modified to handle node failures gracefully, by redistributing the workload across the remaining nodes in the network.
- Security: To ensure the security and privacy of the users on the network, the tracker would need to be modified to run over the Tor network using Arti, providing end-to-end encryption and anonymization.
Every node on the network is a tracker. In addition, every tracker can choose to become a mirror for any other tracker (which doesn't mean it copies all the data, just the metadata). Becoming a tracker mirror should be as simple as copying and pasting the tor onion address of the tracker, which is the only identifier of a node. Therefore, every node operator can run multiple instances and trivially copy over tracker data. This way, when an operator needs to restart hardware or software, they can leave one instance online so that related tracker data is still accessible to the rest of the network. If the original tracker does not ever come back online, that is not a problem. Copying tracker data is a one-time event (full backup), and an operator can choose to automatically keep the tracker data up to date, or to do it manually. But each copy of a tracker becomes its own net-new onion service. Even though becoming a tracker mirror is a one-time event, that does not apply to keeping track of the peers that have copies of the file data related to the tracker data. Address data must be shared synchronously in near real-time between all peers that share tracker data and file data.
Tor middle Relay
Classically, with BitTorrrent, the share ratio is what determines how much someone can download. In this torified version of a BitTorrrent application, the share ratio needs to be pre-determined by how much tor middle relay traffic they provided to the network. The Tor network has limited bandwidth and resources, and using it for high-volume file sharing could negatively impact the network's performance. By using middle relays as a measure of contribution, users would be incentivized to provide resources to the network without overburdening it. Determining a fair and effective share ratio based on Tor middle relay traffic could be challenging and would require careful consideration and testing. Remember that tor onion services only utilize middle relays, not exit relays. So substantially increasing the side of the network with thousands of new middle relays of this type would not affect, and would not contribute to, exit relaying.
How does it all get started?
UI
The app depends on a web browser as an app interface.
First use
First, once launched, the app (node):
- Allows the user to see configuration, search, file management, and share management (via http://127.0.0.1:port) or onion URI (http://v3onion.onion:port/manaement/token).
- Becomes a Tor middle relay
- Displays basic statistics and log output from various Tor services.
A node operator must know at least one other existing, online node (from friends, from a trusted clear-web website, from Reddit, etc). Every node that a user manually adds is considered a trusted node. Once a trusted node is added to the app, the user's local app will use Tor onion services to do two things:
- Check against the trusted node to see if its software version is newer. If newer, the local app will automatically track (tracker, aka redistribute), download, and seed the newer software. It will not automatically install, the user must initiate the update, and give the user the ability to manually validate checksums against the app maintainers website.
- Tracker data from the trusted node becomes searchable upon connecting to a node but it does not become a mirror of tracker data.
Web of trust
Connecting to a trusted node (one-degree of separation) can provide further access to the trusted nodes of the trusted node that the user connected to (two-degrees of separation). However, trust only works for one-degree of separation by default in order to minimize local performance issues. After adding a first node to trust (1deg), the user adding the node to trust can opt-in to adding up to N-degrees of separation for node trust. In other words: if a trusted node (1deg) trusts two nodes (2deg), the app user will in effect trust three total nodes. If the user allows up to three-degrees of separation for trust and the two (2deg) trusted nodes all trust two nodes (3deg), the user will in effect trust seven nodes (1 + 2 + 4). If any of those degrees-of-separation trusts 10,000 nodes, you can see how that might quickly overwhelm the user's local app, and is why they need to be careful about adding nodes to trust based on their hardware, software, and network limitations. Limiting searchable and shareable access to nodes via delegated trust also helps keep the network somewhat flat (prevents extreme bloating), while still allowing users to easily accses and share.
Leaching
A user wishing to download file data first requires mirroring the tracker data of the file a user wishes to download, further enhancing the distribution of tracker data. Once the tracker data is 100% mirrored, the source node then adds the onion service of user to their tracker table, and all nodes that trust and mirror that node then become aware of this new node and what files it is offering, but it is not trusted by any node.
Seeding
Something
UX
From the UI:
- Users (node operators) can point the app to any local file or folder that they wish to share.
- The applicaapption will automatically generate a tracker file for the user that gets self-hosted.
- The user will be required to input information about the file(s) they are about to share. Here is where there should be additional user education about not deanonymizing one's self, if applicable.
- After confirming the files to be shared, the app will automatically make the tracker data and file data ready to be shared (as a private tracker by default).
Private trackers
Being able to keep shared data limited (not publicly shared) is an important feature. By default, data that is ready to be shared will only be privately available. Meaning:
- the tracker for this data will effectively be a private tracker and a random onion URI (http://v3onion.onion:port/private/token) will be generated exclusively for this tracker and file.
- in order to share access to this file in its default state, a user must share the onion URI out-of-band from the application.
Public trackers
Being able to trivially share data with the whole world is also an important feature. Once data has been made available for private sharing, a user can opt-in to making it publicly available.
- With a single click, a user can convert something from a dedicated private tracker into a public share via a new onion URI (http://v3onion.onion:port/public/token).
- Sharing this onion URI with anyone, or with the general public, will allow any app user to access this user's public trackers and any publicly shared data.