@zhongdao 2018-11-21T10:46:15.000000Z 字数 18635 阅读 2023


Bittorrent 协议 翻译

原文网址: http://www.bittorrent.org/beps/bep_0003.html

BEP: 3
Title: The BitTorrent Protocol Specification
Version: 0e08ddf84d8d3bf101cdf897fc312f2774588c9e
Last-Modified: Sat Feb 4 12:58:40 2017 +0100
Author: Bram Cohen <bram@bittorrent.com>
Status: Final
Type: Standard
Created: 10-Jan-2008
Post-History: 24-Jun-2009 (arvid@bittorrent.com), clarified the encoding of strings in torrent files. 20-Oct-2012 (arvid@bittorrent.com), clarified that info-hash is the digest of en bencoding found in .torrent file. Introduced some references to new BEPs and cleaned up formatting. 11-Oct-2013 (arvid@bittorrent.com), correct the accepted and de-facto sizes for request messages 04-Feb-2017 (the8472.bep@infinite-source.de), further info-hash clarifications, added resources for new implementors

BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integrate seamlessly with the web. Its advantage over plain HTTP is that when multiple downloads of the same file happen concurrently, the downloaders upload to each other, making it possible for the file source to support very large numbers of downloaders with only a modest increase in its load.


A BitTorrent file distribution consists of these entities:


There are ideally many end users for a single file.

To start serving, a host goes through the following steps:


  1. Start running a tracker (or, more likely, have one running already). 开始运行跟踪器(或者,更可能是已经运行了一个)
  2. Start running an ordinary web server, such as apache, or have one already. 开始运行一个通常的Web服务,例如apache,或已经开始运行。
  3. Associate the extension .torrent with mimetype application/x-bittorrent on their web server (or have done so already). 在Web服务上将后缀.torrent与mimetype application / x-bittorrent 关联。
  4. Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker. 使用要提供的完整文件和跟踪器的URL生成元信息(.torrent)文件。
  5. Put the metainfo file on the web server. 将metainfo文件放在Web服务器上。
  6. Link to the metainfo (.torrent) file from some other web page. 从其他网页链接链接到metainfo(.torrent)文件。
  7. Start a downloader which already has the complete file (the 'origin'). 启动一个已经拥有完整文件的下载程序。

To start downloading, a user does the following:


  1. Install BitTorrent (or have done so already). 安装BT,(或已经这样做了)
  2. Surf the web. 浏览网页
  3. Click on a link to a .torrent file. 点击指向.torrent文件的链接。
  4. Select where to save the file locally, or select a partial download to resume. 选择本地保存文件的位置,或者选择要回复的部分下载。
  5. Wait for download to complete. 等待下载完成。
  6. Tell downloader to exit (it keeps uploading until this happens). 告诉下载程序退出(它会一直上传,直到发生这种情况)



metainfo files 元信息文件

Metainfo files (also known as .torrent files) are bencoded dictionaries with the following keys:
- announce 宣布

The URL of the tracker. 跟踪器的URL。

All strings in a .torrent file that contains text must be UTF-8 encoded.

info dictionary 信息词典

The name key maps to a UTF-8 encoded string which is the suggested name to save the file (or directory) as. It is purely advisory.

piece length maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated. piece length is almost always a power of two, most commonly 2 18 = 256 K (BitTorrent prior to version 3.2 uses 2 20 = 1 M as default).
'piece length'映射到文件被分割成的每个片段中的字节数。出于传输的目的,文件被分成固定大小的片段,除了可能被截断的最后一个片段之外,它们的长度都相同。片长‘piece length'几乎总是2的幂,最常见的是2 18 = 256 K(版本3.2之前的BitTorrent默认使用2 20 = 1 M)。

pieces maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index.
’pieces' 映射到长度为20的倍数的字符串。它被细分为长度为20的字符串,每个字符串是相应索引处的片段的SHA1哈希值

There is also a key length or a key files, but not both or neither. If length is present then the download represents a single file, otherwise it represents a set of files which go in a directory structure.

In the single file case, length maps to the length of the file in bytes.

For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files maps to, and is a list of dictionaries containing the following keys:

length - The length of the file, in bytes. 文件的长度,以字节为单位。

path - A list of UTF-8 encoded strings corresponding to subdirectory names, the last of which is the actual file name (a zero length list is an error case). 对应于子目录名称的UTF-8编码字符串列表,其中最后一个是实际文件名(零长度列表是错误情况)。

In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory. 在单个文件的情况下,name键是文件的名称,在多文件情况下,是目录名称。


Tracker GET requests have the following keys: 跟踪器GET请求具有以下键:

Tracker responses are bencoded dictionaries. If a tracker response has a key failure reason, then that maps to a human readable string which explains why the query failed, and no other keys are required. Otherwise, it must have two keys: interval, which maps to the number of seconds the downloader should wait between regular rerequests, and peers. peers maps to a list of dictionaries corresponding to peers, each of which contains the keys peer id, ip, and port, which map to the peer's self-selected ID, IP address or dns name as a string, and port number, respectively. Note that downloaders may rerequest on nonscheduled times if an event happens or they need more peers.
跟踪器的响应是bencoded词典。如果跟踪器响应具有一个键’failure reason',则映射到人类可读的字符串,该字符串解释了查询失败的原因,并且不需要其他keys。否则,它必须有两个键:'interval',它映射到下载器在常规请求和peer之间应该等待的秒数。对等体'peers'映射到对等体对应的字典列表,每个字典包含对等体ID 'peer id','ip'和 端口'port',它分别映射到对等方的自选ID,IP地址或dns名称作为字符串,以及端口号。请注意,如果事件event发生或需要更多peers,下载程序可能会在非计划时间重新请求。

More commonly is that trackers return a compact representation of the peer list, see BEP 23.
更常见的是跟踪器返回对等列表的紧凑表示,参见BEP 23。

If you want to make any extensions to metainfo files or tracker queries, please coordinate with Bram Cohen to make sure that all extensions are done compatibly.
如果您想对元信息文件或跟踪器查询进行任何扩展,请与Bram Cohen协调以确保所有扩展都兼容完成。

It is common to announce over a UDP tracker protocol as well.

peer protocol 对等协议

BitTorrent's peer protocol operates over TCP or uTP.

Peer connections are symmetrical. Messages sent in both directions look the same, and data can flow in either direction.

The peer protocol refers to pieces of the file by index as described in the metainfo file, starting at zero. When a peer finishes downloading a piece and checks that the hash matches, it announces that it has that piece to all of its peers.

Connections contain two bits of state on either end: choked or not, and interested or not. Choking is a notification that no data will be sent until unchoking happens. The reasoning and common techniques behind choking are explained later in this document.

Data transfer takes place whenever one side is interested and the other side is not choking. Interest state must be kept up to date at all times - whenever a downloader doesn't have something they currently would ask a peer for in unchoked, they must express lack of interest, despite being choked. Implementing this properly is tricky, but makes it possible for downloaders to know which peers will start downloading immediately if unchoked.
只要一方感兴趣而另一方没有choking,就会进行数据传输。Interest状态必须始终保持 - 每当一个下载器没有什么东西他们会要求未阻塞unchoked的peer,他们必须表示缺乏兴趣interest,尽管被阻塞choked。正确地实现这一点是很棘手的,但它可以让下载者知道,如果没有阻塞unchoked,哪些下载伙伴将立即开始下载。

Connections start out choked and not interested.

When data is being transferred, downloaders should keep several piece requests queued up at once in order to get good TCP performance (this is called 'pipelining'.) On the other side, requests which can't be written out to the TCP buffer immediately should be queued up in memory rather than kept in an application-level network buffer, so they can all be thrown out when a choke happens.

The peer wire protocol consists of a handshake followed by a never-ending stream of length-prefixed messages. The handshake starts with character ninteen (decimal) followed by the string 'BitTorrent protocol'. The leading character is a length prefix, put there in the hope that other new protocols may do the same and thus be trivially distinguishable from each other.
对等线协议包括握手,后跟永不停止的长度前缀消息流。握手以字符19(十进制)开头,后跟字符串'BitTorrent protocol'。前导字符是长度前缀,放在那里,希望其他新协议可以做同样的事情,因此可以在很小的方面彼此区分。

All later integers sent in the protocol are encoded as four bytes big-endian.

After the fixed headers come eight reserved bytes, which are all zero in all current implementations. If you wish to extend the protocol using these bytes, please coordinate with Bram Cohen to make sure all extensions are done compatibly.
在固定头之后有八个保留字节,在所有当前实现中都是零。如果您希望使用这些字节扩展协议,请与Bram Cohen协调以确保所有扩展都兼容。

Next comes the 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. (This is the same value which is announced as info_hash to the tracker, only here it's raw instead of quoted here). If both sides don't send the same value, they sever the connection. The one possible exception is if a downloader wants to do multiple downloads over a single port, they may wait for incoming connections to give a download hash first, and respond with the same one if it's in their list.

After the download hash comes the 20-byte peer id which is reported in tracker requests and contained in peer lists in tracker responses. If the receiving side's peer id doesn't match the one the initiating side expects, it severs the connection.
在下载散列之后出现20字节的对等peer id,该对等ID在跟踪器请求中报告并包含在跟踪器响应中的peer 列表中。如果接收方的对等方peer ID与发起方期望的对等方peer ID不匹配,则会切断连接。

That's it for handshaking, next comes an alternating stream of length prefixes and messages. Messages of length zero are keepalives, and ignored. Keepalives are generally sent once every two minutes, but note that timeouts can be done much more quickly when data is expected.

peer messages

All non-keepalive messages start with a single byte which gives their type.

The possible values are:

'choke', 'unchoke', 'interested', and 'not interested' have no payload.
'choke','unchoke','interest'和'not interested'都没有有效载荷。

'bitfield' is only ever sent as the first message. Its payload is a bitfield with each index that downloader has sent set to one and the rest set to zero. Downloaders which don't have anything yet may skip the 'bitfield' message. The first byte of the bitfield corresponds to indices 0 - 7 from high bit to low bit, respectively. The next one 8-15, etc. Spare bits at the end are set to zero.
'bitfield'只作为第一个消息发送。它的有效负载是一个位字段,下载器发送的每个索引都设置为1,其余的都设置为0。没有任何内容的下载程序可能会跳过“bitfield”消息。位字段的第一个字节分别对应于从高比特到低比特的索引0 - 7。下一个8-15等等。末尾的备用位被设为零。

The 'have' message's payload is a single number, the index which that downloader just completed and checked the hash of.

'request' messages contain an index, begin, and length. The last two are byte offsets. Length is generally a power of two unless it gets truncated by the end of the file. All current implementations use 2^14 (16 kiB), and close connections which request an amount greater than that.
'request'消息包含索引,开头和长度。最后两个是字节偏移。长度通常是2的幂,除非它在文件末尾被截断。所有当前实现都使用2 ^ 14(16 kiB),并且请求大于该值的关闭连接。

'cancel' messages have the same payload as request messages. They are generally only sent towards the end of a download, during what's called 'endgame mode'. When a download is almost complete, there's a tendency for the last few pieces to all be downloaded off a single hosed modem line, taking a very long time. To make sure the last few pieces come in quickly, once requests for all pieces a given downloader doesn't have yet are currently pending, it sends requests for everything to everyone it's downloading from. To keep this from becoming horribly inefficient, it sends cancels to everyone else every time a piece arrives.

'piece' messages contain an index, begin, and piece. Note that they are correlated with request messages implicitly. It's possible for an unexpected piece to arrive if choke and unchoke messages are sent in quick succession and/or transfer is going very slowly.
'piece'消息包含索引,开头和片段。请注意,它们隐式与请求消息相关联。如果快速连续发送阻塞和取消消息, 或传输速度非常慢,则可能会出现意外的部分。

Downloaders generally download pieces in random order, which does a reasonably good job of keeping them from having a strict subset or superset of the pieces of any of their peers.

Choking is done for several reasons. TCP congestion control behaves very poorly when sending over many connections at once. Also, choking lets each peer use a tit-for-tat-ish algorithm to ensure that they get a consistent download rate.

The choking algorithm described below is the currently deployed one. It is very important that all new algorithms work well both in a network consisting entirely of themselves and in a network consisting mostly of this one.

There are several criteria a good choking algorithm should meet. It should cap the number of simultaneous uploads for good TCP performance. It should avoid choking and unchoking quickly, known as 'fibrillation'. It should reciprocate to peers who let it download. Finally, it should try out unused connections once in a while to find out if they might be better than the currently used ones, known as optimistic unchoking.

The currently deployed choking algorithm avoids fibrillation by only changing who's choked once every ten seconds. It does reciprocation and number of uploads capping by unchoking the four peers which it has the best download rates from and are interested. Peers which have a better upload rate but aren't interested get unchoked and if they become interested the worst uploader gets choked. If a downloader has a complete file, it uses its upload rate rather than its download rate to decide who to unchoke.

For optimistic unchoking, at any one time there is a single peer which is unchoked regardless of its upload rate (if interested, it counts as one of the four allowed downloaders.) Which peer is optimistically unchoked rotates every 30 seconds. To give them a decent chance of getting a complete piece to upload, new connections are three times as likely to start as the current optimistic unchoke as anywhere else in the rotation.

Resources 资源

Copyright 版权

This document has been placed in the public domain.


bittorrent协议简介 (有图,建议看看)