[关闭]
@zhongdao 2019-04-16T11:07:56.000000Z 字数 11650 阅读 5721

基于p2p网络的大文件分发方案调研汇总

未分类


前言

P2P架构比Client-Server架构快的原理对比图
image_1c621j0vb1dcv1imrnccp6v6o9.png-83.9kB

当有大的文件在一定时间内需要传输到很多接收端时, 采用ftp的中心化传输并不是明智的方案,比较高效的方法是采用p2p的分享模式。
客户端-服务器体系下,分发时间随着对等方数量线性增加
P2P体系下,分发时间随线性增长而趋向一个常数。

image_1c5ih79ql1d43pgo1pj7hio129iu.png-19.7kB

最为流行的P2P协议是BitTorrent,具有多客户端,图形界面的程序需要人手工操作,若是命令行程序则灵活得多,可以自动化处理一些任务。若是在Linux系统下,支持命令行的p2p客户端不多,ctorrent是其中一个。
由此引出了如下的DIY操作步骤。 若是直接使用,可以略过编译源码的部分,直接从提供的网址下载安装即可。

进行p2p文件分享时,如果采用里的bittracker时的网络拓扑图如下:
image_1c5ic6jjhb7b30s1q33dicn4n4u.png-11.7kB

最早知名的P2P系統可以追朔到1999-2000年, Naspter的出現.
image_1c70alab110brjjr1h761mmo15b99.png-19.8kB
下一個世代的P2P網路像是: Gnutella, Freenet 和Bittorrent這些以分散式為基礎的應用可以有效避免中央化存在的限制和問題發生.
image_1c70alhp5t9btao1ojp1j5kqt6m.png-22.4kB

Bittorrent 协议介绍

现有方案

阿里Dragonfly

内容
开源
服务端端口: 8001, 8002
客户端端口: 上传服务的端口是浮动的,不固定
协议: 使用的是http
互联网上,防火墙与路由器的私有网络环境支持: 蜻蜓并不适合这种场景
端口映射: 不确定
NAT Traversal支持: 不支持
支持最大文件: 支持10GB级大文件
客户端数量: 10万级规模以上传输

github地址:
https://github.com/alibaba/Dragonfly

tcp/udp 连接 与 nat Traversal原理

https://www.codeproject.com/Articles/1199384/NAT-traversal-for-Software-Developers

P2P通信基本原理与实现
https://www.pppan.net/blog/detail/2017-12-16-p2p-over-middle-box

Bittorrent and NAT Traversal

Q: I know about NAT traversal and about STUN, TURN and ICE and its use. I want to know whether these are implemented in peer to peer file sharing application like bittorrent. Whether trackers facilitate peers behind NATs to communicate with each other by helping in creating direct connection using STUN or relay through TURN. In the case of Distributed Hash Table(DHT) how one peer would communicate with other peer behind NAT ?

我知道NAT穿越和STUN,TURN和ICE及其用法。我想知道这些是否在Bittorrent等对等文件共享应用程序中实现。无论是通过帮助创建使用STUN的直接连接还是通过TURN进行中继,跟踪器都可以帮助NAT后面的对等方进行相互通信。在分布式散列表(DHT)的情况下,一个对等体将如何与NAT之后的其他对等体进行通信?

A: BitTorrent does not need to connect to any particular member in a swarm, it's not a p2p chat protocol where two specific end points want to talk to each other. All it cares about is that the connection graph of the swarm has a sufficiently high connectivity degree.

In other words, getting clients behind a NATs to talk to each other is somewhat desirable, but not to the point where major resources, such as traffic forwarding, would be expended on that goal. Failure is an option.

Thus it does not use sip/turn/etc.

Various clients use some combination of the following approaches to improve connectivity for the bulk transport connections:

PCP, NAT-PMP or UPnP-IGD negotiation with the gateway
port reuse socket options to exploit end-point independent (EIM) NAT mappings
the largely undocumented ut_holepunch extension that uses mutually reachable swarm members in place of stun servers.
an optional UDP-based transport protocol (µTP) that can be used in combination with the previous points. generally nat traversal is easier to achieve with udp
IPv6 capability signalling, which in principle allows clients to upgrade their connections and then gossip about v6 peers via PEX/DHT.
In the case of the DHT only the first two points (gateway negotiation and port reuse) are used. The overhead of attempting nat traversal for a single request-reply cycle would be >100% and is not worth it.

BitTorrent不需要连接到群中的任何特定成员,它不是一个p2p聊天协议,两个特定的终端需要彼此交谈。它所关心的是群体的连接图具有足够高的连通度。

换句话说,让客户支持NAT来互相交谈是有点可取的,但是并不是为了达到这个目标而将主要资源(例如流量转发)花费在这个点上。失败是一种选择。

因此它不使用sip / turn / etc。

各种客户使用以下方法的一些组合来改善批量传输连接的连接性:

PCP,NAT-PMP或UPnP-IGD与网关协商
端口重用套接字选项来利用端点独立(EIM)NAT映射
这个基本上没有记录的ut_holepunch扩展使用可互相访问的swarm成员来代替stun服务器。
一个可选的基于UDP的传输协议(μTP),可以结合前面的要点使用。通常nat遍历更容易用udp实现
IPv6能力信令,原则上允许客户升级他们的连接,然后通过PEX / DHT关于v6对等方闲聊。
在DHT的情况下,只使用前两个点(网关协商和端口重用)。尝试nat遍历单个请求 - 回复循环的开销将> 100%,不值得。

大多数Bittorrent不支持NAT Traversal的原因:

https://stackoverflow.com/questions/37367769/how-nat-traversal-works-in-case-of-peer-to-peer-protocols-like-bittorrent

https://news.ycombinator.com/item?id=8177353

11
down vote
accepted
BitTorrent does not need to connect to any particular member in a swarm, it's not a p2p chat protocol where two specific end points want to talk to each other. All it cares about is that the connection graph of the swarm has a sufficiently high connectivity degree.

In other words, getting clients behind a NATs to talk to each other is somewhat desirable, but not to the point where major resources, such as traffic forwarding, would be expended on that goal. Failure is an option.

Thus it does not use sip/turn/etc.

Various clients use some combination of the following approaches to improve connectivity for the bulk transport connections:

PCP, NAT-PMP or UPnP-IGD negotiation with the gateway
port reuse socket options to exploit end-point independent (EIM) NAT mappings
the largely undocumented ut_holepunch extension that uses mutually reachable swarm members in place of stun servers.
an optional UDP-based transport protocol (µTP) that can be used in combination with the previous points. generally nat traversal is easier to achieve with udp
IPv6 capability signalling, which in principle allows clients to upgrade their connections and then gossip about v6 peers via PEX/DHT.
In the case of the DHT only the first two points (gateway negotiation and port reuse) are used. The overhead of attempting nat traversal for a single request-reply cycle would be >100% and is not worth it.

p2p面临的问题解决办法

其他方案:
* 采用基于udp的 webcrt
* 采用

端口转发与NAT

在计算机网络中,端口转发(Port Forwards)或端口映射(Port Mapping)是网络地址转换(NAT)的一种应用,在数据包穿越网络网关(如路由器或防火墙)时将通信请求从一个地址和端口号组合重定向到另一个地址。此技术最常用于通过将通信的目标IP地址和端口号重新映射到网关对端(外部网络)上的主机,使位于受保护的内部主机网络上的主机上的服务可用。

然而由于现在基于ipv4地址的互联网,大部分的办公网,家庭网络都位于NAT路由器或防火墙后面,使用的是私有IP地址,进行P2P的通讯和数据传输时,若是能够设置内网的NAT,直接做端口映射,使得内网的P2P软件监听的端口可以被访问到。自动设置端口映射有uPnP的协议,需要应用程序和路由器都支持。若是遇到多级的NAT时,这个办法有可能不支持。
If your NAT box supports NAT-PMP or uPNP, you could use that. The simplest way would be to create a cgo binding to libminiupnpc, or you could go fishing for a native Go library.

If your NAT doesn't support either of these protocols, then you're probably out of luck, as hole punching is a fragile and difficult technique that you will probably find difficult to implement.

NAT 穿透

那么有没有不需要设置路由器的方案呢?这就必然面临NAT Traversal穿越的问题。方法有很多,由于UDP的特点,打洞穿越比TCP要容易些,同时根据NAT设备类型的不同,由此也产生了相关如STUN,TURN,ACE等技术,,这些都需要客户端应用程序的支持,也需要有中间公网服务器进行转接。
起初的穿越需要中间公网服务器的支持, 后继的数据传输,有的是不经过服务器,有的是需要服务器进行中继的传输,占用带宽。

对于中继模式,由于流量需要经过服务器转发,所以传输速度的快慢取决于服务器的下行带宽和客户端的上行带宽,通常宽带的上行带宽较低,限制了出口的速度。

若是应用程序本身支持能力不强,那么快速的解决方案是使用内网穿透工具,建立公网到内网的隧道,ssh本身的反向代理就可以做,来代替ngrok,frp等工具的作用。frp本身也开始支持p2p的模式。可以试验一下。

以長遠計劃來說, 以TCP/UDP為基礎的RPC和檔案傳輸協議是最終目標, 並且支援STUN和TURN伺服器來處理開放IP/port以及外來的連接. 但是在此專案的實作範疇中, 我們需要一個快速繞過NAT以及Firewalls的方式. 若应用程序不支持,那么就需要用内网穿透的工具如ngrok, frp等工具实现。
若中心的服务器只是起起始连接作用,后继由peer和peer直接沟通,则不存在带宽限制,若全部经由中心服务器作隧道进行转发,则会受中心服务器的带宽和计算资源的限制。 frp支持p2p mode, 可以达成我们的想法。

p2p: Bittorrent

整体动画演示分发过程:
http://mg8.org/processing/bt.html

下载过程示例:
文件分块下载的图形展示:
image_1c64jr03qhhc16291fopoohqv09.png-76.2kB
一个文件被分成很多块,下载时同时进行块的下载。

节点连接和下载情况:
image_1c64k7fi21rpgol0okei16vrh1m.png-71.7kB
会从每个具有这个文件的节点处进行下载。

Bittorrent 的开源替代品 Syncthing

Syncthing 最大的特色是采用了与 Resilio Sync (BitTorrent Sync) 类似的 P2P 分布式技术,无需中心服务器,即可让多台设备互相实时同步文件。
Syncthing 支持文件版本控制 (File Versioning) 功能,会自动保留文件的历史版本,每次文件在变化之后都会增加一个版本 (总数量可以由你设置),一旦出现文件被删除、丢失、文件内容被替换、同步出错等情况,用户均能从该文件的历史版本中选一个恢复回来。
Syncthing 免费且开源,跨平台支持 Windows、Mac、Linux、Android 等主流平台,除了 PC、手机以外,在部分路由器、树莓派等硬件上都能轻松运行,它将以网页版的形式呈现,并且 Syncthing 还提供了中文界面的支持。

基于webrtc的 bittorent

Intro to BitTorrent and WebTorrent - JSConf.asia 2014
https://www.youtube.com/watch?v=Fx-AsXMZyfc
WebTorrent: How I built a BitTorrent client in the browser
https://www.youtube.com/watch?v=3w_6dfqrpzk

参考资料

p2p

P2P对等网络技术原理整合
http://blog.csdn.net/EricFantastic/article/details/49582731

P2P分布式网络简史
https://segmentfault.com/a/1190000011919321

A Brief History of P2P Content Distribution, in 10 Major Steps
https://medium.com/paratii/a-brief-history-of-p2p-content-distribution-in-10-major-steps-6d6733d25122

Xorro P2P How we built a BitTorrent-like P2P network from scratch
https://xorro-p2p.github.io/

rsync

高效同步数据的方法及效率测试--边打包边压缩边传输边解压20150105
http://blog.csdn.net/xuyaqun/article/details/42422791

http://www.beyondoracle.com/2008/09/20/sync-backups-between-multiple-servers/

HOW TO: SPEED UP FILE TRANSFERS IN LINUX USING RSYNC WITH GNU PARALLEL
http://www.yourownlinux.com/2015/04/speed-up-file-transfers-using-rsync-with-gnu-parallel.html

murder

https://github.com/lg/murder

syncthing

code:
https://github.com/syncthing/syncthing
doc:
https://docs.syncthing.net/intro/getting-started.html

setup

Starting Syncthing Automatically
https://docs.syncthing.net/users/autostart.html
Syncthing的安装与使用
https://www.jianshu.com/p/4235cc85c32d
Setting up Syncthing on an Ubuntu 16.04 server
http://drup.org/setting-syncthing-ubuntu-1604-server

tools

A command line tool for the Syncthing API.
https://github.com/classicsc/syncthingmanager

Bittorrent

https://en.wikipedia.org/wiki/Comparison_of_BitTorrent_clients

https://en.wikipedia.org/wiki/Comparison_of_file-sharing_applications

How BitTorrent finds torrents
http://blog.bittorrent.com/2015/10/29/how-bittorrent-finds-torrents/

BitTorrent DHT 协议中文翻译
https://segmentfault.com/a/1190000002528378

基于 DHT 网络的磁力链接和BT种子的搜索引擎架构
https://segmentfault.com/a/1190000002528510

Bittorrent优化传输

BitTorrent optimization and troubleshooting guide
https://ubuntuforums.org/showthread.php?t=1259923

How to Make the Best Torrents
https://torrentfreak.com/how-to-make-the-best-torrents-081121/

一起来聊聊如何提高BT下载速度(以迅雷极速版为例)
http://www.dkys.org/archives/1377.html

UPNP功能介绍和危害
http://www.dkys.org/archives/1298.html

Resilio

https://www.resilio.com/

The Fastest Way to Move Files
https://www.resilio.com/tech/p2p-is-faster/

Speed Calculator/ File Transfer Time
https://www.resilio.com/speed-calculator/

内网NAT等相关问题

内网穿透、远程控制、端口映射,八种方法汇总
https://zhuanlan.zhihu.com/p/26147793

NAT穿透(UDP打洞)
https://blog.csdn.net/qq_23928491/article/details/86747862

Udp打洞,判断Nat网络类型-一种基于UDP协议实现P2P智能穿越NAT的方案
https://blog.csdn.net/jilijelon/article/details/9361475

p2p_tun 一种基于kcptun的p2p双向代理软件
https://blog.csdn.net/qq_23928491/article/details/86747862

Ngrok

ngrok内网穿透原理分析和实现
http://www.dkys.org/archives/92.html

利用开源的Ngrok 配置属于自己的免费外网域名
http://blog.csdn.net/weixin_36065510/article/details/52904509

一分钟实现内网穿透(ngrok服务器搭建)
http://blog.csdn.net/zhangguo5/article/details/77848658

FRP

frp官方文档:
https://github.com/fatedier/frp/blob/master/README_zh.md

IT男的VPS系列教程 篇一:内网穿透(Frp)-拯救没有公网IP的你
https://post.smzdm.com/p/566063/

推荐一款很好用的内网穿透工具--FRP
https://mp.weixin.qq.com/s?__biz=MzI3MTI2NzkxMA==&mid=2247485670&idx=1&sn=df62f2df93f112a7bc0b8d7e843bbc16&chksm=eac529cfddb2a0d9b0fb22324f3eaf5cffeb8e0a56d16efb87ad97d3cca6479e96e12c68eb88&mpshare=1&scene=23&srcid=0131VDjF0WIqduxU6j5sc9sg#rd

SSH port forwarding

SSH反向隧道进行内网穿透
http://blog.csdn.net/lidongshengajz/article/details/73482908

SSH端口转发笔记(ipv6 与 端口映射)
https://www.simongong.net/ssh-e7-ab-af-e5-8f-a3-e8-bd-ac-e5-8f-91-e7-ac-94-e8-ae-b0ipv6-e4-b8-8e-e7-ab-af-e5-8f-a3-e6-98-a0-e5-b0-84/

OpenSSH/Cookbook/Proxies and Jump Hosts
https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Proxies_and_Jump_Hosts#Jump_Hosts_--_Passing_Through_a_Gateway_or_Two

SSH原理与运用(二):远程操作与端口转发
http://www.ruanyifeng.com/blog/2011/12/ssh_port_forwarding.html

SSL/TLS

SSL/TLS协议运行机制的概述
http://www.ruanyifeng.com/blog/2014/02/ssl_tls.html

图解SSL/TLS协议
http://www.ruanyifeng.com/blog/2014/09/illustration-ssl.html

NAT Traversal

NAT技术基本原理与应用
https://www.cnblogs.com/dongzhuangdian/p/5105844.html
内网穿透工具的原理与开发实战
http://blog.csdn.net/moshenglv/article/details/78789019

Session Traversal Utilities for NAT (STUN) is a standardized protocol for such address discovery including NAT classification. Traversal Using Relays around NAT (TURN) places a third-party server to relay messages between two clients when direct media traffic between peers is not allowed by a firewall.

Interactive Connectivity Establishment
https://en.wikipedia.org/wiki/Interactive_Connectivity_Establishment

uTorrent | Setting up your network [and some advanc... http://help.utorrent.com/customer/en/portal/topics/76901-setting-up-your-network-and-some-advanced-setup-tips-/articles

Implementing a TCP Hole Punching NAT Traversal Solution for P2P Applications Using Netty
http://www.cs.stir.ac.uk/courses/ITNP99/PastDissertations/2009-2010/Dissertations/HaddadZ.pdf

TCP hole punching
https://en.wikipedia.org/wiki/TCP_hole_punching

Reliable p2p network connections in Rust with NAT traversal. One of the most needed libraries for any server-less, decentralised project.
https://github.com/maidsafe/crust

A New Method for Symmetric NAT Traversal in UDP and TCP
https://github.com/maidsafe/crust

DHT-based NAT Traversal
http://docs.maidsafe.net/Whitepapers/pdf/DHTbasedNATTraversal.pdf

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注