@Macux 2017-08-21T12:31:33.000000Z 字数 9466 阅读 2914

反作弊研究

Mobvista

反作弊研究
- 1、第三方反作弊玩法
- 2、反作弊现状研究（paper）

1、第三方反作弊玩法

1.1 Adjust 反作弊玩法

Distribution Modeling examines the click-to-install time distribution to determine instances of fraud. If an install is tied to statistically abnormal click behavior, we ignore the attribution and instead attribute the install to the next-best tracker.
分布模型通过检验从点击到安装的分布，来判断样本的虚假性。如果一个安装行为被关联到的点击行为在统计意义上属于异常值，则该归因将被舍弃，然后把安装归因到次好的跟踪码。

Anonymous IP（匿名IP）：IP来自匿名Proxies。IP来自匿名VPN服务。

Too many engagement（互动度过高）：同一 device_Id 下多次点击广告。
（1）、如果被标记为高互动频率（high-level engagement）的归因带有相同的：
a) 跟踪码（Tracker ID）
b) 应用识别码（App Token）
c) 设备标签（Device tag）
则该归因将被弃，并把安装归因到次好的跟踪码（next-best tracker）。
（2）、如果是通过指纹匹配方式的归因，如果被标记高互动频率的归因带有相同的：
a) IP 地址（IP address）
b) 设备类型（Device type）
c) 设备名称（Device name）
d) OS 名称（OS name）
e) OS 版本（OS version）
则该归因将被弃，并把安装归因到次好的跟踪码（next-best tracker）。

1.2 Appsfly 反作弊玩法

DeviceRank
- Appsflyer会根据自己的数据库，将Device进行分级，其中，C级别是最差的，C级别设备上的安装，他们会直接过滤掉。
- 怎么做：使用一个庞大的高维特征集，来对一台设备进行评分。这里的特征集包括:
  （1）device details(设备基本信息)，OS and version number。
  （2）historical fraud modeling scoring（设备历史的作弊记录），output是一个score，可能是一个广义的score
  （3）IP address
  （4）geolocation（地理位置）
  （5）transactional data（交易数据）
  （6）verified transaction data （已经被证实的交易数据）
  （7）mean-time-to-install, MTTI（平均安装时间）
  （8）Click to Install Time, CTIT （从点击到安装的时间）
  （9）etc
规则识别
（1）Advanced Distribution Modeling （针对渠道）
- 使用一个很重要的指标“Click to Install Time (CTIT)”。
- In a fraud-free world, CTIT distribution models would look like this:
- Unreasonably short install times (under a few seconds) indicate install hijacking.（从点击到安装时间过短 -> fraud）
- Nearly uniform distribution (flat lines) over a number of days indicate click flooding.（在不同时间窗口的安装数趋于flat -> fraud，只有在click flooding 的情况下，才有可能导致：用户点击后N天才安装，都归因在这一个渠道，且非常flat。）
  
  如何甄别click flooding：
  (1). High click volume and low conversion rates suggest that clicks are being dramatically over-reported (click-flooding).
  (2). Long CTIT times indicate that installs are being attributed to random clicks from a click flood.
  (3). High contributor rates indicate that the source in question is dramatically over-reporting clicks.
（2）Mobile Install Fraud（针对设备） -- 对于我们意义不大
- 处理“新设备安装（不在当前数据库里的设备）”和“LAT设备安装”
- Large numbers of installs from new devices indicate a DeviceID reset marathon. Remember, only pre-install campaigns should have high New Device rates. So high New Device Install Rate (the ratio of new devices to total install) indicates fraud.新设备安装率（来自新设备的安装数 / 总安装数多高 -> fraud）
- High concentrations of installs with Limit Ad Tracking(LAT) enabled indicate install fraud.（来自LAT设备的安装数 / 总安装数多高 -> fraud）

1.3 Kochava 反作弊玩法

规则识别
一些重要的点：
- Mean time to install defines the time between click and install and varies by app and network.
  （1）Mapping apps have a very low MTTI – users download when they need it.
  （2）Games have a high MTTI – users tend to download with the intention of playing, but it can take days before they launch the app.
- The vast majority of installs happen within very close proximity to the attributed click. Kochava identifies statistically significant variances between the location of the click and that of the install. These variances may be leading indicators for inaccurate geo-targeting and fraud.（也许有相关性、也许没有。也许显著、也许不显著。 - -）
- Kochava identifies fraudulent or mistargeted clicks by looking for instances where the platform captured does not match the platform of the advertised app. For example, if an ad is for an iOS app but the click is from an Android device. This may indicate that bot farms are generating fraudulent traffic. It may also indicate poorly targeted traffic.（广告投放预设的设备类型和安装返回的设备类型有出入 -> fraud）
一些大家都知道的点
- Large Click Volumes From the Same IP Address.
- Large Click Volumes From the Same Device ID and indentical time stamp.

2、反作弊现状研究（paper）

2.1 fraud-whitepaper

4 basic forms of fraudulent activity and standard industry solutions

Faulty Targeting: generates clicks and installs from untargeted and unwanted users.
（1）、作弊形式：Faulty targeting refers the traffic that comes from mistargeted countries or device types.
（2）、反作弊：Attributions from only targeted countrie or device type can match, and any install from a user not matching the targeting criteria will not be attributed.

Automating User Activity: fakes installs on simulated devices.
（1）、作弊形式：Clicks,
installs, sessions, and in some occurrences even in-app user behavior are then triggered endlessly by server-side software.
（2）、反作弊： Any install coming from an IP address associated with the aforementioned services（VPN、常见IP段） should be rejected – blocking the faulty attribution before it happens.

Poaching Organic Installs with click spam and pre-loading ads.
（1）、作弊形式：Sending a myriad of background clicks for a multitude of offers from as many devices on the market as possible.（类似VBA）
（2）、反作弊：Creat an attribution model based on
gathered click-to-install data.【flat distribution -> fraud； exponential distribution -> genuine】

Faking SDK-Triggered Installs: driven via fraudulent HTTP calls
（1）、作弊形式：Track partners who aren't properly encrypting（加密） their data, and then use this information to spoof SDK-transmitted install data.

（2）、反作弊：Prevent tampered HTTP calls.

2.2 Mobile_Fraud_eBook

The main prevention methods include:

Active IP, UserAgent and deviceId filtering.

Distribution Modelling. Detecting anomalies such as MTTI, geographic distribution, click volume by IP address and deviceId, UserAgent versus IP benchmarks and more.

Device ranking.

install and in-app receipt validation. By connecting to the app store's servers to validate the legitimacy of an install or in-app purchase.

The following examples will help us open our eyes to potential threats:

IP-related

Large numer of clicks / installs / unique indentifiers from the same IP.

Different IP locations between the ad click and the install / first launch

Consistency/patterens

Click / install every 20 seconds

Players / users from a specific source always drop off at the exact same point in a game / app (eg. before a game tutorial, before a registration)

Large number of installs from the same device brand / model

DeviceId-related

Different identifiers for the same device.

Multiple IDFAs for a single IDFV(identifier for a vendor).

Performance-related

Sharp increase in install volume, a stark decline in day 1 retention.

Premium traffic performing like low quality traffic.

Suspiciously low pricing.

Extremely low conversation rates.

Extremely high uninstall rates.

Mismatchs:

App versions different than versions avaliable at the store.

Platform mismatches between ad click and install.

Geographic mismatches between ad click and install.

Other issues:

Appearance of GEOs not in included in targeting criteria

For in app events - if the value of the transaction does not exist in the app.

Device IDs increase at the same pattern.

Large volume of instals without data on carrier / city / country

2.3 applift-fraud-ebook-the-next-battleground

The Typology of Fraud
1. Compliance Fraud：Deceitful tactics that do not directly require any specific kind of technology but aim to exploit platform vulnerabilities.
2. Technical Fraud：Fraud committed through the use of technology to "game" the ad tech system.
Technical Fraud
1. Automatic Redirection：Fraudsters implement a click-tracking link in the impression pixel, so the click is triggered at the time when the banner is loaded, but not when the user actually clicks on the ad. If users decide not to download the app now, the device is already tagged by the tracking solution. Later on, if the user decides to download the app, the conversion will ultimately be attributed to the fraudster.
2. Ad Stacking： Fraudsters stack several invisible banners on top of each other, while only one of them can be effectively seen. Therefore, at the time of the click, it is sent not only to the original app page, but also to many different app pages as well.
3. Click Stuffing：Clicks are generated in the background of a device without the user noticing, but if they try to install an app, this conversion will be attributed to the malicious media source.
4. Click Injection：To perform this click injection, fraudsters firstly need the user to install a malicious software(恶意app) (usually disguised as a useful app) that allows to monitor device’s activity and detect when the user is about to install a new app. This trick is possible only on Android devices, where a malicious app can get permission to receive information about many actions performed on the device, in particular, about install of another app.
5. Bots/Emulator：Bots mimic human behavior in order to fake actions, leading to fraudulent impressions, clicks, installs, and post-install events. There are several ways to spot bots, but the best way is to analyze in-app behavior after installs. Most of the time, installs made by bots don't generate any actions within the app.
6. Undisclosed Incentivized Traffic：基于激励的行为会在获得奖励后，不再继续使用app，它们一般具有较低的留存率（low retention rate）
Fraud Detection
1. IP Filtering and Blocking. Fraudulent publishers that generate bot traffic tend to use different hosting solutions to deliver traffic, therefore those installs are usually manufactured from similar IPs.
2. Analysis of Devices. Publisher 5 will be marked as suspicious by our algorithm. The traffi c in this case comes only from devices with three different OS versions, which is very different than a range of OS versions from the rest of publishers.
3. Intraday Distribution of Installs（当天安装分布）. If we look at the traffic that was identified as fraud, we can see that there is a very flat distribution of installs with an abnormal spike at 3:00 AM.
4. Click-To-Install Time
  
  There are other issues that affect the ClickTo-install-Time
  (CTIT) distribution, such as the actual size of the app. The heavier the app, the more likely we will see a flatter distribution, since it takes more time for users to download and install the app. Because it takes a bit of time, users may not want to wait for an app to download, so they’ll continue with other activity and then open the app much later. This can lead to very long CTITs, which are perfectly normal. When evaluating these KPIs marketers need to be aware of these other influencing factors.（App安装包的大小也可能会导致CTIT曲线的长尾现象。）
  CTIT指标应该从多个时间维度来看：
  (a) 如果CTIT集中在头一个小时内，且并无明显的长尾现象，可以认为是fraud 的可能性较小。
  (b) 如果(a)满足，但是CTIT却集中在一个很短时间内（比如20秒），会是 fraud，这是现象是由于click injection导致的。
  (c) 以“天”看观察CTIT时，如果有明显的长尾现象（超过1天），会是fraud，这是由于click injection导致的。
Fraud Fighting Matrix

2.4 FraudShield

反作弊玩法
1. IP Filter
2. Incentivize Traffic
  
  该渠道在这个时间区间，出现激励流量的可能性很高（突增的转换率）。Incentivize Traffic属于fraud的一种，前文有说。
3. Click Spam
  
  正常流量的一日分布，不会是一个 flat distribution。
4. Fraud Profile
  
  在Mobile端，对应于：正常的流量，FingerPrint的信息应该是一个 balanced mix。
5. Goals Report
  
  设置一些只有真是流量才能实现的目标：比如将一个游戏玩到5级以上。

2.5 click-fraud-detection-on-advertiser-side

反作弊玩法
1. 使用机器学习的监督学习来判断反作弊点击（机器点击 or 真实点击），使用C4.5决策树作为分类器。
2. 特征集：