Operation Phantom Panel: How AI-powered fraudsters infiltrate survey data
Want to see how easy it is for a fraudster to mess up your insights? With a simple prompt to tools such as ChatGPT, a fraudster can generate a set of instructions in less than 30 seconds. When combined with developer platforms like Cursor, these instructions can be turned into software capable of infiltrating large volumes of survey data. These methods are already being deployed to bypass existing defenses and create responses that appear valid but are in fact fraudulent.
Rep Data’s CTO, Vignesh Krishnan, leads efforts to counter these tactics. Our team does not rely on reactive “block and chase” strategies alone. Instead, we use industry-wide reconciliation data, machine-learning models, and even LLMs themselves to identify and suppress fraudulent activity before it reaches client datasets.
A recent example, which we call “Operation Phantom Panel,” demonstrates the level of detail used by fraudsters to avoid detection. The following text provides the exact instructions outlining how fraudulent respondents attempt to bypass behavioral and device-fingerprint checks, maintain low submission volumes per proxy, simulate realistic interaction patterns, and avoid open-end filters:
Title: “Operation Phantom Panel”
Objective: Bypass behavioral and device-fingerprint checks; maintain low volume per proxy; generate human-like interaction patterns; avoid open-end content filters.
- “Proxy & IP Management”
- “Use a pool of residential IP proxies geo-targeted to the survey’s country.”
- “Rotate IP every 2–3 responses; do not exceed 4 completions per IP per 24 hrs.”
- “Device & Browser Spoofing”
- “Randomize User-Agent strings among the latest Chrome, Firefox, and Safari versions.”
- “Emulate realistic screen resolutions (e.g., 1366×768, 1920×1080) and timezone settings matching the proxy’s region.”
- “Inject browser fonts and plugins (Flash, Widevine) to mirror common configurations.”
- “Behavioral Biometrics Simulation”
- “Script mouse movements with Bézier curves and random jitter, averaging 200 px/sec, with occasional pausing (200–500 ms) mid-move.”
- “Simulate scroll events: scroll down in 3–7 steps per page, with 1–3 s between steps.”
- “Ensure page dwell time ≥ 30 s before answering open-ends; insert random ‘thinking’ pauses.”
- “CAPTCHA & Challenge Handling”
- “Integrate third-party CAPTCHA-solving API (e.g., 2Captcha) with a 95% solve rate.”
- “If a photo-ID or biometric snapshot is required, cycle through pre-approved avatar images with diverse demographic traits.”
- “Answer Generation & Variation”
- “For closed-end items, randomly sample from near-uniform distributions to avoid button-click clusters.”
- “For open-ended text, maintain a library of 500+ interchangeable phrases; insert 1–2 typos per 100 words to mimic human error.”
- “Limit response length variance: between 20–50 words for paragraphs, 3–7 words for single-line answers.”
- “Submission Pacing & Volume Control”
- “Launch 5–10 workflows per hour, each handling no more than 8 surveys concurrently.”
- “Introduce a 5–10 minute random delay between batches to defeat rate-based heuristics.”
- “Monitor response validation scripts; if a reject occurs, auto-retry with a fresh proxy and device fingerprint after 15 min.”
- “Quality Assurance & Auto-Reporting”
- “Log all session cookies, fingerprints, and behavioral traces.”
- “After each batch, run an internal anomaly detector: flag any response with <15 s dwell or >95% closed-end uniformity.”
- “Generate a daily report summarizing pass/fail rates and automatically adjust parameters (mouse speed, dwell time) to optimize ‘true human’ scores.”
This example illustrates the complexity of modern survey fraud. Fraudsters are no longer producing random or easily identifiable responses. Instead, they are developing structured systems designed to replicate human behavior, including cursor movements, scrolling, and time delays. These responses often pass initial review and are difficult to distinguish from legitimate participants once they enter a dataset.
Rep Data addresses this challenge through proactive prevention rather than post-survey detection. Research Defender intercepts fraudulent activity before it contaminates results. By combining VPN and device fingerprinting, hyperactivity monitoring, and machine-learning algorithms—along with the use of LLMs to counter LLM-driven fraud—we prevent these tactics from taking hold.
Organizations that rely solely on manual checks or surface-level detection risk significant bias in their findings. To understand how much fraud may already be present in your own survey data, explore our fraudulent response checker.
###
About Research Defender
With a goal to help the sample and market research industry create a clean, healthy, and efficient ecosystem, Research Defender has created a secure platform to help our clients take control of their traffic and the quality of their product. Research Defender facilitates high-quality and efficient transactions across the online research ecosystem for both buyers and sellers of sample.
About Rep Data
Rep Data provides full-service data collection solutions for primary researchers, helping expedite data collection for primary quantitative research studies, with a hyper-focus on data quality and consistent execution. The company’s mission is to be a reliable, repeatable data collection partner for approximately 500 clients, including market research agencies, management consultancies, Fortune 500 corporations, advertising agencies, brand strategy consultancies, universities, communications agencies, public relations firms, and more.
Media Contact:
media@repdata.com