{"id":997291,"date":"2023-05-26T15:46:00","date_gmt":"2023-05-26T07:46:00","guid":{"rendered":"https:\/\/geetests.com\/article\/aigc-bot-mitigation-exploration"},"modified":"2025-09-12T18:12:24","modified_gmt":"2025-09-12T10:12:24","slug":"aigc-bot-mitigation-exploration","status":"publish","type":"post","link":"\/en\/article\/aigc-bot-mitigation-exploration","title":{"rendered":"GeeTest&#8217;s AIGC Journey: Reshaping CAPTCHA Verification and Bot Mitigation"},"content":{"rendered":"<div class=\"vgblk-rw-wrapper limit-wrapper\"><span class=\"ql-size-14px\">Since GeeTest pioneered the new generation of intelligent CAPTCHA in 2013, we&#8217;ve been at the forefront of evolving and enhancing bot detection &amp; mitigation technologies. Adopted by nearly 400,000 developers worldwide over the past decade, our ground-breaking &#8220;Behavioral Verification&#8221; has not only improved user experience but has also fortified the battle against bot-generated fraudulent activities. Leading companies such as Binance, Imperva, miHoYo, and Agoda have integrated our CAPTCHA into their systems, transitioning away from traditional verification methods.<\/span><\/p>\n<p><span class=\"ql-size-14px\">In this ever-advancing cybersecurity landscape, we at GeeTest have consistently pushed boundaries, exploring and integrating cutting-edge technologies. From the early adoption of <\/span><a class=\"ql-size-14px\" href=\"https:\/\/en.wikipedia.org\/wiki\/Neural_style_transfer\" target=\"_blank\" rel=\"noopener noreferrer\">Neural Style Transfer<\/a><span class=\"ql-size-14px\"> technology in 2016, we&#8217;ve sought the perfect balance between user experience and security.<\/span><\/p>\n<p class=\"ql-align-center\"><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/neural-style-transfer.png\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">The working mechanism of Neural Style Transfer (<\/span><a style=\"color: #8f959e;\" href=\"https:\/\/arxiv.org\/pdf\/1508.06576.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">Image Source<\/a><span style=\"color: #8f959e;\">)<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/geetest-semantic-understanding.png\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">GeeTest&#8217;s innovation integrating &#8220;semantic understanding&#8221; with &#8220;Neural Style Transfer&#8221;, protected by patent ZL 201830130077.X (Source: GeeTest)<\/span><\/p>\n<p><span class=\"ql-size-14px\">Today, as the realm of AI-Generated Content (AIGC) &#8211; including Text-to-Image models, Large-scale Text-to-Image Generation Models (LTGMs), Large Language Models (LLMs), and beyond &#8211; continues to evolve, we seize every opportunity to learn, experiment, and assimilate these advanced technologies to further enhance our bot protection strategies.<\/span><\/p>\n<p><span class=\"ql-size-14px\">This article takes you behind the scenes of our exploration into these generative AI technologies, offering insights into our process, progress, and intriguing experimental results.<\/span><\/p>\n<h2><strong>Understanding Text-to-Image Models<\/strong><\/h2>\n<p><span class=\"ql-size-14px\">Text-to-image models represent a pioneering approach within Multimodal Deep Learning. They can produce images that semantically correspond to text descriptions. Through the process of connecting visual features and linguistic information, these models bridge the gap between textual prompts and their visual counterparts.<\/span><\/p>\n<p><span class=\"ql-size-14px\">Trained on extensive datasets containing paired natural language descriptions and corresponding images, these models extract key features from the text and map them onto the visual representation in the images. The training process involves sophisticated semantic comprehension and image synthesis.<\/span><\/p>\n<p><span class=\"ql-size-14px\">With this training, the models can generate images from previously unseen textual descriptions. The text is first encoded into a feature vector, which then facilitates the synthesis of a corresponding image through a generator network.<\/span><\/p>\n<p><span class=\"ql-size-14px\">The applications of these models span across several domains, covering the creation of realistic product images for e-commerce websites, development of visual aids for individuals with disabilities, generation of images for virtual and augmented reality applications, and production of CAPTCHA images.<\/span><\/p>\n<h2><strong>Addressing Challenges in AI Image Generation<\/strong><\/h2>\n<p><span class=\"ql-size-14px\">In my work with AI image generation, I&#8217;ve found that there are several challenges to be addressed in terms of accuracy, controllability, and scalability.<\/span><\/p>\n<h3><strong>Accuracy<\/strong><\/h3>\n<p><span class=\"ql-size-14px\">One of the main issues with AI image generation processes relates to accuracy. It&#8217;s clear that most pre-trained models are designed around English, which often leads to translation-induced ambiguities and mismatches between text and images. A word with multiple meanings can cause significant confusion.<\/span><\/p>\n<p><span class=\"ql-size-14px\">For example, the images shown below represent a classic case of translation ambiguity. The English term &#8220;crane&#8221; can refer to either a construction machine or a bird, hence the model generates both types of images.<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/crane-example.jpg\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">Source: GeeTest<\/span><\/p>\n<p><span class=\"ql-size-14px\">Here, we illustrate another challenge faced when generating images based on our existing prompt library. For instance, when the prompt is &#8220;electric mouse&#8221;, the term &#8220;mouse&#8221;, which could signify an animal or a computer peripheral in English, was mistranslated to signify an actual &#8220;mouse&#8221;, resulting in generated images depicting only the rodent. Furthermore, a conspicuous mismatch can be noted in the image at coordinates [1,2], where the generated image bears no relation to the provided prompt.<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/electric-mouse-example.jpeg\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">Source: GeeTest<\/span><\/p>\n<p><span class=\"ql-size-14px\">Similarly, this issue of semantic inconsistency persists with the prompt &#8220;can&#8221;. While typically referring to an aluminum container, it unexpectedly generates a variety of unrelated images, including a cat, an aluminum suitcase, an aluminum beverage can, and a barn-like structure, among others.<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/can-example.jpg\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">Source: GeeTest<\/span><\/p>\n<p><span class=\"ql-size-14px\">Next, we present an instance where the model, switched to a cartoon style, and encountered both semantic ambiguity and sensitivity. Using the same prompt &#8220;electric mouse&#8221;, we observed a variety of responses. Remarkably, it produced images ranging from mice (the animal) morphed into the form of computer mice, to anime characters exhibiting mouse-like features. This encounter demonstrates that the challenges persist across styles and contexts, reaffirming the need for careful attention and innovative solutions in AI image generation tasks.<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/electric-mouse-cartoon.png\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">Source: GeeTest<\/span><\/p>\n<p><span class=\"ql-size-14px\">These instances underline the need for models to better handle language nuances and improve the accuracy of the images generated. An avenue to explore is the open-source project like <\/span><a class=\"ql-size-14px\" href=\"https:\/\/fengshenbang-lm.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Fengshenbang-ML<\/a><span class=\"ql-size-14px\">, constructed based on Chinese by the IDEA-CCNL. Given its localized structure, Fengshenbang could serve as a potential solution to accuracy-related issues in the context of Chinese language applications. Yet, in large-scale applications, the blend of advanced AI and human supervision remains critical to ensuring appropriate image generation.<\/span><\/p>\n<h3><strong>Controllability<\/strong><\/h3>\n<p><span class=\"ql-size-14px\">Addressing sensitive materials and issues of fairness and bias can be partially resolved with safety-checker. However, implementing additional measures can significantly enhance these mitigation efforts:<\/span><\/p>\n<ul>\n<li><strong class=\"ql-size-14px\">Diverse Data Collection<\/strong><span class=\"ql-size-14px\">: Ensuring that datasets used for image generation are representative, encompassing individuals and scenes from various backgrounds, cultures, and races to produce unbiased images.<\/span><\/li>\n<li><strong class=\"ql-size-14px\">Equitable Model Training<\/strong><span class=\"ql-size-14px\">: Employing fair and equitable methods when training text-to-image models. Techniques like &#8220;fairness constraints&#8221; ensure that generated images are free from discriminatory features. Specifically, models are trained to avoid discriminatory attributes such as age, race, gender, and others, thereby eliminating potential biases.<\/span><\/li>\n<li><strong class=\"ql-size-14px\">Supervision and Auditing<\/strong><span class=\"ql-size-14px\">: Regular human supervision and auditing during the model training and image generation stages ensure images meet ethical and moral standards.<\/span><\/li>\n<li><strong class=\"ql-size-14px\">Avoiding Sensitive Topics<\/strong><span class=\"ql-size-14px\">: Steering clear of generating images related to race, gender, religion, politics, or other sensitive subjects.<\/span><\/li>\n<li><strong class=\"ql-size-14px\">Transparency<\/strong><span class=\"ql-size-14px\">: Making public the training methods, datasets, and auditing procedures helps the community understand the use and potential impacts of these technologies.<\/span><\/li>\n<\/ul>\n<h3><strong>Scalability<\/strong><\/h3>\n<p><span class=\"ql-size-14px\">In the sphere of AI image generation, scalability plays a critical role, especially in the context of mass content production. Achieving scalability requires careful computational resource planning and GPU resource allocation. This involves addressing three core demands:<\/span><\/p>\n<ul>\n<li><strong class=\"ql-size-14px\">Model Servicing<\/strong><span class=\"ql-size-14px\">: Large models, by their very nature, necessitate the use of GPU invocation. This resource, while indispensable, can be cost-prohibitive, especially when acquired from cloud-based services.<\/span><\/li>\n<li><strong class=\"ql-size-14px\">Resource Optimization<\/strong><span class=\"ql-size-14px\">: Initially, to manage one-time investments effectively, the approach would be to utilize GPU resources on a pay-as-you-go basis. This strategy avoids hefty monthly rentals and promotes efficient resource usage.<\/span><\/li>\n<li><strong class=\"ql-size-14px\">Efficient Scaling<\/strong><span class=\"ql-size-14px\">: It&#8217;s imperative that the codebase for model servicing remains as compact as possible, facilitating straightforward horizontal scaling.<\/span><\/li>\n<\/ul>\n<p><span class=\"ql-size-14px\">Responding to these pressing needs, we&#8217;ve built a model service architecture leveraging Ray and K8s. This system enables a minimalist approach to code volume in model service deployment while providing flexibility for easy horizontal scaling.<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/A-model-device-based-on-Ray-and-K8s.png\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">A model device based on Ray and K8s (Source: GeeTest)<\/span><\/p>\n<p><span class=\"ql-size-14px\">As illustrated above, the unique architecture enables us to deploy a model service with the least possible code volume.<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/image-6.png\" alt=\"\"><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/image-1-3.png\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span style=\"color: #8f959e;\">Source: GeeTest<\/span><\/p>\n<p><span class=\"ql-size-14px\">These strategic improvements significantly bolster scalability and agility \u00a8C both of which are vital for the large-scale application of AIGC in bot mitigation strategies.<\/span><\/p>\n<h2><strong>Integrating Generative AI into GeeTest CAPTCHA Frameworks<\/strong><\/h2>\n<p><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/geetest_application1.png\" alt=\"\"><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/geetest_application2.png\" alt=\"\"><\/p>\n<p class=\"ql-align-center\"><span class=\"ql-size-14px\" style=\"color: #8f959e;\">Real-world applications of GeeTest CAPTCHAs generated by Generative AI Models (Source: GeeTest)<\/span><\/p>\n<h2><strong>Looking Ahead<\/strong><\/h2>\n<p><span class=\"ql-size-14px\">As we&#8217;ve explored, the integration of Generative AI technologies into GeeTest&#8217;s bot mitigation strategies brings promising possibilities, yet also presents intriguing challenges. Our proactive approach to resolving accuracy, controllability, and scalability issues has paved the way for significant advancements in CAPTCHA technology. As we look to the future, our focus remains on developing advanced AI image generation techniques to combat threats like &#8220;ML model cracking&#8221; and &#8220;CAPTCHA solving farms&#8221;. These forms of attacks often work in conjunction, where manual labor initially solves CAPTCHA for bots, followed by the utilization of AI\/ML algorithms to train models for automatic cracking. Therefore, our focus lies in enhancing the updating speed of images and their resistance against model cracking.<\/span><\/p>\n<p><span class=\"ql-size-14px\">Currently, GeeTest processes billions of daily API calls and maintains a stringent service solution, backed by our extensive experience in image security. We ensure automated updates of our entire online image resources every hour, incorporating 50,000 images across 200 categories. For clients with higher instantaneous requirements, we offer automated updates of 10,000 images in 50 categories every 10 minutes.<\/span><\/p>\n<p><span class=\"ql-size-14px\">I look forward to sharing more about our ongoing explorations and trials in the world of AI-generated content (AIGC). We&#8217;ll also dive into more real-world adversarial examples using LTGMs, discussing in detail how our models offer anti-cracking advantages in CAPTCHA verification and bot prevention. The journey continues, and I&#8217;m excited for what lies ahead.<\/span><\/p>\n<p><a href=\"https:\/\/www.geetest.com\/en\/Register_en\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/geetests.com\/wp-content\/uploads\/2025\/09\/ad_01_728_90-2-1.png\" alt=\"\"><\/a><\/div>\n<p><!-- .vgblk-rw-wrapper --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Embark on GeeTest&#8217;s pioneering journey as we navigate the world of AI-Generated Content (AIGC). Explore how we are leveraging advanced AI technologies to enhance bot mitigation strategies and shape the future of CAPTCHA technology.<\/p>\n","protected":false},"author":2,"featured_media":996844,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[90],"tags":[],"class_list":["post-997291","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cyberwatch"],"_links":{"self":[{"href":"\/en\/wp-json\/wp\/v2\/posts\/997291","targetHints":{"allow":["GET"]}}],"collection":[{"href":"\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"\/en\/wp-json\/wp\/v2\/comments?post=997291"}],"version-history":[{"count":2,"href":"\/en\/wp-json\/wp\/v2\/posts\/997291\/revisions"}],"predecessor-version":[{"id":997512,"href":"\/en\/wp-json\/wp\/v2\/posts\/997291\/revisions\/997512"}],"wp:featuredmedia":[{"embeddable":true,"href":"\/en\/wp-json\/wp\/v2\/media\/996844"}],"wp:attachment":[{"href":"\/en\/wp-json\/wp\/v2\/media?parent=997291"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"\/en\/wp-json\/wp\/v2\/categories?post=997291"},{"taxonomy":"post_tag","embeddable":true,"href":"\/en\/wp-json\/wp\/v2\/tags?post=997291"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}