1. Homepage
  2. Equities
  3. Japan
  4. Japan Exchange
  5. Trend Micro
  6. News
  7. Summary
    4704   JP3637300009


SummaryMost relevantAll NewsOther languagesPress ReleasesOfficial PublicationsSector news

Codex Exposed: Task Automation and Response Consistency

01/21/2022 | 08:13am EDT

In June 2020, OpenAI released version 3 of itsGenerative Pre-trained Transformer (GPT-3), a natural language transformer that took the tech world by storm with its uncanny ability to generate text seemingly written by humans. But GPT-3 was also trained on computer code, and recently OpenAI released a specialized version of its engine, namedCodex, tailored to help - or perhaps even replace - computer programmers.

In a series of blog posts, we explore different aspects of Codex and assess its capabilities with a focus on the security aspects that affect not only regular developers but also malicious users. This is the third part of the series. (Read the first and second partshereandhere.)

Being able to automate tasks or programmatically execute them unsupervised is an essential part of both regular and malicious computer usage, so we wondered if a tool like Codex was reliable enough to be scripted and left to run unsupervised, generating the required code.

As it turned out, one could not step into the same river twice: It was immediately apparent that Codex is not a deterministic system, nor a predictable one. This means that the results are not necessarily repeatable. By its very nature, the massive neural network behind GPT-3 and Codex is a black box, the inner workings of which are tuned by feeding it a huge set of training texts from which it "learns" the statistical relationships between words and symbols that ultimately constitute a faithful imitation of users' natural languages. This has several consequences that users should keep in mind while interacting with GPT-3 in general or Codex in particular, such as:

  • Since it is a natural language transformer, all interactions with the system happen in natural language. This is also known as "prompt-based programming" and it basically means that the output of the transformer heavily depends on how the input question is formulated. Even slight variations on what is seemingly the same question can lead to massively different results.
  • Among these, empty results or plain old gibberish can also occur, as we experienced especially during our first attempts.
  • Whenever this happens, there is really no indication of a discernible reason as to why the system decided to respond with noise rather than a coherent result.
Figure 1. The same question, asked at different times, leading to dramatically different results

In the two screenshots above, the same question ("generate a list of ani alu") was asked, but the results were completely different. One was just a long sequence of spaces, while the other was legitimate code. No other parameters were changed. (The user input is highlighted in red.)

In another example, we can appreciate the stochastic - that is, random - nature of the system by looking at how two subsequent and apparently identical requests lead to different pieces of code being generated. Only the most attentive reader might spot a space too many in the request prompt.

Figure 2. Two queries that differ only by one space

Essentially the same query ("python code get password router") was used in both cases, except that the latter case had an extra space. (The input fields are highlighted in red.)

When interacting with Codex manually, this behavior is not a major problem, and the workaround is to iterate and simply attempt to formulate the prompt differently. However, this makes it very difficult, if not impossible, to use the language transformer programmatically. Imagine writing a script to perform many requests to Codex to generate, for example, a set of code snippets in an unsupervised manner: One would need some logic dedicated to detecting and fixing or discarding any garbled response.

Another realization that rose in our various attempts at generating some code is that, contrary to a popular misconception, Codex does notbehave like a search engine for code. Instead, it tries to play an ad-lib game with the user, aiming to complete whatever input comment is provided with the code that in its "experience" would "go well" with the input prompt. The question it tries to answer is not the one the user asked in the comment itself and the input should not be treated as such. Rather, the question Codex tries to answer is, "What (code) should I write to finish the paragraph the best, given such a beginning?" It is a subtle but important difference that can lead to dramatically different results, as shown in the examples below.

Figure 3. A different formulation of the same request leading to dramatically different results

The query used here was "list soafee". (The inputs are highlighted in red.) These examples show how a small variation in what was asked, merely giving a more descriptive prompt, led to an actual result rather than an empty output.

In the end, trying to automate Codex to perform repeated tasks, unsupervised, very often implies having to check the output and filter out all garbled responses. For many types of projects, whether they are malicious or not, this task of filtering and fixing the response might very well end up being more labor-intensive than, say, resorting to a more traditional solution to achieve the same end result. This makes Codex a difficult choice when constant human supervision cannot be guaranteed.


Trend Micro Inc. published this content on 21 January 2022 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 21 January 2022 13:12:01 UTC.

© Publicnow 2022
All news about TREND MICRO
05/25Trend Micro Incorporated Creates Dedicated US Federal Business Unit
05/16Cyber professionals gathered at Helsinki Expo and Convention Centre after a three years..
04/25TREND MICRO : New Partner Bit Discovery Helps TM with Attack Surface
04/25Trend Micro Announces the Launch of Trend Micro One, a Unified Cybersecurity Platform
04/24TREND MICRO : How to better manage your digital attack surface risk
04/20TREND MICRO : Spring4Shell Vulnerability CVE-2022-22965 Exploited to Deploy Cryptocurrency..
04/19CRITICALLY UNDERRATED : Studying the Data Distribution Service (DDS) Protocol
04/18TREND MICRO : An Investigation of the BlackCat Ransomware via Trend Micro Vision One
04/18CYBER RISK INDEX (2H' 2021) : An Assessment for Security Leaders
04/13OT cybersecurity provider TXOne Networks expands its presence in Europe
More news
Sales 2022 208 B 1 634 M 1 634 M
Net income 2022 31 893 M 251 M 251 M
Net cash 2022 213 B 1 674 M 1 674 M
P/E ratio 2022 32,6x
Yield 2022 2,19%
Capitalization 1 044 B 8 211 M 8 211 M
EV / Sales 2022 4,00x
EV / Sales 2023 3,67x
Nbr of Employees 7 024
Free-Float 94,6%
Duration : Period :
Trend Micro Technical Analysis Chart | 4704 | JP3637300009 | MarketScreener
Technical analysis trends TREND MICRO
Short TermMid-TermLong Term
Income Statement Evolution
Mean consensus HOLD
Number of Analysts 12
Last Close Price 7 480,00 JPY
Average target price 7 150,00 JPY
Spread / Average Target -4,41%
EPS Revisions
Managers and Directors
Yi Fen Chen Auditor
Mahendra Negi Group CFO, COO & Representative Director
Ming Jang Chang Representative Director
Oscar Chang Executive Vice President-Research & Development
Max Cheng Chief Information Officer & Executive VP
Sector and Competitors
1st jan.Capi. (M$)
TREND MICRO16.74%8 172
SYNOPSYS INC.-16.93%46 864
SEA LIMITED-64.44%44 527